Re: Mutation of bytes is too large for the maxiumum size of

2018-09-25 Thread Saladi Naidu
Soumya,Thanks for the suggestion and yes enabling debugging is an option but it 
is very tedious and often the chatter clutters and hard to debug.
 
Naidu Saladi 
 

On Tuesday, September 18, 2018 5:04 PM, Soumya Jena 
 wrote:
 

 The client should notice this on their side . If you want to see on the server 
log one idea may be is to enable the debug mode .You can set it specifically 
for org.apache.cassandra.transportSomething like  nodetool setlogginglevel 
org.apache.cassandra.transport DEBUGIf you are lucky enough :)  (i.e. not too 
much chatter around the same time) , you should see the query just before that 
WARN message appears in log .
You can turn off the debugging once you get the info. 
Good luck !!

On Mon, Sep 17, 2018 at 9:06 PM Saladi Naidu  
wrote:

Any clues on this topic? Naidu Saladi 
 

On Thursday, September 6, 2018 9:41 AM, Saladi Naidu 
 wrote:
 

 We are receiving following error
9140-    at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
[apache-cassandra-3.0.10.jar:3.0.10]
9141-    at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]9142:WARN  
[SharedPool-Worker-1] 2018-09-06 14:29:46,071 
AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread 
Thread[SharedPool-Worker-1,5,main]: {}9143-java.lang.IllegalArgumentException: 
Mutation of 16777251 bytes is too large for the maximum size of 167772169144-   
 at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:256) 
~[apache-cassandra-3.0.10.jar:3.0.10]
I found following link that explained the cause 
By design intent the maximum allowed segment size is 50% of the configured 
commit_log_segment_size_in_mb. This is so Cassandra avoids writing segments 
with large amounts of empty space.To elaborate; up to two 32MB segments will 
fit into 64MB, however 40MB will only fit once leaving a larger amount of 
unused space. 
"I would like to find what table/column family this write/mutation is causing 
this error so that I can reach out to right application team, log does not 
provide any details regarding the mutation at all, is there a way to find that 
out
Mutation of bytes is too large for the maxiumum size of
  
|  
|  
|  
|   ||

  |

  |
|  
|   |  
Mutation of bytes is too large for the maxiumum size of
 Summary Apache Cassandra will discard mutations larger than a predetermined 
size. This note addresses why this h...  |   |

  |

  |

 


 Naidu Saladi 


   


   

Re: Mutation of bytes is too large for the maxiumum size of

2018-09-17 Thread Saladi Naidu
Any clues on this topic? Naidu Saladi 
 

On Thursday, September 6, 2018 9:41 AM, Saladi Naidu 
 wrote:
 

 We are receiving following error
9140-    at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
[apache-cassandra-3.0.10.jar:3.0.10]
9141-    at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]9142:WARN  
[SharedPool-Worker-1] 2018-09-06 14:29:46,071 
AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread 
Thread[SharedPool-Worker-1,5,main]: {}9143-java.lang.IllegalArgumentException: 
Mutation of 16777251 bytes is too large for the maximum size of 167772169144-   
 at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:256) 
~[apache-cassandra-3.0.10.jar:3.0.10]
I found following link that explained the cause 
By design intent the maximum allowed segment size is 50% of the configured 
commit_log_segment_size_in_mb. This is so Cassandra avoids writing segments 
with large amounts of empty space.To elaborate; up to two 32MB segments will 
fit into 64MB, however 40MB will only fit once leaving a larger amount of 
unused space. 
"I would like to find what table/column family this write/mutation is causing 
this error so that I can reach out to right application team, log does not 
provide any details regarding the mutation at all, is there a way to find that 
out
Mutation of bytes is too large for the maxiumum size of
  
|  
|  
|  
|   ||

  |

  |
|  
|   |  
Mutation of bytes is too large for the maxiumum size of
 Summary Apache Cassandra will discard mutations larger than a predetermined 
size. This note addresses why this h...  |   |

  |

  |

 


 Naidu Saladi 


   

Mutation of bytes is too large for the maxiumum size of

2018-09-06 Thread Saladi Naidu
We are receiving following error
9140-    at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
[apache-cassandra-3.0.10.jar:3.0.10]
9141-    at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]

9142:WARN  [SharedPool-Worker-1] 2018-09-06 14:29:46,071 
AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread 
Thread[SharedPool-Worker-1,5,main]: {}

9143-java.lang.IllegalArgumentException: Mutation of 16777251 bytes is too 
large for the maximum size of 16777216

9144-    at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:256) 
~[apache-cassandra-3.0.10.jar:3.0.10]


I found following link that explained the cause 
By design intent the maximum allowed segment size is 50% of the configured 
commit_log_segment_size_in_mb. This is so Cassandra avoids writing segments 
with large amounts of empty space.To elaborate; up to two 32MB segments will 
fit into 64MB, however 40MB will only fit once leaving a larger amount of 
unused space. 
"I would like to find what table/column family this write/mutation is causing 
this error so that I can reach out to right application team, log does not 
provide any details regarding the mutation at all, is there a way to find that 
out
Mutation of bytes is too large for the maxiumum size of
  
|  
|   
|   
|   ||

   |

  |
|  
|   |  
Mutation of bytes is too large for the maxiumum size of
 Summary Apache Cassandra will discard mutations larger than a predetermined 
size. This note addresses why this h...  |   |

  |

  |

 


 Naidu Saladi 


Re: Write Time of a Row in Multi DC Cassandra Cluster

2018-07-10 Thread Saladi Naidu
Simon,Trace would be significant burden on the cluster and it has to be on all 
the time. I am trying to find a way to know when a row is written on demand 
basis, is there a way to determine that? Naidu Saladi 
 

On Tuesday, July 10, 2018 2:24 AM, Simon Fontana Oscarsson 
 wrote:
 

 Have you tried trace?
-- 
SIMON FONTANA OSCARSSON
Software Developer

Ericsson
Ölandsgatan 1
37133 Karlskrona, Sweden
simon.fontana.oscars...@ericsson.com
www.ericsson.com

On mån, 2018-07-09 at 19:30 +, Saladi Naidu wrote:
> Cassandra is an eventual consistent DB, how to find when a row is actually 
> written in multi DC environment? Here is the problem I am trying to solve 
> 
> - I have multi DC (3 DC's) Cassandra cluster/ring - One of the application 
> wrote a row to DC1(using Local Quorum)  and within span of 50 ms, it tried to 
> read same row from DC2 and could not find the
> row. Our both DC's have sub milli second latency at network level, usually <2 
> ms. We promised 20 ms consistency. In this case Application could not find 
> the row in DC2 in 50 ms
> 
> I tried to use "select WRITETIME(authorizations_json) from 
> token_authorizations where " to find  when the Row is written in each DC, 
> but both DC's returned same Timestamp. After further research
> I found that Client V3 onwards Timestamp is supplied at Client level so 
> WRITETIME does not help 
> "https://docs.datastax.com/en/developer/java-driver/3.4/manual/query_timestamps/;
> 
> So how to determine when the row is actually written in each DC?
> 
>  
> Naidu Saladi 

   

Re: Write Time of a Row in Multi DC Cassandra Cluster

2018-07-10 Thread Saladi Naidu
Alain,Thanks for the response and I completely agree with you about your 
approach but there is a small caveat, we have another DC in Europe, right now 
this keyspace is not replicating there but eventually will be added. EU DC has 
significant latency of 200 ms RTT, so going with EACH_QUORUM would not be 
feasible. We can reset the SLA's for consistency but my question is how to 
determine when was the row written to remote DC? Is there anyway to determine 
that Naidu Saladi 
 

On Tuesday, July 10, 2018 8:56 AM, Alain RODRIGUEZ  
wrote:
 

 Hello,

 I have multi DC (3 DC's) Cassandra cluster/ring - One of the application wrote 
a row to DC1(using Local Quorum)  and within span of 50 ms, it tried to read 
same row from DC2 and could not find the row.

 [...]

So how to determine when the row is actually written in each DC? 

To me, this guarantee you try to achieve could obtained using 'EACH_QUORUM' for 
writes (ie 'local_quorum' on each DC), and 'LOCAL_QUORUM' for reads for 
example. You would then have a strong consistency, as long as the same client 
application is running write then read or that it sends a trigger for the 
second call sequentially, after validating the write, in some way.

Our both DC's have sub milli second latency at network level, usually <2 ms. We 
promised 20 ms consistency. In this case Application could not find the row in 
DC2 in 50 ms


In these conditions, using 'EACH_QUORUM' might not be too much of a burden for 
the coordinator and the client. The writes are already being processed, this 
would increase the latency at the coordinator level (and thus at the client 
level), but you would be sure that all the clusters have the row in a majority 
of the replicas before triggering the read.
C*heers,---Alain Rodriguez - @arodream - 
alain@thelastpickle.comFrance / Spain
The Last Pickle - Apache Cassandra Consultinghttp://www.thelastpickle.com

2018-07-10 8:24 GMT+01:00 Simon Fontana Oscarsson 
:

Have you tried trace?
-- 
SIMON FONTANA OSCARSSON
Software Developer

Ericsson
Ölandsgatan 1
37133 Karlskrona, Sweden
simon.fontana.oscarsson@ ericsson.com
www.ericsson.com

On mån, 2018-07-09 at 19:30 +0000, Saladi Naidu wrote:
> Cassandra is an eventual consistent DB, how to find when a row is actually 
> written in multi DC environment? Here is the problem I am trying to solve 
> 
> - I have multi DC (3 DC's) Cassandra cluster/ring - One of the application 
> wrote a row to DC1(using Local Quorum)  and within span of 50 ms, it tried to 
> read same row from DC2 and could not find the
> row. Our both DC's have sub milli second latency at network level, usually <2 
> ms. We promised 20 ms consistency. In this case Application could not find 
> the row in DC2 in 50 ms
> 
> I tried to use "select WRITETIME(authorizations_json) from 
> token_authorizations where " to find  when the Row is written in each DC, 
> but both DC's returned same Timestamp. After further research
> I found that Client V3 onwards Timestamp is supplied at Client level so 
> WRITETIME does not help "https://docs.datastax.com/en/ 
> developer/java-driver/3.4/ manual/query_timestamps/"
> 
> So how to determine when the row is actually written in each DC?
> 
>  
> Naidu Saladi 



   

Write Time of a Row in Multi DC Cassandra Cluster

2018-07-09 Thread Saladi Naidu
Cassandra is an eventual consistent DB, how to find when a row is actually 
written in multi DC environment? Here is the problem I am trying to solve 
- I have multi DC (3 DC's) Cassandra cluster/ring - One of the application 
wrote a row to DC1(using Local Quorum)  and within span of 50 ms, it tried to 
read same row from DC2 and could not find the row. Our both DC's have sub milli 
second latency at network level, usually <2 ms. We promised 20 ms consistency. 
In this case Application could not find the row in DC2 in 50 ms
I tried to use "selectWRITETIME(authorizations_json) from token_authorizations 
where " to find  when the Row is written in each DC, but both DC's returned 
same Timestamp. After further research I found that Client V3 onwards Timestamp 
is supplied at Client level so WRITETIME does not help 
"https://docs.datastax.com/en/developer/java-driver/3.4/manual/query_timestamps/;
So how to determine when the row is actually written in each DC?
 Naidu Saladi 


Re: Partition Key - Wide rows?

2016-10-06 Thread Saladi Naidu
It depends on Partition/Primary key design. In order to execute all 3 queries, 
Partition Key is Org id and others are Clustering keys. if there are many org's 
it will be ok, but if it is one org then a single partition  will hold all the 
data and its not good Naidu Saladi 
 

On Thursday, October 6, 2016 12:14 PM, Ali Akhtar  
wrote:
 

 Thanks, Phil.
1- In my use-case, its probably okay to partition all the org data together. 
This is for a b2b enterprise SaaS application, the customers will be 
organizations.
So it is probably okay to store each org's data next to each other, right?
2- I'm thinking of having the primary key be: (org_id, team_id, project_id, 
issue_id). 
In the above case, will there be a skinny row per issue, or a wide row per org 
/ team / project?
3- Just to double check, with the above primary key, can I still query using 
just the org_id, org + team id, and org + team + project id?
4- If I wanted to refer to a particular issue, it looks like I'd need to send 
all 4 parameters. That may be problematic. Is there a better way of modeling 
this data?


On Thu, Oct 6, 2016 at 9:30 PM, Philip Persad  wrote:



1) No.  Your first 3 queries will work but not the last one (get issue by id).  
In Cassandra when you query you must include every preceding portion of the 
primary key.

2) 64 bytes (16 * 4), or somewhat more if storing as strings?  I don't think 
that's something I'd worry too much about.

3) Depends on how you build your partition key.  If partition key is (org id), 
then you get one partition per org (probably bad depending on your dataset).  
If partition key is (org id, team id, project id) then you will have one 
partition per project which is probably fine ( again, depending on your 
dataset).

Cheers,

-PhilFrom: Ali Akhtar
Sent: ‎2016-‎10-‎06 9:04 AM
To: user@cassandra.apache.org
Subject: Partition Key - Wide rows?

Heya,
I'm designing some tables, where data needs to be stored in the following 
hierarchy:
Organization -> Team -> Project -> Issues
I need to be able to retrieve issues:
- For the whole org - using org id- For a team (org id + team id)- For a 
project (org id + team id + project id)- If possible, by using just the issue id
I'm considering using all 4 ids as the primary key. The first 3 will use UUIDs, 
except issue id which will be an alphanumeric string, unique per project.
1) Will this setup allow using all 4 query scenarios?2) Will this make the 
primary key really long, 3 UUIDs + similar length'd issue id?3) Will this store 
issues as skinny rows, or wide rows? If an org has a lot of teams, which have a 
lot of projects, which have a lot of issues, etc, could I have issues w/ 
running out of the column limit of wide rows?4) Is there a better way of 
achieving this scenario?







   

Re: system_distributed.repair_history table

2016-10-06 Thread Saladi Naidu
Thanks for the response. It makes sense to periodically truncate as it is only 
for debugging purposes Naidu Saladi 
 

On Wednesday, October 5, 2016 8:03 PM, Chris Lohfink <clohfin...@gmail.com> 
wrote:
 

 The only current solution is to truncate it periodically. I opened 
https://issues.apache.org/jira/browse/CASSANDRA-12701 about it if interested in 
following
On Wed, Oct 5, 2016 at 4:23 PM, Saladi Naidu <naidusp2...@yahoo.com> wrote:

We are seeing following warnings in system.log,  As compaction_large_ 
partition_warning_threshold_mb   in cassandra.yaml file is as default value 
100, we are seeing these warnings
110:WARN  [CompactionExecutor:91798] 2016-10-05 00:54:05,554 
BigTableWriter.java:184 - Writing large partition system_distributed/repair_ 
history:gccatmer:mer_admin_job (115943239 bytes)
111:WARN  [CompactionExecutor:91798] 2016-10-05 00:54:13,303 
BigTableWriter.java:184 - Writing large partition system_distributed/repair_ 
history:gcconfigsrvcks:user_ activation (163926097 bytes)
When I looked at the table definition it is partitioned by keyspace and 
cloumnfamily, under this partition, repair history is maintained. When I looked 
at the count of rows in this partition, most of the paritions have >200,000 
rows and these will keep growing because of the partition strategy right. There 
is no TTL on this so any idea what is the solution for reducing partition size. 

I also looked at size_estimates table for this column family and found that the 
mean partition size for each range is 50,610,179 which is very large compared 
to any other tables. 



   

system_distributed.repair_history table

2016-10-05 Thread Saladi Naidu
We are seeing following warnings in system.log,  As 
compaction_large_partition_warning_threshold_mb  in cassandra.yaml file is as 
default value 100, we are seeing these warnings
110:WARN  [CompactionExecutor:91798] 2016-10-05 00:54:05,554 
BigTableWriter.java:184 - Writing large partition 
system_distributed/repair_history:gccatmer:mer_admin_job (115943239 bytes)
111:WARN  [CompactionExecutor:91798] 2016-10-05 00:54:13,303 
BigTableWriter.java:184 - Writing large partition 
system_distributed/repair_history:gcconfigsrvcks:user_activation (163926097 
bytes)


When I looked at the table definition it is partitioned by keyspace and 
cloumnfamily, under this partition, repair history is maintained. When I looked 
at the count of rows in this partition, most of the paritions have >200,000 
rows and these will keep growing because of the partition strategy right. There 
is no TTL on this so any idea what is the solution for reducing partition size. 

I also looked at size_estimates table for this column family and found that the 
mean partition size for each range is 50,610,179 which is very large compared 
to any other tables. 

Re: Many keyspaces pattern

2015-11-24 Thread Saladi Naidu
I can think of following features to solve
1. If you know the time period of after how long data should be removed then 
use TTL feature2. Use Time Series to model the data and use inverted index to 
query the data by time period? Naidu Saladi 
 


On Tuesday, November 24, 2015 6:49 AM, Jack Krupansky 
 wrote:
 

 How often is sometimes - closer to 20% of the batches or 2%?
How are you querying batches, both current and older ones?
As always, your queries should drive your data models.
If deleting a batch is very infrequent, maybe best to not do it and simply have 
logic in the app to ignore deleted batches - if your queries would reference 
them at all.
What reasons would you have to delete a batch? Depending on the nature of the 
reason there may be an alternative.
Make sure your cluster is adequately provisioned so that these expensive 
operations can occur in parallel to reduce their time and resources per node.
Do all batches eventually get aged and deleted or are you expecting that most 
batches will live for many years to come? Have you planned for how you will 
grow the cluster over time?
Maybe bite the bullet and use a background process to delete a batch if 
deletion is competing too heavily with query access - if they really need to be 
deleted at all.
Number of keyspaces - and/or tables - should be limited to "low hundreds", and 
even then you are limited by RAM and CPU of each node. If a keyspace has 14 
tables, then 250/14 = 20 would be a recommended upper limit for number of key 
spaces. Even if your total number of tables was under 300 or even 200, you 
would need to do a proof of concept implementation to verify that your specific 
data works well on your specific hardware.

-- Jack Krupansky
On Tue, Nov 24, 2015 at 5:05 AM, Jonathan Ballet  wrote:

Hi,

we are running an application which produces every night a batch with several 
hundreds of Gigabytes of data. Once a batch has been computed, it is never 
modified (nor updates nor deletes), we just keep producing new batches every 
day.

Now, we are *sometimes* interested to remove a complete specific batch 
altogether. At the moment, we are accumulating all these data into only one 
keyspace which has a batch ID column in all our tables which is also part of 
the primary key. A sample table looks similar to this:

  CREATE TABLE computation_results (
      batch_id int,
      id1 int,
      id2 int,
      value double,
      PRIMARY KEY ((batch_id, id1), id2)
  ) WITH CLUSTERING ORDER BY (id2 ASC);

But we found out it is very difficult to remove a specific batch as we need to 
know all the IDs to delete the entries and it's both time and resource 
consuming (ie. it takes a long time and I'm not sure it's going to scale at 
all.)

So, we are currently looking into having each of our batches in a keyspace of 
their own so removing a batch is merely equivalent to delete a keyspace. 
Potentially, it means we will end up having several hundreds of keyspaces in 
one cluster, although most of the time only the very last one will be used (we 
might still want to access the older ones, but that would be a very seldom 
use-case.) At the moment, the keyspace has about 14 tables and is probably not 
going to evolve much.


Are there any counter-indications of using lot of keyspaces (300+) into one 
Cassandra cluster?
Are there any good practices that we should follow?
After reading the "Anti-patterns in Cassandra > Too many keyspaces or tables", 
does it mean we should plan ahead to already split our keyspace among several 
clusters?

Finally, would there be any other way to achieve what we want to do?

Thanks for your help!

 Jonathan




  

Collections (MAP) data in Column Family

2015-10-14 Thread Saladi Naidu
We are running Apache Cassandra 2.1.9. In one of our Column Family, we have MAP 
type column. We are seeing unusual data size of the column family (SSTables) 
with few 1000's of rows, while debugging, I looked at one of the SSTable and I 
see some unusual data in it
Below is JSON of one Row Key data
1. There is usual Column name, Key-Value pair and TS for the MAP - all_products 
column name 2. After Key Value pair,  I see Cluster Column style data in MAP 
with a "t" marker in between, this is literally repeated millions of cells - 
all_products:_","all_products:!",1442797965371999,"t",1442797965
Any clues on what is happening here? I know "d" marker for marked for delete, 
"e" marker for TTL but dont know what "t" marker is for? 
[{"key": "55736100", "cells": [["","",1444101633184000],           
["active","false",1444101633184000],           
["all_products:_","all_products:!",1442797965371999,"t",1442797965],           
["all_products:_","all_products:!",1442806687091999,"t",1442806687],           
["all_products:_","all_products:!",1443410022982999,"t",1443410022],           
["all_products:_","all_products:!",1443410595224999,"t",1443410595],           
["all_products:_","all_products:!",1443679978903999,"t",1443679978],           
["all_products:_","all_products:!",1444011801906999,"t",1444011801],           
["all_products:_","all_products:!",1444101633183999,"t",1444101633],           
["all_products:3135393730323130305f63735f435a","313539373032313030",1444101633184000],
           
["all_products:3135393730323130305f64655f4154","313539373032313030",1444101633184000],
           
["all_products:3135393730323130305f64655f4348","313539373032313030",1444101633184000],
           
["all_products:3135393730323130305f64655f4445","313539373032313030",1444101633184000],
           
["all_products:3135393730323130305f656e5f4348","313539373032313030",1444101633184000],
.["all_products:3233393238333430305f69745f4348","323339323833343030",1444101633184000],
           ["all_products:_","all_products:!",1442797965371999,"t",1442797965], 
          ["all_products:_","all_products:!",1442797965371999,"t",1442797965],  
         ["all_products:_","all_products:!",1442806687091999,"t",1442806687],   
        ["all_products:_","all_products:!",1442797965371999,"t",1442797965],    
       ["all_products:_","all_products:!",1442797965371999,"t",1442797965],     
      ["all_products:_","all_products:!",1442806687091999,"t",1442806687],      
     ["all_products:_","all_products:!",1442797965371999,"t",1442797965],       
    ["all_products:_","all_products:!",1442797965371999,"t",1442797965],        
   ["all_products:_","all_products:!",1442806687091999,"t",1442806687],         
  ["all_products:_","all_products:!",1442797965371999,"t",1442797965],          
 ["all_products:_","all_products:!",1442797965371999,"t",1442797965],           
["all_products:_","all_products:!",1442806687091999,"t",1442806687],           
["all_products:_","all_products:!",1442797965371999,"t",1442797965],           
["all_products:_","all_products:!",1442797965371999,"t",1442797965],           
["all_products:_","all_products:!",1442806687091999,"t",1442806687],           
["all_products:_","all_products:!",1442797965371999,"t",1442797965],           
["all_products:_","all_products:!",1442797965371999,"t",1442797965],           
["all_products:_","all_products:!",1442806687091999,"t",1442806687],           
["all_products:_","all_products:!",1442797965371999,"t",1442797965],           
["all_products:_","all_products:!",1442797965371999,"t",1442797965],           
["all_products:_","all_products:!",1442806687091999,"t",1442806687],           
["all_products:_","all_products:!",1442797965371999,"t",1442797965],           
["all_products:_","all_products:!",1442797965371999,"t",1442797965],           
["all_products:_","all_products:!",1442806687091999,"t",1442806687],

 Naidu Saladi 


Re: LTCS Strategy Resulting in multiple SSTables

2015-09-16 Thread Saladi Naidu
Nate,Yes we are in process of upgrading to 2.1.9. Meanwhile I am looking for 
correcting the problem, do you know any recovery options to reduce the number 
of SS Tables. As SStbales are keep on increasing, the read performance is 
deteriorating  Naidu Saladi 

  From: Nate McCall <n...@thelastpickle.com>
 To: Cassandra Users <user@cassandra.apache.org>; Saladi Naidu 
<naidusp2...@yahoo.com> 
 Sent: Tuesday, September 15, 2015 4:53 PM
 Subject: Re: LTCS Strategy Resulting in multiple SSTables
   
That's an early 2.1/known buggy version. There have been several issues fixed 
since which could cause that behavior. Most likely 
https://issues.apache.org/jira/browse/CASSANDRA-9592 ? 
Upgrade to 2.1.9 and see if the problem persists. 


On Tue, Sep 15, 2015 at 8:31 AM, Saladi Naidu <naidusp2...@yahoo.com> wrote:

We are on 2.1.2 and planning to upgrade to 2.1.9  Naidu Saladi 

  From: Marcus Eriksson <krum...@gmail.com>
 To: user@cassandra.apache.org; Saladi Naidu <naidusp2...@yahoo.com> 
 Sent: Tuesday, September 15, 2015 1:53 AM
 Subject: Re: LTCS Strategy Resulting in multiple SSTables
   
if you are on Cassandra 2.2, it is probably this: 
https://issues.apache.org/jira/browse/CASSANDRA-10270


On Tue, Sep 15, 2015 at 4:37 AM, Saladi Naidu <naidusp2...@yahoo.com> wrote:

We are using Level Tiered Compaction Strategy on a Column Family. Below are 
CFSTATS from two nodes in same cluster, one node has 880 SStables in L0 whereas 
one node just has 1 SSTable in L0. In the node where there are multiple 
SStables, all of them are small size and created same time stamp. We ran 
Compaction, it did not result in much change, node remained with huge number of 
SStables. Due to this large number of SSTables, Read performance is being 
impacted
In same cluster, under same keyspace, we are observing this discrepancy in 
other column families as well. What is going wrong? What is the solution to fix 
this
---NODE1---   Table: category_ranking_dedup 
  SSTable count: 1   SSTables in each level: 
[1, 0, 0, 0, 0, 0, 0, 0, 0]   Space used (live): 
2012037   Space used (total): 2012037   
Space used by snapshots (total): 0  
 SSTable Compression Ratio: 0.07677216119569073   
Memtable cell count: 990   Memtable data size: 
32082   Memtable switch count: 11   
Local read count: 2842   Local read 
latency: 3.215 ms   Local write count: 18309
   Local write latency: 5.008 ms
   Pending flushes: 0   Bloom filter false 
positives: 0   Bloom filter false ratio: 0.0
   Bloom filter space used: 816 
  Compacted partition minimum bytes: 87   
Compacted partition maximum bytes: 25109160   
Compacted partition mean bytes: 22844   Average 
live cells per slice (last five minutes): 338.84588318085855
   Maximum live cells per slice (last five minutes): 10002.0
   Average tombstones per slice (last five minutes): 
36.53307529908515   Maximum tombstones per slice 
(last five minutes): 36895.0 NODE2---  Table: category_ranking_dedup
   SSTable count: 808   
SSTables in each level: [808/4, 0, 0, 0, 0, 0, 0, 0, 0] 
  Space used (live): 291641980   Space used 
(total): 291641980   Space used by snapshots 
(total): 0   SSTable Compression Ratio: 
0.1431106696818256   Memtable cell count: 4365293   
Memtable data size: 3742375 
  Memtable switch count: 44   Local read count: 
2061   Local read latency: 31.983 ms
   Local write count: 30096   Local 
write latency: 27.449 ms   Pending flushes: 0   
Bloom filter false positives: 0 
  Bloom filter false ratio: 0.0   Bloom 
filter space used: 54544   Compacted partition 
minimum bytes: 87   Compacted partition maximum 
bytes: 25109160   Compacted partition mean bytes: 
634491   Averag

Re: LTCS Strategy Resulting in multiple SSTables

2015-09-15 Thread Saladi Naidu
We are on 2.1.2 and planning to upgrade to 2.1.9  Naidu Saladi 

  From: Marcus Eriksson <krum...@gmail.com>
 To: user@cassandra.apache.org; Saladi Naidu <naidusp2...@yahoo.com> 
 Sent: Tuesday, September 15, 2015 1:53 AM
 Subject: Re: LTCS Strategy Resulting in multiple SSTables
   
if you are on Cassandra 2.2, it is probably this: 
https://issues.apache.org/jira/browse/CASSANDRA-10270


On Tue, Sep 15, 2015 at 4:37 AM, Saladi Naidu <naidusp2...@yahoo.com> wrote:

We are using Level Tiered Compaction Strategy on a Column Family. Below are 
CFSTATS from two nodes in same cluster, one node has 880 SStables in L0 whereas 
one node just has 1 SSTable in L0. In the node where there are multiple 
SStables, all of them are small size and created same time stamp. We ran 
Compaction, it did not result in much change, node remained with huge number of 
SStables. Due to this large number of SSTables, Read performance is being 
impacted
In same cluster, under same keyspace, we are observing this discrepancy in 
other column families as well. What is going wrong? What is the solution to fix 
this
---NODE1---   Table: category_ranking_dedup 
  SSTable count: 1   SSTables in each level: 
[1, 0, 0, 0, 0, 0, 0, 0, 0]   Space used (live): 
2012037   Space used (total): 2012037   
Space used by snapshots (total): 0  
 SSTable Compression Ratio: 0.07677216119569073   
Memtable cell count: 990   Memtable data size: 
32082   Memtable switch count: 11   
Local read count: 2842   Local read 
latency: 3.215 ms   Local write count: 18309
   Local write latency: 5.008 ms
   Pending flushes: 0   Bloom filter false 
positives: 0   Bloom filter false ratio: 0.0
   Bloom filter space used: 816 
  Compacted partition minimum bytes: 87   
Compacted partition maximum bytes: 25109160   
Compacted partition mean bytes: 22844   Average 
live cells per slice (last five minutes): 338.84588318085855
   Maximum live cells per slice (last five minutes): 10002.0
   Average tombstones per slice (last five minutes): 
36.53307529908515   Maximum tombstones per slice 
(last five minutes): 36895.0 NODE2---  Table: category_ranking_dedup
   SSTable count: 808   
SSTables in each level: [808/4, 0, 0, 0, 0, 0, 0, 0, 0] 
  Space used (live): 291641980   Space used 
(total): 291641980   Space used by snapshots 
(total): 0   SSTable Compression Ratio: 
0.1431106696818256   Memtable cell count: 4365293   
Memtable data size: 3742375 
  Memtable switch count: 44   Local read count: 
2061   Local read latency: 31.983 ms
   Local write count: 30096   Local 
write latency: 27.449 ms   Pending flushes: 0   
Bloom filter false positives: 0 
  Bloom filter false ratio: 0.0   Bloom 
filter space used: 54544   Compacted partition 
minimum bytes: 87   Compacted partition maximum 
bytes: 25109160   Compacted partition mean bytes: 
634491   Average live cells per slice (last five 
minutes): 416.1780688985929   Maximum live cells 
per slice (last five minutes): 10002.0   Average 
tombstones per slice (last five minutes): 45.11547792333818 
  Maximum tombstones per slice (last five minutes): 36895.0


 Naidu Saladi 




   

LTCS Strategy Resulting in multiple SSTables

2015-09-14 Thread Saladi Naidu
We are using Level Tiered Compaction Strategy on a Column Family. Below are 
CFSTATS from two nodes in same cluster, one node has 880 SStables in L0 whereas 
one node just has 1 SSTable in L0. In the node where there are multiple 
SStables, all of them are small size and created same time stamp. We ran 
Compaction, it did not result in much change, node remained with huge number of 
SStables. Due to this large number of SSTables, Read performance is being 
impacted
In same cluster, under same keyspace, we are observing this discrepancy in 
other column families as well. What is going wrong? What is the solution to fix 
this
---NODE1---

   Table: category_ranking_dedup

   SSTable count: 1

   SSTables in each level: [1, 0, 0, 0, 0, 0, 0, 0, 
0]

   Space used (live): 2012037

   Space used (total): 2012037

   Space used by snapshots (total): 0

   SSTable Compression Ratio: 0.07677216119569073

   Memtable cell count: 990

   Memtable data size: 32082

   Memtable switch count: 11

   Local read count: 2842

   Local read latency: 3.215 ms

   Local write count: 18309

   Local write latency: 5.008 ms

   Pending flushes: 0

   Bloom filter false positives: 0

   Bloom filter false ratio: 0.0

   Bloom filter space used: 816

   Compacted partition minimum bytes: 87

   Compacted partition maximum bytes: 25109160

   Compacted partition mean bytes: 22844

   Average live cells per slice (last five 
minutes): 338.84588318085855

   Maximum live cells per slice (last five 
minutes): 10002.0

   Average tombstones per slice (last five 
minutes): 36.53307529908515

   Maximum tombstones per slice (last five 
minutes): 36895.0

 NODE2---  

Table: category_ranking_dedup

   SSTable count: 808

   SSTables in each level: [808/4, 0, 0, 0, 0, 0, 
0, 0, 0]

   Space used (live): 291641980

   Space used (total): 291641980

   Space used by snapshots (total): 0

   SSTable Compression Ratio: 0.1431106696818256

   Memtable cell count: 4365293

   Memtable data size: 3742375

   Memtable switch count: 44

   Local read count: 2061

   Local read latency: 31.983 ms

   Local write count: 30096

   Local write latency: 27.449 ms

   Pending flushes: 0

   Bloom filter false positives: 0

   Bloom filter false ratio: 0.0

   Bloom filter space used: 54544

   Compacted partition minimum bytes: 87

   Compacted partition maximum bytes: 25109160

   Compacted partition mean bytes: 634491

   Average live cells per slice (last five 
minutes): 416.1780688985929

   Maximum live cells per slice (last five 
minutes): 10002.0

   Average tombstones per slice (last five 
minutes): 45.11547792333818

   Maximum tombstones per slice (last five 
minutes): 36895.0




 Naidu Saladi 


Data Distribution in Table/Column Family

2015-08-27 Thread Saladi Naidu
Is there a way to find out how data is distributed within column family by each 
node? Nodetool provides how data is distributed across nodes that only shows 
all the data by node. We are seeing heavy load on one node and I suspect that 
partitioning is not distributing data equally. But to prove that to development 
team we need to know the stats for that table Naidu Saladi 


Re: DROP Table

2015-07-13 Thread Saladi Naidu
Sebastian,Thank you so much for providing detailed explanation. I still have 
some questions and I need to provide some clarifications
1. We do not have code that is creating the tables dynamically. All DDL 
operations are done through Datastax DevCenter tool. When you say schema to 
settle, do you means we provide proper consistency level? I don't think there 
is a provision to do that in tool. Or I can change the SYSTEM KEYSPACE 
definition of replication factor equal to number of nodes?
2. In the steps described below for correcting this problem - when you say move 
data from old directory to new, do you mean move the .db file? It will override 
the current file right?
3. Do we have to rename the directory name to remove CFID i.e. just column 
family name without CFID? After that, update the System table as well?  Naidu 
Saladi 

  From: Sebastian Estevez sebastian.este...@datastax.com
 To: user@cassandra.apache.org; Saladi Naidu naidusp2...@yahoo.com 
 Sent: Friday, July 10, 2015 5:25 PM
 Subject: Re: DROP Table
   
#1 The cause of this problem is a CREATE TABLE statement collision. Do not 
generate tables dynamically from multiple clients, even with IF NOT EXISTS. 
First thing you need to do is fix your code so that this does not happen. Just 
create your tables manually from cqlsh allowing time for the schema to settle.
#2 Here's the fix:
1) Change your code to not automatically re-create tables (even with IF NOT 
EXISTS).
2) Run a rolling restart to ensure schema matches across nodes. Run nodetool 
describecluster around your cluster. Check that there is only one schema 
version. 
ON EACH NODE:3) Check your filesystem and see if you have two directories for 
the table in question in the data directory.
If THERE ARE TWO OR MORE DIRECTORIES:4)Identify from schema_column_families 
which cf ID is the new one (currently in use). 

cqlsh -e select * from system.schema_column_families|grep table name

5) Move the data from the old one to the new one and remove the old 
directory. 
6) If there are multiple old ones repeat 5 for every old directory.
7) run nodetool refresh
IF THERE IS ONLY ONE DIRECTORY:
No further action is needed.
All the best,
Sebastián EstévezSolutions Architect | 954 905 8615 | 
sebastian.este...@datastax.com


DataStax is the fastest, most scalable distributed database technology, 
delivering Apache Cassandra to the world’s most innovative enterprises. 
Datastax is built to be agile, always-on, and predictably scalable to any size. 
With more than 500 customers in 45 countries, DataStax is the database 
technology and transactional backbone of choice for the worlds most innovative 
companies such as Netflix, Adobe, Intuit, and eBay. 


On Fri, Jul 10, 2015 at 12:15 PM, Saladi Naidu naidusp2...@yahoo.com wrote:

My understanding is that Cassandra File Structure follows below naming 
convention
/cassandra/data/      key-spaces table 

Whereas our file structure is as below, each table has multiple names and when 
we drop tables and recreate these directories remain. Also when we dropped the 
table one node was down, when it came back, we tried to do Nodetool repair and 
repair kept failing  referring to CFID error listed below

drwxr-xr-x. 16 cass cass 4096 May 24 06:49 ../drwxr-xr-x.  4 cass cass 4096 Jul 
 2 11:09application_by_user-e0eec95019a211e58b954ffc8e9bfaa6/drwxr-xr-x.  2 
cass cass 4096 Jun 25 
10:15application_info-4dba2bf0054f11e58b954ffc8e9bfaa6/drwxr-xr-x.  4 cass cass 
4096 Jul  2 11:09application_info-a0ee65d019a311e58b954ffc8e9bfaa6/drwxr-xr-x.  
4 cass cass 4096 Jul  2 
11:09configproperties-228ea2e0c13811e4aa1d4ffc8e9bfaa6/drwxr-xr-x.  4 cass cass 
4096 Jul  2 11:09user_activation-95d005f019a311e58b954ffc8e9bfaa6/drwxr-xr-x.  
3 cass cass 4096 Jun 25 
10:16user_app_permission-9fddcd62ffbe11e4a25a45259f96ec68/drwxr-xr-x.  4 cass 
cass 4096 Jul  2 
11:09user_credential-86cfff1019a311e58b954ffc8e9bfaa6/drwxr-xr-x.  4 cass cass 
4096 Jul  2 11:09user_info-2fa076221b1011e58b954ffc8e9bfaa6/drwxr-xr-x.  2 cass 
cass 4096 Jun 25 10:15user_info-36028c00054f11e58b954ffc8e9bfaa6/drwxr-xr-x.  3 
cass cass 4096 Jun 25 
10:15user_info-fe1d7b101a5711e58b954ffc8e9bfaa6/drwxr-xr-x.  4 cass cass 4096 
Jun 25 10:16user_role-9ed0ca30ffbe11e4b71d09335ad2d5a9/

WARN [Thread-2579] 2015-07-02 16:02:27,523 IncomingTcpConnection.java:91 
-UnknownColumnFamilyException reading from socket; 
closingorg.apache.cassandra.db.UnknownColumnFamilyException:Couldn't 
findcfId=218e3c90-1b0e-11e5-a34b-d7c17b3e318a   
atorg.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)~[apache-cassandra-2.1.2.jar:2.1.2]
   
atorg.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:322)~[apache-cassandra-2.1.2.jar:2.1.2]
   at 
org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:302

DROP Table

2015-07-10 Thread Saladi Naidu
My understanding is that Cassandra File Structure follows below naming 
convention
/cassandra/data/      key-spaces table 



Whereas our file structure is as below, each table has multiple names and when 
we drop tables and recreate these directories remain. Also when we dropped the 
table one node was down, when it came back, we tried to do Nodetool repair and 
repair kept failing  referring to CFID error listed below

drwxr-xr-x. 16 cass cass 4096 May 24 06:49 ../

drwxr-xr-x.  4 cass cass 4096 Jul  2 
11:09application_by_user-e0eec95019a211e58b954ffc8e9bfaa6/

drwxr-xr-x.  2 cass cass 4096 Jun 25 
10:15application_info-4dba2bf0054f11e58b954ffc8e9bfaa6/

drwxr-xr-x.  4 cass cass 4096 Jul  2 
11:09application_info-a0ee65d019a311e58b954ffc8e9bfaa6/

drwxr-xr-x.  4 cass cass 4096 Jul  2 
11:09configproperties-228ea2e0c13811e4aa1d4ffc8e9bfaa6/

drwxr-xr-x.  4 cass cass 4096 Jul  2 
11:09user_activation-95d005f019a311e58b954ffc8e9bfaa6/

drwxr-xr-x.  3 cass cass 4096 Jun 25 
10:16user_app_permission-9fddcd62ffbe11e4a25a45259f96ec68/

drwxr-xr-x.  4 cass cass 4096 Jul  2 
11:09user_credential-86cfff1019a311e58b954ffc8e9bfaa6/

drwxr-xr-x.  4 cass cass 4096 Jul  2 
11:09user_info-2fa076221b1011e58b954ffc8e9bfaa6/

drwxr-xr-x.  2 cass cass 4096 Jun 25 
10:15user_info-36028c00054f11e58b954ffc8e9bfaa6/

drwxr-xr-x.  3 cass cass 4096 Jun 25 
10:15user_info-fe1d7b101a5711e58b954ffc8e9bfaa6/

drwxr-xr-x.  4 cass cass 4096 Jun 25 
10:16user_role-9ed0ca30ffbe11e4b71d09335ad2d5a9/



WARN [Thread-2579] 2015-07-02 16:02:27,523 IncomingTcpConnection.java:91 
-UnknownColumnFamilyException reading from socket; closing

org.apache.cassandra.db.UnknownColumnFamilyException:Couldn't 
findcfId=218e3c90-1b0e-11e5-a34b-d7c17b3e318a

   
atorg.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)~[apache-cassandra-2.1.2.jar:2.1.2]

   at 
org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)~[apache-cassandra-2.1.2.jar:2.1.2]

   
atorg.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:322)~[apache-cassandra-2.1.2.jar:2.1.2]

   at 
org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:302)~[apache-cassandra-2.1.2.jar:2.1.2]

   
atorg.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:330)~[apache-cassandra-2.1.2.jar:2.1.2]

   at 
org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:272)~[apache-cassandra-2.1.2.jar:2.1.2]

   at 
org.apache.cassandra.net.MessageIn.read(MessageIn.java:99)~[apache-cassandra-2.1.2.jar:2.1.2]

   at 
org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:168)~[apache-cassandra-2.1.2.jar:2.1.2]

   
atorg.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:150)~[apache-cassandra-2.1.2.jar:2.1.2]

   
atorg.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:82)~[apache-cassandra-2.1.2.jar:2.1.2]

  Naidu Saladi 


Read Repair

2015-07-08 Thread Saladi Naidu
Suppose I have a row of existing data with set of values for attributes I call 
this State1, and issue an update to some columns with Quorum consistency.  If 
the write is succeeded in one node, Node1 and and failed on remaining nodes. As 
there is no Rollback, Node1 row attributes will remain new state, State2 and 
rest of the nodes row will have old state, State1. If I do a Read and Cassandra 
detects state difference, it will issue a Read repair which will result in new 
state, State2 being propagated to other nodes. But from a application point of 
view the update never happened because it received an exception. How to handle 
this kind of a situation? Naidu Saladi 


Re: Example Data Modelling

2015-07-08 Thread Saladi Naidu
If going by Month as partition key then you need to duplicate the data. I dont 
think going with name as partition key is good datamodel practice as it will 
create a hotspot. Also I believe your queries will be mostly by employee not by 
month. 
You can create employee id as partition key and month as clustering and keep 
employee details as static columns so they wont be repeated  Naidu Saladi 

  From: Srinivasa T N seen...@gmail.com
 To: user@cassandra.apache.org user@cassandra.apache.org 
 Sent: Tuesday, July 7, 2015 3:07 AM
 Subject: Re: Example Data Modelling
   
Thanks for the inputs.

Now my question is how should the app populate the duplicate data, i.e., if I 
have an employee record (along with his FN, LN,..) for the month of Apr and 
later I am populating the same record for the month of may (with salary 
changed), should my application first read/fetch the corresponding data for apr 
and re-insert with modification for month of may?

Regards,
Seenu.



On Tue, Jul 7, 2015 at 11:32 AM, Peer, Oded oded.p...@rsa.com wrote:

The data model suggested isn’t optimal for the “end of month” query you want to 
run since you are not querying by partition key.The query would look like 
“select EmpID, FN, LN, basic from salaries where month = 1” which requires 
filtering and has unpredictable performance. For this type of query to be fast 
you can use the “month” column as the partition key and the “EmpID” and the 
clustering column.This approach also has drawbacks:1. This data model creates a 
wide row. Depending on the number of employees this partition might be very 
large. You should limit partition sizes to 25MB2. Distributing data according 
to month means that only a small number of nodes will hold all of the salary 
data for a specific month which might cause hotspots on those nodes. Choose the 
approach that works best for you.  From: Carlos Alonso 
[mailto:i...@mrcalonso.com]
Sent: Monday, July 06, 2015 7:04 PM
To: user@cassandra.apache.org
Subject: Re: Example Data Modelling Hi Srinivasa, I think you're right, In 
Cassandra you should favor denormalisation when in RDBMS you find a 
relationship like this. I'd suggest a cf like thisCREATE TABLE salaries (  
EmpID varchar,  FN varchar,  LN varchar,  Phone varchar,  Address varchar,  
month integer,  basic integer,  flexible_allowance float,  PRIMARY KEY(EmpID, 
month)) That way the salaries will be partitioned by EmpID and clustered by 
month, which I guess is the natural sorting you want. Hope it helps,Cheers!
Carlos Alonso | Software Engineer | @calonso On 6 July 2015 at 13:01, Srinivasa 
T N seen...@gmail.com wrote:Hi,   I have basic doubt: I have an RDBMS with 
the following two tables:

   Emp - EmpID, FN, LN, Phone, Address
   Sal - Month, Empid, Basic, Flexible Allowance

   My use case is to print the Salary slip at the end of each month and the 
slip contains emp name and his other details.

   Now, if I want to have the same in cassandra, I will have a single cf with 
emp personal details and his salary details.  Is this the right approach?  
Should we have the employee personal details duplicated each month?

Regards,
Seenu. 



  

Re: Catastrophy Recovery.

2015-06-15 Thread Saladi Naidu
Alain great write-up on the recovery procedure. You had covered both RF factor 
and Consistency levels. As mentioned two anti entropy mechanisms, hinted hand 
off's and Read Repair work for temporary node outage and incremental recovery. 
In case of disaster/catastrophic recovery, nodetool repair is best way to 
recover back. 
Is below procedure would have ensured node being added properly to the cluster?
Adding nodes to an existing cluster | DataStax Cassandra 2.0 Documentation 
|   |
|   |   |   |   |   |
| Adding nodes to an existing cluster | DataStax Cassandra 2.0 
DocumentationSteps to add nodes when using virtual nodes. | Version 2.0 |
|  |
| View on docs.datastax.com | Preview by Yahoo |
|  |
|   |

   Naidu Saladi 

  From: Jean Tremblay jean.tremb...@zen-innovations.com
 To: user@cassandra.apache.org user@cassandra.apache.org 
 Sent: Monday, June 15, 2015 10:58 AM
 Subject: Re: Catastrophy Recovery.
   
That is really wonderful. Thank you very much Alain. You gave me a lot of 
trails to investigate. Thanks again for you help.



On 15 Jun 2015, at 17:49 , Alain RODRIGUEZ arodr...@gmail.com wrote:
Hi, it looks like your starting to use Cassandra.
Welcome.
I invite you to read from here as much as you can 
http://docs.datastax.com/en/cassandra/2.1/cassandra/gettingStartedCassandraIntro.html.
When a node lose some data you have various anti entropy mechanism
Hinted Handoff -- For writes that occurred while node was down and known as 
such by other nodes (exclusively)Read repair -- On each read, you can set a 
chance to check other nodes for auto correction.Repair ( called either manual / 
anti entropy / full / ...) : Which takes care to give back a node its missing 
data only for the range this node handles (-pr) or for all its data (its range 
plus its replica). This is something you generally want to perform on all nodes 
on a regular basis (lower than the lowest gc_grace_period set on any of your 
tables).
Also, you are having wrong values because you probably have a Consistency Level 
(CL) too low. If you want this to never happen you have to set Read (R) / Write 
(W) consistency level as follow : R + W  RF (Refplication Factor), if not you 
can see what you are currently seeing. I advise you to set your consistency to 
local_quorum or quorum on single DC environment. Also, with 3 nodes, you 
should set RF to 3, if not you won't be able to reach a strong consistency due 
to the formula I just give you.
There is a lot more to know, you should read about this all. Using Cassandra 
without knowing about its internals would lead you to very poor and unexpected 
results.
To answer your questions:
For what I understand, if you have a fixed node with no data it will 
automatically bootstrap and recover all its old data from its neighbour while 
doing the joining phase. Is this correct?

-- Not at all, unless it join the ring for the first time, which is not your 
case. Through it will (by default) slowly recover while you read.
After such catastrophe, and after the joining phase is done should the cluster 
not be ready to deliver always consistent data if there was no inserts or 
delete during the catastrophe?
No, we can't ensure that, excepted dropping the node and bootstrapping a new 
one. What we can make sure of is that there is enough replica remaining to 
serve consistent data (search for RF and CL)
After the bootstrap of a broken node is finish, i.e. after the joining phase, 
is there not simply a repair to be done on that node using “node repair?
This sentence is false bootstrap / joining phase ≠ from broken node coming 
back. You are right on repair, if a broken node (or down for too long - default 
3 hours) come back you have to repair. But repair is slow, make sure you can 
afford a node, see my previous answer.
Testing is a really good idea but you also have to read a lot imho.
Good luck,
C*heers,
Alain

2015-06-15 11:13 GMT+02:00 Jean Tremblay jean.tremb...@zen-innovations.com:




Hi,

I have a cluster of 3 nodes RF: 2.
There are about 2 billion rows in one table.
I use LeveledCompactionStrategy on my table.
I use version 2.1.6.
I use the default cassandra.yaml, only the ip address for seeds and throughput 
has been change.
I am have tested a scenario where one node crashes and loose all its data.I 
have deleted all data on this node after having stopped Cassandra.At this point 
I noticed that the cluster was giving proper results. What I was expecting from 
a cluster DB.
I then restarted that node and I observed that the node was joining the 
cluster.After an hour or so the old “defect” node was up and normal. I noticed 
that its hard disk loaded with much less data than its neighbours.
When I was querying the DB, the cluster was giving me different results for 
successive identical queries.I guess the old “defect” node was giving me less 
rows than it should have.
1) For what I understand, if you have a fixed node with no data it will 
automatically bootstrap and recover all its 

Re: Is Table created in all the nodes if the default consistency level used

2015-03-16 Thread Saladi Naidu
There are 3 different things we are talking here
1. SimpleStrategy vs. NetworkTopology matter when you have single DC vs. 
Multiple DC's2. In both cases you can specific replication factor, obviously in 
SimpleStratgey case you dont mention DC whereas in NetworkTopology, you can 
mentione mutiple options by each DC's replication requirements.3. Now if your 
question is referred to single DC then even if your System keyspace is 
SimpleStartegy and your user table is NetworkToplogy, it should not matter and 
Table_Test will be created in all nodes4. If your System_auth KS is set less 
than number of nodes, you will face AUTH issues. Naidu Saladi 

  From: 鄢来琼 laiqiong@gtafe.com
 To: user@cassandra.apache.org user@cassandra.apache.org 
 Sent: Monday, March 16, 2015 2:13 AM
 Subject: Re: Is Table created in all the nodes if the default consistency 
level used
   
#yiv5346526530 #yiv5346526530 -- _filtered #yiv5346526530 
{font-family:宋体;panose-1:2 1 6 0 3 1 1 1 1 1;} _filtered #yiv5346526530 
{font-family:宋体;panose-1:2 1 6 0 3 1 1 1 1 1;} _filtered #yiv5346526530 
{font-family:Calibri;panose-1:2 15 5 2 2 2 4 3 2 4;} _filtered #yiv5346526530 
{panose-1:2 1 6 0 3 1 1 1 1 1;} _filtered #yiv5346526530 {panose-1:3 15 7 2 3 3 
2 2 2 4;} _filtered #yiv5346526530 {font-family:Consolas;panose-1:2 11 6 9 2 2 
4 3 2 4;}#yiv5346526530 #yiv5346526530 p.yiv5346526530MsoNormal, #yiv5346526530 
li.yiv5346526530MsoNormal, #yiv5346526530 div.yiv5346526530MsoNormal 
{margin:0cm;margin-bottom:.0001pt;font-size:12.0pt;font-family:宋体;}#yiv5346526530
 a:link, #yiv5346526530 span.yiv5346526530MsoHyperlink 
{color:blue;text-decoration:underline;}#yiv5346526530 a:visited, #yiv5346526530 
span.yiv5346526530MsoHyperlinkFollowed 
{color:purple;text-decoration:underline;}#yiv5346526530 code 
{font-family:宋体;}#yiv5346526530 pre 
{margin:0cm;margin-bottom:.0001pt;font-size:12.0pt;font-family:宋体;}#yiv5346526530
 p.yiv5346526530MsoAcetate, #yiv5346526530 li.yiv5346526530MsoAcetate, 
#yiv5346526530 div.yiv5346526530MsoAcetate 
{margin:0cm;margin-bottom:.0001pt;font-size:9.0pt;font-family:宋体;}#yiv5346526530
 span.yiv5346526530HTMLChar {}#yiv5346526530 span.yiv5346526530EmailStyle21 
{color:#1F497D;}#yiv5346526530 span.yiv5346526530Char 
{font-family:宋体;}#yiv5346526530 .yiv5346526530MsoChpDefault {} _filtered 
#yiv5346526530 {margin:72.0pt 90.0pt 72.0pt 90.0pt;}#yiv5346526530 
div.yiv5346526530WordSection1 {}#yiv5346526530 Hi Daemeon,    Yes, I use 
“NetworkTopologyStrategy” strategy for “Table_test”, but “System keyspace” is 
Cassandra internal keyspace, its strategy is localStrategy. So my question is 
how to guarantee “Table_test” is created in all the nodes before any R/W 
opertions?    Thanks.    Peter       

发件人: daemeon reiydelle [mailto:daeme...@gmail.com]
发送时间: 2015年3月16日 14:35
收件人: user@cassandra.apache.org
主题: Re: Is Table created in all the nodes if the default consistency level used 
   If you want to guarantee that the data is written to all nodes before the 
code returns, then yes you have to use consistency all. Otherwise there is a 
small risk of outdated data being served if a node goes offline longer than 
hints timeouts. Somewhat looser options that can assure multiple copies are 
written, as you probably know, are quorum or a hard coded value. This applies 
to a typical installation with a substantial number of nodes of course, not a 
small 2-3 node cluster. I am curious why localStrategy when you have such 
concerns about data consistency that you want to assure all nodes get data 
written. Can you elaborate on your use case? 
 
...
“Life should not be a journey to the grave with the intention of arriving 
safely in a
pretty and well preserved body, but rather to skid in broadside in a cloud of 
smoke,
thoroughly used up, totally worn out, and loudly proclaiming “Wow! What a Ride!”
- Hunter Thompson

Daemeon C.M. Reiydelle
USA (+1) 415.501.0198
London (+44) (0) 20 8144 9872    On Sun, Mar 15, 2015 at 8:11 PM, 鄢来琼 
laiqiong@gtafe.com wrote: Could you tell me whether the meta data of the 
new table are build in all the nodes after execute the following statement.   
cassandra_session.execute_async( “““CREATE TABLE Table_test(   ID uuid, 
  Time timestamp,   Value double,   Date timestamp,   PRIMARY KEY 
((ID,Date), Time) ) WITH COMPACT STORAGE; ””” )   As I know, the system 
keyspace is used to store the meta data, but the strategy is localStrategy, 
which only store meta data of local node. So I want to know whether table is 
created in all the nodes, should I add consistency_level setting to the above 
statement to make sure “create table” will be executed in all the nodes? 
Thanks.   Peter