Re: Read-repair working, repair not working?

2013-02-11 Thread Brian Fleming

Hi Aaron,
 
Many thanks for your reply - answers below.
 
Cheers,
 
Brian
 
 
 What CL are you using for reads and writes?
 I would first build a test case to ensure correct operation when using strong 
 consistency. i.e. QUOURM write and read. Because you are using RF 2 per DC I 
 assume you are not using LOCAL_QUOURM because that is 2 and you would not 
 have any redundancy in the DC.
CL.ONE : this is primarily for performance reasons but also because there are 
only three local nodes as you suggest and we need at least some resiliency.  
In the context of this issue, I considered increasing this to CL.LOCAL_QUORUM 
but the behaviour suggests than none of the 3 local nodes have the data (say I 
make 100 requests : all 100 initially fail and subsequently all 100 succeed), 
so not sure it'll help?
 
 
 Dropped mutations in a multi DC setup may be a sign of network congestion or 
 overloaded nodes.
This DC is remote in terms of network topology - it's in Asia (Hong Kong) while 
the rest of the cluster is in Europe/North America, so network latency rather 
than congestion could be a cause?  However I see some pretty aggressive data 
transfer speeds during the initial repairs  the data footprint approximately 
matches the nodes elsewhere in the ring, so something doesn't add up?
 
Here are the tpstats for one of these nodes :
Pool NameActive   Pending  Completed   Blocked  All 
time blocked
ReadStage 0 0   4919185 0   
  0
RequestResponseStage  0 0  16869994 0   
  0
MutationStage 0 0  16764910 0   
  0
ReadRepairStage   0 0   3703 0  
   0
ReplicateOnWriteStage 0 0  0 0  
   0
GossipStage   0 0 845225 0  
   0
AntiEntropyStage  0 0  52441 0  
   0
MigrationStage0 0   4362 0  
   0
MemtablePostFlusher   0 0952 0  
   0
StreamStage   0 0 24 0  
   0
FlushWriter   0 0960 0  
   5
MiscStage 0 0   3592 0  
   0
AntiEntropySessions   4 4121 0  
   0
InternalResponseStage 0 0  0 0  
   0
HintedHandoff 1 2 55 0  
   0
 
Message type   Dropped
RANGE_SLICE  0
READ_REPAIR 150597
BINARY   0
READ781490
MUTATION853846
REQUEST_RESPONSE 0
The numbers of READ_REPAIR, READ  MUTATION operations  are non-negligable.  
The nodes in Europe/North America have effectively zero dropped messages.  This 
suggests network latency is probably a significant factor? 
[the network ping from Europe to a HK node is ~250ms, so I wouldn’t have 
expected it to be such a problem?]
 

 It would, but the INFO logging for the AES is pretty good. I would hold off 
 for now.
Ok.
 
 [AES session logging]
Yes, I see the expected start/end logs, so that's another thing off the list.
 
 


On 10 Feb 2013, at 20:12, aaron morton aa...@thelastpickle.com wrote:

 I’d request data, nothing would be returned, I would then re-request the 
 data and it would correctly be returned:
 What CL are you using for reads and writes?
 
 I see a number of dropped ‘MUTATION’ operations : just under 5% of the total 
 ‘MutationStage’ count.
 Dropped mutations in a multi DC setup may be a sign of network congestion or 
 overloaded nodes. 
 
 
 -  Could anybody suggest anything specific to look at to see why the 
 repair operations aren’t having the desired effect?
 I would first build a test case to ensure correct operation when using strong 
 consistency. i.e. QUOURM write and read. Because you are using RF 2 per DC I 
 assume you are not using LOCAL_QUOURM because that is 2 and you would not 
 have any redundancy in the DC. 
 
 
 
 -  Would increasing logging level to ‘DEBUG’ show read-repair 
 activity (to confirm that this is happening, when  for what proportion of 
 total requests)?
 It would, but the INFO logging for the AES is pretty good. I would hold off 
 for now. 
 
 
 -  Is there something obvious that I could be missing here?
 When a new AES session starts it logs this
 
logger.info(String.format([repair #%s] new session: will sync %s 
 on range %s for %s.%s, getName(), repairedNodes(), range, tablename, 
 Arrays.toString(cfnames)));
 
 When it completes it logs this
 
 

Re: Cassandra 1.1.2 - 1.1.8 upgrade

2013-02-11 Thread Alain RODRIGUEZ
Not sure this will be useful for you but nodetool drain doesn't work
properly well for a while. If you are using counters I recommend you to
remove commit logs after you drained ans stopped the node, before
restarting the node to avoid replaying counts.

https://issues.apache.org/jira/browse/CASSANDRA-4446

Hope this will help,

Alain


2013/2/11 Michal Michalski mich...@opera.com


  2) Upgrade one node at a time, running the clustered in a mixed
 1.1.2-1.1.9 configuration for a number of days.


 I'm about to upgrade my 1.1.0 cluster and
 http://www.datastax.com/docs/**1.1/install/upgrading#infohttp://www.datastax.com/docs/1.1/install/upgrading#infosays:

 If you are upgrading to Cassandra 1.1.9 from a version earlier than
 1.1.7, all nodes must be upgraded before any streaming can take place.
 Until you upgrade all nodes, you cannot add version 1.1.7 nodes or later to
 a 1.1.7 or earlier cluster.

 Which one is correct then? Can I run mixed 1.1.2 (in my case 1.1.0) 
 1.1.9 cluster or not?

 M.



Re: High CPU usage during repair

2013-02-11 Thread aaron morton
 What machine size?
 m1.large 
If you are seeing high CPU move to an m1.xlarge, that's the sweet spot. 

 That's normally ok. How many are waiting?
 
 I have seen 4 this morning 
That's not really abnormal. 
The pending task count goes when when a file *may* be eligible for compaction, 
not when there is a compaction task waiting. 

If you suddenly create a number of new SSTables for a CF the pending count will 
rise, however one of the tasks may compact all the sstables waiting for 
compaction. So the count will suddenly drop as well. 

 Just to make sure I understand you correctly, you suggest that I change 
 throughput to 12 regardless of whether repair is ongoing or not. I will do it 
 using nodetool and change the yaml file in case a restart will occur in the 
 future? 
Yes. 
If you are seeing performance degrade during compaction or repair try reducing 
the throughput. 

I would attribute most of the problems you have described to using m1.large. 

Cheers
 

-
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 11/02/2013, at 9:16 AM, Tamar Fraenkel ta...@tok-media.com wrote:

 Hi!
 Thanks for the response.
 See my answers and questions below.
 Thanks!
 Tamar
 
 Tamar Fraenkel 
 Senior Software Engineer, TOK Media 
 
 tokLogo.png
 
 ta...@tok-media.com
 Tel:   +972 2 6409736 
 Mob:  +972 54 8356490 
 Fax:   +972 2 5612956 
 
 
 
 
 On Sun, Feb 10, 2013 at 10:04 PM, aaron morton aa...@thelastpickle.com 
 wrote:
 During repair I see high CPU consumption, 
 Repair reads the data and computes a hash, this is a CPU intensive operation.
 Is the CPU over loaded or is just under load?
  Usually just load, but in the past two weeks I have seen CPU of over 90%!
 I run Cassandra  version 1.0.11, on 3 node setup on EC2 instances.
 
 What machine size?
 m1.large 
 
 there are compactions waiting.
 That's normally ok. How many are waiting?
 
 I have seen 4 this morning 
 I thought of adding a call to my repair script, before repair starts to do:
 nodetool setcompactionthroughput 0
 and then when repair finishes call
 nodetool setcompactionthroughput 16
 That will remove throttling on compaction and the validation compaction used 
 for the repair. Which may in turn add additional IO load, CPU load and GC 
 pressure. You probably do not want to do this. 
 
 Try reducing the compaction throughput to say 12 normally and see the effect.
 
 Just to make sure I understand you correctly, you suggest that I change 
 throughput to 12 regardless of whether repair is ongoing or not. I will do it 
 using nodetool and change the yaml file in case a restart will occur in the 
 future? 
 Cheers
 
 
 -
 Aaron Morton
 Freelance Cassandra Developer
 New Zealand
 
 @aaronmorton
 http://www.thelastpickle.com
 
 On 11/02/2013, at 1:01 AM, Tamar Fraenkel ta...@tok-media.com wrote:
 
 Hi!
 I run repair weekly, using a scheduled cron job.
 During repair I see high CPU consumption, and messages in the log file
 INFO [ScheduledTasks:1] 2013-02-10 11:48:06,396 GCInspector.java (line 122) 
 GC for ParNew: 208 ms for 1 collections, 1704786200 used; max is 3894411264
 From time to time, there are also messages of the form
 INFO [ScheduledTasks:1] 2012-12-04 13:34:52,406 MessagingService.java (line 
 607) 1 READ messages dropped in last 5000ms
 
 Using opscenter, jmx and nodetool compactionstats I can see that during the 
 time the CPU consumption is high, there are compactions waiting.
 
 I run Cassandra  version 1.0.11, on 3 node setup on EC2 instances.
 I have the default settings:
 compaction_throughput_mb_per_sec: 16
 in_memory_compaction_limit_in_mb: 64
 multithreaded_compaction: false
 compaction_preheat_key_cache: true
 
 I am thinking on the following solution, and wanted to ask if I am on the 
 right track:
 I thought of adding a call to my repair script, before repair starts to do:
 nodetool setcompactionthroughput 0
 and then when repair finishes call
 nodetool setcompactionthroughput 16
 
 Is this a right solution?
 Thanks,
 Tamar
 
 Tamar Fraenkel 
 Senior Software Engineer, TOK Media 
 
 tokLogo.png
 
 
 ta...@tok-media.com
 Tel:   +972 2 6409736 
 Mob:  +972 54 8356490 
 Fax:   +972 2 5612956 
 
 
 
 



Re: CQL 3 compound row key error

2013-02-11 Thread aaron morton
That sounds like a bug, or something that is still under work. Sylvain has his 
finger on all things CQL. 

Can you raise a ticket on https://issues.apache.org/jira/browse/CASSANDRA

Cheers

-
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 11/02/2013, at 4:01 PM, Shahryar Sedghi shsed...@gmail.com wrote:

 I am moving my application from 1.1 to 1.2.1  to utilize secondary index and 
 simplify the data model. In 1.1 I was concentrating some fields into one 
 separated by : for the row key and it was a big string. In V1.2 I use 
 compound rows key showed in the following test case (interval and seq):
 
 
 CREATE TABLE  test(
 interval text,
 seq int, 
 id int,
 severity int,
 PRIMARY KEY ((interval, seq), id)) 
 WITH CLUSTERING ORDER BY (id DESC);
 --
 CREATE INDEX ON test(severity);
 
 
  select * from test where severity = 3 and  interval = 't' and seq =1;
 
 results:
 
 Bad Request: Start key sorts after end key. This is not allowed; you probably 
 should not specify end key at all under random partitioner
 
 If I define the table as this:
 
 CREATE TABLE  test(
 interval text,
 id int,
 severity int,
 PRIMARY KEY (interval, id))
 WITH CLUSTERING ORDER BY (id DESC);
 
  select * from test where severity = 3 and  interval = 't1';
 
 Works fine. Is it a bug?
 
 Thanks in Advance
 
 Shahryar
 
 
 
 -- 
 Life is what happens while you are making other plans. ~ John Lennon



Re: High CPU usage during repair

2013-02-11 Thread Tamar Fraenkel
Thank you very much! Due to monetary limitations I will keep the m1.large
for now, but try the throughput modification.
Tamar

*Tamar Fraenkel *
Senior Software Engineer, TOK Media

[image: Inline image 1]

ta...@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956




On Mon, Feb 11, 2013 at 11:30 AM, aaron morton aa...@thelastpickle.comwrote:

  What machine size?

 m1.large

 If you are seeing high CPU move to an m1.xlarge, that's the sweet spot.

 That's normally ok. How many are waiting?

 I have seen 4 this morning

 That's not really abnormal.
 The pending task count goes when when a file *may* be eligible for
 compaction, not when there is a compaction task waiting.

 If you suddenly create a number of new SSTables for a CF the pending count
 will rise, however one of the tasks may compact all the sstables waiting
 for compaction. So the count will suddenly drop as well.

 Just to make sure I understand you correctly, you suggest that I change
 throughput to 12 regardless of whether repair is ongoing or not. I will do
 it using nodetool and change the yaml file in case a restart will occur in
 the future?

 Yes.
 If you are seeing performance degrade during compaction or repair try
 reducing the throughput.

 I would attribute most of the problems you have described to using
 m1.large.

 Cheers


-
 Aaron Morton
 Freelance Cassandra Developer
 New Zealand

 @aaronmorton
 http://www.thelastpickle.com

 On 11/02/2013, at 9:16 AM, Tamar Fraenkel ta...@tok-media.com wrote:

 Hi!
 Thanks for the response.
 See my answers and questions below.
 Thanks!
 Tamar

  *Tamar Fraenkel *
 Senior Software Engineer, TOK Media

 tokLogo.png

 ta...@tok-media.com
 Tel:   +972 2 6409736
 Mob:  +972 54 8356490
 Fax:   +972 2 5612956




 On Sun, Feb 10, 2013 at 10:04 PM, aaron morton aa...@thelastpickle.comwrote:

 During repair I see high CPU consumption,

 Repair reads the data and computes a hash, this is a CPU intensive
 operation.
 Is the CPU over loaded or is just under load?

  Usually just load, but in the past two weeks I have seen CPU of over 90%!

 I run Cassandra  version 1.0.11, on 3 node setup on EC2 instances.

 What machine size?

 m1.large


 there are compactions waiting.

 That's normally ok. How many are waiting?

 I have seen 4 this morning

 I thought of adding a call to my repair script, before repair starts to
 do:
 nodetool setcompactionthroughput 0
 and then when repair finishes call
 nodetool setcompactionthroughput 16

 That will remove throttling on compaction and the validation compaction
 used for the repair. Which may in turn add additional IO load, CPU load and
 GC pressure. You probably do not want to do this.

 Try reducing the compaction throughput to say 12 normally and see the
 effect.

 Just to make sure I understand you correctly, you suggest that I change
 throughput to 12 regardless of whether repair is ongoing or not. I will do
 it using nodetool and change the yaml file in case a restart will occur in
 the future?

 Cheers


-
 Aaron Morton
 Freelance Cassandra Developer
 New Zealand

 @aaronmorton
 http://www.thelastpickle.com

 On 11/02/2013, at 1:01 AM, Tamar Fraenkel ta...@tok-media.com wrote:

 Hi!
 I run repair weekly, using a scheduled cron job.
 During repair I see high CPU consumption, and messages in the log file
 INFO [ScheduledTasks:1] 2013-02-10 11:48:06,396 GCInspector.java (line
 122) GC for ParNew: 208 ms for 1 collections, 1704786200 used; max is
 3894411264
 From time to time, there are also messages of the form
 INFO [ScheduledTasks:1] 2012-12-04 13:34:52,406 MessagingService.java
 (line 607) 1 READ messages dropped in last 5000ms

 Using opscenter, jmx and nodetool compactionstats I can see that during
 the time the CPU consumption is high, there are compactions waiting.

 I run Cassandra  version 1.0.11, on 3 node setup on EC2 instances.
 I have the default settings:
 compaction_throughput_mb_per_sec: 16
 in_memory_compaction_limit_in_mb: 64
 multithreaded_compaction: false
 compaction_preheat_key_cache: true

 I am thinking on the following solution, and wanted to ask if I am on the
 right track:
 I thought of adding a call to my repair script, before repair starts to
 do:
 nodetool setcompactionthroughput 0
 and then when repair finishes call
 nodetool setcompactionthroughput 16

 Is this a right solution?
 Thanks,
 Tamar

 *Tamar Fraenkel *
 Senior Software Engineer, TOK Media

 tokLogo.png


 ta...@tok-media.com
 Tel:   +972 2 6409736
 Mob:  +972 54 8356490
 Fax:   +972 2 5612956






tokLogo.png

Re: Read-repair working, repair not working?

2013-02-11 Thread aaron morton
 CL.ONE : this is primarily for performance reasons …
This makes reasoning about correct behaviour a little harder. 

If there is anyway you can run some tests with R + W  N strong consistency I 
would encourage you to do so. You will then have a baseline of what works.

  (say I make 100 requests : all 100 initially fail and subsequently all 100 
 succeed), so not sure it'll help?
The high number of inconsistencies seems to match with the massive number of 
dropped Mutation messages. Even if Anti Entropy is running, if the node in HK 
is dropping so many messages there will be inconsistencies. 
  

It looks like the HK node is overloaded. I would check the logs for GC 
messages, check for VM steal in a virtualised env, check for sufficient CPU + 
memory resources, check for IO stress. 

 20 node cluster running v1.0.7 split between 5 data centres,
 I’ve had some intermittent issues with a new data centre (3 nodes, RF=2) I
Do all DC's have the same number of nodes ?

Cheers
-
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 11/02/2013, at 9:13 PM, Brian Fleming bigbrianflem...@gmail.com wrote:

 
 Hi Aaron,
  
 Many thanks for your reply - answers below.
  
 Cheers,
  
 Brian
  
  
  What CL are you using for reads and writes?
  I would first build a test case to ensure correct operation when using 
  strong consistency. i.e. QUOURM write and read. Because you are using RF 2 
  per DC I assume you are not using LOCAL_QUOURM because that is 2 and you 
  would not have any redundancy in the DC.
 CL.ONE : this is primarily for performance reasons but also because there are 
 only three local nodes as you suggest and we need at least some resiliency.  
 In the context of this issue, I considered increasing this to CL.LOCAL_QUORUM 
 but the behaviour suggests than none of the 3 local nodes have the data (say 
 I make 100 requests : all 100 initially fail and subsequently all 100 
 succeed), so not sure it'll help?
  
  
  Dropped mutations in a multi DC setup may be a sign of network congestion 
  or overloaded nodes.
 This DC is remote in terms of network topology - it's in Asia (Hong Kong) 
 while the rest of the cluster is in Europe/North America, so network latency 
 rather than congestion could be a cause?  However I see some pretty 
 aggressive data transfer speeds during the initial repairs  the data 
 footprint approximately matches the nodes elsewhere in the ring, so something 
 doesn't add up?
  
 Here are the tpstats for one of these nodes :
 Pool NameActive   Pending  Completed   Blocked  All 
 time blocked
 ReadStage 0 0   4919185 0 
 0
 RequestResponseStage  0 0  16869994 0 
 0
 MutationStage 0 0  16764910 0 
 0
 ReadRepairStage   0 0   3703 0
  0
 ReplicateOnWriteStage 0 0  0 0
  0
 GossipStage   0 0 845225 0
  0
 AntiEntropyStage  0 0  52441 0
  0
 MigrationStage0 0   4362 0
  0
 MemtablePostFlusher   0 0952 0
  0
 StreamStage   0 0 24 0
  0
 FlushWriter   0 0960 0
  5
 MiscStage 0 0   3592 0
  0
 AntiEntropySessions   4 4121 0
  0
 InternalResponseStage 0 0  0 0
  0
 HintedHandoff 1 2 55 0
  0
  
 Message type   Dropped
 RANGE_SLICE  0
 READ_REPAIR 150597
 BINARY   0
 READ781490
 MUTATION853846
 REQUEST_RESPONSE 0
 The numbers of READ_REPAIR, READ  MUTATION operations  are non-negligable.  
 The nodes in Europe/North America have effectively zero dropped messages.  
 This suggests network latency is probably a significant factor? 
 [the network ping from Europe to a HK node is ~250ms, so I wouldn’t have 
 expected it to be such a problem?]
  
 
  It would, but the INFO logging for the AES is pretty good. I would hold off 
  for now.
 Ok.
  
  [AES session logging]
 Yes, I see the expected start/end logs, so that's another thing off the list.
  
  
 
 
 On 10 Feb 2013, at 20:12, aaron morton aa...@thelastpickle.com wrote:
 
 I’d request data, nothing would be returned, I would then re-request the 
 data and it would correctly be returned:
 
 

Re: Cassandra 1.1.2 - 1.1.8 upgrade

2013-02-11 Thread aaron morton

You can always run them. But in some situations repair cannot be used, and in 
this case new nodes cannot be added. The news.txt file is your friend there. 

As a general rule when upgrading a cluster I move one node to the new version 
and let it soak in for an hour or so. Just to catch any crazy. I then upgrade 
all the nodes and run through the upgrade table. You can stagger upgrade table 
to be every RF'th node in the cluster to reduce the impact.

Cheers

-
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 11/02/2013, at 8:05 PM, Michal Michalski mich...@opera.com wrote:

 
 2) Upgrade one node at a time, running the clustered in a mixed
 1.1.2-1.1.9 configuration for a number of days.
 
 I'm about to upgrade my 1.1.0 cluster and
 http://www.datastax.com/docs/1.1/install/upgrading#info says:
 
 If you are upgrading to Cassandra 1.1.9 from a version earlier than 1.1.7, 
 all nodes must be upgraded before any streaming can take place. Until you 
 upgrade all nodes, you cannot add version 1.1.7 nodes or later to a 1.1.7 or 
 earlier cluster.
 
 Which one is correct then? Can I run mixed 1.1.2 (in my case 1.1.0)  1.1.9 
 cluster or not?
 
 M.



Re: Cassandra 1.1.2 - 1.1.8 upgrade

2013-02-11 Thread Michal Michalski
OK, thanks Aaron. I ask because NEWS.txt is not a big help in case of  
1.1.5 versions because there's no info on them in it (especially on 
1.1.7 which seems to be the most important one in this case, according 
to the DataStax' upgrade instructions) ;-)


https://github.com/apache/cassandra/blob/trunk/NEWS.txt

M.


W dniu 11.02.2013 11:05, aaron morton pisze:

You can always run them. But in some situations repair cannot be used, and in 
this case new nodes cannot be added. The news.txt file is your friend there.

As a general rule when upgrading a cluster I move one node to the new version 
and let it soak in for an hour or so. Just to catch any crazy. I then upgrade 
all the nodes and run through the upgrade table. You can stagger upgrade table 
to be every RF'th node in the cluster to reduce the impact.





Re: CQL 3 compound row key error

2013-02-11 Thread Shahryar Sedghi
Thanks Aaron. Opened
CASSANDRA-5240https://issues.apache.org/jira/browse/CASSANDRA-5240CASSANDRA-5240


On Mon, Feb 11, 2013 at 4:34 AM, aaron morton aa...@thelastpickle.comwrote:

 That sounds like a bug, or something that is still under work. Sylvain has
 his finger on all things CQL.

 Can you raise a ticket on https://issues.apache.org/jira/browse/CASSANDRA

 Cheers

 -
 Aaron Morton
 Freelance Cassandra Developer
 New Zealand

 @aaronmorton
 http://www.thelastpickle.com

 On 11/02/2013, at 4:01 PM, Shahryar Sedghi shsed...@gmail.com wrote:

 I am moving my application from 1.1 to 1.2.1  to utilize secondary index
 and simplify the data model. In 1.1 I was concentrating some fields into
 one separated by : for the row key and it was a big string. In V1.2 I use
 compound rows key showed in the following test case (interval and seq):


 CREATE TABLE  test(
 interval text,
 seq int,
 id int,
 severity int,
 PRIMARY KEY ((interval, seq), id))
 WITH CLUSTERING ORDER BY (id DESC);
 --
 CREATE INDEX ON test(severity);


  select * from test where severity = 3 and  interval = 't' and seq =1;

 results:

 Bad Request: Start key sorts after end key. This is not allowed; you
 probably should not specify end key at all under random partitioner

 If I define the table as this:

 CREATE TABLE  test(
 interval text,
 id int,
 severity int,
 PRIMARY KEY (interval, id))
 WITH CLUSTERING ORDER BY (id DESC);

  select * from test where severity = 3 and  interval = 't1';

 Works fine. Is it a bug?

 Thanks in Advance

 Shahryar



 --
 Life is what happens while you are making other plans. ~ John Lennon





-- 
Life is what happens while you are making other plans. ~ John Lennon


RuntimeException during leveled compaction

2013-02-11 Thread Andre Sprenger
Hi,

I'm running a 6 node Cassandra 1.1.5 cluster on EC2. We have switched to
leveled compaction a couple of weeks ago,
this has been successful. Some days ago 3 of the nodes start to log the
following exception during compaction of
a particular column family:

ERROR [CompactionExecutor:726] 2013-02-11 13:02:26,582
AbstractCassandraDaemon.java (line 135) Exception in thread
Thread[CompactionExecutor:726,1,main]
java.lang.RuntimeException: Last written key
DecoratedKey(84590743047470232854915142878708713938,
3133353533383530323237303130313030303232313537303030303132393832)
= current key DecoratedKey(28357704665244162161305918843747894551,
31333430313336313830333831303130313030303230313632303030303036363338)
writing into
/var/cassandra/data/AdServer/EventHistory/Adserver-EventHistory-tmp-he-68638-Data.db
at
org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:134)
at
org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:153)
at
org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:159)
at
org.apache.cassandra.db.compaction.LeveledCompactionTask.execute(LeveledCompactionTask.java:50)
at
org.apache.cassandra.db.compaction.CompactionManager$1.runMayThrow(CompactionManager.java:154)
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

Compaction does not happen any more for the column family and read
performance gets worse because of the growing
number of data files accessed during reads. Looks like one or more of the
data files are corrupt and have keys
that are stored out of order.

Any help to resolve this situation would be greatly appreciated.

Thanks
Andre


Re: Cassandra jmx stats ReadCount

2013-02-11 Thread aaron morton
Are you using counters? They require a read before write.

Also secondary index CF's require a read before write. 

Cheers
 
-
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 8/02/2013, at 1:26 PM, Daning Wang dan...@netseer.com wrote:

 We have 8 nodes cluster in Casandra 1.1.0, with replication factor is 3. We 
 found that when you just insert data, not only WriteCount increases, the 
 ReadCount also increases.
 
 How could this happen? I am under the impression that readCount only counts 
 the reads from client.
 
 Thanks,
 
 Daning
 
 



RE: unbalanced ring

2013-02-11 Thread Stephen.M.Thompson
Aaron, thanks for your feedback.

.125
num_tokens: 256
# initial_token:

.126
num_tokens: 256
#initial_token:

.127
num_tokens: 256
# initial_token:

This all looks correct.  So when you say to do this with a clean setup, what 
are you asking me to do?  Is it enough to blow away /var/lib/cassandra and 
reload the data?  Also destroy my Cassandra install (which is just un-tar) and 
reinstall from nothing?

Stephen Thompson
Wells Fargo Corporation
Internet Authentication  Fraud Prevention
704.427.3137 (W) | 704.807.3431 (C)

This message may contain confidential and/or privileged information, and is 
intended for the use of the addressee only. If you are not the addressee or 
authorized to receive this for the addressee, you must not use, copy, disclose, 
or take any action based on this message or any information herein. If you have 
received this message in error, please advise the sender immediately by reply 
e-mail and delete this message. Thank you for your cooperation.

From: aaron morton [mailto:aa...@thelastpickle.com]
Sent: Monday, February 11, 2013 12:51 PM
To: user@cassandra.apache.org
Subject: Re: unbalanced ring

The tokens are not right, not right at all. Some are too short and some are too 
tall.

More technically they do not appear to be randomly arranged. The tokens for the 
.125 node all start with -3, the 126 node only has negative tokens and the 127 
node mostly has positive tokens.

Check that on each node the initial_token yaml setting is commented out, and 
that num_tokens is set to 256.

If you can reproduce this fault with a clean setup please raise a ticket at 
https://issues.apache.org/jira/browse/CASSANDRA

Cheers

-
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 8/02/2013, at 10:36 AM, 
stephen.m.thomp...@wellsfargo.commailto:stephen.m.thomp...@wellsfargo.com 
wrote:


I found when I tried to do queries after sending this that although it shows a 
ton of data, it would no longer return ANYTHING for any query ... always 0 
rows.  So something was severely hosed.  I blew away the data and reloaded from 
database ... the data set is a little smaller than before.  It shows up 
somewhat more balanced, although I'm still curious why the third node is so 
much smaller than the first two.

[root@Config3482VM1 apache-cassandra-1.2.1]# bin/nodetool status
Datacenter: 28
==
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address   Load   Tokens  Owns (effective)  Host ID  
 Rack
UN  10.28.205.125 994.89 MB  255 33.7% 
3daab184-61f0-49a0-b076-863f10bc8c6c  205
UN  10.28.205.126 966.17 MB  256 99.9% 
55bbd4b1-8036-4e32-b975-c073a7f0f47f  205
UN  10.28.205.127 699.79 MB  257 66.4% 
d240c91f-4901-40ad-bd66-d374a0ccf0b9  205
[root@Config3482VM1 apache-cassandra-1.2.1]#

And yes, that is the entire content of the output from the status call, 
unedited.   I have attached the output from nodetool ring.  To answer a couple 
of the questions from below from Eric:

* One data center (28)?  One rack (205)? Three nodes?
Yes, that's right.  We're just doing a proof of concept at the 
moment so this is three VMWare servers.

* How many keyspaces, and what are the replication strategies?
There is one keyspace, and it has only one CF at this point.

[default@KEYSPACE_NAME] describe;
Keyspace: KEYSPACE_NAME:
  Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy
  Durable Writes: true
Options: [28:2]

* TL;DR  What Aaron Said(tm)  In the absence of rack/dc aware replication, your 
allocation is suspicious.

I'm not sure what you mean by this.

Steve

-Original Message-
From: Eric Evans [mailto:eev...@acunu.comhttp://acunu.com]
Sent: Thursday, February 07, 2013 9:56 AM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: unbalanced ring

On Wed, Feb 6, 2013 at 2:02 PM,  
stephen.m.thomp...@wellsfargo.commailto:stephen.m.thomp...@wellsfargo.com 
wrote:
 Thanks Aaron.  I ran the cassandra-shuffle job and did a rebuild and
 compact on each of the nodes.



 [root@Config3482VM1 apache-cassandra-1.2.1]# bin/nodetool status

 Datacenter: 28

 ==

 Status=Up/Down

 |/ State=Normal/Leaving/Joining/Moving

 --  Address   Load   Tokens  Owns (effective)  Host ID
 Rack

 UN  10.28.205.125 1.7 GB 255 33.7%
 3daab184-61f0-49a0-b076-863f10bc8c6c  205

 UN  10.28.205.126 591.44 MB  256 99.9%
 55bbd4b1-8036-4e32-b975-c073a7f0f47f  205

 UN  10.28.205.127 112.28 MB  257 66.4%
 d240c91f-4901-40ad-bd66-d374a0ccf0b9  205

Sorry, I have to ask, Is this the complete output?  Have you perhaps sanitized 
it in some way?

It seems like there is some piece of missing context here.  Can you tell us:

* Is this a cluster that was upgraded to virtual nodes (that would include a 
1.2.x cluster initialized with 

Time complexity of cassandra operations

2013-02-11 Thread Tim Wintle
Hi,

I've tried searching for this all over the place, but I can't find an
answer anywhere...

What is the (theoretical) time complexity of basic C* operations?

I assume that single lookups are O(log(R/N)) for R rows across N nodes
(as SST lookups should be O(log(n)) and there are R/N rows per node).

Writes with consistency 1 should be O(1) as they're just appended..

But what about iteration through rows? I don't think I know enough about
how iteration is implemented to guess what the complexity is here.

Tim



what addresses to use in EC2 cluster (whenever an instance restarts it gets a new private ip)?

2013-02-11 Thread Brian Tarbox
How do I configure my cluster to run in EC2?  In my cassandra.yaml I have
IP addresses under seed_provider, listen_address and rpc_address.

I tried setting up my cluster using just the EC2 private addresses but when
one of my instances failed and I restarted it there was a new private
address.  Suddenly my cluster thought it have five nodes rather than four.

Then I tried using Elastic IP addresses (permanent addresses) but it turns
out you get charged for network traffic between elastic addresses even if
they are within the cluster.

So...how do you configure the cluster when the IP addresses can change out
from under you?

Thanks.

Brian Tarbox


Re: Directory structure after upgrading 1.0.8 to 1.2.1

2013-02-11 Thread aaron morton
I think it's a little more subtle that that 
https://issues.apache.org/jira/browse/CASSANDRA-5242

Cheers


-
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 8/02/2013, at 10:21 PM, Desimpel, Ignace ignace.desim...@nuance.com 
wrote:

 Yes it are new directories. I did some debugging …
 The Cassandra code is org.apache.cassandra.db.Directories::migrateFile.
 It is detecting that it is a manifest (based on the .json extension).
 But then it does not take in account that something like 
 MyColumnFamily-old.json can exist. Then it is using MyColumnFamily-old as 
 a directory name in a call to a function destDir = getOrCreate(ksDir, 
 dirname, additionalPath), while it should be  MyColumnFamily.
  
 So I guess that the cfname computation should be adapted to include the 
 “-old.json” manifest files.
  
 Ignace
  
 From: aaron morton [mailto:aa...@thelastpickle.com] 
 Sent: vrijdag 8 februari 2013 03:09
 To: user@cassandra.apache.org
 Subject: Re: Directory structure after upgrading 1.0.8 to 1.2.1
  
 the -old.json is an artefact of Levelled Compaction. 
  
 You should see a non -old file in the current CF folder. 
  
 I'm not sure what would have created the -old CF dir. Does the timestamp 
 indicate it was created the time the server first started as a 1.2 node?
  
 Cheers
  
  
 -
 Aaron Morton
 Freelance Cassandra Developer
 New Zealand
  
 @aaronmorton
 http://www.thelastpickle.com
  
 On 7/02/2013, at 10:39 PM, Desimpel, Ignace ignace.desim...@nuance.com 
 wrote:
 
 
 After upgrading from 1.0.8 I see that now the directory structure has changed 
 and has a structure like keyspacecolumnfamily (part of the 1.1.x 
 migration).
 But I also see that directories appear like keyspacecolumnfamily-old, and 
 the content of that ‘old’ directory is only one file columnfamily-old.json.
  
 Questions :
 Should this xxx-old.json file be in the other directory?
 Should the extra directory xxx-old not be created?
 Or was that intentionally done and is it allowed to remove these directories 
 ( manually … )?
  
 Thanks



Cassandra becnhmark

2013-02-11 Thread Kanwar Sangha
Hi - I am trying to do benchmark using the Cassandra-stress tool. They have 
given an example to insert data across 2 nodes -


/tools/stress/bin/stress -d 192.168.1.101,192.168.1.102 -n 1000
But when I run this across my 2 node cluster, I see the same keys in both 
nodes. Replication is not enabled. Should it not have unique keys in both nodes 
?

Thanks,
Kanwar





Re: what addresses to use in EC2 cluster (whenever an instance restarts it gets a new private ip)?

2013-02-11 Thread Andrey Ilinykh
You have to use private IPs, but if an instance dies you have to bootstrap
it with replace token flag. If you use EC2 I'd recommend Netflix's Priam
tool. It manages all that stuff, plus you have S3 backup.


Andrey


On Mon, Feb 11, 2013 at 11:35 AM, Brian Tarbox tar...@cabotresearch.comwrote:

 How do I configure my cluster to run in EC2?  In my cassandra.yaml I have
 IP addresses under seed_provider, listen_address and rpc_address.

 I tried setting up my cluster using just the EC2 private addresses but
 when one of my instances failed and I restarted it there was a new private
 address.  Suddenly my cluster thought it have five nodes rather than four.

 Then I tried using Elastic IP addresses (permanent addresses) but it turns
 out you get charged for network traffic between elastic addresses even if
 they are within the cluster.

 So...how do you configure the cluster when the IP addresses can change out
 from under you?

 Thanks.

 Brian Tarbox



Re: Cassandra libraries for Golang

2013-02-11 Thread Ben Hood
Hi Boris,

I use this one with Cassandra 1.2+ (you'll need to turn the native port on):

https://github.com/titanous/gocql 

HTH,

Ben


On Friday, 8 February 2013 at 16:40, Boris Solovyov wrote:

 Hi,
 
 I'm developing Go application. I see there is gossie, which doesn't support 
 the native binary protocol, and there is gocql, which I tried. I wasn't able 
 to connect to my Cassandra server. I got EOF after a timeout. I didn't 
 investigate it much further. I wanted to ask the list, what is status of 
 Cassandra libraries for Go? Is anyone using one of them successfully in 
 production? Does it really matter whether I use new native protocol? 
 
 - Boris 



Re: Cassandra 1.1.2 - 1.1.8 upgrade

2013-02-11 Thread Mike
So the upgrade sstables is recommended as part of the upgrade to 1.1.3 
if you are using counter columns


Also, there was a general recommendation (in another response to my 
question) to run upgrade sstables because of:


upgradesstables always needs to be done between majors. While 1.1.2 - 
1.1.8 is not a major, due to an unforeseen bug in the conversion to 
microseconds you'll need to run upgradesstables.


Is this referring to: https://issues.apache.org/jira/browse/CASSANDRA-4432

Can anyone know the impact of not running upgrade sstables? Or possible 
not running it for several days?


Thanks,
-Mike

On 2/10/2013 3:27 PM, aaron morton wrote:

I would do #1.

You can play with nodetool setcompactionthroughput to speed things up, 
but beware nothing comes for free.


Cheers

-
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 10/02/2013, at 6:40 AM, Mike mthero...@yahoo.com 
mailto:mthero...@yahoo.com wrote:



Thank you,

Another question on this topic.

Upgrading from 1.1.2-1.1.9 requires running upgradesstables, which 
will take many hours on our dataset (about 12).  For this upgrade, is 
it recommended that I:


1) Upgrade all the DB nodes to 1.1.9 first, then go around the ring 
and run a staggered upgrade of the sstables over a number of days.
2) Upgrade one node at a time, running the clustered in a mixed 
1.1.2-1.1.9 configuration for a number of days.


I would prefer #1, as with #2, streaming will not work until all the 
nodes are upgraded.


I appreciate your thoughts,
-Mike

On 1/16/2013 11:08 AM, Jason Wee wrote:
always check NEWS.txt for instance for cassandra 1.1.3 you need to 
run nodetool upgradesstables if your cf has counter.



On Wed, Jan 16, 2013 at 11:58 PM, Mike mthero...@yahoo.com 
mailto:mthero...@yahoo.com wrote:


Hello,

We are looking to upgrade our Cassandra cluster from 1.1.2 -
1.1.8 (or possibly 1.1.9 depending on timing).  It is my
understanding that rolling upgrades of Cassandra is supported,
so as we upgrade our cluster, we can do so one node at a time
without experiencing downtime.

Has anyone had any gotchas recently that I should be aware of
before performing this upgrade?

In order to upgrade, is the only thing that needs to change are
the JAR files?  Can everything remain as-is?

Thanks,
-Mike










Re: Upgrade to Cassandra 1.2

2013-02-11 Thread Daning Wang
Thanks Aaron.

I tried to migrate existing cluster(ver 1.1.0) to 1.2.1 but failed.

- I followed http://www.datastax.com/docs/1.2/install/upgrading, have
merged cassandra.yaml, with follow parameter

num_tokens: 256
#initial_token: 0

the initial_token is commented out, current token should be obtained from
system schema

- I did rolling upgrade, during the upgrade, I got Borken Pipe error from
the nodes with old version, is that normal?

- After I upgraded 3 nodes(still have 5 to go), I found it is total wrong,
the first node upgraded owns 99.2 of ring

[cassy@d5:/usr/local/cassy conf]$  ~/bin/nodetool -h localhost status
Datacenter: datacenter1
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address   Load   Tokens  Owns   Host ID
  Rack
DN  10.210.101.11745.01 GB   254 99.2%
 f4b6afe3-7e2e-4c61-96e8-12a529a31373  rack1
UN  10.210.101.12045.43 GB   256 0.4%
0fd912fb-3187-462b-8c8a-7d223751b649  rack1
UN  10.210.101.11127.08 GB   256 0.4%
bd4c37bc-07dd-488b-bfab-e74e32c26f6e  rack1


What was wrong? please help. I could provide more information if you need.

Thanks,

Daning



On Mon, Feb 4, 2013 at 9:16 AM, aaron morton aa...@thelastpickle.comwrote:

 There is a command line utility in 1.2 to shuffle the tokens…

 http://www.datastax.com/dev/blog/upgrading-an-existing-cluster-to-vnodes

 $ ./cassandra-shuffle --help
 Missing sub-command argument.
 Usage: shuffle [options] sub-command

 Sub-commands:
  create   Initialize a new shuffle operation
  ls   List pending relocations
  clearClear pending relocations
  en[able] Enable shuffling
  dis[able]Disable shuffling

 Options:
  -dc,  --only-dc   Apply only to named DC (create only)
  -tp,  --thrift-port   Thrift port number (Default: 9160)
  -p,   --port  JMX port number (Default: 7199)
  -tf,  --thrift-framed Enable framed transport for Thrift (Default:
 false)
  -en,  --and-enableImmediately enable shuffling (create only)
  -H,   --help  Print help information
  -h,   --host  JMX hostname or IP address (Default: localhost)
  -th,  --thrift-host   Thrift hostname or IP address (Default: JMX
 host)

 Cheers

 -
 Aaron Morton
 Freelance Cassandra Developer
 New Zealand

 @aaronmorton
 http://www.thelastpickle.com

 On 3/02/2013, at 11:32 PM, Manu Zhang owenzhang1...@gmail.com wrote:

 On Sun 03 Feb 2013 05:45:56 AM CST, Daning Wang wrote:

 I'd like to upgrade from 1.1.6 to 1.2.1, one big feature in 1.2 is
 that it can have multiple tokens in one node. but there is only one
 token in 1.1.6.

 how can I upgrade to 1.2.1 then breaking the token to take advantage
 of this feature? I went through this doc but it does not say how to
 change the num_token

 http://www.datastax.com/docs/1.2/install/upgrading

 Is there other doc about this upgrade path?

 Thanks,

 Daning


 I think for each node you need to change the num_token option in
 conf/cassandra.yaml (this only split the current range into num_token
 parts) and run the bin/cassandra-shuffle command (this spread it all over
 the ring).





Cassandra 1.2 Atomic Batches and Thrift API

2013-02-11 Thread Drew Kutcharian
Hey Guys,

Is the new atomic batch feature in Cassandra 1.2 available via the thrift API? 
If so, how can I use it?

-- Drew



Spike in latency, one node keeps firing Interval min max errors

2013-02-11 Thread Drew Broadley
Hi there,

I have a cluster of three nodes running Cassandra 1.2.0

I received alerts from my monitoring, and then discovered this huge spike
in cluster latency:
https://dl.dropbox.com/u/3444322/Screen%20Shot%202013-02-12%20at%205.07.49%20PM.png

Investigating what is going on, there is no load on any node, iostat shows
nothing more than idle operations, and I've restarted all nodes.

In the system.log I keep noticing this on ONE node only:
== /var/log/cassandra/system.log ==
ERROR [ReadStage:563] 2013-02-12 04:31:30,013 CassandraDaemon.java (line
133) Exception in thread Thread[ReadStage:563,5,main]
java.lang.AssertionError: Interval min  max
 at
org.apache.cassandra.utils.IntervalTree$IntervalNode.init(IntervalTree.java:250)
at org.apache.cassandra.utils.IntervalTree.init(IntervalTree.java:72)
 at org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
at org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
 at
org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
at
org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
 at org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
at
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
 at
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
at
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
 at
org.apache.cassandra.utils.MergeIterator$ManyToOne.init(MergeIterator.java:86)
at org.apache.cassandra.utils.MergeIterator.get(MergeIterator.java:45)
 at
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:134)
at
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
 at
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:286)
at
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:61)
 at
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1362)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1222)
 at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1134)
at org.apache.cassandra.db.Table.getRow(Table.java:348)
 at
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
at
org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1048)
 at
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1506)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:679)

-- 

Cheers,
Drew Broadley

*Broadley Speaking :)*

e: d...@broadley.org.nz
p: +64 (0)21 519 711
m:P O Box 488, Wellington, New Zealand
w:http://blog.drew.broadley.org.nz/
ln:http://nz.linkedin.com/in/drewbroadley


Re: Estimating write throughput with LeveledCompactionStrategy

2013-02-11 Thread Ивaн Cобoлeв
Yup, we set it to 100M. Currently we have around 1Tb of data per
node(getting to level 5 now) + data pieces are rather large(small
tables would flush more often).

Yes, you're right, it's slower thus building mental models is more
time effective than experimenting :)

Ivan

2013/2/6 Wei Zhu wz1...@yahoo.com:
 I have been struggling with the LCS myself. I observed that for the higher
 level compaction,(from level 4 to 5) it involves much more SSTables than
 compacting from lower level. One compaction could take an hour or more. By
 the way, you set the your SSTable size to be 100M?

 Thanks.
 -Wei

 
 From: Ивaн Cобoлeв sobol...@gmail.com
 To: user@cassandra.apache.org
 Sent: Wednesday, February 6, 2013 2:42 AM
 Subject: Estimating write throughput with LeveledCompactionStrategy

 Dear Community,

 Could anyone please give me a hand with understanding what am I
 missing while trying to model how LeveledCompactionStrategy works:
 https://docs.google.com/spreadsheet/ccc?key=0AvNacZ0w52BydDQ3N2ZPSks2OHR1dlFmMVV4d1E2eEE#gid=0

 Logs mostly contain something like this:
 INFO [CompactionExecutor:2235] 2013-02-06 02:32:29,758
 CompactionTask.java (line 221) Compacted to
 [chunks-hf-285962-Data.db,chunks-hf-285963-Data.db,chunks-hf-285964-Data.db,chunks-hf-285965-Data.db,chunks-hf-285966-Data.db,chunks-hf-285967-Data.db,chunks-hf-285968-Data.db,chunks-hf-285969-Data.db,chunks-hf-285970-Data.db,chunks-hf-285971-Data.db,chunks-hf-285972-Data.db,chunks-hf-285973-Data.db,chunks-hf-285974-Data.db,chunks-hf-285975-Data.db,chunks-hf-285976-Data.db,chunks-hf-285977-Data.db,chunks-hf-285978-Data.db,chunks-hf-285979-Data.db,chunks-hf-285980-Data.db,].
 2,255,863,073 to 1,908,460,931 (~84% of original) bytes for 36,868
 keys at 14.965795MB/s.  Time: 121,614ms.

 Thus spreadsheet is parameterized with throughput being 15Mb and
 survivor ratio of 0.9.

 1) Projected result actually differs from what I observe - what am I
 missing?
 2) Are there any metrics on write throughput with LCS per node anyone
 could possibly share?

 Thank you very much in advance,
 Ivan