Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-22 Thread Brice Dutheil
Reads are mostly limited by IO so I’d set concurrent_read to something
related to your disks, we have set it to 64 (yet we have SSDs)
Writes are mostly limited by CPU, so the number of cores matter, we set
concurrent_read to 48 and 128 (depending on the CPU on the nodes)

Careful with LCS it is not recommended for write heavy workload. LCS is
good to optimize reads, in that it avoids t read many SSTables.
​

-- Brice

On Wed, Apr 22, 2015 at 6:53 AM, Anishek Agarwal anis...@gmail.com wrote:

 Thanks Brice for the input,

 I am confused as to how to calculate the value of concurrent_read,
 following is what i found recommended on sites and in configuration docs.

 concurrent_read : some places its 16 X number of drives or 4 X number of
 cores
 which of the above should i pick ?  i have 40 core cpu with 3 disks(non
 ssd) one used for commitlog and other two for data directories, I am having
 3 nodes in my cluster.

 I think there are tools out there that allow the max write speed to disk,
 i am going to run them too to find out the write throughput i can get to
 see that i am not trying to overachieve something, currently we are stuck
 at 35MBps

 @Sebastian
 the concurrent_compactors is at default value of 32 for us and i think
 that should be fine.
 Since we had lot of cores i thought it would be better to use 
 multithreaded_compaction
 but i think i will try one set with it turned off again.

 Question is still,

 how do i find what write load should i aim for per node such that it is
 able to compact data while inserting, is it just try and error ? or there
 is a certain QPS i can target for per node ?

 Our business case is
 -- new client comes and create a new keyspace for him, initially there
 will be lots of new keys ( i think size tired might work better here)
 -- as time progresses we are going to update the existing keys very
 frequently ( i think LCS will work better here -- we are going with this
 strategy for long term benefit)




 On Wed, Apr 22, 2015 at 4:17 AM, Brice Dutheil brice.duth...@gmail.com
 wrote:

 Yes I was referring referring to multithreaded_compaction, but just
 because we didn’t get bitten by this setting just doesn’t mean it’s right,
 and the jira is a clear indication of that ;)

 @Anishek that reminds me of these settings to look at as well:

- concurrent_write and concurrent_read both need to be adapted to
your actual hardware though.

  Cassandra is, more often than not, disk constrained though this can
 change for some workloads with SSD’s.

 Yes that is typically the case, SSDs are more and more commons but so are
 multi-core CPUs and the trend to multiple cores is not going to stop ; just
 look at the next Intel *flagship* : Knights Landing
 http://www.anandtech.com/show/8217/intels-knights-landing-coprocessor-detailed
 = *72 cores*.

 Nowadays it is not rare to have boxes with multicore CPU, either way if
 they are not used because of some IO bottleneck there’s no reason to be
 licensed for that, and if IO is not an issue the CPUs are most probably
 next in line. While node is much more about a combination of that plus much
 more added value like the linear scaling of Cassandra. And I’m not even
 listing the other nifty integration that DSE ships in.

 But on this matter I believe we shouldn’t hijack the original thread
 purpose.

 — Brice

 On Wed, Apr 22, 2015 at 12:13 AM, Sebastian Estevez
 [sebastian.este...@datastax.com](mailto:sebastian.este...@datastax.com)
 http://mailto:%5bsebastian.este...@datastax.com%5D(mailto:sebastian.este...@datastax.com)
 wrote:

 I want to draw a distinction between a) multithreaded compaction (the
 jira I just pointed to) and b) concurrent_compactors. I'm not clear on
 which one you are recommending at this stage.

 a) Multithreaded compaction is what I warned against in my last note. b)
 Concurrent compactors is the number of separate compaction tasks (on
 different tables) that can run simultaneously. You can crank this up
 without much risk though the old default of num cores was too aggressive
 (CASSANDRA-7139). 2 seems to be the sweet-spot.

 Cassandra is, more often than not, disk constrained though this can
 change for some workloads with SSD's.


 All the best,


 [image: datastax_logo.png] http://www.datastax.com/

 Sebastián Estévez

 Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

 [image: linkedin.png] https://www.linkedin.com/company/datastax [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax

 http://cassandrasummit-datastax.com/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology and 

Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Anishek Agarwal
sorry i take that back we will modify different keys across threads not the
same key, our storm topology is going to use field grouping to get updates
for same keys to same set of bolts.

On Tue, Apr 21, 2015 at 6:17 PM, Anishek Agarwal anis...@gmail.com wrote:

 @Bruice : I dont think so as i am giving each thread a specific key range
 with no overlaps this does not seem to be the case now. However we will
 have to test where we have to modify the same key across threads -- do u
 think that will cause a problem ? As far as i have read LCS is recommended
 for such cases. should i just switch back to SizeTiredCompactionStrategy.


 On Tue, Apr 21, 2015 at 6:13 PM, Brice Dutheil brice.duth...@gmail.com
 wrote:

 Could it that the app is inserting _duplicate_ keys ?

 -- Brice

 On Tue, Apr 21, 2015 at 1:52 PM, Marcus Eriksson krum...@gmail.com
 wrote:

 nope, but you can correlate I guess, tools/bin/sstablemetadata gives you
 sstable level information

 and, it is also likely that since you get so many L0 sstables, you will
 be doing size tiered compaction in L0 for a while.

 On Tue, Apr 21, 2015 at 1:40 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 @Marcus I did look and that is where i got the above but it doesnt show
 any detail about moving from L0 -L1 any specific arguments i should try
 with ?

 On Tue, Apr 21, 2015 at 4:52 PM, Marcus Eriksson krum...@gmail.com
 wrote:

 you need to look at nodetool compactionstats - there is probably a big
 L0 - L1 compaction going on that blocks other compactions from starting

 On Tue, Apr 21, 2015 at 1:06 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 the some_bits column has about 14-15 bytes of data per key.

 On Tue, Apr 21, 2015 at 4:34 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 Hello,

 I am inserting about 100 million entries via datastax-java driver to
 a cassandra cluster of 3 nodes.

 Table structure is as

 create keyspace test with replication = {'class':
 'NetworkTopologyStrategy', 'DC' : 3};

 CREATE TABLE test_bits(id bigint primary key , some_bits text) with
 gc_grace_seconds=0 and compaction = {'class': 
 'LeveledCompactionStrategy'}
 and compression={'sstable_compression' : ''};

 have 75 threads that are inserting data into the above table with
 each thread having non over lapping keys.

 I see that the number of pending tasks via nodetool
 compactionstats keeps increasing and looks like from nodetool cfstats
 test.test_bits has SSTTable levels as [154/4, 8, 0, 0, 0, 0, 0, 0, 0],

 Why is compaction not kicking in ?

 thanks
 anishek










Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Anishek Agarwal
@Bruice : I dont think so as i am giving each thread a specific key range
with no overlaps this does not seem to be the case now. However we will
have to test where we have to modify the same key across threads -- do u
think that will cause a problem ? As far as i have read LCS is recommended
for such cases. should i just switch back to SizeTiredCompactionStrategy.


On Tue, Apr 21, 2015 at 6:13 PM, Brice Dutheil brice.duth...@gmail.com
wrote:

 Could it that the app is inserting _duplicate_ keys ?

 -- Brice

 On Tue, Apr 21, 2015 at 1:52 PM, Marcus Eriksson krum...@gmail.com
 wrote:

 nope, but you can correlate I guess, tools/bin/sstablemetadata gives you
 sstable level information

 and, it is also likely that since you get so many L0 sstables, you will
 be doing size tiered compaction in L0 for a while.

 On Tue, Apr 21, 2015 at 1:40 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 @Marcus I did look and that is where i got the above but it doesnt show
 any detail about moving from L0 -L1 any specific arguments i should try
 with ?

 On Tue, Apr 21, 2015 at 4:52 PM, Marcus Eriksson krum...@gmail.com
 wrote:

 you need to look at nodetool compactionstats - there is probably a big
 L0 - L1 compaction going on that blocks other compactions from starting

 On Tue, Apr 21, 2015 at 1:06 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 the some_bits column has about 14-15 bytes of data per key.

 On Tue, Apr 21, 2015 at 4:34 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 Hello,

 I am inserting about 100 million entries via datastax-java driver to
 a cassandra cluster of 3 nodes.

 Table structure is as

 create keyspace test with replication = {'class':
 'NetworkTopologyStrategy', 'DC' : 3};

 CREATE TABLE test_bits(id bigint primary key , some_bits text) with
 gc_grace_seconds=0 and compaction = {'class': 
 'LeveledCompactionStrategy'}
 and compression={'sstable_compression' : ''};

 have 75 threads that are inserting data into the above table with
 each thread having non over lapping keys.

 I see that the number of pending tasks via nodetool compactionstats
 keeps increasing and looks like from nodetool cfstats test.test_bits 
 has
 SSTTable levels as [154/4, 8, 0, 0, 0, 0, 0, 0, 0],

 Why is compaction not kicking in ?

 thanks
 anishek









Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Brice Dutheil
I’m not sure I get everything about storm stuff, but my understanding of
LCS is that compaction count may increase the more one update data (that’s
why I was wondering about duplicate primary keys).

Another option is that the code is sending too much write request/s to the
cassandra cluster. I don’t know haw many nodes you have, but the less node
there is the more compactions.
Also I’d look at the CPU / load, maybe the config is too *restrictive*,
look at the following properties in the cassandra.yaml

   - compaction_throughput_mb_per_sec, by default the value is 16, you may
   want to increase it but be careful on mechanical drives, if already in SSD
   IO is rarely the issue, we have 64 (with SSDs)
   - multithreaded_compaction by default it is false, we enabled it.

Compaction thread are niced, so it shouldn’t be much an issue for serving
production r/w requests. But you never know, always keep an eye on IO and
CPU.

— Brice

On Tue, Apr 21, 2015 at 2:48 PM, Anishek Agarwal anis...@gmail.com wrote:

sorry i take that back we will modify different keys across threads not the
 same key, our storm topology is going to use field grouping to get updates
 for same keys to same set of bolts.

 On Tue, Apr 21, 2015 at 6:17 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 @Bruice : I dont think so as i am giving each thread a specific key range
 with no overlaps this does not seem to be the case now. However we will
 have to test where we have to modify the same key across threads -- do u
 think that will cause a problem ? As far as i have read LCS is recommended
 for such cases. should i just switch back to SizeTiredCompactionStrategy.


 On Tue, Apr 21, 2015 at 6:13 PM, Brice Dutheil brice.duth...@gmail.com
 wrote:

 Could it that the app is inserting _duplicate_ keys ?

 -- Brice

 On Tue, Apr 21, 2015 at 1:52 PM, Marcus Eriksson krum...@gmail.com
 wrote:

 nope, but you can correlate I guess, tools/bin/sstablemetadata gives
 you sstable level information

 and, it is also likely that since you get so many L0 sstables, you will
 be doing size tiered compaction in L0 for a while.

 On Tue, Apr 21, 2015 at 1:40 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 @Marcus I did look and that is where i got the above but it doesnt
 show any detail about moving from L0 -L1 any specific arguments i should
 try with ?

 On Tue, Apr 21, 2015 at 4:52 PM, Marcus Eriksson krum...@gmail.com
 wrote:

 you need to look at nodetool compactionstats - there is probably a
 big L0 - L1 compaction going on that blocks other compactions from 
 starting

 On Tue, Apr 21, 2015 at 1:06 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 the some_bits column has about 14-15 bytes of data per key.

 On Tue, Apr 21, 2015 at 4:34 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 Hello,

 I am inserting about 100 million entries via datastax-java driver
 to a cassandra cluster of 3 nodes.

 Table structure is as

 create keyspace test with replication = {'class':
 'NetworkTopologyStrategy', 'DC' : 3};

 CREATE TABLE test_bits(id bigint primary key , some_bits text) with
 gc_grace_seconds=0 and compaction = {'class': 
 'LeveledCompactionStrategy'}
 and compression={'sstable_compression' : ''};

 have 75 threads that are inserting data into the above table with
 each thread having non over lapping keys.

 I see that the number of pending tasks via nodetool
 compactionstats keeps increasing and looks like from nodetool cfstats
 test.test_bits has SSTTable levels as [154/4, 8, 0, 0, 0, 0, 0, 0, 0],

 Why is compaction not kicking in ?

 thanks
 anishek








  ​


Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Carlos Rolo
Are you on version 2.1.x?

Regards,

Carlos Juzarte Rolo
Cassandra Consultant

Pythian - Love your data

rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo
http://linkedin.com/in/carlosjuzarterolo*
Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649
www.pythian.com

On Tue, Apr 21, 2015 at 1:06 PM, Anishek Agarwal anis...@gmail.com wrote:

 the some_bits column has about 14-15 bytes of data per key.

 On Tue, Apr 21, 2015 at 4:34 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 Hello,

 I am inserting about 100 million entries via datastax-java driver to a
 cassandra cluster of 3 nodes.

 Table structure is as

 create keyspace test with replication = {'class':
 'NetworkTopologyStrategy', 'DC' : 3};

 CREATE TABLE test_bits(id bigint primary key , some_bits text) with
 gc_grace_seconds=0 and compaction = {'class': 'LeveledCompactionStrategy'}
 and compression={'sstable_compression' : ''};

 have 75 threads that are inserting data into the above table with each
 thread having non over lapping keys.

 I see that the number of pending tasks via nodetool compactionstats
 keeps increasing and looks like from nodetool cfstats test.test_bits has
 SSTTable levels as [154/4, 8, 0, 0, 0, 0, 0, 0, 0],

 Why is compaction not kicking in ?

 thanks
 anishek




-- 


--





Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Brice Dutheil
Could it that the app is inserting _duplicate_ keys ?

-- Brice

On Tue, Apr 21, 2015 at 1:52 PM, Marcus Eriksson krum...@gmail.com wrote:

 nope, but you can correlate I guess, tools/bin/sstablemetadata gives you
 sstable level information

 and, it is also likely that since you get so many L0 sstables, you will be
 doing size tiered compaction in L0 for a while.

 On Tue, Apr 21, 2015 at 1:40 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 @Marcus I did look and that is where i got the above but it doesnt show
 any detail about moving from L0 -L1 any specific arguments i should try
 with ?

 On Tue, Apr 21, 2015 at 4:52 PM, Marcus Eriksson krum...@gmail.com
 wrote:

 you need to look at nodetool compactionstats - there is probably a big
 L0 - L1 compaction going on that blocks other compactions from starting

 On Tue, Apr 21, 2015 at 1:06 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 the some_bits column has about 14-15 bytes of data per key.

 On Tue, Apr 21, 2015 at 4:34 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 Hello,

 I am inserting about 100 million entries via datastax-java driver to a
 cassandra cluster of 3 nodes.

 Table structure is as

 create keyspace test with replication = {'class':
 'NetworkTopologyStrategy', 'DC' : 3};

 CREATE TABLE test_bits(id bigint primary key , some_bits text) with
 gc_grace_seconds=0 and compaction = {'class': 'LeveledCompactionStrategy'}
 and compression={'sstable_compression' : ''};

 have 75 threads that are inserting data into the above table with each
 thread having non over lapping keys.

 I see that the number of pending tasks via nodetool compactionstats
 keeps increasing and looks like from nodetool cfstats test.test_bits has
 SSTTable levels as [154/4, 8, 0, 0, 0, 0, 0, 0, 0],

 Why is compaction not kicking in ?

 thanks
 anishek








Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Sebastian Estevez
I want to draw a distinction between a) multithreaded compaction (the jira
I just pointed to) and b) concurrent_compactors. I'm not clear on which one
you are recommending at this stage.

a) Multithreaded compaction is what I warned against in my last note. b)
Concurrent compactors is the number of separate compaction tasks (on
different tables) that can run simultaneously. You can crank this up
without much risk though the old default of num cores was too aggressive
(CASSANDRA-7139). 2 seems to be the sweet-spot.

Cassandra is, more often than not, disk constrained though this can change
for some workloads with SSD's.


All the best,


[image: datastax_logo.png] http://www.datastax.com/

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png] https://www.linkedin.com/company/datastax [image:
facebook.png] https://www.facebook.com/datastax [image: twitter.png]
https://twitter.com/datastax [image: g+.png]
https://plus.google.com/+Datastax/about
http://feeds.feedburner.com/datastax

http://cassandrasummit-datastax.com/

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Tue, Apr 21, 2015 at 5:46 PM, Brice Dutheil brice.duth...@gmail.com
wrote:

 Oh, thank you Sebastian for this input and the ticket reference !
 We did notice an increase in CPU usage, but kept the concurrent compaction
 low enough for our usage, by default it takes the number of cores. We did
 use a number up to 30% of our available cores. But under heavy load clearly
 CPU is the bottleneck and we have 2 CPU with 8 hyper threaded cores per
 node.

 In a related topic : I’m a bit concerned by datastax communication,
 usually people talk about IO as being the weak spot, but in our case it’s
 more about CPU. Fortunately the Moore law doesn’t really apply anymore
 vertically, now we have have multi core processors *and* the trend is
 going that way. Yet Datastax terms feels a bit *antiquated* and maybe a
 bit too much Oracle-y : http://www.datastax.com/enterprise-terms
 Node licensing is more appropriate for this century.
 ​

 -- Brice

 On Tue, Apr 21, 2015 at 11:19 PM, Sebastian Estevez 
 sebastian.este...@datastax.com wrote:

 Do not enable multithreaded compaction. Overhead usually outweighs any
 benefit. It's removed in 2.1 because it harms more than helps:

 https://issues.apache.org/jira/browse/CASSANDRA-6142

 All the best,


 [image: datastax_logo.png] http://www.datastax.com/

 Sebastián Estévez

 Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

 [image: linkedin.png] https://www.linkedin.com/company/datastax [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax

 http://cassandrasummit-datastax.com/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology and transactional backbone of choice for the worlds
 most innovative companies such as Netflix, Adobe, Intuit, and eBay.

 On Tue, Apr 21, 2015 at 9:06 AM, Brice Dutheil brice.duth...@gmail.com
 wrote:

 I’m not sure I get everything about storm stuff, but my understanding of
 LCS is that compaction count may increase the more one update data (that’s
 why I was wondering about duplicate primary keys).

 Another option is that the code is sending too much write request/s to
 the cassandra cluster. I don’t know haw many nodes you have, but the less
 node there is the more compactions.
 Also I’d look at the CPU / load, maybe the config is too *restrictive*,
 look at the following properties in the cassandra.yaml

- compaction_throughput_mb_per_sec, by default the value is 16, you
may want to increase it but be careful on mechanical drives, if already 
 in
SSD IO is rarely the issue, we have 64 (with SSDs)
- multithreaded_compaction by default it is false, we enabled it.

 Compaction thread are niced, so it shouldn’t be much an issue for
 serving production r/w requests. But you never know, always keep an eye on
 IO and CPU.

 — Brice

 On Tue, Apr 21, 2015 at 2:48 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 sorry i take that back we will modify different keys across threads not
 the same key, our storm topology is going to use field grouping to get
 updates for same keys to same set of bolts.

 On Tue, Apr 21, 2015 at 6:17 PM, 

Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Sebastian Estevez
Do not enable multithreaded compaction. Overhead usually outweighs any
benefit. It's removed in 2.1 because it harms more than helps:

https://issues.apache.org/jira/browse/CASSANDRA-6142

All the best,


[image: datastax_logo.png] http://www.datastax.com/

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png] https://www.linkedin.com/company/datastax [image:
facebook.png] https://www.facebook.com/datastax [image: twitter.png]
https://twitter.com/datastax [image: g+.png]
https://plus.google.com/+Datastax/about
http://feeds.feedburner.com/datastax

http://cassandrasummit-datastax.com/

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Tue, Apr 21, 2015 at 9:06 AM, Brice Dutheil brice.duth...@gmail.com
wrote:

 I’m not sure I get everything about storm stuff, but my understanding of
 LCS is that compaction count may increase the more one update data (that’s
 why I was wondering about duplicate primary keys).

 Another option is that the code is sending too much write request/s to the
 cassandra cluster. I don’t know haw many nodes you have, but the less node
 there is the more compactions.
 Also I’d look at the CPU / load, maybe the config is too *restrictive*,
 look at the following properties in the cassandra.yaml

- compaction_throughput_mb_per_sec, by default the value is 16, you
may want to increase it but be careful on mechanical drives, if already in
SSD IO is rarely the issue, we have 64 (with SSDs)
- multithreaded_compaction by default it is false, we enabled it.

 Compaction thread are niced, so it shouldn’t be much an issue for serving
 production r/w requests. But you never know, always keep an eye on IO and
 CPU.

 — Brice

 On Tue, Apr 21, 2015 at 2:48 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 sorry i take that back we will modify different keys across threads not
 the same key, our storm topology is going to use field grouping to get
 updates for same keys to same set of bolts.

 On Tue, Apr 21, 2015 at 6:17 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 @Bruice : I dont think so as i am giving each thread a specific key
 range with no overlaps this does not seem to be the case now. However we
 will have to test where we have to modify the same key across threads -- do
 u think that will cause a problem ? As far as i have read LCS is
 recommended for such cases. should i just switch back to
 SizeTiredCompactionStrategy.


 On Tue, Apr 21, 2015 at 6:13 PM, Brice Dutheil brice.duth...@gmail.com
 wrote:

 Could it that the app is inserting _duplicate_ keys ?

 -- Brice

 On Tue, Apr 21, 2015 at 1:52 PM, Marcus Eriksson krum...@gmail.com
 wrote:

 nope, but you can correlate I guess, tools/bin/sstablemetadata gives
 you sstable level information

 and, it is also likely that since you get so many L0 sstables, you
 will be doing size tiered compaction in L0 for a while.

 On Tue, Apr 21, 2015 at 1:40 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 @Marcus I did look and that is where i got the above but it doesnt
 show any detail about moving from L0 -L1 any specific arguments i should
 try with ?

 On Tue, Apr 21, 2015 at 4:52 PM, Marcus Eriksson krum...@gmail.com
 wrote:

 you need to look at nodetool compactionstats - there is probably a
 big L0 - L1 compaction going on that blocks other compactions from 
 starting

 On Tue, Apr 21, 2015 at 1:06 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 the some_bits column has about 14-15 bytes of data per key.

 On Tue, Apr 21, 2015 at 4:34 PM, Anishek Agarwal anis...@gmail.com
  wrote:

 Hello,

 I am inserting about 100 million entries via datastax-java driver
 to a cassandra cluster of 3 nodes.

 Table structure is as

 create keyspace test with replication = {'class':
 'NetworkTopologyStrategy', 'DC' : 3};

 CREATE TABLE test_bits(id bigint primary key , some_bits text)
 with gc_grace_seconds=0 and compaction = {'class':
 'LeveledCompactionStrategy'} and compression={'sstable_compression' : 
 ''};

 have 75 threads that are inserting data into the above table with
 each thread having non over lapping keys.

 I see that the number of pending tasks via nodetool
 compactionstats keeps increasing and looks like from nodetool 
 cfstats
 test.test_bits has SSTTable levels as [154/4, 8, 0, 0, 0, 0, 0, 0, 
 0],

 Why is compaction not kicking in ?

 thanks
 anishek








  ​



Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Anishek Agarwal
Thanks Brice for the input,

I am confused as to how to calculate the value of concurrent_read,
following is what i found recommended on sites and in configuration docs.

concurrent_read : some places its 16 X number of drives or 4 X number of
cores
which of the above should i pick ?  i have 40 core cpu with 3 disks(non
ssd) one used for commitlog and other two for data directories, I am having
3 nodes in my cluster.

I think there are tools out there that allow the max write speed to disk, i
am going to run them too to find out the write throughput i can get to see
that i am not trying to overachieve something, currently we are stuck at
35MBps

@Sebastian
the concurrent_compactors is at default value of 32 for us and i think that
should be fine.
Since we had lot of cores i thought it would be better to use
multithreaded_compaction
but i think i will try one set with it turned off again.

Question is still,

how do i find what write load should i aim for per node such that it is
able to compact data while inserting, is it just try and error ? or there
is a certain QPS i can target for per node ?

Our business case is
-- new client comes and create a new keyspace for him, initially there will
be lots of new keys ( i think size tired might work better here)
-- as time progresses we are going to update the existing keys very
frequently ( i think LCS will work better here -- we are going with this
strategy for long term benefit)




On Wed, Apr 22, 2015 at 4:17 AM, Brice Dutheil brice.duth...@gmail.com
wrote:

 Yes I was referring referring to multithreaded_compaction, but just
 because we didn’t get bitten by this setting just doesn’t mean it’s right,
 and the jira is a clear indication of that ;)

 @Anishek that reminds me of these settings to look at as well:

- concurrent_write and concurrent_read both need to be adapted to your
actual hardware though.

  Cassandra is, more often than not, disk constrained though this can
 change for some workloads with SSD’s.

 Yes that is typically the case, SSDs are more and more commons but so are
 multi-core CPUs and the trend to multiple cores is not going to stop ; just
 look at the next Intel *flagship* : Knights Landing
 http://www.anandtech.com/show/8217/intels-knights-landing-coprocessor-detailed
 = *72 cores*.

 Nowadays it is not rare to have boxes with multicore CPU, either way if
 they are not used because of some IO bottleneck there’s no reason to be
 licensed for that, and if IO is not an issue the CPUs are most probably
 next in line. While node is much more about a combination of that plus much
 more added value like the linear scaling of Cassandra. And I’m not even
 listing the other nifty integration that DSE ships in.

 But on this matter I believe we shouldn’t hijack the original thread
 purpose.

 — Brice

 On Wed, Apr 22, 2015 at 12:13 AM, Sebastian Estevez
 [sebastian.este...@datastax.com](mailto:sebastian.este...@datastax.com)
 http://mailto:%5bsebastian.este...@datastax.com%5D(mailto:sebastian.este...@datastax.com)
 wrote:

 I want to draw a distinction between a) multithreaded compaction (the jira
 I just pointed to) and b) concurrent_compactors. I'm not clear on which one
 you are recommending at this stage.

 a) Multithreaded compaction is what I warned against in my last note. b)
 Concurrent compactors is the number of separate compaction tasks (on
 different tables) that can run simultaneously. You can crank this up
 without much risk though the old default of num cores was too aggressive
 (CASSANDRA-7139). 2 seems to be the sweet-spot.

 Cassandra is, more often than not, disk constrained though this can
 change for some workloads with SSD's.


 All the best,


 [image: datastax_logo.png] http://www.datastax.com/

 Sebastián Estévez

 Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

 [image: linkedin.png] https://www.linkedin.com/company/datastax [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax

 http://cassandrasummit-datastax.com/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology and transactional backbone of choice for the worlds
 most innovative companies such as Netflix, Adobe, Intuit, and eBay.

 On Tue, Apr 21, 2015 at 5:46 PM, Brice Dutheil brice.duth...@gmail.com
 wrote:

 Oh, thank you Sebastian for this input and the ticket reference !
 We did notice an increase in CPU usage, but kept the concurrent
 compaction low enough for our usage, by default it takes the number of
 cores. We did use a number up to 30% of our available cores. But under
 heavy load clearly CPU is the bottleneck and we 

Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Brice Dutheil
Yes I was referring referring to multithreaded_compaction, but just because
we didn’t get bitten by this setting just doesn’t mean it’s right, and the
jira is a clear indication of that ;)

@Anishek that reminds me of these settings to look at as well:

   - concurrent_write and concurrent_read both need to be adapted to your
   actual hardware though.

 Cassandra is, more often than not, disk constrained though this can change
for some workloads with SSD’s.

Yes that is typically the case, SSDs are more and more commons but so are
multi-core CPUs and the trend to multiple cores is not going to stop ; just
look at the next Intel *flagship* : Knights Landing
http://www.anandtech.com/show/8217/intels-knights-landing-coprocessor-detailed
= *72 cores*.

Nowadays it is not rare to have boxes with multicore CPU, either way if
they are not used because of some IO bottleneck there’s no reason to be
licensed for that, and if IO is not an issue the CPUs are most probably
next in line. While node is much more about a combination of that plus much
more added value like the linear scaling of Cassandra. And I’m not even
listing the other nifty integration that DSE ships in.

But on this matter I believe we shouldn’t hijack the original thread
purpose.

— Brice

On Wed, Apr 22, 2015 at 12:13 AM, Sebastian Estevez
[sebastian.este...@datastax.com](mailto:sebastian.este...@datastax.com)
http://mailto:[sebastian.este...@datastax.com](mailto:sebastian.este...@datastax.com)
wrote:

I want to draw a distinction between a) multithreaded compaction (the jira
 I just pointed to) and b) concurrent_compactors. I'm not clear on which one
 you are recommending at this stage.

 a) Multithreaded compaction is what I warned against in my last note. b)
 Concurrent compactors is the number of separate compaction tasks (on
 different tables) that can run simultaneously. You can crank this up
 without much risk though the old default of num cores was too aggressive
 (CASSANDRA-7139). 2 seems to be the sweet-spot.

 Cassandra is, more often than not, disk constrained though this can change
 for some workloads with SSD's.


 All the best,


 [image: datastax_logo.png] http://www.datastax.com/

 Sebastián Estévez

 Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

 [image: linkedin.png] https://www.linkedin.com/company/datastax [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax

 http://cassandrasummit-datastax.com/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology and transactional backbone of choice for the worlds
 most innovative companies such as Netflix, Adobe, Intuit, and eBay.

 On Tue, Apr 21, 2015 at 5:46 PM, Brice Dutheil brice.duth...@gmail.com
 wrote:

 Oh, thank you Sebastian for this input and the ticket reference !
 We did notice an increase in CPU usage, but kept the concurrent
 compaction low enough for our usage, by default it takes the number of
 cores. We did use a number up to 30% of our available cores. But under
 heavy load clearly CPU is the bottleneck and we have 2 CPU with 8 hyper
 threaded cores per node.

 In a related topic : I’m a bit concerned by datastax communication,
 usually people talk about IO as being the weak spot, but in our case it’s
 more about CPU. Fortunately the Moore law doesn’t really apply anymore
 vertically, now we have have multi core processors *and* the trend is
 going that way. Yet Datastax terms feels a bit *antiquated* and maybe a
 bit too much Oracle-y : http://www.datastax.com/enterprise-terms
 Node licensing is more appropriate for this century.
 ​

 -- Brice

 On Tue, Apr 21, 2015 at 11:19 PM, Sebastian Estevez 
 sebastian.este...@datastax.com wrote:

 Do not enable multithreaded compaction. Overhead usually outweighs any
 benefit. It's removed in 2.1 because it harms more than helps:

 https://issues.apache.org/jira/browse/CASSANDRA-6142

 All the best,


 [image: datastax_logo.png] http://www.datastax.com/

 Sebastián Estévez

 Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

 [image: linkedin.png] https://www.linkedin.com/company/datastax [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax

 http://cassandrasummit-datastax.com/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 

Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Anishek Agarwal
@Marcus I did look and that is where i got the above but it doesnt show any
detail about moving from L0 -L1 any specific arguments i should try with ?

On Tue, Apr 21, 2015 at 4:52 PM, Marcus Eriksson krum...@gmail.com wrote:

 you need to look at nodetool compactionstats - there is probably a big L0
 - L1 compaction going on that blocks other compactions from starting

 On Tue, Apr 21, 2015 at 1:06 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 the some_bits column has about 14-15 bytes of data per key.

 On Tue, Apr 21, 2015 at 4:34 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 Hello,

 I am inserting about 100 million entries via datastax-java driver to a
 cassandra cluster of 3 nodes.

 Table structure is as

 create keyspace test with replication = {'class':
 'NetworkTopologyStrategy', 'DC' : 3};

 CREATE TABLE test_bits(id bigint primary key , some_bits text) with
 gc_grace_seconds=0 and compaction = {'class': 'LeveledCompactionStrategy'}
 and compression={'sstable_compression' : ''};

 have 75 threads that are inserting data into the above table with each
 thread having non over lapping keys.

 I see that the number of pending tasks via nodetool compactionstats
 keeps increasing and looks like from nodetool cfstats test.test_bits has
 SSTTable levels as [154/4, 8, 0, 0, 0, 0, 0, 0, 0],

 Why is compaction not kicking in ?

 thanks
 anishek






Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Marcus Eriksson
nope, but you can correlate I guess, tools/bin/sstablemetadata gives you
sstable level information

and, it is also likely that since you get so many L0 sstables, you will be
doing size tiered compaction in L0 for a while.

On Tue, Apr 21, 2015 at 1:40 PM, Anishek Agarwal anis...@gmail.com wrote:

 @Marcus I did look and that is where i got the above but it doesnt show
 any detail about moving from L0 -L1 any specific arguments i should try
 with ?

 On Tue, Apr 21, 2015 at 4:52 PM, Marcus Eriksson krum...@gmail.com
 wrote:

 you need to look at nodetool compactionstats - there is probably a big L0
 - L1 compaction going on that blocks other compactions from starting

 On Tue, Apr 21, 2015 at 1:06 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 the some_bits column has about 14-15 bytes of data per key.

 On Tue, Apr 21, 2015 at 4:34 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 Hello,

 I am inserting about 100 million entries via datastax-java driver to a
 cassandra cluster of 3 nodes.

 Table structure is as

 create keyspace test with replication = {'class':
 'NetworkTopologyStrategy', 'DC' : 3};

 CREATE TABLE test_bits(id bigint primary key , some_bits text) with
 gc_grace_seconds=0 and compaction = {'class': 'LeveledCompactionStrategy'}
 and compression={'sstable_compression' : ''};

 have 75 threads that are inserting data into the above table with each
 thread having non over lapping keys.

 I see that the number of pending tasks via nodetool compactionstats
 keeps increasing and looks like from nodetool cfstats test.test_bits has
 SSTTable levels as [154/4, 8, 0, 0, 0, 0, 0, 0, 0],

 Why is compaction not kicking in ?

 thanks
 anishek







Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Anishek Agarwal
the some_bits column has about 14-15 bytes of data per key.

On Tue, Apr 21, 2015 at 4:34 PM, Anishek Agarwal anis...@gmail.com wrote:

 Hello,

 I am inserting about 100 million entries via datastax-java driver to a
 cassandra cluster of 3 nodes.

 Table structure is as

 create keyspace test with replication = {'class':
 'NetworkTopologyStrategy', 'DC' : 3};

 CREATE TABLE test_bits(id bigint primary key , some_bits text) with
 gc_grace_seconds=0 and compaction = {'class': 'LeveledCompactionStrategy'}
 and compression={'sstable_compression' : ''};

 have 75 threads that are inserting data into the above table with each
 thread having non over lapping keys.

 I see that the number of pending tasks via nodetool compactionstats
 keeps increasing and looks like from nodetool cfstats test.test_bits has
 SSTTable levels as [154/4, 8, 0, 0, 0, 0, 0, 0, 0],

 Why is compaction not kicking in ?

 thanks
 anishek



Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Marcus Eriksson
you need to look at nodetool compactionstats - there is probably a big L0
- L1 compaction going on that blocks other compactions from starting

On Tue, Apr 21, 2015 at 1:06 PM, Anishek Agarwal anis...@gmail.com wrote:

 the some_bits column has about 14-15 bytes of data per key.

 On Tue, Apr 21, 2015 at 4:34 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 Hello,

 I am inserting about 100 million entries via datastax-java driver to a
 cassandra cluster of 3 nodes.

 Table structure is as

 create keyspace test with replication = {'class':
 'NetworkTopologyStrategy', 'DC' : 3};

 CREATE TABLE test_bits(id bigint primary key , some_bits text) with
 gc_grace_seconds=0 and compaction = {'class': 'LeveledCompactionStrategy'}
 and compression={'sstable_compression' : ''};

 have 75 threads that are inserting data into the above table with each
 thread having non over lapping keys.

 I see that the number of pending tasks via nodetool compactionstats
 keeps increasing and looks like from nodetool cfstats test.test_bits has
 SSTTable levels as [154/4, 8, 0, 0, 0, 0, 0, 0, 0],

 Why is compaction not kicking in ?

 thanks
 anishek





LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Anishek Agarwal
Hello,

I am inserting about 100 million entries via datastax-java driver to a
cassandra cluster of 3 nodes.

Table structure is as

create keyspace test with replication = {'class':
'NetworkTopologyStrategy', 'DC' : 3};

CREATE TABLE test_bits(id bigint primary key , some_bits text) with
gc_grace_seconds=0 and compaction = {'class': 'LeveledCompactionStrategy'}
and compression={'sstable_compression' : ''};

have 75 threads that are inserting data into the above table with each
thread having non over lapping keys.

I see that the number of pending tasks via nodetool compactionstats keeps
increasing and looks like from nodetool cfstats test.test_bits has
SSTTable levels as [154/4, 8, 0, 0, 0, 0, 0, 0, 0],

Why is compaction not kicking in ?

thanks
anishek


Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Anishek Agarwal
I am on version 2.0.14, will update once i get the stats up for the writes
again


On Tue, Apr 21, 2015 at 4:46 PM, Carlos Rolo r...@pythian.com wrote:

 Are you on version 2.1.x?

 Regards,

 Carlos Juzarte Rolo
 Cassandra Consultant

 Pythian - Love your data

 rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo
 http://linkedin.com/in/carlosjuzarterolo*
 Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649
 www.pythian.com

 On Tue, Apr 21, 2015 at 1:06 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 the some_bits column has about 14-15 bytes of data per key.

 On Tue, Apr 21, 2015 at 4:34 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 Hello,

 I am inserting about 100 million entries via datastax-java driver to a
 cassandra cluster of 3 nodes.

 Table structure is as

 create keyspace test with replication = {'class':
 'NetworkTopologyStrategy', 'DC' : 3};

 CREATE TABLE test_bits(id bigint primary key , some_bits text) with
 gc_grace_seconds=0 and compaction = {'class': 'LeveledCompactionStrategy'}
 and compression={'sstable_compression' : ''};

 have 75 threads that are inserting data into the above table with each
 thread having non over lapping keys.

 I see that the number of pending tasks via nodetool compactionstats
 keeps increasing and looks like from nodetool cfstats test.test_bits has
 SSTTable levels as [154/4, 8, 0, 0, 0, 0, 0, 0, 0],

 Why is compaction not kicking in ?

 thanks
 anishek




 --






Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Brice Dutheil
Oh, thank you Sebastian for this input and the ticket reference !
We did notice an increase in CPU usage, but kept the concurrent compaction
low enough for our usage, by default it takes the number of cores. We did
use a number up to 30% of our available cores. But under heavy load clearly
CPU is the bottleneck and we have 2 CPU with 8 hyper threaded cores per
node.

In a related topic : I’m a bit concerned by datastax communication, usually
people talk about IO as being the weak spot, but in our case it’s more
about CPU. Fortunately the Moore law doesn’t really apply anymore
vertically, now we have have multi core processors *and* the trend is going
that way. Yet Datastax terms feels a bit *antiquated* and maybe a bit too
much Oracle-y : http://www.datastax.com/enterprise-terms
Node licensing is more appropriate for this century.
​

-- Brice

On Tue, Apr 21, 2015 at 11:19 PM, Sebastian Estevez 
sebastian.este...@datastax.com wrote:

 Do not enable multithreaded compaction. Overhead usually outweighs any
 benefit. It's removed in 2.1 because it harms more than helps:

 https://issues.apache.org/jira/browse/CASSANDRA-6142

 All the best,


 [image: datastax_logo.png] http://www.datastax.com/

 Sebastián Estévez

 Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

 [image: linkedin.png] https://www.linkedin.com/company/datastax [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax

 http://cassandrasummit-datastax.com/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology and transactional backbone of choice for the worlds
 most innovative companies such as Netflix, Adobe, Intuit, and eBay.

 On Tue, Apr 21, 2015 at 9:06 AM, Brice Dutheil brice.duth...@gmail.com
 wrote:

 I’m not sure I get everything about storm stuff, but my understanding of
 LCS is that compaction count may increase the more one update data (that’s
 why I was wondering about duplicate primary keys).

 Another option is that the code is sending too much write request/s to
 the cassandra cluster. I don’t know haw many nodes you have, but the less
 node there is the more compactions.
 Also I’d look at the CPU / load, maybe the config is too *restrictive*,
 look at the following properties in the cassandra.yaml

- compaction_throughput_mb_per_sec, by default the value is 16, you
may want to increase it but be careful on mechanical drives, if already in
SSD IO is rarely the issue, we have 64 (with SSDs)
- multithreaded_compaction by default it is false, we enabled it.

 Compaction thread are niced, so it shouldn’t be much an issue for
 serving production r/w requests. But you never know, always keep an eye on
 IO and CPU.

 — Brice

 On Tue, Apr 21, 2015 at 2:48 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 sorry i take that back we will modify different keys across threads not
 the same key, our storm topology is going to use field grouping to get
 updates for same keys to same set of bolts.

 On Tue, Apr 21, 2015 at 6:17 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 @Bruice : I dont think so as i am giving each thread a specific key
 range with no overlaps this does not seem to be the case now. However we
 will have to test where we have to modify the same key across threads -- do
 u think that will cause a problem ? As far as i have read LCS is
 recommended for such cases. should i just switch back to
 SizeTiredCompactionStrategy.


 On Tue, Apr 21, 2015 at 6:13 PM, Brice Dutheil brice.duth...@gmail.com
  wrote:

 Could it that the app is inserting _duplicate_ keys ?

 -- Brice

 On Tue, Apr 21, 2015 at 1:52 PM, Marcus Eriksson krum...@gmail.com
 wrote:

 nope, but you can correlate I guess, tools/bin/sstablemetadata gives
 you sstable level information

 and, it is also likely that since you get so many L0 sstables, you
 will be doing size tiered compaction in L0 for a while.

 On Tue, Apr 21, 2015 at 1:40 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 @Marcus I did look and that is where i got the above but it doesnt
 show any detail about moving from L0 -L1 any specific arguments i should
 try with ?

 On Tue, Apr 21, 2015 at 4:52 PM, Marcus Eriksson krum...@gmail.com
 wrote:

 you need to look at nodetool compactionstats - there is probably a
 big L0 - L1 compaction going on that blocks other compactions from 
 starting

 On Tue, Apr 21, 2015 at 1:06 PM, Anishek Agarwal anis...@gmail.com
  wrote:

 the some_bits column has about 14-15 bytes of data per key.

 On Tue, Apr 21, 2015 at 4:34 PM, Anishek Agarwal 
 anis...@gmail.com wrote:

 Hello,

 I am inserting about 100 million entries via