Re: Bootstrap performance.

2015-04-21 Thread Robert Coli
On Mon, Apr 20, 2015 at 8:09 PM, Dikang Gu dikan...@gmail.com wrote:

 Why do you say steaming is single threaded? I see a lot of background
 streaming threads running, for example:


Imprecise :

Each stream is a single thread.

As I said, first place to look is throttles... but I would not be surprised
if the overall number of threads available to streaming is a meaningful
bound.

=Rob


Re: Handle Write Heavy Loads in Cassandra 2.0.3

2015-04-21 Thread Anuj Wadehra
Thanks Brice!!


We are using Red Hat Linux 6.4..24 cores...64Gb Ram..SSDs in RAID5..CPU are not 
overloaded even in peak load..I dont think IO is an issue as iostat shows 
await17 all times..util attrbute in iostat usually increases from 0 to 
100..and comes back immediately..m not an expert on analyzing IO but things 
look ok..We are using STCS..and not using Logged batches..We are making around 
12k writes/sec in 5 cf (one with 4 sec index) and 2300 reads/sec on each node 
of 3 node cluster. 2 CFs have wide rows with max data of around 100mb per row.  
 We have further reduced in_memory_compaction_limit_in_mb to 125.Though still 
getting logs saying compacting large row.


We are planning to upgrade to 2.0.14 as 2.1 is not yet production ready.


I would appreciate if you could answer the queries posted in initial mail.


Thanks

Anuj Wadehra


Sent from Yahoo Mail on Android

From:Brice Dutheil brice.duth...@gmail.com
Date:Tue, 21 Apr, 2015 at 10:22 pm
Subject:Re: Handle Write Heavy Loads in Cassandra 2.0.3

This is an intricate matter, I cannot say for sure what are good parameters 
from the wrong ones, too many things changed at once.

However there’s many things to consider 

What is your OS ?Do your nodes have SSDs or mechanical drives ? How many cores 
do you have ?Is it the CPUs or IOs that are overloaded ?What is the write 
request/s per node and cluster wide ?What is the compaction strategy of the 
tables you are writing into ?Are you using LOGGED BATCH statement. 

With heavy writes, it is NOT recommend to use LOGGED BATCH statements.

In our 2.0.14 cluster we have experimented node unavailability due to long Full 
GC pauses. We discovered bogus legacy data, a single outlier was so wrong that 
it updated hundred thousand time the same CQL rows with duplicate data. Given 
the tables we were writing to were configured to use LCS, this resulted in 
keeping Memtables in memory long enough to promote them in the old generation 
(the MaxTenuringThreshold default is 1).
Handling this data proved to be the thing to fix, with default GC settings the 
cluster (10 nodes) handle 39 write requests/s.

Note Memtables are allocated on heap with 2.0.x. With 2.1.x they will be 
allocated off-heap.

​


-- Brice


On Tue, Apr 21, 2015 at 5:12 PM, Anuj Wadehra anujw_2...@yahoo.co.in wrote:

Any suggestions or comments on this one?? 


Thanks

Anuj Wadhera


Sent from Yahoo Mail on Android

From:Anuj Wadehra anujw_2...@yahoo.co.in
Date:Mon, 20 Apr, 2015 at 11:51 pm
Subject:Re: Handle Write Heavy Loads in Cassandra 2.0.3

Small correction: we are making writes in 5 cf an reading frm one at high 
speeds. 




Thanks

Anuj Wadehra

Sent from Yahoo Mail on Android

From:Anuj Wadehra anujw_2...@yahoo.co.in
Date:Mon, 20 Apr, 2015 at 7:53 pm
Subject:Handle Write Heavy Loads in Cassandra 2.0.3

Hi, 
 
Recently, we discovered that  millions of mutations were getting dropped on our 
cluster. Eventually, we solved this problem by increasing the value of 
memtable_flush_writers from 1 to 3. We usually write 3 CFs simultaneously an 
one of them has 4 Secondary Indexes. 
 
New changes also include: 
concurrent_compactors: 12 (earlier it was default) 
compaction_throughput_mb_per_sec: 32(earlier it was default) 
in_memory_compaction_limit_in_mb: 400 ((earlier it was default 64) 
memtable_flush_writers: 3 (earlier 1) 
 
After, making above changes, our write heavy workload scenarios started giving 
promotion failed exceptions in  gc logs. 
 
We have done JVM tuning and Cassandra config changes to solve this: 
 
MAX_HEAP_SIZE=12G (Increased Heap to from 8G to reduce fragmentation) 
HEAP_NEWSIZE=3G 
 
JVM_OPTS=$JVM_OPTS -XX:SurvivorRatio=2 (We observed that even at 
SurvivorRatio=4, our survivor space was getting 100% utilized under heavy write 
load and we thought that minor collections were directly promoting objects to 
Tenured generation) 
 
JVM_OPTS=$JVM_OPTS -XX:MaxTenuringThreshold=20 (Lots of objects were moving 
from Eden to Tenured on each minor collection..may be related to medium life 
objects related to Memtables and compactions as suggested by heapdump) 
 
JVM_OPTS=$JVM_OPTS -XX:ConcGCThreads=20 
JVM_OPTS=$JVM_OPTS -XX:+UnlockDiagnosticVMOptions 
JVM_OPTS=$JVM_OPTS -XX:+UseGCTaskAffinity 
JVM_OPTS=$JVM_OPTS -XX:+BindGCTaskThreadsToCPUs 
JVM_OPTS=$JVM_OPTS -XX:ParGCCardsPerStrideChunk=32768 
JVM_OPTS=$JVM_OPTS -XX:+CMSScavengeBeforeRemark 
JVM_OPTS=$JVM_OPTS -XX:CMSMaxAbortablePrecleanTime=3 
JVM_OPTS=$JVM_OPTS -XX:CMSWaitDuration=2000 //though it's default value 
JVM_OPTS=$JVM_OPTS -XX:+CMSEdenChunksRecordAlways 
JVM_OPTS=$JVM_OPTS -XX:+CMSParallelInitialMarkEnabled 
JVM_OPTS=$JVM_OPTS -XX:-UseBiasedLocking 
JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=70 (to avoid concurrent 
failures we reduced value) 
 
Cassandra config: 
compaction_throughput_mb_per_sec: 24 
memtable_total_space_in_mb: 1000 (to make memtable flush frequent.default is 
1/4 heap which creates more long lived objects) 
 
Questions: 
1. 

Re: Cassandra tombstones being created by updating rows with TTL's

2015-04-21 Thread Anuj Wadehra
Whats ur sstable count for the CF? I hope compactions are working fine. Also 
check the full stacktrace of FileNotFoundException ..if its related to 
compactionyou can try cleaning compactions_in_progress folder in system 
folder in data directory..there are JIRA issues relating to that.


Thanks

Anuj Wadehra


Sent from Yahoo Mail on Android

From:Laing, Michael michael.la...@nytimes.com
Date:Tue, 21 Apr, 2015 at 10:21 pm
Subject:Re: Cassandra tombstones being created by updating rows with TTL's

Hmm - we read/write with Local Quorum always - I'd recommend that as that is 
your 'consistency' defense.


We use python, so I am not familiar with the java driver - but 'file not found' 
indicates something is inconsistent. 


On Tue, Apr 21, 2015 at 12:22 PM, Walsh, Stephen stephen.wa...@aspect.com 
wrote:

Thanks for all your help Michael,

 

Our data will change through the day, so data with a TTL will eventually get 
dropped, and new data will appear.

I’d imagine the entire table maybe expire and start over 7-10 times a day.

 

 

 

But on the GC topic, now java Driver now gives this error on the query 

I also get “Request did not complete within rpc_timeout.” In cqlsh.

 

#

com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout 
during read query at consistency ONE (1 responses were required but only 0 
replica responded)

    at 
com.datastax.driver.core.exceptions.ReadTimeoutException.copy(ReadTimeoutException.java:69)
 ~[cassandra-driver-core-2.1.4.jar:na]

    at 
com.datastax.driver.core.Responses$Error.asException(Responses.java:100) 
~[cassandra-driver-core-2.1.4.jar:na]

    at 
com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:140)
 ~[cassandra-driver-core-2.1.4.jar:na]

    at 
com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:249) 
~[cassandra-driver-core-2.1.4.jar:na]

    at 
com.datastax.driver.core.RequestHandler.onSet(RequestHandler.java:433) 
~[cassandra-driver-core-2.1.4.jar:na]

Caused by: com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra 
timeout during read query at consistency ONE (1 responses were required but 
only 0 replica responded)

    at com.datastax.driver.core.Responses$Error$1.decode(Responses.java:61) 
~[cassandra-driver-core-2.1.4.jar:na]

    at com.datastax.driver.core.Responses$Error$1.decode(Responses.java:38) 
~[cassandra-driver-core-2.1.4.jar:na]

    at 
com.datastax.driver.core.Message$ProtocolDecoder.decode(Message.java:168) 
~[cassandra-driver-core-2.1.4.jar:na]

    at 
com.datastax.shaded.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:66)
 ~[cassandra-driver-core-2.1.4.jar:na]

    at 
com.datastax.shaded.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 ~[cassandra-driver-core-2.1.4.jar:na]

#

 

 

These queries where taking about 1 second to run when the gc was at 10 seconds 
(same duration as the TTL).

 

Also seeing a lot of this this stuff in the log file

 

#

ERROR [ReadStage:71] 2015-04-21 17:11:07,597 CassandraDaemon.java (line 199) 
Exception in thread Thread[ReadStage:71,5,main]

java.lang.RuntimeException: java.lang.RuntimeException: 
java.io.FileNotFoundException: 
/var/lib/cassandra/data/keyspace/table/keyspace-table-jb-5-Data.db (No such 
file or directory)

    at 
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2008)

    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

    at java.lang.Thread.run(Thread.java:745)

Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: 
/var/lib/cassandra/data/keyspace/table/keyspace-table-jb-5-Data.db



 

 

Maybe this is a 1 step back 2 steps forward approach?

Any ideas?

 

 

 

 

From: Laing, Michael [mailto:michael.la...@nytimes.com] 
Sent: 21 April 2015 17:09


To: user@cassandra.apache.org
Subject: Re: Cassandra tombstones being created by updating rows with TTL's

 

Discussions previously on the list show why this is not a problem in much more 
detail.

 

If something changes in your cluster: node down, new node, etc - you run repair 
for sure.

 

We also run periodic repairs prophylactically.

 

But if you never delete and always ttl by the same amount, you do not have to 
worry about zombie data being resurrected - the main reason for running repair 
within gc_grace_seconds.

 

 

 

On Tue, Apr 21, 2015 at 11:49 AM, Walsh, Stephen stephen.wa...@aspect.com 
wrote:

Maybe thanks Michael, 

I will give these setting a go,

How do you do you periodic node-tool repairs in the situation, for what I read 
we need to start doing this also.

Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Anishek Agarwal
sorry i take that back we will modify different keys across threads not the
same key, our storm topology is going to use field grouping to get updates
for same keys to same set of bolts.

On Tue, Apr 21, 2015 at 6:17 PM, Anishek Agarwal anis...@gmail.com wrote:

 @Bruice : I dont think so as i am giving each thread a specific key range
 with no overlaps this does not seem to be the case now. However we will
 have to test where we have to modify the same key across threads -- do u
 think that will cause a problem ? As far as i have read LCS is recommended
 for such cases. should i just switch back to SizeTiredCompactionStrategy.


 On Tue, Apr 21, 2015 at 6:13 PM, Brice Dutheil brice.duth...@gmail.com
 wrote:

 Could it that the app is inserting _duplicate_ keys ?

 -- Brice

 On Tue, Apr 21, 2015 at 1:52 PM, Marcus Eriksson krum...@gmail.com
 wrote:

 nope, but you can correlate I guess, tools/bin/sstablemetadata gives you
 sstable level information

 and, it is also likely that since you get so many L0 sstables, you will
 be doing size tiered compaction in L0 for a while.

 On Tue, Apr 21, 2015 at 1:40 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 @Marcus I did look and that is where i got the above but it doesnt show
 any detail about moving from L0 -L1 any specific arguments i should try
 with ?

 On Tue, Apr 21, 2015 at 4:52 PM, Marcus Eriksson krum...@gmail.com
 wrote:

 you need to look at nodetool compactionstats - there is probably a big
 L0 - L1 compaction going on that blocks other compactions from starting

 On Tue, Apr 21, 2015 at 1:06 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 the some_bits column has about 14-15 bytes of data per key.

 On Tue, Apr 21, 2015 at 4:34 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 Hello,

 I am inserting about 100 million entries via datastax-java driver to
 a cassandra cluster of 3 nodes.

 Table structure is as

 create keyspace test with replication = {'class':
 'NetworkTopologyStrategy', 'DC' : 3};

 CREATE TABLE test_bits(id bigint primary key , some_bits text) with
 gc_grace_seconds=0 and compaction = {'class': 
 'LeveledCompactionStrategy'}
 and compression={'sstable_compression' : ''};

 have 75 threads that are inserting data into the above table with
 each thread having non over lapping keys.

 I see that the number of pending tasks via nodetool
 compactionstats keeps increasing and looks like from nodetool cfstats
 test.test_bits has SSTTable levels as [154/4, 8, 0, 0, 0, 0, 0, 0, 0],

 Why is compaction not kicking in ?

 thanks
 anishek










Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Anishek Agarwal
@Bruice : I dont think so as i am giving each thread a specific key range
with no overlaps this does not seem to be the case now. However we will
have to test where we have to modify the same key across threads -- do u
think that will cause a problem ? As far as i have read LCS is recommended
for such cases. should i just switch back to SizeTiredCompactionStrategy.


On Tue, Apr 21, 2015 at 6:13 PM, Brice Dutheil brice.duth...@gmail.com
wrote:

 Could it that the app is inserting _duplicate_ keys ?

 -- Brice

 On Tue, Apr 21, 2015 at 1:52 PM, Marcus Eriksson krum...@gmail.com
 wrote:

 nope, but you can correlate I guess, tools/bin/sstablemetadata gives you
 sstable level information

 and, it is also likely that since you get so many L0 sstables, you will
 be doing size tiered compaction in L0 for a while.

 On Tue, Apr 21, 2015 at 1:40 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 @Marcus I did look and that is where i got the above but it doesnt show
 any detail about moving from L0 -L1 any specific arguments i should try
 with ?

 On Tue, Apr 21, 2015 at 4:52 PM, Marcus Eriksson krum...@gmail.com
 wrote:

 you need to look at nodetool compactionstats - there is probably a big
 L0 - L1 compaction going on that blocks other compactions from starting

 On Tue, Apr 21, 2015 at 1:06 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 the some_bits column has about 14-15 bytes of data per key.

 On Tue, Apr 21, 2015 at 4:34 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 Hello,

 I am inserting about 100 million entries via datastax-java driver to
 a cassandra cluster of 3 nodes.

 Table structure is as

 create keyspace test with replication = {'class':
 'NetworkTopologyStrategy', 'DC' : 3};

 CREATE TABLE test_bits(id bigint primary key , some_bits text) with
 gc_grace_seconds=0 and compaction = {'class': 
 'LeveledCompactionStrategy'}
 and compression={'sstable_compression' : ''};

 have 75 threads that are inserting data into the above table with
 each thread having non over lapping keys.

 I see that the number of pending tasks via nodetool compactionstats
 keeps increasing and looks like from nodetool cfstats test.test_bits 
 has
 SSTTable levels as [154/4, 8, 0, 0, 0, 0, 0, 0, 0],

 Why is compaction not kicking in ?

 thanks
 anishek









Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Brice Dutheil
I’m not sure I get everything about storm stuff, but my understanding of
LCS is that compaction count may increase the more one update data (that’s
why I was wondering about duplicate primary keys).

Another option is that the code is sending too much write request/s to the
cassandra cluster. I don’t know haw many nodes you have, but the less node
there is the more compactions.
Also I’d look at the CPU / load, maybe the config is too *restrictive*,
look at the following properties in the cassandra.yaml

   - compaction_throughput_mb_per_sec, by default the value is 16, you may
   want to increase it but be careful on mechanical drives, if already in SSD
   IO is rarely the issue, we have 64 (with SSDs)
   - multithreaded_compaction by default it is false, we enabled it.

Compaction thread are niced, so it shouldn’t be much an issue for serving
production r/w requests. But you never know, always keep an eye on IO and
CPU.

— Brice

On Tue, Apr 21, 2015 at 2:48 PM, Anishek Agarwal anis...@gmail.com wrote:

sorry i take that back we will modify different keys across threads not the
 same key, our storm topology is going to use field grouping to get updates
 for same keys to same set of bolts.

 On Tue, Apr 21, 2015 at 6:17 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 @Bruice : I dont think so as i am giving each thread a specific key range
 with no overlaps this does not seem to be the case now. However we will
 have to test where we have to modify the same key across threads -- do u
 think that will cause a problem ? As far as i have read LCS is recommended
 for such cases. should i just switch back to SizeTiredCompactionStrategy.


 On Tue, Apr 21, 2015 at 6:13 PM, Brice Dutheil brice.duth...@gmail.com
 wrote:

 Could it that the app is inserting _duplicate_ keys ?

 -- Brice

 On Tue, Apr 21, 2015 at 1:52 PM, Marcus Eriksson krum...@gmail.com
 wrote:

 nope, but you can correlate I guess, tools/bin/sstablemetadata gives
 you sstable level information

 and, it is also likely that since you get so many L0 sstables, you will
 be doing size tiered compaction in L0 for a while.

 On Tue, Apr 21, 2015 at 1:40 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 @Marcus I did look and that is where i got the above but it doesnt
 show any detail about moving from L0 -L1 any specific arguments i should
 try with ?

 On Tue, Apr 21, 2015 at 4:52 PM, Marcus Eriksson krum...@gmail.com
 wrote:

 you need to look at nodetool compactionstats - there is probably a
 big L0 - L1 compaction going on that blocks other compactions from 
 starting

 On Tue, Apr 21, 2015 at 1:06 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 the some_bits column has about 14-15 bytes of data per key.

 On Tue, Apr 21, 2015 at 4:34 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 Hello,

 I am inserting about 100 million entries via datastax-java driver
 to a cassandra cluster of 3 nodes.

 Table structure is as

 create keyspace test with replication = {'class':
 'NetworkTopologyStrategy', 'DC' : 3};

 CREATE TABLE test_bits(id bigint primary key , some_bits text) with
 gc_grace_seconds=0 and compaction = {'class': 
 'LeveledCompactionStrategy'}
 and compression={'sstable_compression' : ''};

 have 75 threads that are inserting data into the above table with
 each thread having non over lapping keys.

 I see that the number of pending tasks via nodetool
 compactionstats keeps increasing and looks like from nodetool cfstats
 test.test_bits has SSTTable levels as [154/4, 8, 0, 0, 0, 0, 0, 0, 0],

 Why is compaction not kicking in ?

 thanks
 anishek








  ​


Cassandra tombstones being created by updating rows with TTL's

2015-04-21 Thread Walsh, Stephen
We were chatting to Jon Haddena about a week ago about our tombstone issue 
using Cassandra 2.0.14
To Summarize

We have a 3 node cluster with replication-factor=3 and compaction = SizeTiered
We use 1 keyspace with 1 table
Each row have about 40 columns
Each row has a TTL of 10 seconds

We insert about 500 rows per second in a prepared batch** (about 3mb in network 
overhead)
We query the entire table once per second

**This is too enable consistent data, E.G batch in transactional, so we get all 
queried data from one insert and not a mix of 2 or more.


Seems every second we insert, the rows are never deleted by the TTL, or so we 
thought.
After some time we got this message on the query side


###
ERROR [ReadStage:91] 2015-04-21 12:27:03,902 SliceQueryFilter.java (line 206) 
Scanned over 10 tombstones in keyspace.table; query aborted (see 
tombstone_failure_threshold)
ERROR [ReadStage:91] 2015-04-21 12:27:03,931 CassandraDaemon.java (line 199) 
Exception in thread Thread[ReadStage:91,5,main]
java.lang.RuntimeException: 
org.apache.cassandra.db.filter.TombstoneOverwhelmingException
at 
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2008)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.cassandra.db.filter.TombstoneOverwhelmingException
###


So we know tombstones are infact being created.
Solution was to change the table schema and set gc_grace_seconds to run every 
60 seconds.
This worked for 20 seconds, then we saw this


###
Read 500 live and 3 tombstoned cells in keyspace.table (see 
tombstone_warn_threshold). 1 columns was requested, slices=[-], 
delInfo={deletedAt=-9223372036854775808, localDeletion=2147483647}
###

So every 20 seconds (500 inserts x 20 seconds = 10,000 tombstones)
So now we have the gc_grace_seconds set to 10 seoncds.
But its feels very wrong to have it at a low number, especially if we move to a 
larger cluster. This just wont fly.
What are we doing wrong?

We shouldn't increase the tombstone threshold as that is extremely dangerous.


Best Regards
Stephen Walsh






This email (including any attachments) is proprietary to Aspect Software, Inc. 
and may contain information that is confidential. If you have received this 
message in error, please do not read, copy or forward this message. Please 
notify the sender immediately, delete it from your system and destroy any 
copies. You may not further disclose or distribute this email or its 
attachments.


Re: Handle Write Heavy Loads in Cassandra 2.0.3

2015-04-21 Thread Anuj Wadehra
Any suggestions or comments on this one?? 


Thanks

Anuj Wadhera


Sent from Yahoo Mail on Android

From:Anuj Wadehra anujw_2...@yahoo.co.in
Date:Mon, 20 Apr, 2015 at 11:51 pm
Subject:Re: Handle Write Heavy Loads in Cassandra 2.0.3

Small correction: we are making writes in 5 cf an reading frm one at high 
speeds. 



Thanks

Anuj Wadehra

Sent from Yahoo Mail on Android

From:Anuj Wadehra anujw_2...@yahoo.co.in
Date:Mon, 20 Apr, 2015 at 7:53 pm
Subject:Handle Write Heavy Loads in Cassandra 2.0.3

Hi, 
 
Recently, we discovered that  millions of mutations were getting dropped on our 
cluster. Eventually, we solved this problem by increasing the value of 
memtable_flush_writers from 1 to 3. We usually write 3 CFs simultaneously an 
one of them has 4 Secondary Indexes. 
 
New changes also include: 
concurrent_compactors: 12 (earlier it was default) 
compaction_throughput_mb_per_sec: 32(earlier it was default) 
in_memory_compaction_limit_in_mb: 400 ((earlier it was default 64) 
memtable_flush_writers: 3 (earlier 1) 
 
After, making above changes, our write heavy workload scenarios started giving 
promotion failed exceptions in  gc logs. 
 
We have done JVM tuning and Cassandra config changes to solve this: 
 
MAX_HEAP_SIZE=12G (Increased Heap to from 8G to reduce fragmentation) 
HEAP_NEWSIZE=3G 
 
JVM_OPTS=$JVM_OPTS -XX:SurvivorRatio=2 (We observed that even at 
SurvivorRatio=4, our survivor space was getting 100% utilized under heavy write 
load and we thought that minor collections were directly promoting objects to 
Tenured generation) 
 
JVM_OPTS=$JVM_OPTS -XX:MaxTenuringThreshold=20 (Lots of objects were moving 
from Eden to Tenured on each minor collection..may be related to medium life 
objects related to Memtables and compactions as suggested by heapdump) 
 
JVM_OPTS=$JVM_OPTS -XX:ConcGCThreads=20 
JVM_OPTS=$JVM_OPTS -XX:+UnlockDiagnosticVMOptions 
JVM_OPTS=$JVM_OPTS -XX:+UseGCTaskAffinity 
JVM_OPTS=$JVM_OPTS -XX:+BindGCTaskThreadsToCPUs 
JVM_OPTS=$JVM_OPTS -XX:ParGCCardsPerStrideChunk=32768 
JVM_OPTS=$JVM_OPTS -XX:+CMSScavengeBeforeRemark 
JVM_OPTS=$JVM_OPTS -XX:CMSMaxAbortablePrecleanTime=3 
JVM_OPTS=$JVM_OPTS -XX:CMSWaitDuration=2000 //though it's default value 
JVM_OPTS=$JVM_OPTS -XX:+CMSEdenChunksRecordAlways 
JVM_OPTS=$JVM_OPTS -XX:+CMSParallelInitialMarkEnabled 
JVM_OPTS=$JVM_OPTS -XX:-UseBiasedLocking 
JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=70 (to avoid concurrent 
failures we reduced value) 
 
Cassandra config: 
compaction_throughput_mb_per_sec: 24 
memtable_total_space_in_mb: 1000 (to make memtable flush frequent.default is 
1/4 heap which creates more long lived objects) 
 
Questions: 
1. Why increasing memtable_flush_writers and in_memory_compaction_limit_in_mb 
caused promotion failures in JVM? Does more memtable_flush_writers mean more 
memtables in memory? 


2. Still, objects are getting promoted at high speed to Tenured space. CMS is 
running on Old gen every 4-5 minutes  under heavy write load. Around 750+ minor 
collections of upto 300ms happened in 45 mins. Do you see any problems with new 
JVM tuning and Cassandra config? Is the justification given against those 
changes sounds logical? Any suggestions? 
3. What is the best practice for reducing heap fragmentation/promotion failure 
when allocation and promotion rates are high? 
 
Thanks 
Anuj 
 
 




Re: Cassandra tombstones being created by updating rows with TTL's

2015-04-21 Thread Laing, Michael
If you never delete except by ttl, and always write with the same ttl (or
monotonically increasing), you can set gc_grace_seconds to 0.

That's what we do. There have been discussions on the list over the last
few years re this topic.

ml

On Tue, Apr 21, 2015 at 11:14 AM, Walsh, Stephen stephen.wa...@aspect.com
wrote:

  We were chatting to Jon Haddena about a week ago about our tombstone
 issue using Cassandra 2.0.14

 To Summarize



 We have a 3 node cluster with replication-factor=3 and compaction =
 SizeTiered

 We use 1 keyspace with 1 table

 Each row have about 40 columns

 Each row has a TTL of 10 seconds



 We insert about 500 rows per second in a prepared batch** (about 3mb in
 network overhead)

 We query the entire table once per second



 **This is too enable consistent data, E.G batch in transactional, so we
 get all queried data from one insert and not a mix of 2 or more.





 Seems every second we insert, the rows are never deleted by the TTL, or so
 we thought.

 After some time we got this message on the query side





 ###

 ERROR [ReadStage:91] 2015-04-21 12:27:03,902 SliceQueryFilter.java (line
 206) Scanned over 10 tombstones in keyspace.table; query aborted (see
 tombstone_failure_threshold)

 ERROR [ReadStage:91] 2015-04-21 12:27:03,931 CassandraDaemon.java (line
 199) Exception in thread Thread[ReadStage:91,5,main]

 java.lang.RuntimeException:
 org.apache.cassandra.db.filter.TombstoneOverwhelmingException

 at
 org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2008)

 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

 at java.lang.Thread.run(Thread.java:745)

 Caused by: org.apache.cassandra.db.filter.TombstoneOverwhelmingException

 ###





 So we know tombstones are infact being created.

 Solution was to change the table schema and set gc_grace_seconds to run
 every 60 seconds.

 This worked for 20 seconds, then we saw this





 ###

 Read 500 live and 3 tombstoned cells in keyspace.table (see
 tombstone_warn_threshold). 1 columns was requested, slices=[-],
 delInfo={deletedAt=-9223372036854775808, localDeletion=2147483647}

 ###



 So every 20 seconds (500 inserts x 20 seconds = 10,000 tombstones)

 So now we have the gc_grace_seconds set to 10 seoncds.

 But its feels very wrong to have it at a low number, especially if we move
 to a larger cluster. This just wont fly.

 What are we doing wrong?



 We shouldn’t increase the tombstone threshold as that is extremely
 dangerous.





 Best Regards

 Stephen Walsh












  This email (including any attachments) is proprietary to Aspect Software,
 Inc. and may contain information that is confidential. If you have received
 this message in error, please do not read, copy or forward this message.
 Please notify the sender immediately, delete it from your system and
 destroy any copies. You may not further disclose or distribute this email
 or its attachments.



Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Carlos Rolo
Are you on version 2.1.x?

Regards,

Carlos Juzarte Rolo
Cassandra Consultant

Pythian - Love your data

rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo
http://linkedin.com/in/carlosjuzarterolo*
Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649
www.pythian.com

On Tue, Apr 21, 2015 at 1:06 PM, Anishek Agarwal anis...@gmail.com wrote:

 the some_bits column has about 14-15 bytes of data per key.

 On Tue, Apr 21, 2015 at 4:34 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 Hello,

 I am inserting about 100 million entries via datastax-java driver to a
 cassandra cluster of 3 nodes.

 Table structure is as

 create keyspace test with replication = {'class':
 'NetworkTopologyStrategy', 'DC' : 3};

 CREATE TABLE test_bits(id bigint primary key , some_bits text) with
 gc_grace_seconds=0 and compaction = {'class': 'LeveledCompactionStrategy'}
 and compression={'sstable_compression' : ''};

 have 75 threads that are inserting data into the above table with each
 thread having non over lapping keys.

 I see that the number of pending tasks via nodetool compactionstats
 keeps increasing and looks like from nodetool cfstats test.test_bits has
 SSTTable levels as [154/4, 8, 0, 0, 0, 0, 0, 0, 0],

 Why is compaction not kicking in ?

 thanks
 anishek




-- 


--





RE: Connecting to Cassandra cluster in AWS from local network

2015-04-21 Thread Matthew Johnson
Thanks everyone for the suggestions!



I have used the following code to create my cluster from my dev environment
and it seems to be working perfectly:



cluster = Cluster.*builder*
().addContactPoints(nodes).withAddressTranslater(*new* AddressTranslater() {

  *public* InetSocketAddress translate(InetSocketAddress
address) {

String newAddress = *null*;

*if*(address != *null*  address.getAddress() !=
*null*) {

  *if*(address.getHostName().equals(
172.x.x.237)) newAddress = 54.x.x.157;

  *if*(address.getHostName().equals(
172.x.x.170)) newAddress = 54.x.x.208;

  *if*(address.getHostName().equals(
172.x.x.150)) newAddress = 54.x.x.142;

}

*return* *new* InetSocketAddress(newAddress,
address.getPort());

  }

}).build();





Cheers,

Matt





*From:* Russell Bradberry [mailto:rbradbe...@gmail.com]
*Sent:* 20 April 2015 19:06
*To:* user@cassandra.apache.org
*Subject:* Re: Connecting to Cassandra cluster in AWS from local network



I would like to note that this will require all clients connect over the
external IP address. If you have clients within Amazon that need to connect
over the private IP address, this would not be possible.  If you have a mix
of clients that need to connect over private IP address and public, then
one of the solutions outlined in
https://datastax-oss.atlassian.net/browse/JAVA-145 may be more appropriate.



-Russ



*From: *Alex Popescu
*Reply-To: *user@cassandra.apache.org
*Date: *Monday, April 20, 2015 at 2:00 PM
*To: *user
*Subject: *Re: Connecting to Cassandra cluster in AWS from local network



You'll have to configure your nodes to:



1. use AWS internal IPs for inter-node connection (check listen_address)
and

2. use the AWS public IP for client-to-node connections (check rpc_address)



Depending on the setup, there might be other interesting conf options in
cassandra.yaml (broadcast_address, listen_interface, rpc_interface).



[1]:
http://docs.datastax.com/en/cassandra/2.1/cassandra/configuration/configCassandra_yaml_r.html



On Mon, Apr 20, 2015 at 9:50 AM, Jonathan Haddad j...@jonhaddad.com wrote:

Ideally you'll be on the same network, but if you can't be, you'll need to
use the public ip in listen_address.



On Mon, Apr 20, 2015 at 9:47 AM Matthew Johnson matt.john...@algomi.com
wrote:

Hi all,



I have set up a Cassandra cluster with 2.1.4 on some existing AWS boxes,
just as a POC. Cassandra servers connect to each other over their internal
AWS IP addresses (172.x.x.x) aliased in /etc/hosts as sales1, sales2 and
sales3.



I connect to it from my local dev environment using the seed’s external NAT
address (54.x.x.x) aliases in my Windows hosts file as sales3 (my seed).



When I try to connect, it connects fine, and can retrieve some data (I have
very limited amounts of data in there, but it seems to retrieve ok), but I
also get lots of stacktraces in my log where my dev environment is trying
to connect to Cassandra on the internal IP (presumably the Cassandra seed
node tells my dev env where to look):





*INFO  2015-04-20 16:34:14,808 [CASSANDRA-CLIENT] {main} Cluster - New
Cassandra host sales3/54.x.x.142:9042 added*

*INFO  2015-04-20 16:34:14,808 [CASSANDRA-CLIENT] {main} Cluster - New
Cassandra host /172.x.x.237:9042 added*

*INFO  2015-04-20 16:34:14,808 [CASSANDRA-CLIENT] {main} Cluster - New
Cassandra host /172.x.x.170:9042 added*

*Connected to cluster: Test Cluster*

*Datatacenter: datacenter1; Host: /172.x.x.170; Rack: rack1*

*Datatacenter: datacenter1; Host: sales3/54.x.x.142; Rack: rack1*

*Datatacenter: datacenter1; Host: /172.x.x.237; Rack: rack1*

*DEBUG 2015-04-20 16:34:14,901 [CASSANDRA-CLIENT] {Cassandra Java Driver
worker-0} Connection - Connection[sales3/54.x.x.142:9042-2, inFlight=0,
closed=false] Transport initialized and ready*

*DEBUG 2015-04-20 16:34:14,901 [CASSANDRA-CLIENT] {Cassandra Java Driver
worker-0} Session - Added connection pool for sales3/54.x.x.142:9042*

*DEBUG 2015-04-20 16:34:19,850 [CASSANDRA-CLIENT] {Cassandra Java Driver
worker-1} Connection - Connection[/172.x.x.237:9042-1, inFlight=0,
closed=false] Error connecting to /172.x.x.237:9042 (connection timed out:
/172.x.x.237:9042)*

*DEBUG 2015-04-20 16:34:19,850 [CASSANDRA-CLIENT] {Cassandra Java Driver
worker-1} Connection - Defuncting connection to /172.x.x.237:9042*

*com.datastax.driver.core.TransportException**: [/172.x.x.237:9042] Cannot
connect*





Does anyone have any experience with connecting to AWS clusters from dev
machines? How have you set up your aliases to get around this issue?



Current setup in sales3 (seed node) cassandra.yaml:



*- seeds: sales3*

*listen_address: sales3*

*rpc_address: sales3*



Current setup in other nodes (eg sales2) cassandra.yaml:



*- seeds: sales3*

*listen_address: sales2*


Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Brice Dutheil
Could it that the app is inserting _duplicate_ keys ?

-- Brice

On Tue, Apr 21, 2015 at 1:52 PM, Marcus Eriksson krum...@gmail.com wrote:

 nope, but you can correlate I guess, tools/bin/sstablemetadata gives you
 sstable level information

 and, it is also likely that since you get so many L0 sstables, you will be
 doing size tiered compaction in L0 for a while.

 On Tue, Apr 21, 2015 at 1:40 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 @Marcus I did look and that is where i got the above but it doesnt show
 any detail about moving from L0 -L1 any specific arguments i should try
 with ?

 On Tue, Apr 21, 2015 at 4:52 PM, Marcus Eriksson krum...@gmail.com
 wrote:

 you need to look at nodetool compactionstats - there is probably a big
 L0 - L1 compaction going on that blocks other compactions from starting

 On Tue, Apr 21, 2015 at 1:06 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 the some_bits column has about 14-15 bytes of data per key.

 On Tue, Apr 21, 2015 at 4:34 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 Hello,

 I am inserting about 100 million entries via datastax-java driver to a
 cassandra cluster of 3 nodes.

 Table structure is as

 create keyspace test with replication = {'class':
 'NetworkTopologyStrategy', 'DC' : 3};

 CREATE TABLE test_bits(id bigint primary key , some_bits text) with
 gc_grace_seconds=0 and compaction = {'class': 'LeveledCompactionStrategy'}
 and compression={'sstable_compression' : ''};

 have 75 threads that are inserting data into the above table with each
 thread having non over lapping keys.

 I see that the number of pending tasks via nodetool compactionstats
 keeps increasing and looks like from nodetool cfstats test.test_bits has
 SSTTable levels as [154/4, 8, 0, 0, 0, 0, 0, 0, 0],

 Why is compaction not kicking in ?

 thanks
 anishek








RE: Cassandra tombstones being created by updating rows with TTL's

2015-04-21 Thread Walsh, Stephen
Maybe thanks Michael,
I will give these setting a go,
How do you do you periodic node-tool repairs in the situation, for what I read 
we need to start doing this also.

https://wiki.apache.org/cassandra/Operations#Frequency_of_nodetool_repair


From: Laing, Michael [mailto:michael.la...@nytimes.com]
Sent: 21 April 2015 16:26
To: user@cassandra.apache.org
Subject: Re: Cassandra tombstones being created by updating rows with TTL's

If you never delete except by ttl, and always write with the same ttl (or 
monotonically increasing), you can set gc_grace_seconds to 0.

That's what we do. There have been discussions on the list over the last few 
years re this topic.

ml

On Tue, Apr 21, 2015 at 11:14 AM, Walsh, Stephen 
stephen.wa...@aspect.commailto:stephen.wa...@aspect.com wrote:
We were chatting to Jon Haddena about a week ago about our tombstone issue 
using Cassandra 2.0.14
To Summarize

We have a 3 node cluster with replication-factor=3 and compaction = SizeTiered
We use 1 keyspace with 1 table
Each row have about 40 columns
Each row has a TTL of 10 seconds

We insert about 500 rows per second in a prepared batch** (about 3mb in network 
overhead)
We query the entire table once per second

**This is too enable consistent data, E.G batch in transactional, so we get all 
queried data from one insert and not a mix of 2 or more.


Seems every second we insert, the rows are never deleted by the TTL, or so we 
thought.
After some time we got this message on the query side


###
ERROR [ReadStage:91] 2015-04-21 12:27:03,902 SliceQueryFilter.java (line 206) 
Scanned over 10 tombstones in keyspace.table; query aborted (see 
tombstone_failure_threshold)
ERROR [ReadStage:91] 2015-04-21 12:27:03,931 CassandraDaemon.java (line 199) 
Exception in thread Thread[ReadStage:91,5,main]
java.lang.RuntimeException: 
org.apache.cassandra.db.filter.TombstoneOverwhelmingException
at 
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2008)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.cassandra.db.filter.TombstoneOverwhelmingException
###


So we know tombstones are infact being created.
Solution was to change the table schema and set gc_grace_seconds to run every 
60 seconds.
This worked for 20 seconds, then we saw this


###
Read 500 live and 3 tombstoned cells in keyspace.table (see 
tombstone_warn_threshold). 1 columns was requested, slices=[-], 
delInfo={deletedAt=-9223372036854775808, 
localDeletion=2147483647tel:2147483647}
###

So every 20 seconds (500 inserts x 20 seconds = 10,000 tombstones)
So now we have the gc_grace_seconds set to 10 seoncds.
But its feels very wrong to have it at a low number, especially if we move to a 
larger cluster. This just wont fly.
What are we doing wrong?

We shouldn’t increase the tombstone threshold as that is extremely dangerous.


Best Regards
Stephen Walsh






This email (including any attachments) is proprietary to Aspect Software, Inc. 
and may contain information that is confidential. If you have received this 
message in error, please do not read, copy or forward this message. Please 
notify the sender immediately, delete it from your system and destroy any 
copies. You may not further disclose or distribute this email or its 
attachments.

This email (including any attachments) is proprietary to Aspect Software, Inc. 
and may contain information that is confidential. If you have received this 
message in error, please do not read, copy or forward this message. Please 
notify the sender immediately, delete it from your system and destroy any 
copies. You may not further disclose or distribute this email or its 
attachments.


Re: Cassandra tombstones being created by updating rows with TTL's

2015-04-21 Thread Laing, Michael
Discussions previously on the list show why this is not a problem in much
more detail.

If something changes in your cluster: node down, new node, etc - you run
repair for sure.

We also run periodic repairs prophylactically.

But if you never delete and always ttl by the same amount, you do not have
to worry about zombie data being resurrected - the main reason for running
repair within gc_grace_seconds.



On Tue, Apr 21, 2015 at 11:49 AM, Walsh, Stephen stephen.wa...@aspect.com
wrote:

  Maybe thanks Michael,

 I will give these setting a go,

 How do you do you periodic node-tool repairs in the situation, for what I
 read we need to start doing this also.



 https://wiki.apache.org/cassandra/Operations#Frequency_of_nodetool_repair





 *From:* Laing, Michael [mailto:michael.la...@nytimes.com]
 *Sent:* 21 April 2015 16:26
 *To:* user@cassandra.apache.org
 *Subject:* Re: Cassandra tombstones being created by updating rows with
 TTL's



 If you never delete except by ttl, and always write with the same ttl (or
 monotonically increasing), you can set gc_grace_seconds to 0.



 That's what we do. There have been discussions on the list over the last
 few years re this topic.



 ml



 On Tue, Apr 21, 2015 at 11:14 AM, Walsh, Stephen stephen.wa...@aspect.com
 wrote:

  We were chatting to Jon Haddena about a week ago about our tombstone
 issue using Cassandra 2.0.14

 To Summarize



 We have a 3 node cluster with replication-factor=3 and compaction =
 SizeTiered

 We use 1 keyspace with 1 table

 Each row have about 40 columns

 Each row has a TTL of 10 seconds



 We insert about 500 rows per second in a prepared batch** (about 3mb in
 network overhead)

 We query the entire table once per second



 **This is too enable consistent data, E.G batch in transactional, so we
 get all queried data from one insert and not a mix of 2 or more.





 Seems every second we insert, the rows are never deleted by the TTL, or so
 we thought.

 After some time we got this message on the query side





 ###

 ERROR [ReadStage:91] 2015-04-21 12:27:03,902 SliceQueryFilter.java (line
 206) Scanned over 10 tombstones in keyspace.table; query aborted (see
 tombstone_failure_threshold)

 ERROR [ReadStage:91] 2015-04-21 12:27:03,931 CassandraDaemon.java (line
 199) Exception in thread Thread[ReadStage:91,5,main]

 java.lang.RuntimeException:
 org.apache.cassandra.db.filter.TombstoneOverwhelmingException

 at
 org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2008)

 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

 at java.lang.Thread.run(Thread.java:745)

 Caused by: org.apache.cassandra.db.filter.TombstoneOverwhelmingException

 ###





 So we know tombstones are infact being created.

 Solution was to change the table schema and set gc_grace_seconds to run
 every 60 seconds.

 This worked for 20 seconds, then we saw this





 ###

 Read 500 live and 3 tombstoned cells in keyspace.table (see
 tombstone_warn_threshold). 1 columns was requested, slices=[-],
 delInfo={deletedAt=-9223372036854775808, localDeletion=2147483647}

 ###



 So every 20 seconds (500 inserts x 20 seconds = 10,000 tombstones)

 So now we have the gc_grace_seconds set to 10 seoncds.

 But its feels very wrong to have it at a low number, especially if we move
 to a larger cluster. This just wont fly.

 What are we doing wrong?



 We shouldn’t increase the tombstone threshold as that is extremely
 dangerous.





 Best Regards

 Stephen Walsh













 This email (including any attachments) is proprietary to Aspect Software,
 Inc. and may contain information that is confidential. If you have received
 this message in error, please do not read, copy or forward this message.
 Please notify the sender immediately, delete it from your system and
 destroy any copies. You may not further disclose or distribute this email
 or its attachments.


  This email (including any attachments) is proprietary to Aspect
 Software, Inc. and may contain information that is confidential. If you
 have received this message in error, please do not read, copy or forward
 this message. Please notify the sender immediately, delete it from your
 system and destroy any copies. You may not further disclose or distribute
 this email or its attachments.



Is 2.1.5 ready for upgrade?

2015-04-21 Thread Dikang Gu
Hi guys,

We have some issues with streaming in 2.1.2. We find that there are a lot
of patches in 2.1.5. Is it ready for upgrade?

Thanks.
-- 
Dikang


Re: Is 2.1.5 ready for upgrade?

2015-04-21 Thread Brian Sam-Bodden
Robert,
Can you elaborate more please?

Cheers,
Brian

On Tuesday, April 21, 2015, Robert Coli rc...@eventbrite.com wrote:

 On Tue, Apr 21, 2015 at 2:25 PM, Dikang Gu dikan...@gmail.com
 javascript:_e(%7B%7D,'cvml','dikan...@gmail.com'); wrote:

 We have some issues with streaming in 2.1.2. We find that there are a lot
 of patches in 2.1.5. Is it ready for upgrade?


 I personally would not run either version in production at this time, but
 if forced, would prefer 2.1.5 over 2.1.2.

 =Rob




-- 
Cheers,
Brian
http://www.integrallis.com


Error while building from source code

2015-04-21 Thread Jay Ken
Hi,

I am trying to build a project the source bundled downloaded from

http://apache.arvixe.com/cassandra/2.1.4/apache-cassandra-2.1.4-src.tar.gz

but when I run ant build I get following error during build. Any idea why
I am getting build Failed? Seems looking for dependencies
org.apache.cassandra:cassandra-coverage-deps:jar:2.1.4-SNAPSHOT




BUILD FAILED

/Users/user1/apache-cassandra-2.1.4-src/build.xml:572: Unable to resolve
artifact: Missing:

--

1) com.sun:tools:jar:0


  Try downloading the file manually from the project website.


  Then, install it using the command:

  mvn install:install-file -DgroupId=com.sun -DartifactId=tools
-Dversion=0 -Dpackaging=jar -Dfile=/path/to/file


  Alternatively, if you host your own repository you can deploy the file
there:

  mvn deploy:deploy-file -DgroupId=com.sun -DartifactId=tools
-Dversion=0 -Dpackaging=jar -Dfile=/path/to/file -Durl=[url]
-DrepositoryId=[id]


  Path to dependency:

  1) org.apache.cassandra:cassandra-coverage-deps:jar:2.1.4-SNAPSHOT

  2) net.sourceforge.cobertura:cobertura:jar:2.0.3

  3) com.sun:tools:jar:0


--

1 required artifact is missing.


for artifact:

  org.apache.cassandra:cassandra-coverage-deps:jar:2.1.4-SNAPSHOT


from the specified remote repositories:

  central (http://repo1.maven.org/maven2)

Thanks,
Jay


Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Sebastian Estevez
I want to draw a distinction between a) multithreaded compaction (the jira
I just pointed to) and b) concurrent_compactors. I'm not clear on which one
you are recommending at this stage.

a) Multithreaded compaction is what I warned against in my last note. b)
Concurrent compactors is the number of separate compaction tasks (on
different tables) that can run simultaneously. You can crank this up
without much risk though the old default of num cores was too aggressive
(CASSANDRA-7139). 2 seems to be the sweet-spot.

Cassandra is, more often than not, disk constrained though this can change
for some workloads with SSD's.


All the best,


[image: datastax_logo.png] http://www.datastax.com/

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png] https://www.linkedin.com/company/datastax [image:
facebook.png] https://www.facebook.com/datastax [image: twitter.png]
https://twitter.com/datastax [image: g+.png]
https://plus.google.com/+Datastax/about
http://feeds.feedburner.com/datastax

http://cassandrasummit-datastax.com/

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Tue, Apr 21, 2015 at 5:46 PM, Brice Dutheil brice.duth...@gmail.com
wrote:

 Oh, thank you Sebastian for this input and the ticket reference !
 We did notice an increase in CPU usage, but kept the concurrent compaction
 low enough for our usage, by default it takes the number of cores. We did
 use a number up to 30% of our available cores. But under heavy load clearly
 CPU is the bottleneck and we have 2 CPU with 8 hyper threaded cores per
 node.

 In a related topic : I’m a bit concerned by datastax communication,
 usually people talk about IO as being the weak spot, but in our case it’s
 more about CPU. Fortunately the Moore law doesn’t really apply anymore
 vertically, now we have have multi core processors *and* the trend is
 going that way. Yet Datastax terms feels a bit *antiquated* and maybe a
 bit too much Oracle-y : http://www.datastax.com/enterprise-terms
 Node licensing is more appropriate for this century.
 ​

 -- Brice

 On Tue, Apr 21, 2015 at 11:19 PM, Sebastian Estevez 
 sebastian.este...@datastax.com wrote:

 Do not enable multithreaded compaction. Overhead usually outweighs any
 benefit. It's removed in 2.1 because it harms more than helps:

 https://issues.apache.org/jira/browse/CASSANDRA-6142

 All the best,


 [image: datastax_logo.png] http://www.datastax.com/

 Sebastián Estévez

 Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

 [image: linkedin.png] https://www.linkedin.com/company/datastax [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax

 http://cassandrasummit-datastax.com/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology and transactional backbone of choice for the worlds
 most innovative companies such as Netflix, Adobe, Intuit, and eBay.

 On Tue, Apr 21, 2015 at 9:06 AM, Brice Dutheil brice.duth...@gmail.com
 wrote:

 I’m not sure I get everything about storm stuff, but my understanding of
 LCS is that compaction count may increase the more one update data (that’s
 why I was wondering about duplicate primary keys).

 Another option is that the code is sending too much write request/s to
 the cassandra cluster. I don’t know haw many nodes you have, but the less
 node there is the more compactions.
 Also I’d look at the CPU / load, maybe the config is too *restrictive*,
 look at the following properties in the cassandra.yaml

- compaction_throughput_mb_per_sec, by default the value is 16, you
may want to increase it but be careful on mechanical drives, if already 
 in
SSD IO is rarely the issue, we have 64 (with SSDs)
- multithreaded_compaction by default it is false, we enabled it.

 Compaction thread are niced, so it shouldn’t be much an issue for
 serving production r/w requests. But you never know, always keep an eye on
 IO and CPU.

 — Brice

 On Tue, Apr 21, 2015 at 2:48 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 sorry i take that back we will modify different keys across threads not
 the same key, our storm topology is going to use field grouping to get
 updates for same keys to same set of bolts.

 On Tue, Apr 21, 2015 at 6:17 PM, 

Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Sebastian Estevez
Do not enable multithreaded compaction. Overhead usually outweighs any
benefit. It's removed in 2.1 because it harms more than helps:

https://issues.apache.org/jira/browse/CASSANDRA-6142

All the best,


[image: datastax_logo.png] http://www.datastax.com/

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png] https://www.linkedin.com/company/datastax [image:
facebook.png] https://www.facebook.com/datastax [image: twitter.png]
https://twitter.com/datastax [image: g+.png]
https://plus.google.com/+Datastax/about
http://feeds.feedburner.com/datastax

http://cassandrasummit-datastax.com/

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Tue, Apr 21, 2015 at 9:06 AM, Brice Dutheil brice.duth...@gmail.com
wrote:

 I’m not sure I get everything about storm stuff, but my understanding of
 LCS is that compaction count may increase the more one update data (that’s
 why I was wondering about duplicate primary keys).

 Another option is that the code is sending too much write request/s to the
 cassandra cluster. I don’t know haw many nodes you have, but the less node
 there is the more compactions.
 Also I’d look at the CPU / load, maybe the config is too *restrictive*,
 look at the following properties in the cassandra.yaml

- compaction_throughput_mb_per_sec, by default the value is 16, you
may want to increase it but be careful on mechanical drives, if already in
SSD IO is rarely the issue, we have 64 (with SSDs)
- multithreaded_compaction by default it is false, we enabled it.

 Compaction thread are niced, so it shouldn’t be much an issue for serving
 production r/w requests. But you never know, always keep an eye on IO and
 CPU.

 — Brice

 On Tue, Apr 21, 2015 at 2:48 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 sorry i take that back we will modify different keys across threads not
 the same key, our storm topology is going to use field grouping to get
 updates for same keys to same set of bolts.

 On Tue, Apr 21, 2015 at 6:17 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 @Bruice : I dont think so as i am giving each thread a specific key
 range with no overlaps this does not seem to be the case now. However we
 will have to test where we have to modify the same key across threads -- do
 u think that will cause a problem ? As far as i have read LCS is
 recommended for such cases. should i just switch back to
 SizeTiredCompactionStrategy.


 On Tue, Apr 21, 2015 at 6:13 PM, Brice Dutheil brice.duth...@gmail.com
 wrote:

 Could it that the app is inserting _duplicate_ keys ?

 -- Brice

 On Tue, Apr 21, 2015 at 1:52 PM, Marcus Eriksson krum...@gmail.com
 wrote:

 nope, but you can correlate I guess, tools/bin/sstablemetadata gives
 you sstable level information

 and, it is also likely that since you get so many L0 sstables, you
 will be doing size tiered compaction in L0 for a while.

 On Tue, Apr 21, 2015 at 1:40 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 @Marcus I did look and that is where i got the above but it doesnt
 show any detail about moving from L0 -L1 any specific arguments i should
 try with ?

 On Tue, Apr 21, 2015 at 4:52 PM, Marcus Eriksson krum...@gmail.com
 wrote:

 you need to look at nodetool compactionstats - there is probably a
 big L0 - L1 compaction going on that blocks other compactions from 
 starting

 On Tue, Apr 21, 2015 at 1:06 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 the some_bits column has about 14-15 bytes of data per key.

 On Tue, Apr 21, 2015 at 4:34 PM, Anishek Agarwal anis...@gmail.com
  wrote:

 Hello,

 I am inserting about 100 million entries via datastax-java driver
 to a cassandra cluster of 3 nodes.

 Table structure is as

 create keyspace test with replication = {'class':
 'NetworkTopologyStrategy', 'DC' : 3};

 CREATE TABLE test_bits(id bigint primary key , some_bits text)
 with gc_grace_seconds=0 and compaction = {'class':
 'LeveledCompactionStrategy'} and compression={'sstable_compression' : 
 ''};

 have 75 threads that are inserting data into the above table with
 each thread having non over lapping keys.

 I see that the number of pending tasks via nodetool
 compactionstats keeps increasing and looks like from nodetool 
 cfstats
 test.test_bits has SSTTable levels as [154/4, 8, 0, 0, 0, 0, 0, 0, 
 0],

 Why is compaction not kicking in ?

 thanks
 anishek








  ​



Re: Handle Write Heavy Loads in Cassandra 2.0.3

2015-04-21 Thread Brice Dutheil
Hi, I cannot really answer your question as some rock solid truth.

When we had problems, we did mainly two things

   - Analyzed the GC logs (with censum from jClarity, this tool IS really
   awesome, it’s good investment even better if the production is running
   other java applications)
   - Heap dumped cassandra when there was a GC, this helped in narrowing
   down the actual issue

I don’t know precisely how to answer, but :

   - concurrent_compactors could be lowered to 10, it seems from another
   thread here that it can be harmful, see
   https://issues.apache.org/jira/browse/CASSANDRA-6142
   - memtable_flush_writers we set it to 2
   - compaction_throughput_mb_per_sec could probably be increased, on SSDs
   that should help
   - trickle_fsync don’t forget this one too if you’re on SSDs

Touching JVM heap parameters can be hazardous, increasing heap may seem
like a nice thing, but it can increase GC time in the worst case scenario.

Also increasing the MaxTenuringThreshold is probably wrong too, as you
probably know it means objects will be copied from Eden to Survivor 0/1 and
to the other Survivor on the next collection until that threshold is
reached, then it will be copied in Old generation. That means that’s being
applied to Memtables, so it *may* mean several copies to be done on each
GCs, and memtables are not small objects that could take a little while for
an *available* system. Another fact to take account for is that upon each
collection the active survivor S0/S1 has to be big enough for the memtable
to fit there, and there’s other objects too.

So I would rather work on the real cause. rather than GC. One thing brought
my attention

Though still getting logs saying “compacting large row”.

Could it be that the model is based on wide rows ? That could be a problem,
for several reasons not limited to compactions. If that is so I’d advise to
revise the datamodel
​

-- Brice

On Tue, Apr 21, 2015 at 7:53 PM, Anuj Wadehra anujw_2...@yahoo.co.in
wrote:

 Thanks Brice!!

 We are using Red Hat Linux 6.4..24 cores...64Gb Ram..SSDs in RAID5..CPU
 are not overloaded even in peak load..I dont think IO is an issue as iostat
 shows await17 all times..util attrbute in iostat usually increases from 0
 to 100..and comes back immediately..m not an expert on analyzing IO but
 things look ok..We are using STCS..and not using Logged batches..We are
 making around 12k writes/sec in 5 cf (one with 4 sec index) and 2300
 reads/sec on each node of 3 node cluster. 2 CFs have wide rows with max
 data of around 100mb per row.   We have further reduced
 in_memory_compaction_limit_in_mb to 125.Though still getting logs saying
 compacting large row.

 We are planning to upgrade to 2.0.14 as 2.1 is not yet production ready.

 I would appreciate if you could answer the queries posted in initial mail.

 Thanks
 Anuj Wadehra

 Sent from Yahoo Mail on Android
 https://overview.mail.yahoo.com/mobile/?.src=Android
 --
   *From*:Brice Dutheil brice.duth...@gmail.com
 *Date*:Tue, 21 Apr, 2015 at 10:22 pm

 *Subject*:Re: Handle Write Heavy Loads in Cassandra 2.0.3

 This is an intricate matter, I cannot say for sure what are good
 parameters from the wrong ones, too many things changed at once.

 However there’s many things to consider

- What is your OS ?
- Do your nodes have SSDs or mechanical drives ? How many cores do you
have ?
- Is it the CPUs or IOs that are overloaded ?
- What is the write request/s per node and cluster wide ?
- What is the compaction strategy of the tables you are writing into ?
- Are you using LOGGED BATCH statement.

 With heavy writes, it is *NOT* recommend to use LOGGED BATCH statements.

 In our 2.0.14 cluster we have experimented node unavailability due to long
 Full GC pauses. We discovered bogus legacy data, a single outlier was so
 wrong that it updated hundred thousand time the same CQL rows with
 duplicate data. Given the tables we were writing to were configured to use
 LCS, this resulted in keeping Memtables in memory long enough to promote
 them in the old generation (the MaxTenuringThreshold default is 1).
 Handling this data proved to be the thing to fix, with default GC settings
 the cluster (10 nodes) handle 39 write requests/s.

 Note Memtables are allocated on heap with 2.0.x. With 2.1.x they will be
 allocated off-heap.
 ​

 -- Brice

 On Tue, Apr 21, 2015 at 5:12 PM, Anuj Wadehra anujw_2...@yahoo.co.in
 wrote:

 Any suggestions or comments on this one??

 Thanks
 Anuj Wadhera

 Sent from Yahoo Mail on Android
 https://overview.mail.yahoo.com/mobile/?.src=Android
 --
 *From*:Anuj Wadehra anujw_2...@yahoo.co.in
 *Date*:Mon, 20 Apr, 2015 at 11:51 pm
 *Subject*:Re: Handle Write Heavy Loads in Cassandra 2.0.3

 Small correction: we are making writes in 5 cf an reading frm one at high
 speeds.



 Thanks
 Anuj Wadehra

 Sent from Yahoo Mail on Android
 

Cluster imbalance caused due to #Num_Tokens

2015-04-21 Thread Tiwari, Tarun
Hi,

While setting up a cluster for our POC, when we installed Cassandra on the 1st 
node we gave num_tokens: 256 , while on next 2 nodes which were added later we 
left it blank in Cassandra.yaml.

This made our cluster an unbalanced one with nodetool status showing 99% load 
on one server. Now even if I am setting up num tokens in the other 2 nodes as 
256, its not seeming to effect. The wiki article  
http://wiki.apache.org/cassandra/VirtualNodes/Balance doesn't seem to provide 
steps to correct from this situation.

I read that there was nodetool balance kind of command in Cassandra 0.7 but not 
anymore.

UN  Node3  23.72 MB   1   0.4%   41a71df-7e6c-40ab-902f-237697eaaf3e  rack1
UN  Node2  79.35 MB   1   0.5%   98c493b-f661-491e-9d1f-1803f859528b  rack1
UN  Node1  86.93 MB   256 99.1%  a35ccca-556c-4f77-aa6d-7e3dad41ecf8  rack1

Is there something that we can do now balance the cluster?

Regards,
Tarun


Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Anishek Agarwal
Thanks Brice for the input,

I am confused as to how to calculate the value of concurrent_read,
following is what i found recommended on sites and in configuration docs.

concurrent_read : some places its 16 X number of drives or 4 X number of
cores
which of the above should i pick ?  i have 40 core cpu with 3 disks(non
ssd) one used for commitlog and other two for data directories, I am having
3 nodes in my cluster.

I think there are tools out there that allow the max write speed to disk, i
am going to run them too to find out the write throughput i can get to see
that i am not trying to overachieve something, currently we are stuck at
35MBps

@Sebastian
the concurrent_compactors is at default value of 32 for us and i think that
should be fine.
Since we had lot of cores i thought it would be better to use
multithreaded_compaction
but i think i will try one set with it turned off again.

Question is still,

how do i find what write load should i aim for per node such that it is
able to compact data while inserting, is it just try and error ? or there
is a certain QPS i can target for per node ?

Our business case is
-- new client comes and create a new keyspace for him, initially there will
be lots of new keys ( i think size tired might work better here)
-- as time progresses we are going to update the existing keys very
frequently ( i think LCS will work better here -- we are going with this
strategy for long term benefit)




On Wed, Apr 22, 2015 at 4:17 AM, Brice Dutheil brice.duth...@gmail.com
wrote:

 Yes I was referring referring to multithreaded_compaction, but just
 because we didn’t get bitten by this setting just doesn’t mean it’s right,
 and the jira is a clear indication of that ;)

 @Anishek that reminds me of these settings to look at as well:

- concurrent_write and concurrent_read both need to be adapted to your
actual hardware though.

  Cassandra is, more often than not, disk constrained though this can
 change for some workloads with SSD’s.

 Yes that is typically the case, SSDs are more and more commons but so are
 multi-core CPUs and the trend to multiple cores is not going to stop ; just
 look at the next Intel *flagship* : Knights Landing
 http://www.anandtech.com/show/8217/intels-knights-landing-coprocessor-detailed
 = *72 cores*.

 Nowadays it is not rare to have boxes with multicore CPU, either way if
 they are not used because of some IO bottleneck there’s no reason to be
 licensed for that, and if IO is not an issue the CPUs are most probably
 next in line. While node is much more about a combination of that plus much
 more added value like the linear scaling of Cassandra. And I’m not even
 listing the other nifty integration that DSE ships in.

 But on this matter I believe we shouldn’t hijack the original thread
 purpose.

 — Brice

 On Wed, Apr 22, 2015 at 12:13 AM, Sebastian Estevez
 [sebastian.este...@datastax.com](mailto:sebastian.este...@datastax.com)
 http://mailto:%5bsebastian.este...@datastax.com%5D(mailto:sebastian.este...@datastax.com)
 wrote:

 I want to draw a distinction between a) multithreaded compaction (the jira
 I just pointed to) and b) concurrent_compactors. I'm not clear on which one
 you are recommending at this stage.

 a) Multithreaded compaction is what I warned against in my last note. b)
 Concurrent compactors is the number of separate compaction tasks (on
 different tables) that can run simultaneously. You can crank this up
 without much risk though the old default of num cores was too aggressive
 (CASSANDRA-7139). 2 seems to be the sweet-spot.

 Cassandra is, more often than not, disk constrained though this can
 change for some workloads with SSD's.


 All the best,


 [image: datastax_logo.png] http://www.datastax.com/

 Sebastián Estévez

 Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

 [image: linkedin.png] https://www.linkedin.com/company/datastax [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax

 http://cassandrasummit-datastax.com/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology and transactional backbone of choice for the worlds
 most innovative companies such as Netflix, Adobe, Intuit, and eBay.

 On Tue, Apr 21, 2015 at 5:46 PM, Brice Dutheil brice.duth...@gmail.com
 wrote:

 Oh, thank you Sebastian for this input and the ticket reference !
 We did notice an increase in CPU usage, but kept the concurrent
 compaction low enough for our usage, by default it takes the number of
 cores. We did use a number up to 30% of our available cores. But under
 heavy load clearly CPU is the bottleneck and we 

Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Brice Dutheil
Yes I was referring referring to multithreaded_compaction, but just because
we didn’t get bitten by this setting just doesn’t mean it’s right, and the
jira is a clear indication of that ;)

@Anishek that reminds me of these settings to look at as well:

   - concurrent_write and concurrent_read both need to be adapted to your
   actual hardware though.

 Cassandra is, more often than not, disk constrained though this can change
for some workloads with SSD’s.

Yes that is typically the case, SSDs are more and more commons but so are
multi-core CPUs and the trend to multiple cores is not going to stop ; just
look at the next Intel *flagship* : Knights Landing
http://www.anandtech.com/show/8217/intels-knights-landing-coprocessor-detailed
= *72 cores*.

Nowadays it is not rare to have boxes with multicore CPU, either way if
they are not used because of some IO bottleneck there’s no reason to be
licensed for that, and if IO is not an issue the CPUs are most probably
next in line. While node is much more about a combination of that plus much
more added value like the linear scaling of Cassandra. And I’m not even
listing the other nifty integration that DSE ships in.

But on this matter I believe we shouldn’t hijack the original thread
purpose.

— Brice

On Wed, Apr 22, 2015 at 12:13 AM, Sebastian Estevez
[sebastian.este...@datastax.com](mailto:sebastian.este...@datastax.com)
http://mailto:[sebastian.este...@datastax.com](mailto:sebastian.este...@datastax.com)
wrote:

I want to draw a distinction between a) multithreaded compaction (the jira
 I just pointed to) and b) concurrent_compactors. I'm not clear on which one
 you are recommending at this stage.

 a) Multithreaded compaction is what I warned against in my last note. b)
 Concurrent compactors is the number of separate compaction tasks (on
 different tables) that can run simultaneously. You can crank this up
 without much risk though the old default of num cores was too aggressive
 (CASSANDRA-7139). 2 seems to be the sweet-spot.

 Cassandra is, more often than not, disk constrained though this can change
 for some workloads with SSD's.


 All the best,


 [image: datastax_logo.png] http://www.datastax.com/

 Sebastián Estévez

 Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

 [image: linkedin.png] https://www.linkedin.com/company/datastax [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax

 http://cassandrasummit-datastax.com/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology and transactional backbone of choice for the worlds
 most innovative companies such as Netflix, Adobe, Intuit, and eBay.

 On Tue, Apr 21, 2015 at 5:46 PM, Brice Dutheil brice.duth...@gmail.com
 wrote:

 Oh, thank you Sebastian for this input and the ticket reference !
 We did notice an increase in CPU usage, but kept the concurrent
 compaction low enough for our usage, by default it takes the number of
 cores. We did use a number up to 30% of our available cores. But under
 heavy load clearly CPU is the bottleneck and we have 2 CPU with 8 hyper
 threaded cores per node.

 In a related topic : I’m a bit concerned by datastax communication,
 usually people talk about IO as being the weak spot, but in our case it’s
 more about CPU. Fortunately the Moore law doesn’t really apply anymore
 vertically, now we have have multi core processors *and* the trend is
 going that way. Yet Datastax terms feels a bit *antiquated* and maybe a
 bit too much Oracle-y : http://www.datastax.com/enterprise-terms
 Node licensing is more appropriate for this century.
 ​

 -- Brice

 On Tue, Apr 21, 2015 at 11:19 PM, Sebastian Estevez 
 sebastian.este...@datastax.com wrote:

 Do not enable multithreaded compaction. Overhead usually outweighs any
 benefit. It's removed in 2.1 because it harms more than helps:

 https://issues.apache.org/jira/browse/CASSANDRA-6142

 All the best,


 [image: datastax_logo.png] http://www.datastax.com/

 Sebastián Estévez

 Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

 [image: linkedin.png] https://www.linkedin.com/company/datastax [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax

 http://cassandrasummit-datastax.com/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 

Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Anishek Agarwal
@Marcus I did look and that is where i got the above but it doesnt show any
detail about moving from L0 -L1 any specific arguments i should try with ?

On Tue, Apr 21, 2015 at 4:52 PM, Marcus Eriksson krum...@gmail.com wrote:

 you need to look at nodetool compactionstats - there is probably a big L0
 - L1 compaction going on that blocks other compactions from starting

 On Tue, Apr 21, 2015 at 1:06 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 the some_bits column has about 14-15 bytes of data per key.

 On Tue, Apr 21, 2015 at 4:34 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 Hello,

 I am inserting about 100 million entries via datastax-java driver to a
 cassandra cluster of 3 nodes.

 Table structure is as

 create keyspace test with replication = {'class':
 'NetworkTopologyStrategy', 'DC' : 3};

 CREATE TABLE test_bits(id bigint primary key , some_bits text) with
 gc_grace_seconds=0 and compaction = {'class': 'LeveledCompactionStrategy'}
 and compression={'sstable_compression' : ''};

 have 75 threads that are inserting data into the above table with each
 thread having non over lapping keys.

 I see that the number of pending tasks via nodetool compactionstats
 keeps increasing and looks like from nodetool cfstats test.test_bits has
 SSTTable levels as [154/4, 8, 0, 0, 0, 0, 0, 0, 0],

 Why is compaction not kicking in ?

 thanks
 anishek






Network transfer to one node twice as others

2015-04-21 Thread Anishek Agarwal
Hello,

We are using cassandra 2.0.14 and have a cluster of 3 nodes. I have a
writer test (written in java) that runs 50 threads to populate data to a
single table in a single keyspace.

when i look at the iftop  I see that the amount of network transfer
happening on two nodes is same but on one of the nodes its almost 2ice as
the other two, Any reason that would be the case ?

Thanks
Anishek


Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Marcus Eriksson
nope, but you can correlate I guess, tools/bin/sstablemetadata gives you
sstable level information

and, it is also likely that since you get so many L0 sstables, you will be
doing size tiered compaction in L0 for a while.

On Tue, Apr 21, 2015 at 1:40 PM, Anishek Agarwal anis...@gmail.com wrote:

 @Marcus I did look and that is where i got the above but it doesnt show
 any detail about moving from L0 -L1 any specific arguments i should try
 with ?

 On Tue, Apr 21, 2015 at 4:52 PM, Marcus Eriksson krum...@gmail.com
 wrote:

 you need to look at nodetool compactionstats - there is probably a big L0
 - L1 compaction going on that blocks other compactions from starting

 On Tue, Apr 21, 2015 at 1:06 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 the some_bits column has about 14-15 bytes of data per key.

 On Tue, Apr 21, 2015 at 4:34 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 Hello,

 I am inserting about 100 million entries via datastax-java driver to a
 cassandra cluster of 3 nodes.

 Table structure is as

 create keyspace test with replication = {'class':
 'NetworkTopologyStrategy', 'DC' : 3};

 CREATE TABLE test_bits(id bigint primary key , some_bits text) with
 gc_grace_seconds=0 and compaction = {'class': 'LeveledCompactionStrategy'}
 and compression={'sstable_compression' : ''};

 have 75 threads that are inserting data into the above table with each
 thread having non over lapping keys.

 I see that the number of pending tasks via nodetool compactionstats
 keeps increasing and looks like from nodetool cfstats test.test_bits has
 SSTTable levels as [154/4, 8, 0, 0, 0, 0, 0, 0, 0],

 Why is compaction not kicking in ?

 thanks
 anishek







Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Anishek Agarwal
the some_bits column has about 14-15 bytes of data per key.

On Tue, Apr 21, 2015 at 4:34 PM, Anishek Agarwal anis...@gmail.com wrote:

 Hello,

 I am inserting about 100 million entries via datastax-java driver to a
 cassandra cluster of 3 nodes.

 Table structure is as

 create keyspace test with replication = {'class':
 'NetworkTopologyStrategy', 'DC' : 3};

 CREATE TABLE test_bits(id bigint primary key , some_bits text) with
 gc_grace_seconds=0 and compaction = {'class': 'LeveledCompactionStrategy'}
 and compression={'sstable_compression' : ''};

 have 75 threads that are inserting data into the above table with each
 thread having non over lapping keys.

 I see that the number of pending tasks via nodetool compactionstats
 keeps increasing and looks like from nodetool cfstats test.test_bits has
 SSTTable levels as [154/4, 8, 0, 0, 0, 0, 0, 0, 0],

 Why is compaction not kicking in ?

 thanks
 anishek



Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Marcus Eriksson
you need to look at nodetool compactionstats - there is probably a big L0
- L1 compaction going on that blocks other compactions from starting

On Tue, Apr 21, 2015 at 1:06 PM, Anishek Agarwal anis...@gmail.com wrote:

 the some_bits column has about 14-15 bytes of data per key.

 On Tue, Apr 21, 2015 at 4:34 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 Hello,

 I am inserting about 100 million entries via datastax-java driver to a
 cassandra cluster of 3 nodes.

 Table structure is as

 create keyspace test with replication = {'class':
 'NetworkTopologyStrategy', 'DC' : 3};

 CREATE TABLE test_bits(id bigint primary key , some_bits text) with
 gc_grace_seconds=0 and compaction = {'class': 'LeveledCompactionStrategy'}
 and compression={'sstable_compression' : ''};

 have 75 threads that are inserting data into the above table with each
 thread having non over lapping keys.

 I see that the number of pending tasks via nodetool compactionstats
 keeps increasing and looks like from nodetool cfstats test.test_bits has
 SSTTable levels as [154/4, 8, 0, 0, 0, 0, 0, 0, 0],

 Why is compaction not kicking in ?

 thanks
 anishek





LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Anishek Agarwal
Hello,

I am inserting about 100 million entries via datastax-java driver to a
cassandra cluster of 3 nodes.

Table structure is as

create keyspace test with replication = {'class':
'NetworkTopologyStrategy', 'DC' : 3};

CREATE TABLE test_bits(id bigint primary key , some_bits text) with
gc_grace_seconds=0 and compaction = {'class': 'LeveledCompactionStrategy'}
and compression={'sstable_compression' : ''};

have 75 threads that are inserting data into the above table with each
thread having non over lapping keys.

I see that the number of pending tasks via nodetool compactionstats keeps
increasing and looks like from nodetool cfstats test.test_bits has
SSTTable levels as [154/4, 8, 0, 0, 0, 0, 0, 0, 0],

Why is compaction not kicking in ?

thanks
anishek


Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Anishek Agarwal
I am on version 2.0.14, will update once i get the stats up for the writes
again


On Tue, Apr 21, 2015 at 4:46 PM, Carlos Rolo r...@pythian.com wrote:

 Are you on version 2.1.x?

 Regards,

 Carlos Juzarte Rolo
 Cassandra Consultant

 Pythian - Love your data

 rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo
 http://linkedin.com/in/carlosjuzarterolo*
 Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649
 www.pythian.com

 On Tue, Apr 21, 2015 at 1:06 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 the some_bits column has about 14-15 bytes of data per key.

 On Tue, Apr 21, 2015 at 4:34 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 Hello,

 I am inserting about 100 million entries via datastax-java driver to a
 cassandra cluster of 3 nodes.

 Table structure is as

 create keyspace test with replication = {'class':
 'NetworkTopologyStrategy', 'DC' : 3};

 CREATE TABLE test_bits(id bigint primary key , some_bits text) with
 gc_grace_seconds=0 and compaction = {'class': 'LeveledCompactionStrategy'}
 and compression={'sstable_compression' : ''};

 have 75 threads that are inserting data into the above table with each
 thread having non over lapping keys.

 I see that the number of pending tasks via nodetool compactionstats
 keeps increasing and looks like from nodetool cfstats test.test_bits has
 SSTTable levels as [154/4, 8, 0, 0, 0, 0, 0, 0, 0],

 Why is compaction not kicking in ?

 thanks
 anishek




 --






Re: Cassandra tombstones being created by updating rows with TTL's

2015-04-21 Thread Laing, Michael
Hmm - we read/write with Local Quorum always - I'd recommend that as that
is your 'consistency' defense.

We use python, so I am not familiar with the java driver - but 'file not
found' indicates something is inconsistent.

On Tue, Apr 21, 2015 at 12:22 PM, Walsh, Stephen stephen.wa...@aspect.com
wrote:

  Thanks for all your help Michael,



 Our data will change through the day, so data with a TTL will eventually
 get dropped, and new data will appear.

 I’d imagine the entire table maybe expire and start over 7-10 times a day.







 But on the GC topic, now java Driver now gives this error on the query

 I also get “Request did not complete within rpc_timeout.” In cqlsh.



 #

 com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra
 timeout during read query at consistency ONE (1 responses were required but
 only 0 replica responded)

 at
 com.datastax.driver.core.exceptions.ReadTimeoutException.copy(ReadTimeoutException.java:69)
 ~[cassandra-driver-core-2.1.4.jar:na]

 at
 com.datastax.driver.core.Responses$Error.asException(Responses.java:100)
 ~[cassandra-driver-core-2.1.4.jar:na]

 at
 com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:140)
 ~[cassandra-driver-core-2.1.4.jar:na]

 at
 com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:249)
 ~[cassandra-driver-core-2.1.4.jar:na]

 at
 com.datastax.driver.core.RequestHandler.onSet(RequestHandler.java:433)
 ~[cassandra-driver-core-2.1.4.jar:na]

 Caused by: com.datastax.driver.core.exceptions.ReadTimeoutException:
 Cassandra timeout during read query at consistency ONE (1 responses were
 required but only 0 replica responded)

 at
 com.datastax.driver.core.Responses$Error$1.decode(Responses.java:61)
 ~[cassandra-driver-core-2.1.4.jar:na]

 at
 com.datastax.driver.core.Responses$Error$1.decode(Responses.java:38)
 ~[cassandra-driver-core-2.1.4.jar:na]

 at
 com.datastax.driver.core.Message$ProtocolDecoder.decode(Message.java:168)
 ~[cassandra-driver-core-2.1.4.jar:na]

 at
 com.datastax.shaded.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:66)
 ~[cassandra-driver-core-2.1.4.jar:na]

 at
 com.datastax.shaded.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 ~[cassandra-driver-core-2.1.4.jar:na]

 #





 These queries where taking about 1 second to run when the gc was at 10
 seconds (same duration as the TTL).



 Also seeing a lot of this this stuff in the log file



 #

 ERROR [ReadStage:71] 2015-04-21 17:11:07,597 CassandraDaemon.java (line
 199) Exception in thread Thread[ReadStage:71,5,main]

 java.lang.RuntimeException: java.lang.RuntimeException:
 java.io.FileNotFoundException:
 /var/lib/cassandra/data/keyspace/table/keyspace-table-jb-5-Data.db (No such
 file or directory)

 at
 org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2008)

 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

 at java.lang.Thread.run(Thread.java:745)

 Caused by: java.lang.RuntimeException: java.io.FileNotFoundException:
 /var/lib/cassandra/data/keyspace/table/keyspace-table-jb-5-Data.db

 





 Maybe this is a 1 step back 2 steps forward approach?

 Any ideas?









 *From:* Laing, Michael [mailto:michael.la...@nytimes.com]
 *Sent:* 21 April 2015 17:09

 *To:* user@cassandra.apache.org
 *Subject:* Re: Cassandra tombstones being created by updating rows with
 TTL's



 Discussions previously on the list show why this is not a problem in much
 more detail.



 If something changes in your cluster: node down, new node, etc - you run
 repair for sure.



 We also run periodic repairs prophylactically.



 But if you never delete and always ttl by the same amount, you do not have
 to worry about zombie data being resurrected - the main reason for running
 repair within gc_grace_seconds.







 On Tue, Apr 21, 2015 at 11:49 AM, Walsh, Stephen stephen.wa...@aspect.com
 wrote:

  Maybe thanks Michael,

 I will give these setting a go,

 How do you do you periodic node-tool repairs in the situation, for what I
 read we need to start doing this also.



 https://wiki.apache.org/cassandra/Operations#Frequency_of_nodetool_repair





 *From:* Laing, Michael [mailto:michael.la...@nytimes.com]
 *Sent:* 21 April 2015 16:26
 *To:* user@cassandra.apache.org
 *Subject:* Re: Cassandra tombstones being created by updating rows with
 TTL's



 If you never delete except by ttl, and always write with the same ttl (or
 monotonically increasing), you can set gc_grace_seconds to 0.



 That's what we do. There have been discussions on the 

Re: Handle Write Heavy Loads in Cassandra 2.0.3

2015-04-21 Thread Brice Dutheil
This is an intricate matter, I cannot say for sure what are good parameters
from the wrong ones, too many things changed at once.

However there’s many things to consider

   - What is your OS ?
   - Do your nodes have SSDs or mechanical drives ? How many cores do you
   have ?
   - Is it the CPUs or IOs that are overloaded ?
   - What is the write request/s per node and cluster wide ?
   - What is the compaction strategy of the tables you are writing into ?
   - Are you using LOGGED BATCH statement.

With heavy writes, it is *NOT* recommend to use LOGGED BATCH statements.

In our 2.0.14 cluster we have experimented node unavailability due to long
Full GC pauses. We discovered bogus legacy data, a single outlier was so
wrong that it updated hundred thousand time the same CQL rows with
duplicate data. Given the tables we were writing to were configured to use
LCS, this resulted in keeping Memtables in memory long enough to promote
them in the old generation (the MaxTenuringThreshold default is 1).
Handling this data proved to be the thing to fix, with default GC settings
the cluster (10 nodes) handle 39 write requests/s.

Note Memtables are allocated on heap with 2.0.x. With 2.1.x they will be
allocated off-heap.
​

-- Brice

On Tue, Apr 21, 2015 at 5:12 PM, Anuj Wadehra anujw_2...@yahoo.co.in
wrote:

 Any suggestions or comments on this one??

 Thanks
 Anuj Wadhera

 Sent from Yahoo Mail on Android
 https://overview.mail.yahoo.com/mobile/?.src=Android
 --
   *From*:Anuj Wadehra anujw_2...@yahoo.co.in
 *Date*:Mon, 20 Apr, 2015 at 11:51 pm
 *Subject*:Re: Handle Write Heavy Loads in Cassandra 2.0.3

 Small correction: we are making writes in 5 cf an reading frm one at high
 speeds.


 Thanks
 Anuj Wadehra

 Sent from Yahoo Mail on Android
 https://overview.mail.yahoo.com/mobile/?.src=Android
 --
 *From*:Anuj Wadehra anujw_2...@yahoo.co.in
 *Date*:Mon, 20 Apr, 2015 at 7:53 pm
 *Subject*:Handle Write Heavy Loads in Cassandra 2.0.3

 Hi,

 Recently, we discovered that  millions of mutations were getting dropped
 on our cluster. Eventually, we solved this problem by increasing the value
 of memtable_flush_writers from 1 to 3. We usually write 3 CFs
 simultaneously an one of them has 4 Secondary Indexes.

 New changes also include:
 concurrent_compactors: 12 (earlier it was default)
 compaction_throughput_mb_per_sec: 32(earlier it was default)
 in_memory_compaction_limit_in_mb: 400 ((earlier it was default 64)
 memtable_flush_writers: 3 (earlier 1)

 After, making above changes, our write heavy workload scenarios started
 giving promotion failed exceptions in  gc logs.

 We have done JVM tuning and Cassandra config changes to solve this:

 MAX_HEAP_SIZE=12G (Increased Heap to from 8G to reduce fragmentation)
 HEAP_NEWSIZE=3G

 JVM_OPTS=$JVM_OPTS -XX:SurvivorRatio=2 (We observed that even at
 SurvivorRatio=4, our survivor space was getting 100% utilized under heavy
 write load and we thought that minor collections were directly promoting
 objects to Tenured generation)

 JVM_OPTS=$JVM_OPTS -XX:MaxTenuringThreshold=20 (Lots of objects were
 moving from Eden to Tenured on each minor collection..may be related to
 medium life objects related to Memtables and compactions as suggested by
 heapdump)

 JVM_OPTS=$JVM_OPTS -XX:ConcGCThreads=20
 JVM_OPTS=$JVM_OPTS -XX:+UnlockDiagnosticVMOptions
 JVM_OPTS=$JVM_OPTS -XX:+UseGCTaskAffinity
 JVM_OPTS=$JVM_OPTS -XX:+BindGCTaskThreadsToCPUs
 JVM_OPTS=$JVM_OPTS -XX:ParGCCardsPerStrideChunk=32768
 JVM_OPTS=$JVM_OPTS -XX:+CMSScavengeBeforeRemark
 JVM_OPTS=$JVM_OPTS -XX:CMSMaxAbortablePrecleanTime=3
 JVM_OPTS=$JVM_OPTS -XX:CMSWaitDuration=2000 //though it's default value
 JVM_OPTS=$JVM_OPTS -XX:+CMSEdenChunksRecordAlways
 JVM_OPTS=$JVM_OPTS -XX:+CMSParallelInitialMarkEnabled
 JVM_OPTS=$JVM_OPTS -XX:-UseBiasedLocking
 JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=70 (to avoid
 concurrent failures we reduced value)

 Cassandra config:
 compaction_throughput_mb_per_sec: 24
 memtable_total_space_in_mb: 1000 (to make memtable flush frequent.default
 is 1/4 heap which creates more long lived objects)

 Questions:
 1. Why increasing memtable_flush_writers and
 in_memory_compaction_limit_in_mb caused promotion failures in JVM? Does
 more memtable_flush_writers mean more memtables in memory?

 2. Still, objects are getting promoted at high speed to Tenured space. CMS
 is running on Old gen every 4-5 minutes  under heavy write load. Around
 750+ minor collections of upto 300ms happened in 45 mins. Do you see any
 problems with new JVM tuning and Cassandra config? Is the justification
 given against those changes sounds logical? Any suggestions?
 3. What is the best practice for reducing heap fragmentation/promotion
 failure when allocation and promotion rates are high?

 Thanks
 Anuj







Re: CQL 3.x Update ...USING TIMESTAMP...

2015-04-21 Thread Tyler Hobbs
On Mon, Apr 20, 2015 at 4:02 PM, Sachin Nikam skni...@gmail.com wrote:

 #1. We have 2 data centers located close by with plans to expand to more
 data centers which are even further away geographically.
 #2. How will this impact light weight transactions when there is high
 level of network contention for cross data center traffic.


If you are only expecting updates to a given document from one DC, then you
could use LOCAL_SERIAL for the LWT operations.  If you can't do that, then
LWT are probably not a great option for you.


 #3. Do you know of any real examples where companies have used light
 weight transactions in a multi-data center traffic.


I don't know who's doing that off the top of my head, but I imagine they're
using LOCAL_SERIAL.


-- 
Tyler Hobbs
DataStax http://datastax.com/


Re: Is 2.1.5 ready for upgrade?

2015-04-21 Thread Robert Coli
On Tue, Apr 21, 2015 at 2:25 PM, Dikang Gu dikan...@gmail.com wrote:

 We have some issues with streaming in 2.1.2. We find that there are a lot
 of patches in 2.1.5. Is it ready for upgrade?


I personally would not run either version in production at this time, but
if forced, would prefer 2.1.5 over 2.1.2.

=Rob


Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Brice Dutheil
Oh, thank you Sebastian for this input and the ticket reference !
We did notice an increase in CPU usage, but kept the concurrent compaction
low enough for our usage, by default it takes the number of cores. We did
use a number up to 30% of our available cores. But under heavy load clearly
CPU is the bottleneck and we have 2 CPU with 8 hyper threaded cores per
node.

In a related topic : I’m a bit concerned by datastax communication, usually
people talk about IO as being the weak spot, but in our case it’s more
about CPU. Fortunately the Moore law doesn’t really apply anymore
vertically, now we have have multi core processors *and* the trend is going
that way. Yet Datastax terms feels a bit *antiquated* and maybe a bit too
much Oracle-y : http://www.datastax.com/enterprise-terms
Node licensing is more appropriate for this century.
​

-- Brice

On Tue, Apr 21, 2015 at 11:19 PM, Sebastian Estevez 
sebastian.este...@datastax.com wrote:

 Do not enable multithreaded compaction. Overhead usually outweighs any
 benefit. It's removed in 2.1 because it harms more than helps:

 https://issues.apache.org/jira/browse/CASSANDRA-6142

 All the best,


 [image: datastax_logo.png] http://www.datastax.com/

 Sebastián Estévez

 Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

 [image: linkedin.png] https://www.linkedin.com/company/datastax [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax

 http://cassandrasummit-datastax.com/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology and transactional backbone of choice for the worlds
 most innovative companies such as Netflix, Adobe, Intuit, and eBay.

 On Tue, Apr 21, 2015 at 9:06 AM, Brice Dutheil brice.duth...@gmail.com
 wrote:

 I’m not sure I get everything about storm stuff, but my understanding of
 LCS is that compaction count may increase the more one update data (that’s
 why I was wondering about duplicate primary keys).

 Another option is that the code is sending too much write request/s to
 the cassandra cluster. I don’t know haw many nodes you have, but the less
 node there is the more compactions.
 Also I’d look at the CPU / load, maybe the config is too *restrictive*,
 look at the following properties in the cassandra.yaml

- compaction_throughput_mb_per_sec, by default the value is 16, you
may want to increase it but be careful on mechanical drives, if already in
SSD IO is rarely the issue, we have 64 (with SSDs)
- multithreaded_compaction by default it is false, we enabled it.

 Compaction thread are niced, so it shouldn’t be much an issue for
 serving production r/w requests. But you never know, always keep an eye on
 IO and CPU.

 — Brice

 On Tue, Apr 21, 2015 at 2:48 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 sorry i take that back we will modify different keys across threads not
 the same key, our storm topology is going to use field grouping to get
 updates for same keys to same set of bolts.

 On Tue, Apr 21, 2015 at 6:17 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 @Bruice : I dont think so as i am giving each thread a specific key
 range with no overlaps this does not seem to be the case now. However we
 will have to test where we have to modify the same key across threads -- do
 u think that will cause a problem ? As far as i have read LCS is
 recommended for such cases. should i just switch back to
 SizeTiredCompactionStrategy.


 On Tue, Apr 21, 2015 at 6:13 PM, Brice Dutheil brice.duth...@gmail.com
  wrote:

 Could it that the app is inserting _duplicate_ keys ?

 -- Brice

 On Tue, Apr 21, 2015 at 1:52 PM, Marcus Eriksson krum...@gmail.com
 wrote:

 nope, but you can correlate I guess, tools/bin/sstablemetadata gives
 you sstable level information

 and, it is also likely that since you get so many L0 sstables, you
 will be doing size tiered compaction in L0 for a while.

 On Tue, Apr 21, 2015 at 1:40 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 @Marcus I did look and that is where i got the above but it doesnt
 show any detail about moving from L0 -L1 any specific arguments i should
 try with ?

 On Tue, Apr 21, 2015 at 4:52 PM, Marcus Eriksson krum...@gmail.com
 wrote:

 you need to look at nodetool compactionstats - there is probably a
 big L0 - L1 compaction going on that blocks other compactions from 
 starting

 On Tue, Apr 21, 2015 at 1:06 PM, Anishek Agarwal anis...@gmail.com
  wrote:

 the some_bits column has about 14-15 bytes of data per key.

 On Tue, Apr 21, 2015 at 4:34 PM, Anishek Agarwal 
 anis...@gmail.com wrote:

 Hello,

 I am inserting about 100 million entries via 

RE: Cassandra tombstones being created by updating rows with TTL's

2015-04-21 Thread Walsh, Stephen
Thanks for all your help Michael,

Our data will change through the day, so data with a TTL will eventually get 
dropped, and new data will appear.
I’d imagine the entire table maybe expire and start over 7-10 times a day.



But on the GC topic, now java Driver now gives this error on the query
I also get “Request did not complete within rpc_timeout.” In cqlsh.

#
com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout 
during read query at consistency ONE (1 responses were required but only 0 
replica responded)
at 
com.datastax.driver.core.exceptions.ReadTimeoutException.copy(ReadTimeoutException.java:69)
 ~[cassandra-driver-core-2.1.4.jar:na]
at 
com.datastax.driver.core.Responses$Error.asException(Responses.java:100) 
~[cassandra-driver-core-2.1.4.jar:na]
at 
com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:140)
 ~[cassandra-driver-core-2.1.4.jar:na]
at 
com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:249) 
~[cassandra-driver-core-2.1.4.jar:na]
at 
com.datastax.driver.core.RequestHandler.onSet(RequestHandler.java:433) 
~[cassandra-driver-core-2.1.4.jar:na]
Caused by: com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra 
timeout during read query at consistency ONE (1 responses were required but 
only 0 replica responded)
at com.datastax.driver.core.Responses$Error$1.decode(Responses.java:61) 
~[cassandra-driver-core-2.1.4.jar:na]
at com.datastax.driver.core.Responses$Error$1.decode(Responses.java:38) 
~[cassandra-driver-core-2.1.4.jar:na]
at 
com.datastax.driver.core.Message$ProtocolDecoder.decode(Message.java:168) 
~[cassandra-driver-core-2.1.4.jar:na]
at 
com.datastax.shaded.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:66)
 ~[cassandra-driver-core-2.1.4.jar:na]
at 
com.datastax.shaded.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 ~[cassandra-driver-core-2.1.4.jar:na]
#


These queries where taking about 1 second to run when the gc was at 10 seconds 
(same duration as the TTL).

Also seeing a lot of this this stuff in the log file

#
ERROR [ReadStage:71] 2015-04-21 17:11:07,597 CassandraDaemon.java (line 199) 
Exception in thread Thread[ReadStage:71,5,main]
java.lang.RuntimeException: java.lang.RuntimeException: 
java.io.FileNotFoundException: 
/var/lib/cassandra/data/keyspace/table/keyspace-table-jb-5-Data.db (No such 
file or directory)
at 
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2008)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: 
/var/lib/cassandra/data/keyspace/table/keyspace-table-jb-5-Data.db



Maybe this is a 1 step back 2 steps forward approach?
Any ideas?




From: Laing, Michael [mailto:michael.la...@nytimes.com]
Sent: 21 April 2015 17:09
To: user@cassandra.apache.org
Subject: Re: Cassandra tombstones being created by updating rows with TTL's

Discussions previously on the list show why this is not a problem in much more 
detail.

If something changes in your cluster: node down, new node, etc - you run repair 
for sure.

We also run periodic repairs prophylactically.

But if you never delete and always ttl by the same amount, you do not have to 
worry about zombie data being resurrected - the main reason for running repair 
within gc_grace_seconds.



On Tue, Apr 21, 2015 at 11:49 AM, Walsh, Stephen 
stephen.wa...@aspect.commailto:stephen.wa...@aspect.com wrote:
Maybe thanks Michael,
I will give these setting a go,
How do you do you periodic node-tool repairs in the situation, for what I read 
we need to start doing this also.

https://wiki.apache.org/cassandra/Operations#Frequency_of_nodetool_repair


From: Laing, Michael 
[mailto:michael.la...@nytimes.commailto:michael.la...@nytimes.com]
Sent: 21 April 2015 16:26
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: Cassandra tombstones being created by updating rows with TTL's

If you never delete except by ttl, and always write with the same ttl (or 
monotonically increasing), you can set gc_grace_seconds to 0.

That's what we do. There have been discussions on the list over the last few 
years re this topic.

ml

On Tue, Apr 21, 2015 at 11:14 AM, Walsh, Stephen 
stephen.wa...@aspect.commailto:stephen.wa...@aspect.com wrote:
We were chatting to Jon Haddena about a week ago about our tombstone issue 
using Cassandra 2.0.14
To Summarize

We have a 3 node cluster with replication-factor=3 and compaction = SizeTiered