Re: parallel processing - splitting data

2017-01-19 Thread Benjamin Roth
If you have 4 Nodes with RF 4 then all data is on every node. So you can just slice the whole token range into 4 pieces and let each node process 1 slice. Determining local ranges also only helps if you read with CL_ONE. 2017-01-19 13:05 GMT+01:00 Frank Hughes : > Hello

Re: parallel processing - splitting data

2017-01-19 Thread Frank Hughes
I have tried to retrieve the token range and slice in 4, but the response i get for the following code is different on each node: TokenRange[] tokenRanges = unwrapTokenRanges(metadata.getTokenRanges(keyspaceName, localHost)).toArray(new TokenRange[0]); On each node, the 1024 token ranges are

Re: parallel processing - splitting data

2017-01-19 Thread Benjamin Roth
I meant the global whole token range which is -(2^64/2) to ((2^64) / 2 - 1) I remember there are classes that already generate the right slices but don't know by heart which one it was. 2017-01-19 13:29 GMT+01:00 Frank Hughes : > I have tried to retrieve the token range

parallel processing - splitting data

2017-01-19 Thread Frank Hughes
Hello there, I'm running a 4 node cluster of Cassandra 3.9 with a replication factor of 4. I want to be able to run a java process on each node only selecting a 25% of the data on each node, so i can process all of the data in parallel on each node. What is the best way to do this with the java

Re: parallel processing - splitting data

2017-01-19 Thread siddharth verma
Hi Frank, You could try this https://github.com/siddv29/cfs I have processed 1.2 billion rows in 480 seconds with just 20 threads on client side. C* 3.0.9 Nodes = 6 RF = 3 Have a go at it. You might be surprised. Regards, On Thu, Jan 19, 2017 at 5:35 PM, Frank Hughes

JVM state determined to be unstable. Exiting forcefully. what is Java Stability Inspector ?? why it is stopping DSE?

2017-01-19 Thread Pranay akula
>From last few days i am seeing on some of the nodes in cassandra cluster DSE is getting shutdown due to the error below and i need to kill Java process and restart DSE service. I have cross checked reads and writes and compactions nothing looks suspicious, but i am seeing full Gc pause on these

Re: Unreliable JMX metrics

2017-01-19 Thread Sun, Guan
Thanks for you reply Malte. I replaced the nodes and the new nodes have new IP, but I have removed the dead nodes using nodetool. Does Cassandra still keep the info of dead nodes for 72 hours even I have removed them? Thanks, Guan From: Malte Pickhan

Re: JVM state determined to be unstable. Exiting forcefully. what is Java Stability Inspector ?? why it is stopping DSE?

2017-01-19 Thread Alain RODRIGUEZ
Hi Pranay, what can be the reason for this It can be due to a JVM / GC misconfiguration or to some abnormal activity in Cassandra. Often, GC issues are a consequences and not the root cause of an issue in Cassandra. > how to debug that ?? how to fine grain why on those particular nodes this

Re: Unreliable JMX metrics

2017-01-19 Thread kurt Greaves
Yes. You likely will still be able to see the nodes in nodetool gossipinfo

3.0.8 AssertionError

2017-01-19 Thread sfesc...@gmail.com
The first time a client connects to the cluster and sends a bunch of inserts in parallel I get this assertion error. All subsequent inserts (including just retrying the first requests) work fine. The only assert in PrecisionTime is checking if the millis argument is <= the current millis. What

Re: JVM state determined to be unstable. Exiting forcefully. what is Java Stability Inspector ?? why it is stopping DSE?

2017-01-19 Thread Pranay akula
what i have observed is 2-3 old gen GC's in 1-2 mins before OOM which i rarely see and seen hinted handoffs get accumulated on nodes which went down, and Mutation drops as well. i really don't know how to analyse hprof file is there any guide or blog that can help me how to analyse it ?? our