Hello there,

I'm running a 4 node cluster of Cassandra 3.9 with a replication factor of
4.

I want to be able to run a java process on each node only selecting a 25%
of the data on each node,
so i can process all of the data in parallel on each node.

What is the best way to do this with the java driver ?

I was assuming I could retrieve the token ranges for each node and page
through the data using these ranges, but this includes the replicated data.
I was hoping there was away of only selecting the data that a node is
responsible for and avoiding the replicated data.

Many thanks for any help and guidance,

Frank Hughes

Reply via email to