Re: load distribution that I can't explain

2017-09-13 Thread kaveh minooie

I am using RoundRobin

cluster = Cluster.builder()...( socket stuff, pool option stuff ... )
.withLoadBalancingPolicy( new RoundRobinPolicy() )
.addContactPoints( hosts )
.build();



On 09/13/2017 03:02 AM, kurt greaves wrote:
Are you using a load balancing policy? That sounds like you are only 
using node2 as a coordinator.​


--
Kaveh Minooie

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: load distribution that I can't explain

2017-09-13 Thread kurt greaves
Are you using a load balancing policy? That sounds like you are only using
node2 as a coordinator.​


Re: load distribution that I can't explain

2017-09-12 Thread kaveh minooie

Hi kurt, thanks for responding.

I understand that that query is very resource consuming. My question was 
why I only see its effect on the same node? considering that I have a 
replication factor of 2, I was hoping to see this load evenly 
distributed among those 2 nodes. That query runs hundreds of time on 
each run, but the loads seems to be always on the node2. That is what I 
am trying to figure out.



On 09/11/2017 06:25 PM, kurt greaves wrote:
Your first query will effectively have to perform table scans to satisfy 
what you are asking. If a query requires ALLOW FILTERING to be 
specified, it means that Cassandra can't really optimise that query in 
any way and it's going to have to query a lot of data (all of it...) to 
satisfy the result.
Because you've only specified one attribute of the partitioning key, 
Cassandra doesn't know where to look for that data, and will need to 
query all of it to find partitions matching that restriction.


If you want to select distinct you should probably do it in a 
distributed manner using token range scans, however this is generally 
not a good use case for Cassandra. If you really need to know your 
partitioning keys you should probably store them in a separate cache.


​


--
Kaveh Minooie

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: load distribution that I can't explain

2017-09-11 Thread kurt greaves
Your first query will effectively have to perform table scans to satisfy
what you are asking. If a query requires ALLOW FILTERING to be specified,
it means that Cassandra can't really optimise that query in any way and
it's going to have to query a lot of data (all of it...) to satisfy the
result.
Because you've only specified one attribute of the partitioning key,
Cassandra doesn't know where to look for that data, and will need to query
all of it to find partitions matching that restriction.

If you want to select distinct you should probably do it in a distributed
manner using token range scans, however this is generally not a good use
case for Cassandra. If you really need to know your partitioning keys you
should probably store them in a separate cache.

​