Re: load distribution that I can't explain

kaveh minooie Tue, 12 Sep 2017 09:52:00 -0700

Hi kurt, thanks for responding.

I understand that that query is very resource consuming. My question waswhy I only see its effect on the same node? considering that I have areplication factor of 2, I was hoping to see this load evenlydistributed among those 2 nodes. That query runs hundreds of time oneach run, but the loads seems to be always on the node2. That is what Iam trying to figure out.



On 09/11/2017 06:25 PM, kurt greaves wrote:

Your first query will effectively have to perform table scans to satisfywhat you are asking. If a query requires ALLOW FILTERING to bespecified, it means that Cassandra can't really optimise that query inany way and it's going to have to query a lot of data (all of it...) tosatisfy the result.Because you've only specified one attribute of the partitioning key,Cassandra doesn't know where to look for that data, and will need toquery all of it to find partitions matching that restriction.
If you want to select distinct you should probably do it in adistributed manner using token range scans, however this is generallynot a good use case for Cassandra. If you really need to know yourpartitioning keys you should probably store them in a separate cache.


--
Kaveh Minooie

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org

Re: load distribution that I can't explain

Reply via email to