Re: Skip Scan

James Heather Mon, 05 Oct 2015 02:09:26 -0700

I'll leave someone else to comment on the Phoenix specifics.

I recall from some experiments on MySQL that if you have a massive loadof IDs to pass, it's quicker if you split them into batches of somereasonable (but still large) size, and that for this, you would want tosort them first. I don't think it made any difference whether the IDswere sorted within an individual SQL statement, but you want to splitinto batches that cover disjoint ranges, so the easiest is to sort thewhole lot first and then split.

This might be MySQL-specific, though. I think each query was beingturned into a range scan, from the lowest ID in the IN clause to thehighest, which was why it was useful to get the ranges disjoint and nottoo huge.


James

On 05/10/15 09:49, Sumit Nigam wrote:

Hi,
Would it make any difference if I were to pass non-sorted IDs(secondary indexed) to a huge IN clause? I assume that skip scanoptimization would work in either case.
Also, can any one let me know if there is some limit to beyond howmany such IDs in a large IN clause do I get into diminishing returns?Or is it plainly dependent on specific workloads and memory of regionservers?
Thanks,
Sumit

Re: Skip Scan

Reply via email to