Unfortunately that’s pretty much what I’m doing now, and the results are large 
enough that pulling them back and sorting them causes fairly dramatic GC issues.
If I could get them in sorted order I no longer need to retain them, I can just 
process them and discard them eliminating my GC issues.
I think the way I’ll end up working around this in the short term is to pull 
pages of data from a batch scanner, sort those, then combine the paged results. 
That should be manageable.

Rob Povey

From: Keith Turner <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Wednesday, October 28, 2015 at 8:04 AM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Re: Is there a sensible way to do this? Sequential Batch Scanner

Will the results always fit into memory?  If so could put results from batch 
scanner into ArrayList and sort it.

On Tue, Oct 27, 2015 at 6:21 PM, Rob Povey 
<[email protected]<mailto:[email protected]>> wrote:
What I want is something that behaves like a BatchScanner (I.e. Takes a 
collection of Ranges in a single RPC), but preserves the scan ordering.
I understand this would greatly impact performance, but in my case I can 
manually partition my request on the client, and send one request per tablet.
I can’t use scanners, because in some cases I have 10’s of thousands of none 
consecutive ranges.
If I use a single threaded BatchScanner, and only request data from a single 
Tablet, am I guaranteed ordering?
This appears to work correctly in my small tests (albeit slower than a single 1 
thread Batch scanner call), but I don’t really want to have to rely on it if 
the semantic isn’t guaranteed.
If not Is there another “efficient” way to do this.

Thanks

Rob Povey


Reply via email to