Can you write the results back to a temporary accumulo table? On Oct 28, 2015 4:00 PM, "Rob Povey" <[email protected]> wrote:
> Unfortunately that’s pretty much what I’m doing now, and the results are > large enough that pulling them back and sorting them causes fairly dramatic > GC issues. > If I could get them in sorted order I no longer need to retain them, I can > just process them and discard them eliminating my GC issues. > I think the way I’ll end up working around this in the short term is to > pull pages of data from a batch scanner, sort those, then combine the paged > results. That should be manageable. > > Rob Povey > > From: Keith Turner <[email protected]> > Reply-To: "[email protected]" <[email protected]> > Date: Wednesday, October 28, 2015 at 8:04 AM > To: "[email protected]" <[email protected]> > Subject: Re: Is there a sensible way to do this? Sequential Batch Scanner > > Will the results always fit into memory? If so could put results from > batch scanner into ArrayList and sort it. > > On Tue, Oct 27, 2015 at 6:21 PM, Rob Povey <[email protected]> wrote: > >> What I want is something that behaves like a BatchScanner (I.e. Takes a >> collection of Ranges in a single RPC), but preserves the scan ordering. >> I understand this would greatly impact performance, but in my case I can >> manually partition my request on the client, and send one request per >> tablet. >> I can’t use scanners, because in some cases I have 10’s of thousands of >> none consecutive ranges. >> If I use a single threaded BatchScanner, and only request data from a >> single Tablet, am I guaranteed ordering? >> This appears to work correctly in my small tests (albeit slower than a >> single 1 thread Batch scanner call), but I don’t really want to have to >> rely on it if the semantic isn’t guaranteed. >> If not Is there another “efficient” way to do this. >> >> Thanks >> >> Rob Povey >> >> >
