Re: Scanning with many singleton ranges?

Josh Elser Thu, 02 Apr 2015 15:35:47 -0700

That seems perfectly reasonable to me, IMO. I'm surprised to hear thetserver crashed.

Taking a quick glance at the code, it looks like this would be a goodplace to do some optimization in the BatchScanner's impl(TabletServerBatchReaderImpl). The BatchScanner will bin the ranges tothe tablets and the servers hosting those tablets. Normally, this wouldbe spread out, but, in your single server case, all 1M rows would all goto a single TabletServer in one RPC call.

I'm guessing a good optimization here would be to check the size of abatch of Ranges for a single tabletserver, and when above a certainthreshold, split the batch in half and try to reprocess each half (therecursion would naturally keep splitting until we get down to somehigh-watermark).

Point being, if your client VM constructed the Ranges without issue, theBatchScanner impl should be smart enough to not knock over a TabletServer.

What was the reason the tserver died? OOME? Was there anything at theend of the log files or in the .out/.err files?


- Josh

Dylan Hutchison wrote:

A friend of mine has a use case where he wants to scan ~1M individual
rows, scattered across a ~15GB table.  He performed the following:

1. Gather a List of Range objects, each one a singleton range spanning
an entire row.
2. Create a BatchScanner with one read thread.
3. Set the ranges via BatchScanner.setRanges()
4. Start iterating through the scanner.

Performing these steps crashed the TabletServer for my friend (haven't
had time to verify it myself yet). We're using a single-node standalone
1.6.1 Accumulo instance.

Is this a bad way to use Accumulo?  I advised my friend to batch the
reads into groups of ~10k ranges and see if that helps.  I wanted to
check with the community and see if we're doing something weird.  If the
behavior should have worked, I can try to put together a test case
reproducing it, that creates a table with many entries and then scans
with many ranges.

Thanks,
Dylan Hutchison

Re: Scanning with many singleton ranges?

Reply via email to