That seems perfectly reasonable to me, IMO. I'm surprised to hear the
tserver crashed.
Taking a quick glance at the code, it looks like this would be a good
place to do some optimization in the BatchScanner's impl
(TabletServerBatchReaderImpl). The BatchScanner will bin the ranges to
the tablets and the servers hosting those tablets. Normally, this would
be spread out, but, in your single server case, all 1M rows would all go
to a single TabletServer in one RPC call.
I'm guessing a good optimization here would be to check the size of a
batch of Ranges for a single tabletserver, and when above a certain
threshold, split the batch in half and try to reprocess each half (the
recursion would naturally keep splitting until we get down to some
high-watermark).
Point being, if your client VM constructed the Ranges without issue, the
BatchScanner impl should be smart enough to not knock over a TabletServer.
What was the reason the tserver died? OOME? Was there anything at the
end of the log files or in the .out/.err files?
- Josh
Dylan Hutchison wrote:
A friend of mine has a use case where he wants to scan ~1M individual
rows, scattered across a ~15GB table. He performed the following:
1. Gather a List of Range objects, each one a singleton range spanning
an entire row.
2. Create a BatchScanner with one read thread.
3. Set the ranges via BatchScanner.setRanges()
4. Start iterating through the scanner.
Performing these steps crashed the TabletServer for my friend (haven't
had time to verify it myself yet). We're using a single-node standalone
1.6.1 Accumulo instance.
Is this a bad way to use Accumulo? I advised my friend to batch the
reads into groups of ~10k ranges and see if that helps. I wanted to
check with the community and see if we're doing something weird. If the
behavior should have worked, I can try to put together a test case
reproducing it, that creates a table with many entries and then scans
with many ranges.
Thanks,
Dylan Hutchison