A friend of mine has a use case where he wants to scan ~1M individual rows,
scattered across a ~15GB table.  He performed the following:

1. Gather a List of Range objects, each one a singleton range spanning an
entire row.
2. Create a BatchScanner with one read thread.
3. Set the ranges via BatchScanner.setRanges()
4. Start iterating through the scanner.

Performing these steps crashed the TabletServer for my friend (haven't had
time to verify it myself yet). We're using a single-node standalone 1.6.1
Accumulo instance.

Is this a bad way to use Accumulo?  I advised my friend to batch the reads
into groups of ~10k ranges and see if that helps.  I wanted to check with
the community and see if we're doing something weird.  If the behavior
should have worked, I can try to put together a test case reproducing it,
that creates a table with many entries and then scans with many ranges.

Thanks,
Dylan Hutchison

Reply via email to