[jira] [Created] (ACCUMULO-3710) Scanning with many singleton ranges crashes tserver

Dylan Hutchison (JIRA) Thu, 02 Apr 2015 22:15:07 -0700

Dylan Hutchison created ACCUMULO-3710:
-----------------------------------------


             Summary: Scanning with many singleton ranges crashes tserver
                 Key: ACCUMULO-3710
                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3710
             Project: Accumulo
          Issue Type: Bug
          Components: client, tserver
    Affects Versions: 1.6.1
            Reporter: Dylan Hutchison


Setup: single-node standalone 1.6.1 Accumulo instance.
Use case: scan ~1M individual rows, scattered across a ~15GB table.  
The following steps crash the TabletServer:

1. Gather a List of Range objects, each one a singleton range spanning an 
entire row.
2. Create a BatchScanner with one read thread.
3. Set the ranges via BatchScanner.setRanges()
4. Start iterating through the scanner.

One solution is to batch the reads into groups of ~10k ranges idea.  

Comment from Josh Elser:
{quote}
Taking a quick glance at the code, it looks like this would be a good place to 
do some optimization in the BatchScanner's impl (TabletServerBatchReaderImpl). 
The BatchScanner will bin the ranges to the tablets and the servers hosting 
those tablets. Normally, this would be spread out, but, in your single server 
case, all 1M rows would all go to a single TabletServer in one RPC call.

I'm guessing a good optimization here would be to check the size of a batch of 
Ranges for a single tabletserver, and when above a certain threshold, split the 
batch in half and try to reprocess each half (the recursion would naturally 
keep splitting until we get down to some high-watermark).

Point being, if your client VM constructed the Ranges without issue, the 
BatchScanner impl should be smart enough to not knock over a TabletServer.
{quote}

Verified to cause an OOME via  tserver_localhost.out:
{quote}
#
# java.lang.OutOfMemoryError: Java heap space
# -XX:OnOutOfMemoryError="kill -9 %p"
#   Executing /bin/sh -c "kill -9 12833"...
{quote}





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (ACCUMULO-3710) Scanning with many singleton ranges crashes tserver

Reply via email to