[ 
https://issues.apache.org/jira/browse/ACCUMULO-3710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christopher Tubbs resolved ACCUMULO-3710.
-----------------------------------------
    Resolution: Abandoned

Closing this stale issue. If this is still a problem, please create a new issue 
or PR at https://github.com/apache/accumulo

> Scanning with many singleton ranges crashes tserver
> ---------------------------------------------------
>
>                 Key: ACCUMULO-3710
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3710
>             Project: Accumulo
>          Issue Type: Bug
>          Components: client, tserver
>    Affects Versions: 1.6.1
>            Reporter: Shana Hutchison
>            Priority: Major
>
> Setup: single-node standalone 1.6.1 Accumulo instance.
> Use case: scan ~1M individual rows, scattered across a ~15GB table.  
> The following steps crash the TabletServer:
> 1. Gather a List of Range objects, each one a singleton range spanning an 
> entire row.
> 2. Create a BatchScanner with one read thread.
> 3. Set the ranges via BatchScanner.setRanges()
> 4. Start iterating through the scanner.
> One solution is to batch the reads into groups of ~10k ranges idea.  
> Comment from Josh Elser:
> {quote}
> Taking a quick glance at the code, it looks like this would be a good place 
> to do some optimization in the BatchScanner's impl 
> (TabletServerBatchReaderImpl). The BatchScanner will bin the ranges to the 
> tablets and the servers hosting those tablets. Normally, this would be spread 
> out, but, in your single server case, all 1M rows would all go to a single 
> TabletServer in one RPC call.
> I'm guessing a good optimization here would be to check the size of a batch 
> of Ranges for a single tabletserver, and when above a certain threshold, 
> split the batch in half and try to reprocess each half (the recursion would 
> naturally keep splitting until we get down to some high-watermark).
> Point being, if your client VM constructed the Ranges without issue, the 
> BatchScanner impl should be smart enough to not knock over a TabletServer.
> {quote}
> Verified to cause an OOME via  tserver_localhost.out:
> {quote}
> #
> # java.lang.OutOfMemoryError: Java heap space
> # -XX:OnOutOfMemoryError="kill -9 %p"
> #   Executing /bin/sh -c "kill -9 12833"...
> {quote}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to