[ 
https://issues.apache.org/jira/browse/HBASE-11544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376919#comment-14376919
 ] 

Jonathan Lawlor commented on HBASE-11544:
-----------------------------------------

[~apurtell] [~lhofhansl] Thanks for bringing up these discussion points, I have 
included some discussion below about the design decisions made here and it 
would be great to hear your thoughts on them.

bq. If scanning millions of rows, millions of objects?

Ya

bq. The size estimations are done up in RSRpcServices

To avoid out of memory errors that resulted from very large rows, the size 
calculation was pushed all the way down into StoreScanner to be performed 
between cells (rather than between rows in RSRpcServices). This meant that we 
may reach the size limit in the middle of a row and form a partial result.

With the size calculation pushed all the way down to StoreScanner, we needed 
some way of communicating upwards to the RegionScanner and RSRpcServices when a 
partial result is formed (i.e. we reach the size limit in the middle of a row). 
At first, the intention was to NOT change the return type from boolean. 
However, the implementation with the boolean return type ended up requiring 
many repetitions of the size calculation. 

With the boolean return type, the RegionScanner and RSRpcServices both needed 
to calculate the result size (in addition to the calculation that had been 
pushed down to StoreScanner). RegionScanner and RSRpcServices needed to do this 
in order to check whether or not the size limit had been reached since there 
was no way to communicate this understanding upwards with a boolean that 
indicates more values exists. The problems with this approach were:
* The size calculation was being repeated too much
* The state was not explicit enough. Cells were being returned from 
StoreScanner and it was up to the caller of StoreScanner#next to figure out why 
these were the cells being returned (size limit reached? batch limit reached?). 
The only way for the state to bubble up from the StoreScanner was to repeat all 
of the logic that made the StoreScanner return those Cells.

NextState was introduced to make this communication more explicit and avoid 
replication of size calculations. 

Any alternative approaches are welcomed. If there is a way to keep the boolean 
return type and avoid replication of the size calculation, we could certainly 
try that alternative. Or, if repeating the size calculation is less costly than 
the NextState, perhaps we should go down that route.

> [Ergonomics] hbase.client.scanner.caching is dogged and will try to return 
> batch even if it means OOME
> ------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-11544
>                 URL: https://issues.apache.org/jira/browse/HBASE-11544
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Jonathan Lawlor
>            Priority: Critical
>             Fix For: 2.0.0, 1.1.0
>
>         Attachments: HBASE-11544-branch_1_0-v1.patch, 
> HBASE-11544-branch_1_0-v2.patch, HBASE-11544-v1.patch, HBASE-11544-v2.patch, 
> HBASE-11544-v3.patch, HBASE-11544-v4.patch, HBASE-11544-v5.patch, 
> HBASE-11544-v6.patch, HBASE-11544-v6.patch, HBASE-11544-v6.patch, 
> HBASE-11544-v7.patch, HBASE-11544-v8-branch-1.patch, HBASE-11544-v8.patch, 
> gc.j.png, hits.j.png, mean.png, net.j.png
>
>
> Running some tests, I set hbase.client.scanner.caching=1000.  Dataset has 
> large cells.  I kept OOME'ing.
> Serverside, we should measure how much we've accumulated and return to the 
> client whatever we've gathered once we pass out a certain size threshold 
> rather than keep accumulating till we OOME.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to