[jira] [Commented] (HBASE-11544) [Ergonomics] hbase.client.scanner.caching is dogged and will try to return batch even if it means OOME

Mikhail Antonov (JIRA) Fri, 25 Jul 2014 00:44:31 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-11544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074175#comment-14074175
 ]


Mikhail Antonov commented on HBASE-11544:
-----------------------------------------

bq.  Anything from 1k to 128k should be good as chunk size. 64k seems fine.

bq. or simply never see a cell if it is too big to fit into this size.

[~lhofhansl] as the max cell size now is 10mb IIRC, for the robust solution 
sounds like we should be able to split the cell and pass the portion of byte 
array, representing the cell value?

Thinking on [~enis]'s note about mvcc readpoint i think yeah, sending partial 
rows might be much bigger change (though controlling throttling at bytes level 
would be definitely more efficient than at row level).

To address the issue with OOM, as a first cut may be we can have 2 thresholds 
on HRS side, one is for total amount of memory (% of HRS heap size?) which 
scanner buffers may take (across all clients), and second threshold for max 
cache size for individual scanners?

The first threshold would be used to reject new scanners if HRS feels it's 
about to OOM, if too many clients try to connect, and second one to prevent one 
client from eating up all memory by opening scanners for big rows/cells? 
Thoughts?

[~stack] - could you give some details on what was avg/max size of row/cell in 
your tests, just to estimate of what those thresholds might be in their default 
values?



> [Ergonomics] hbase.client.scanner.caching is dogged and will try to return 
> batch even if it means OOME
> ------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-11544
>                 URL: https://issues.apache.org/jira/browse/HBASE-11544
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>              Labels: noob
>
> Running some tests, I set hbase.client.scanner.caching=1000.  Dataset has 
> large cells.  I kept OOME'ing.
> Serverside, we should measure how much we've accumulated and return to the 
> client whatever we've gathered once we pass out a certain size threshold 
> rather than keep accumulating till we OOME.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-11544) [Ergonomics] hbase.client.scanner.caching is dogged and will try to return batch even if it means OOME

Reply via email to