[ 
https://issues.apache.org/jira/browse/HBASE-11544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14310373#comment-14310373
 ] 

Lars Hofhansl commented on HBASE-11544:
---------------------------------------

Somewhat related: RPC is inefficient because it keeps the network idle while 
the client is producing the request and the server the response.
Scanning in HBase is particularly dumb about it with caching. We fetch a buffer 
worth of results, then the client works through that buffer, and as we reach 
the end the client says "Oh shit" and requests the next buffer. We can start to 
load the next buffer from the server while the client is working through the 
current one. (sort of a poor-mans streaming)

Not sure that fits into this work, but thought I'd mention it.


> [Ergonomics] hbase.client.scanner.caching is dogged and will try to return 
> batch even if it means OOME
> ------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-11544
>                 URL: https://issues.apache.org/jira/browse/HBASE-11544
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Jonathan Lawlor
>            Priority: Critical
>              Labels: beginner
>
> Running some tests, I set hbase.client.scanner.caching=1000.  Dataset has 
> large cells.  I kept OOME'ing.
> Serverside, we should measure how much we've accumulated and return to the 
> client whatever we've gathered once we pass out a certain size threshold 
> rather than keep accumulating till we OOME.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to