[ 
https://issues.apache.org/jira/browse/HBASE-11544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Lawlor updated HBASE-11544:
------------------------------------
    Release Note: 
Results returned from RPC calls may now be returned as partials

When is a Result marked as a partial? 
When the server must stop the scan because the max size limit has been reached. 
Means that the LAST Result returned within the ScanResult's Result array may be 
marked as a partial if the scan's max size limit caused it to stop in the 
middle of a row.

Incompatible Change: The return type of InternalScanners and RegionScanners has 
been changed to NextState from boolean
The previous boolean return value can be accessed via NextState#hasMoreValues()
Provides more context as to what happened inside the scanner

Caching default has been changed to Integer.Max_Value 
This value works together with the new maxResultSize value from HBASE-12976 
(defaults to 2MB) 
Results returned from server on basis of size rather than number of rows
Provides better use of network since row size varies amongst tables

Protobuf models have changed for Result, ScanRequest, and ScanResponse to 
support new partial Results

Partial Results should be invisible to application layer unless 
Scan#setAllowPartials is set

Scan#setAllowPartials has been added to allow the application to request to see 
the partial Results returned by the server rather than have the ClientScanner 
form the complete Result prior to returning it to the application

To disable the use of partial Results on the server, set 
ScanRequest.Builder#setClientHandlesPartials() to be false in the ScanRequest 
issued to server

Partial Results should allow the server to return large rows in parts rather 
than accumulate all the cells for that particular row and run out of memory

  was:
Results returned from RPC calls may now be returned as partials
   - When is a Result marked as a partial? 
*** When the server must stop the scan because the max size limit has been 
reached. Means that the LAST Result returned within the ScanResult's Result 
array may be marked as a partial if the scan's max size limit caused it to stop 
in the middle of a row.
* Incompatible Change: The return type of InternalScanners and RegionScanners 
has been changed to NextState from boolean
** The previous boolean return value can be accessed via 
NextState#hasMoreValues()
** Provides more context as to what happened inside the scanner
* Caching default has been changed to Integer.Max_Value 
** This value works together with the new maxResultSize value from HBASE-12976 
(defaults to 2MB) 
** Results returned from server on basis of size rather than number of rows
** Provides better use of network since row size varies amongst tables
* Protobuf models have changed for Result, ScanRequest, and ScanResponse to 
support new partial Results
* Partial Results should be invisible to application layer unless 
Scan#setAllowPartials is set
* Scan#setAllowPartials has been added to allow the application to request to 
see the partial Results returned by the server rather than have the 
ClientScanner form the complete Result prior to returning it to the application
* To disable the use of partial Results on the server, set 
ScanRequest.Builder#setClientHandlesPartials() to be false in the ScanRequest 
issued to server
* Partial Results should allow the server to return large rows in parts rather 
than accumulate all the cells for that particular row and run out of memory


> [Ergonomics] hbase.client.scanner.caching is dogged and will try to return 
> batch even if it means OOME
> ------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-11544
>                 URL: https://issues.apache.org/jira/browse/HBASE-11544
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Jonathan Lawlor
>            Priority: Critical
>              Labels: beginner
>         Attachments: HBASE-11544-branch_1_0-v1.patch, 
> HBASE-11544-branch_1_0-v2.patch, HBASE-11544-v1.patch, HBASE-11544-v2.patch, 
> HBASE-11544-v3.patch, HBASE-11544-v4.patch, HBASE-11544-v5.patch, 
> HBASE-11544-v6.patch, HBASE-11544-v6.patch, HBASE-11544-v6.patch, 
> HBASE-11544-v7.patch, gc.j.png, hits.j.png, mean.png, net.j.png
>
>
> Running some tests, I set hbase.client.scanner.caching=1000.  Dataset has 
> large cells.  I kept OOME'ing.
> Serverside, we should measure how much we've accumulated and return to the 
> client whatever we've gathered once we pass out a certain size threshold 
> rather than keep accumulating till we OOME.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to