[
https://issues.apache.org/jira/browse/HBASE-11544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jonathan Lawlor updated HBASE-11544:
------------------------------------
Release Note:
Results returned from RPC calls may now be returned as partials
When is a Result marked as a partial?
When the server must stop the scan because the max size limit has been reached.
Means that the LAST Result returned within the ScanResult's Result array may be
marked as a partial if the scan's max size limit caused it to stop in the
middle of a row.
Incompatible Change: The return type of InternalScanners and RegionScanners has
been changed to NextState from boolean
The previous boolean return value can be accessed via NextState#hasMoreValues()
Provides more context as to what happened inside the scanner
Caching default has been changed to Integer.Max_Value
This value works together with the new maxResultSize value from HBASE-12976
(defaults to 2MB)
Results returned from server on basis of size rather than number of rows
Provides better use of network since row size varies amongst tables
Protobuf models have changed for Result, ScanRequest, and ScanResponse to
support new partial Results
Partial Results should be invisible to application layer unless
Scan#setAllowPartials is set
Scan#setAllowPartials has been added to allow the application to request to see
the partial Results returned by the server rather than have the ClientScanner
form the complete Result prior to returning it to the application
To disable the use of partial Results on the server, set
ScanRequest.Builder#setClientHandlesPartials() to be false in the ScanRequest
issued to server
Partial Results should allow the server to return large rows in parts rather
than accumulate all the cells for that particular row and run out of memory
was:
Results returned from RPC calls may now be returned as partials
- When is a Result marked as a partial?
*** When the server must stop the scan because the max size limit has been
reached. Means that the LAST Result returned within the ScanResult's Result
array may be marked as a partial if the scan's max size limit caused it to stop
in the middle of a row.
* Incompatible Change: The return type of InternalScanners and RegionScanners
has been changed to NextState from boolean
** The previous boolean return value can be accessed via
NextState#hasMoreValues()
** Provides more context as to what happened inside the scanner
* Caching default has been changed to Integer.Max_Value
** This value works together with the new maxResultSize value from HBASE-12976
(defaults to 2MB)
** Results returned from server on basis of size rather than number of rows
** Provides better use of network since row size varies amongst tables
* Protobuf models have changed for Result, ScanRequest, and ScanResponse to
support new partial Results
* Partial Results should be invisible to application layer unless
Scan#setAllowPartials is set
* Scan#setAllowPartials has been added to allow the application to request to
see the partial Results returned by the server rather than have the
ClientScanner form the complete Result prior to returning it to the application
* To disable the use of partial Results on the server, set
ScanRequest.Builder#setClientHandlesPartials() to be false in the ScanRequest
issued to server
* Partial Results should allow the server to return large rows in parts rather
than accumulate all the cells for that particular row and run out of memory
> [Ergonomics] hbase.client.scanner.caching is dogged and will try to return
> batch even if it means OOME
> ------------------------------------------------------------------------------------------------------
>
> Key: HBASE-11544
> URL: https://issues.apache.org/jira/browse/HBASE-11544
> Project: HBase
> Issue Type: Bug
> Reporter: stack
> Assignee: Jonathan Lawlor
> Priority: Critical
> Labels: beginner
> Attachments: HBASE-11544-branch_1_0-v1.patch,
> HBASE-11544-branch_1_0-v2.patch, HBASE-11544-v1.patch, HBASE-11544-v2.patch,
> HBASE-11544-v3.patch, HBASE-11544-v4.patch, HBASE-11544-v5.patch,
> HBASE-11544-v6.patch, HBASE-11544-v6.patch, HBASE-11544-v6.patch,
> HBASE-11544-v7.patch, gc.j.png, hits.j.png, mean.png, net.j.png
>
>
> Running some tests, I set hbase.client.scanner.caching=1000. Dataset has
> large cells. I kept OOME'ing.
> Serverside, we should measure how much we've accumulated and return to the
> client whatever we've gathered once we pass out a certain size threshold
> rather than keep accumulating till we OOME.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)