[
https://issues.apache.org/jira/browse/HBASE-27558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Bryan Beaudreault updated HBASE-27558:
--------------------------------------
Fix Version/s: 2.6.0
3.0.0-alpha-4
Release Note: Scan quotas and hbase.server.scanner.max.result.size will
now be enforced against the total bytes of actual blocks scanned by the
request. Block bytes scanned will typically be much higher for heavily filtered
scans, since the cost will include the size of every block read rather than
just the size of the returned cells. If you have heavily filtered scans, you
may need to account for that in your defined quotas.
Assignee: Bryan Beaudreault
Resolution: Fixed
Status: Resolved (was: Patch Available)
Thanks [~zhangduo] for the review!
Pushed to master and branch-2.
> Scan quotas and limits should account for total block IO
> --------------------------------------------------------
>
> Key: HBASE-27558
> URL: https://issues.apache.org/jira/browse/HBASE-27558
> Project: HBase
> Issue Type: Improvement
> Reporter: Bryan Beaudreault
> Assignee: Bryan Beaudreault
> Priority: Major
> Fix For: 2.6.0, 3.0.0-alpha-4
>
>
> Scan and Multi requests pull the byte throughput limit from
> Quotas.getReadAvailable(). Multis validate the result inline in
> RSRpcServices, by checking the accumulated
> {{RpcCallContext.getResponseCellSize}} and {{getResponseBlockSize}} against
> the read available after each action. Scans make use of
> {{{}ScannerContext{}}}, and only checks the total cell serialized size and
> {{{}cell.heapSize(){}}}.
> The handling for Multis was added in HBASE-14978. The block size is checked
> because regardless of the actual cell size, the regionserver needs to retain
> entire blocks backing those cells for the lifetime of a request. If the
> retained blocks grows too large, a regionserver can OOM or experience heavy
> GC pressure.
> So multis validate read available against the actual block size retained for
> the responses, but scans only account for cell sizes. We should extend the
> same block support to scans through ScannerContext tracking block bytes
> scanned.
> Large scans can read over ranges of both returned and filtered rows. Despite
> what's returned the users, the server-side cost of the scan is just as
> impacted by filtered rows as non-filtered.
> Both Scans and Multis take the Math.min of Quotas read available and
> hbase.server.scanner.max.result.size. Scans further take the min of that and
> Scan.setMaxResultSize.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)