[
https://issues.apache.org/jira/browse/HBASE-27558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17655600#comment-17655600
]
Bryan Beaudreault commented on HBASE-27558:
-------------------------------------------
As of HBASE-18294, ScannerContext dataSize and heapSize fields are almost
identical. dataSize is “cell.getSerializedSize() + Bytes.{_}SIZEOF_INT”{_} per
PrivateCellUtil.estimatedSerializedSizeOf. heapSize is
"cell.getSerializedSize() + FIXED_OVERHEAD", per all of the cell
implementations of that method. The fixed overhead will often be on the order
of 50-60 bytes depending on the extra fields in each object. It seems sort of
pointless to have 2 such similar values, and from a read perspective the
heapSize is actually incorrect.
On the server side, the actual memory retained for a read must include the
actual length of the block(s) backing those cells. The full blocks are held in
memory until the request is finished and they are released. So for
ScannerContext I suggest we increment heapSize by cell.heapSize() -
cell.getSerializedSize(). We’d also increment it by blockSize for each block
loaded (and retained) during the request.
Additionally, will add a new "blockSize" field to ScannerContext which will be
incremented for all blocks read during the request (not just retained). The
difference between this and heapSize would depend on how much of requested
blocks were able to be released early due to filters (see HBASE-27227)
> Scan quotas and limits should account for total block IO
> --------------------------------------------------------
>
> Key: HBASE-27558
> URL: https://issues.apache.org/jira/browse/HBASE-27558
> Project: HBase
> Issue Type: Improvement
> Reporter: Bryan Beaudreault
> Priority: Major
>
> Scan and Multi requests pull the byte throughput limit from
> Quotas.getReadAvailable(). Multis validate the result inline in
> RSRpcServices, by checking the accumulated
> {{RpcCallContext.getResponseCellSize}} and {{getResponseBlockSize}} against
> the read available after each action. Scans make use of
> {{{}ScannerContext{}}}, and only checks the total cell serialized size and
> {{{}cell.heapSize(){}}}.
> The handling for Multis was added in HBASE-14978. The block size is checked
> because regardless of the actual cell size, the regionserver needs to retain
> entire blocks backing those cells for the lifetime of a request. If the
> retained blocks grows too large, a regionserver can OOM or experience heavy
> GC pressure.
> So multis validate read available against the actual block size retained for
> the responses, but scans only account for cell sizes. We should extend the
> same block support to scans through ScannerContext tracking block bytes
> scanned.
> Large scans can read over ranges of both returned and filtered rows. Despite
> what's returned the users, the server-side cost of the scan is just as
> impacted by filtered rows as non-filtered.
> Both Scans and Multis take the Math.min of Quotas read available and
> hbase.server.scanner.max.result.size. Scans further take the min of that and
> Scan.setMaxResultSize.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)