[
https://issues.apache.org/jira/browse/HBASE-27570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Bryan Beaudreault updated HBASE-27570:
--------------------------------------
Description:
Various requests in RSRpcServices use
[addSize|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L1303-L1335]
in an attempt to estimate block IO of the request by pulling the capacity of
the underlying cell buffer when that buffer changes. This is just an estimate
and can be inaccurate in certain circumstances, such as when the ordering of
gets in a multiget causes the same buffer to be counted twice.
As of HBASE-27558, we now have a hook in StoreScanner for tracking block IO via
HFileReaderImpl.reportBlockSize method call. We use this for getting block IO
into ScannerContext for tracking limits. We can add a call to
RpcCall.incrementResponseBlockSize here so that we can get an accurate
accounting of block IO for all request types.
was:
Currently Get and Multiget call a method in RSRpcServices
[addSize|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L1303-L1335],
which attempts to estimate block IO by pulling the capacity of the underlying
cell buffer when that buffer changes. This is just an estimate and can be
inaccurate in certain circumstances, such as when the ordering of gets in a
multiget causes the same buffer to be counted twice.
As of HBASE-27558, ScannerContext now tracks the block IO for each read
request. Gets and Multigets make use of a default scanner context which only
enforces batch size and isn't exposed to RSRpcServices. We can make a small
change to create a ScannerContext with LimitScope.ROW, and use the
ScannerContext.getBlockSize() to get the exact block IO consumed by a query.
> Unify tracking of block IO across all read request types
> --------------------------------------------------------
>
> Key: HBASE-27570
> URL: https://issues.apache.org/jira/browse/HBASE-27570
> Project: HBase
> Issue Type: Improvement
> Reporter: Bryan Beaudreault
> Priority: Major
>
> Various requests in RSRpcServices use
> [addSize|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L1303-L1335]
> in an attempt to estimate block IO of the request by pulling the capacity of
> the underlying cell buffer when that buffer changes. This is just an estimate
> and can be inaccurate in certain circumstances, such as when the ordering of
> gets in a multiget causes the same buffer to be counted twice.
> As of HBASE-27558, we now have a hook in StoreScanner for tracking block IO
> via HFileReaderImpl.reportBlockSize method call. We use this for getting
> block IO into ScannerContext for tracking limits. We can add a call to
> RpcCall.incrementResponseBlockSize here so that we can get an accurate
> accounting of block IO for all request types.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)