[ 
https://issues.apache.org/jira/browse/HBASE-27570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bryan Beaudreault updated HBASE-27570:
--------------------------------------
    Description: 
Various requests in RSRpcServices use 
[addSize|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L1303-L1335]
 in an attempt to estimate block IO of the request by pulling the capacity of 
the underlying cell buffer when that buffer changes. This is just an estimate 
and can be inaccurate in certain circumstances, such as when the ordering of 
gets in a multiget causes the same buffer to be counted twice.

As of HBASE-27558, we now have a hook in StoreScanner for tracking block IO via 
HFileReaderImpl.reportBlockSize method call. We use this for getting block IO 
into ScannerContext for tracking limits. We can add a call to 
RpcCall.incrementResponseBlockSize here so that we can get an accurate 
accounting of block IO for all request types.

  was:
Currently Get and Multiget call a method in RSRpcServices 
[addSize|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L1303-L1335],
 which attempts to estimate block IO by pulling the capacity of the underlying 
cell buffer when that buffer changes. This is just an estimate and can be 
inaccurate in certain circumstances, such as when the ordering of gets in a 
multiget causes the same buffer to be counted twice.

As of HBASE-27558, ScannerContext now tracks the block IO for each read 
request. Gets and Multigets make use of a default scanner context which only 
enforces batch size and isn't exposed to RSRpcServices. We can make a small 
change to create a ScannerContext with LimitScope.ROW, and use the 
ScannerContext.getBlockSize() to get the exact block IO consumed by a query.


> Unify tracking of block IO across all read request types
> --------------------------------------------------------
>
>                 Key: HBASE-27570
>                 URL: https://issues.apache.org/jira/browse/HBASE-27570
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Bryan Beaudreault
>            Priority: Major
>
> Various requests in RSRpcServices use 
> [addSize|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L1303-L1335]
>  in an attempt to estimate block IO of the request by pulling the capacity of 
> the underlying cell buffer when that buffer changes. This is just an estimate 
> and can be inaccurate in certain circumstances, such as when the ordering of 
> gets in a multiget causes the same buffer to be counted twice.
> As of HBASE-27558, we now have a hook in StoreScanner for tracking block IO 
> via HFileReaderImpl.reportBlockSize method call. We use this for getting 
> block IO into ScannerContext for tracking limits. We can add a call to 
> RpcCall.incrementResponseBlockSize here so that we can get an accurate 
> accounting of block IO for all request types.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to