Bryan Beaudreault created HBASE-27570:
-----------------------------------------

             Summary: Get and Multiget should use ScannerContext to track block 
IO
                 Key: HBASE-27570
                 URL: https://issues.apache.org/jira/browse/HBASE-27570
             Project: HBase
          Issue Type: Improvement
            Reporter: Bryan Beaudreault


Currently Get and Multiget call a method in RSRpcServices 
[addSize|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L1303-L1335],
 which attempts to estimate block IO by pulling the capacity of the underlying 
cell buffer when that buffer changes. This is just an estimate and can be 
inaccurate in certain circumstances, such as when the ordering of gets in a 
multiget causes the same buffer to be counted twice.

As of HBASE-27558, ScannerContext now tracks the block IO for each read 
request. Gets and Multigets make use of a default scanner context which only 
enforces batch size and isn't exposed to RSRpcServices. We can make a small 
change to create a ScannerContext with LimitScope.ROW, and use the 
ScannerContext.getBlockSize() to get the exact block IO consumed by a query.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to