Bryan Beaudreault created HBASE-27570:
-----------------------------------------
Summary: Get and Multiget should use ScannerContext to track block
IO
Key: HBASE-27570
URL: https://issues.apache.org/jira/browse/HBASE-27570
Project: HBase
Issue Type: Improvement
Reporter: Bryan Beaudreault
Currently Get and Multiget call a method in RSRpcServices
[addSize|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L1303-L1335],
which attempts to estimate block IO by pulling the capacity of the underlying
cell buffer when that buffer changes. This is just an estimate and can be
inaccurate in certain circumstances, such as when the ordering of gets in a
multiget causes the same buffer to be counted twice.
As of HBASE-27558, ScannerContext now tracks the block IO for each read
request. Gets and Multigets make use of a default scanner context which only
enforces batch size and isn't exposed to RSRpcServices. We can make a small
change to create a ScannerContext with LimitScope.ROW, and use the
ScannerContext.getBlockSize() to get the exact block IO consumed by a query.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)