Apache9 commented on code in PR #4967:
URL: https://github.com/apache/hbase/pull/4967#discussion_r1082300876
##########
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileScanner.java:
##########
@@ -140,4 +140,10 @@ public interface HFileScanner extends Shipper, Closeable {
*/
@Override
void close();
+
+ /**
+ * Returns the block size in bytes for the current block. Will only return a
value once per block,
+ * otherwise 0. Used for calculating block IO in ScannerContext.
+ */
+ int getCurrentBlockSizeOnce();
Review Comment:
I think it is better to introduce a method called recordBlockSize? The
comment could say that the implementation should make sure that for every block
we only record once.
##########
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java:
##########
@@ -3616,8 +3620,9 @@ public ScanResponse scan(final RpcController controller,
final ScanRequest reque
if (region.getCoprocessorHost() != null) {
Boolean bypass = region.getCoprocessorHost().preScannerNext(scanner,
results, rows);
if (!results.isEmpty()) {
+ Object lastBlock = null;
for (Result r : results) {
- lastBlock.setValue(addSize(rpcCall, r, lastBlock.getValue()));
+ lastBlock = addSize(rpcCall, r, lastBlock);
Review Comment:
I mean here, we still call addSize with lastBlock, and in addSize we have
some logic for limiting.
##########
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java:
##########
@@ -3398,7 +3398,6 @@ private void scan(HBaseRpcController controller,
ScanRequest request, RegionScan
}
boolean mayHaveMoreCellsInRow =
scannerContext.mayHaveMoreCellsInRow();
Result r = Result.create(values, null, stale,
mayHaveMoreCellsInRow);
- lastBlock.setValue(addSize(rpcCall, r, lastBlock.getValue()));
Review Comment:
Then could we just remove the lastBlock? Seems we still have it declared in
the caller method.
##########
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionScannerImpl.java:
##########
@@ -608,6 +610,11 @@ private boolean nextInternal(List<Cell> results,
ScannerContext scannerContext)
return
scannerContext.setScannerState(NextState.NO_MORE_VALUES).hasMoreValues();
}
if (!shouldStop) {
+ // Read nothing as the cells were filtered, but still need to check
time limit.
Review Comment:
This method is on the critial path of reading, and the code here will
executed every time when we get a row, so it may affect scan performance if we
add more checks here.
I just mean is it a must to have a check here? Why we do not need to check
here in the past...
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]