[GitHub] [hbase] Apache9 commented on a diff in pull request #4967: HBASE-27558 Scan quotas and limits should account for total block IO

GitBox Fri, 20 Jan 2023 02:08:40 -0800


Apache9 commented on code in PR #4967:
URL: https://github.com/apache/hbase/pull/4967#discussion_r1082300876



##########
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileScanner.java:
##########
@@ -140,4 +140,10 @@ public interface HFileScanner extends Shipper, Closeable {
    */
   @Override
   void close();
+
+  /**
+   * Returns the block size in bytes for the current block. Will only return a 
value once per block,
+   * otherwise 0. Used for calculating block IO in ScannerContext.
+   */
+  int getCurrentBlockSizeOnce();

Review Comment:
   I think it is better to introduce a method called recordBlockSize? The 
comment could say that the implementation should make sure that for every block 
we only record once.



##########
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java:
##########
@@ -3616,8 +3620,9 @@ public ScanResponse scan(final RpcController controller, 
final ScanRequest reque
         if (region.getCoprocessorHost() != null) {
           Boolean bypass = region.getCoprocessorHost().preScannerNext(scanner, 
results, rows);
           if (!results.isEmpty()) {
+            Object lastBlock = null;
             for (Result r : results) {
-              lastBlock.setValue(addSize(rpcCall, r, lastBlock.getValue()));
+              lastBlock = addSize(rpcCall, r, lastBlock);

Review Comment:
   I mean here, we still call addSize with lastBlock, and in addSize we have 
some logic for limiting.



##########
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java:
##########
@@ -3398,7 +3398,6 @@ private void scan(HBaseRpcController controller, 
ScanRequest request, RegionScan
             }
             boolean mayHaveMoreCellsInRow = 
scannerContext.mayHaveMoreCellsInRow();
             Result r = Result.create(values, null, stale, 
mayHaveMoreCellsInRow);
-            lastBlock.setValue(addSize(rpcCall, r, lastBlock.getValue()));

Review Comment:
   Then could we just remove the lastBlock? Seems we still have it declared in 
the caller method.



##########
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionScannerImpl.java:
##########
@@ -608,6 +610,11 @@ private boolean nextInternal(List<Cell> results, 
ScannerContext scannerContext)
           return 
scannerContext.setScannerState(NextState.NO_MORE_VALUES).hasMoreValues();
         }
         if (!shouldStop) {
+          // Read nothing as the cells were filtered, but still need to check 
time limit.

Review Comment:
   This method is on the critial path of reading, and the code here will 
executed every time when we get a row, so it may affect scan performance if we 
add more checks here.
   I just mean is it a must to have a check here? Why we do not need to check 
here in the past...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hbase] Apache9 commented on a diff in pull request #4967: HBASE-27558 Scan quotas and limits should account for total block IO

Reply via email to