Re: [PR] HBASE-29398: Server side scan metrics for bytes read from FS vs Block cache vs memstore [hbase]

via GitHub Fri, 04 Jul 2025 03:28:11 -0700


haridsv commented on code in PR #7136:
URL: https://github.com/apache/hbase/pull/7136#discussion_r2184703931



##########
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:
##########
@@ -1754,6 +1758,9 @@ protected HFileBlock 
readBlockDataInternal(FSDataInputStream is, long offset,
           headerBuf = HEAP.allocate(hdrSize);
           readAtOffset(is, headerBuf, hdrSize, false, offset, pread);
           headerBuf.rewind();
+          if (isScanMetricsEnabled) {
+            ThreadLocalServerSideScanMetrics.addBytesReadFromFs(hdrSize);
+          }

Review Comment:
   Per the comment above, this block should get executed very rarely and when 
that happens it would end up being costly. I wonder if we should actually have 
a warning and monitor for such occurrences. This is of course unrelated to the 
current PR, but starting a discussion anyway.



##########
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/SegmentScanner.java:
##########
@@ -336,22 +337,40 @@ private Segment getSegment() {
    */
   protected void updateCurrent() {
     ExtendedCell next = null;
+    boolean isScanMetricsEnabled = 
ThreadLocalServerSideScanMetrics.isScanMetricsEnabled();
+    int totalBytesRead = 0;
 
     try {
       while (iter.hasNext()) {
         next = iter.next();
+        if (isScanMetricsEnabled) {
+          // Batch collect bytes to reduce method call overhead
+          totalBytesRead += Segment.getCellLength(next);
+        }
         if (next.getSequenceId() <= this.readPoint) {
           current = next;
+          // Add accumulated bytes before returning
+          if (isScanMetricsEnabled && totalBytesRead > 0) {
+            
ThreadLocalServerSideScanMetrics.addBytesReadFromMemstore(totalBytesRead);
+          }
           return;// skip irrelevant versions
         }
         // for backwardSeek() stay in the boundaries of a single row
         if (stopSkippingKVsIfNextRow && segment.compareRows(next, 
stopSkippingKVsRow) > 0) {
           current = null;
+          // Add accumulated bytes before returning
+          if (isScanMetricsEnabled && totalBytesRead > 0) {
+            
ThreadLocalServerSideScanMetrics.addBytesReadFromMemstore(totalBytesRead);
+          }
           return;
         }
       } // end of while
 
       current = null; // nothing found
+      // Add accumulated bytes at the end
+      if (isScanMetricsEnabled && totalBytesRead > 0) {
+        
ThreadLocalServerSideScanMetrics.addBytesReadFromMemstore(totalBytesRead);
+      }

Review Comment:
   The loop is continuing so might accumulate more, so adding here seems 
incorrect especially that we are not reseting `totalBytesRead`, besides it will 
get added again anyway when the loop ends, am I missing something?



##########
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderImpl.java:
##########
@@ -1114,6 +1115,15 @@ private HFileBlock getCachedBlock(BlockCacheKey 
cacheKey, boolean cacheBlock, bo
             compressedBlock.release();
           }
         }
+        boolean isScanMetricsEnabled = 
ThreadLocalServerSideScanMetrics.isScanMetricsEnabled();
+        if (isScanMetricsEnabled) {
+          int cachedBlockBytesRead = cachedBlock.getOnDiskSizeWithHeader();
+          // Account for the header size of the next block if it exists
+          if (cachedBlock.getNextBlockOnDiskSize() > 0) {
+            cachedBlockBytesRead += cachedBlock.headerSize();
+          }
+          
ThreadLocalServerSideScanMetrics.addBytesReadFromBlockCache(cachedBlockBytesRead);
+        }

Review Comment:
   A nit observation is that the block may get evicted if the block encoding 
doesn't match expected encoding, which I think will happen in case the encoding 
format gets changed since the time the blocks were cached. This should result 
in the block getting read from the disk anyway, so we could avoid counting 
these blocks.



##########
hbase-client/src/main/java/org/apache/hadoop/hbase/client/metrics/ThreadLocalServerSideScanMetrics.java:
##########
@@ -0,0 +1,117 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.client.metrics;
+
+import java.util.concurrent.atomic.AtomicInteger;
+import org.apache.yetus.audience.InterfaceAudience;
+
[email protected]
+public final class ThreadLocalServerSideScanMetrics {
+  private ThreadLocalServerSideScanMetrics() {
+  }
+
+  private static final ThreadLocal<Boolean> isScanMetricsEnabled = new 
ThreadLocal<>() {
+    @Override
+    protected Boolean initialValue() {
+      return false;
+    }
+  };
+
+  private static final ThreadLocal<AtomicInteger> bytesReadFromFs = new 
ThreadLocal<>() {
+    @Override
+    protected AtomicInteger initialValue() {
+      return new AtomicInteger(0);
+    }
+  };
+
+  private static final ThreadLocal<AtomicInteger> bytesReadFromBlockCache = 
new ThreadLocal<>() {
+    @Override
+    protected AtomicInteger initialValue() {
+      return new AtomicInteger(0);
+    }
+  };
+
+  private static final ThreadLocal<AtomicInteger> bytesReadFromMemstore = new 
ThreadLocal<>() {
+    @Override
+    protected AtomicInteger initialValue() {
+      return new AtomicInteger(0);
+    }
+  };
+
+  public static final void setScanMetricsEnabled(boolean enable) {
+    isScanMetricsEnabled.set(enable);
+  }
+
+  public static final int addBytesReadFromFs(int bytes) {
+    return bytesReadFromFs.get().addAndGet(bytes);
+  }
+
+  public static final int addBytesReadFromBlockCache(int bytes) {
+    return bytesReadFromBlockCache.get().addAndGet(bytes);
+  }
+
+  public static final int addBytesReadFromMemstore(int bytes) {
+    return bytesReadFromMemstore.get().addAndGet(bytes);
+  }
+
+  public static final boolean isScanMetricsEnabled() {
+    return isScanMetricsEnabled.get();
+  }
+
+  public static final AtomicInteger getBytesReadFromFsCounter() {
+    return bytesReadFromFs.get();
+  }
+
+  public static final AtomicInteger getBytesReadFromBlockCacheCounter() {
+    return bytesReadFromBlockCache.get();
+  }
+
+  public static final AtomicInteger getBytesReadFromMemstoreCounter() {
+    return bytesReadFromMemstore.get();
+  }
+
+  public static final int getBytesReadFromFsAndReset() {
+    return getBytesReadFromFsCounter().getAndSet(0);
+  }
+
+  public static final int getBytesReadFromBlockCacheAndReset() {
+    return getBytesReadFromBlockCacheCounter().getAndSet(0);
+  }
+
+  public static final int getBytesReadFromMemstoreAndReset() {
+    return getBytesReadFromMemstoreCounter().getAndSet(0);
+  }
+
+  public static final void reset() {
+    getBytesReadFromFsAndReset();
+    getBytesReadFromBlockCacheAndReset();
+    getBytesReadFromMemstoreAndReset();

Review Comment:
   Are the threads used here coming from a pool? Do those threads also get used 
for any other purpose? Just wondering if calling remove would be a better 
option, in case they can lie around unused.



##########
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/ParallelSeekHandler.java:
##########
@@ -46,12 +51,25 @@ public ParallelSeekHandler(KeyValueScanner scanner, 
ExtendedCell keyValue, long
     this.keyValue = keyValue;
     this.readPoint = readPoint;
     this.latch = latch;
+    this.isScanMetricsEnabled = 
ThreadLocalServerSideScanMetrics.isScanMetricsEnabled();
+    this.bytesReadFromFs = 
ThreadLocalServerSideScanMetrics.getBytesReadFromFsCounter();
+    this.bytesReadFromBlockCache =
+      ThreadLocalServerSideScanMetrics.getBytesReadFromBlockCacheCounter();
   }
 
   @Override
   public void process() {
     try {
+      
ThreadLocalServerSideScanMetrics.setScanMetricsEnabled(isScanMetricsEnabled);
+      if (isScanMetricsEnabled) {
+        ThreadLocalServerSideScanMetrics.reset();
+      }
       scanner.seek(keyValue);

Review Comment:
   What read activity does this exactly incur?



##########
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionScannerImpl.java:
##########
@@ -145,7 +150,18 @@ private static boolean hasNonce(HRegion region, long 
nonce) {
     } finally {
       
region.smallestReadPointCalcLock.unlock(ReadPointCalculationLock.LockType.RECORDING_LOCK);
     }
+    boolean isScanMetricsEnabled = scan.isScanMetricsEnabled();
+    
ThreadLocalServerSideScanMetrics.setScanMetricsEnabled(isScanMetricsEnabled);
+    if (isScanMetricsEnabled) {
+      ThreadLocalServerSideScanMetrics.reset();
+    }
     initializeScanners(scan, additionalScanners);
+    if (isScanMetricsEnabled) {
+      bytesReadFromFs += 
ThreadLocalServerSideScanMetrics.getBytesReadFromFsAndReset();

Review Comment:
   This corresponds to any bytes read for trailer and metadata, which will only 
happen if the file is being read the first time or if the cached information is 
ejected correct? This makes me wonder if this should be tracked separately 
instead of adding to bytesReadFromFs as this is typically a one time thing. 
However, if we track it separately, it can give insights into the occasional 
spikes in latencies that are the result of these cache evictions.



##########
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/SegmentScanner.java:
##########
@@ -336,22 +339,39 @@ private Segment getSegment() {
    */
   protected void updateCurrent() {
     ExtendedCell next = null;
+    int totalBytesRead = 0;
 
     try {
       while (iter.hasNext()) {
         next = iter.next();
+        if (isScanMetricsEnabled) {
+          // Batch collect bytes to reduce method call overhead
+          totalBytesRead += Segment.getCellLength(next);
+        }
         if (next.getSequenceId() <= this.readPoint) {
           current = next;
+          // Add accumulated bytes before returning
+          if (isScanMetricsEnabled && totalBytesRead > 0) {
+            
ThreadLocalServerSideScanMetrics.addBytesReadFromMemstore(totalBytesRead);
+          }
           return;// skip irrelevant versions
         }
         // for backwardSeek() stay in the boundaries of a single row
         if (stopSkippingKVsIfNextRow && segment.compareRows(next, 
stopSkippingKVsRow) > 0) {
           current = null;
+          // Add accumulated bytes before returning
+          if (isScanMetricsEnabled && totalBytesRead > 0) {
+            
ThreadLocalServerSideScanMetrics.addBytesReadFromMemstore(totalBytesRead);
+          }

Review Comment:
   Can't we move this  to finally block and avoid repeating before both return 
statements?



##########
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CompoundBloomFilter.java:
##########
@@ -120,7 +120,7 @@ private boolean containsInternal(byte[] key, int keyOffset, 
int keyLength, ByteB
     return result;
   }
 
-  private HFileBlock getBloomBlock(int block) {
+  public HFileBlock getBloomBlock(int block) {

Review Comment:
   You are making this and others and a few others public for the sake of 
testing? You can add:
   ```
     @InterfaceAudience.LimitedPrivate(HBaseInterfaceAudience.UNITTEST)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] HBASE-29398: Server side scan metrics for bytes read from FS vs Block cache vs memstore [hbase]

Reply via email to