steveloughran commented on code in PR #4458:
URL: https://github.com/apache/hadoop/pull/4458#discussion_r923663742


##########
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/statistics/StreamStatisticNames.java:
##########
@@ -387,6 +387,46 @@ public final class StreamStatisticNames {
   public static final String BLOCKS_RELEASED
       = "blocks_released";
 
+  /**
+   * Total number of prefetching operations executed.
+   */
+  public static final String STREAM_READ_PREFETCH_OPERATIONS
+      = "stream_read_prefetch_operations";
+
+  /**
+   * Total number of block in disk cache.
+   */
+  public static final String STREAM_READ_BLOCKS_IN_FILE_CACHE
+      = "stream_read_blocks_in_cache";
+
+  /**
+   * Total number of active prefetch operations.
+   */
+  public static final String STREAM_READ_ACTIVE_PREFETCH_OPERATIONS
+      = "stream_read_active_prefetch_operations";
+
+  /**
+   * Total bytes of memory in use by this input stream.
+   */
+  public static final String STREAM_READ_ACTIVE_MEMORY_IN_USE
+      = "stream_read_active_memory_in_use";
+
+  /**
+   * count/duration of reading a remote block.
+   * IO.

Review Comment:
   why the IO here and below?



##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/common/CachingBlockManager.java:
##########
@@ -330,6 +341,10 @@ private void readBlock(BufferData data, boolean 
isPrefetch, BufferData.State...
         this.read(buffer, offset, size);
         buffer.flip();
         data.setReady(expectedState);
+
+        if(isPrefetch) {

Review Comment:
   nit, spacing



##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/common/BufferPool.java:
##########
@@ -236,11 +244,15 @@ public synchronized void close() {
       }
     }
 
+    int currentPoolSize = this.pool.numCreated();

Review Comment:
   how about you cut the `this.` here and from the existing calls in the same 
method. not our project's style



##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/common/CachingBlockManager.java:
##########
@@ -350,16 +365,18 @@ private void readBlock(BufferData data, boolean 
isPrefetch, BufferData.State...
   private static class PrefetchTask implements Supplier<Void> {
     private final BufferData data;
     private final CachingBlockManager blockManager;
+    private final Instant taskQueuedStartTime;
 
-    PrefetchTask(BufferData data, CachingBlockManager blockManager) {
+    PrefetchTask(BufferData data, CachingBlockManager blockManager, Instant 
taskQueuedStartTime) {
       this.data = data;
       this.blockManager = blockManager;
+      this.taskQueuedStartTime = taskQueuedStartTime;
     }
 
     @Override
     public Void get() {
       try {
-        this.blockManager.prefetch(data);
+        this.blockManager.prefetch(data, taskQueuedStartTime);
       } catch (Exception e) {
         LOG.error("error during prefetch", e);

Review Comment:
   i worry about this. seen problems with transient network failures and abfs 
where the logs are flooded with many parallel stack traces before it recovers.



##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/common/CachingBlockManager.java:
##########
@@ -330,6 +341,10 @@ private void readBlock(BufferData data, boolean 
isPrefetch, BufferData.State...
         this.read(buffer, offset, size);
         buffer.flip();
         data.setReady(expectedState);
+
+        if(isPrefetch) {
+          this.prefetchingStatistics.prefetchOperationCompleted();

Review Comment:
   what happens if the read failed? is the failure count and duration before 
failure recorded separately from the successful calls?



##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/common/BufferPool.java:
##########
@@ -56,26 +58,32 @@ public class BufferPool implements Closeable {
   // Allows associating metadata to each buffer in the pool.
   private Map<BufferData, ByteBuffer> allocated;
 
+  private PrefetchingStatistics prefetchingStatistics;
+
   /**
    * Initializes a new instance of the {@code BufferPool} class.
    *
    * @param size number of buffer in this pool.
    * @param bufferSize size in bytes of each buffer.
+   * @param prefetchingStatistics statistics for this stream.
    *
    * @throws IllegalArgumentException if size is zero or negative.
    * @throws IllegalArgumentException if bufferSize is zero or negative.
    */
-  public BufferPool(int size, int bufferSize) {
+  public BufferPool(int size, int bufferSize, PrefetchingStatistics 
prefetchingStatistics) {

Review Comment:
   we are going to have to rename this class; 
org.apache.hadoop.io.ByteBufferPool has long existed and it will only cause 
confusion. not in this PR, but soon



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to