[GitHub] [hadoop] steveloughran commented on a change in pull request #2154: HADOOP-17113. Adding ReadAhead Counters in ABFS

GitBox Tue, 21 Jul 2020 03:19:43 -0700


steveloughran commented on a change in pull request #2154:
URL: https://github.com/apache/hadoop/pull/2154#discussion_r457991108




##########
File path: 
hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAbfsInputStreamStatistics.java
##########
@@ -285,6 +292,96 @@ public void testWithNullStreamStatistics() throws 
IOException {
     }
   }
 
+  /**
+   * Testing readAhead counters in AbfsInputStream with 30 seconds timeout.
+   */
+  @Test(timeout = TIMEOUT_30_SECONDS)
+  public void testReadAheadCounters() throws IOException {
+    describe("Test to check correct values for readAhead counters in "
+        + "AbfsInputStream");
+
+    AzureBlobFileSystem fs = getFileSystem();
+    AzureBlobFileSystemStore abfss = fs.getAbfsStore();
+    Path readAheadCountersPath = path(getMethodName());
+
+    /*
+     * Setting the block size for readAhead as 4KB.
+     */
+    abfss.getAbfsConfiguration().setReadBufferSize(CUSTOM_BLOCK_BUFFER_SIZE);
+
+    AbfsOutputStream out = null;
+    AbfsInputStream in = null;
+
+    try {
+
+      /*
+       * Creating a file of 1MB size.
+       */
+      out = createAbfsOutputStreamWithFlushEnabled(fs, readAheadCountersPath);
+      out.write(defBuffer);
+      out.close();
+
+      in = abfss.openFileForRead(readAheadCountersPath, fs.getFsStatistics());
+
+      /*
+       * Reading 1KB after each i * KB positions. Hence the reads are from 0
+       * to 1KB, 1KB to 2KB, and so on.. for 5 operations.
+       */
+      for (int i = 0; i < 5; i++) {
+        in.seek(ONE_KB * i);
+        in.read(defBuffer, ONE_KB * i, ONE_KB);
+      }
+      AbfsInputStreamStatisticsImpl stats =
+          (AbfsInputStreamStatisticsImpl) in.getStreamStatistics();
+
+      /*
+       * Since, readAhead is done in background threads. Sometimes, the
+       * threads aren't finished in the background and could result in
+       * inaccurate results. So, we wait till we have the accurate values
+       * with a limit of 30 seconds as that's when the test times out.
+       *
+       */
+      while (stats.getRemoteBytesRead() < CUSTOM_READ_AHEAD_BUFFER_SIZE
+          || stats.getReadAheadBytesRead() < CUSTOM_BLOCK_BUFFER_SIZE) {
+        Thread.sleep(THREAD_SLEEP_10_SECONDS);
+      }
+
+      /*
+       * Verifying the counter values of readAheadBytesRead and 
remoteBytesRead.
+       *
+       * readAheadBytesRead : Since, we read 1KBs 5 times, that means we go
+       * from 0 to 5KB in the file. The bufferSize is set to 4KB, and since
+       * we have 8 blocks of readAhead buffer. We would have 8 blocks of 4KB
+       * buffer. Our read is till 5KB, hence readAhead would ideally read 2
+       * blocks of 4KB which is equal to 8KB. But, sometimes to get more than
+       * one block from readAhead buffer we might have to wait for background
+       * threads to fill the buffer and hence we might do remote read which
+       * would be faster. Therefore, readAheadBytesRead would be equal to or
+       * greater than 4KB.
+       *
+       * remoteBytesRead : Since, the bufferSize is set to 4KB and the number
+       * of blocks or readAheadQueueDepth is equal to 8. We would read 8 * 4
+       * KB buffer on the first read, which is equal to 32KB. But, if we are 
not
+       * able to read some bytes that were in the buffer after doing
+       * readAhead, we might use remote read again. Thus, the bytes read
+       * remotely could also be greater than 32Kb.
+       *
+       */
+      Assertions.assertThat(stats.getReadAheadBytesRead()).describedAs(
+          "Mismatch in readAheadBytesRead counter value")
+          .isGreaterThanOrEqualTo(CUSTOM_BLOCK_BUFFER_SIZE);
+
+      Assertions.assertThat(stats.getRemoteBytesRead()).describedAs(
+          "Mismatch in remoteBytesRead counter value")
+          .isGreaterThanOrEqualTo(CUSTOM_READ_AHEAD_BUFFER_SIZE);
+
+    } catch (InterruptedException e) {
+      e.printStackTrace();

Review comment:
       can't we just throw this? If not, at least use LOG




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [hadoop] steveloughran commented on a change in pull request #2154: HADOOP-17113. Adding ReadAhead Counters in ABFS

Reply via email to