fuatbasik commented on code in PR #7763:
URL: https://github.com/apache/hadoop/pull/7763#discussion_r2182648632


##########
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3AAnalyticsAcceleratorStreamReading.java:
##########
@@ -194,4 +223,96 @@ public void testInvalidConfigurationThrows() throws 
Exception {
         () -> 
S3SeekableInputStreamConfiguration.fromConfiguration(connectorConfiguration));
   }
 
+  /**
+   *
+   * TXT files are classified as SEQUENTIAL format and use 
SequentialPrefetcher(requests the entire 10MB file)
+   * RangeOptimiser splits ranges larger than maxRangeSizeBytes (8MB) using 
partSizeBytes (8MB)
+   * The 10MB range gets split into: [0-8MB) and [8MB-10MB)
+   * Each split range becomes a separate Block, resulting in 2 GET requests:
+   */
+  @Test
+  public void testLargeFileMultipleGets() throws Throwable {
+    describe("Large file should trigger multiple GET requests");
+
+    Path dest = path("large-test-file.txt");
+    byte[] data = dataset(10 * S_1M, 256, 255);
+    writeDataset(getFileSystem(), dest, data, 10 * S_1M, 1024, true);
+
+    byte[] buffer = new byte[S_1M * 10];
+    try (FSDataInputStream inputStream = getFileSystem().open(dest)) {
+      IOStatistics ioStats = inputStream.getIOStatistics();
+      inputStream.readFully(buffer);
+
+      verifyStatisticCounterValue(ioStats, STREAM_READ_ANALYTICS_GET_REQUESTS, 
2);
+      // Because S3A passes in the meta-data(content length) on file open, we 
expect AAL to make no HEAD requests
+      verifyStatisticCounterValue(ioStats, 
STREAM_READ_ANALYTICS_HEAD_REQUESTS, 0);
+    }
+  }
+
+  @Test
+  public void testSmallFileSingleGet() throws Throwable {
+    describe("Small file should trigger only one GET request");
+
+    Path dest = path("small-test-file.txt");
+    byte[] data = dataset(S_1M, 256, 255);
+    writeDataset(getFileSystem(), dest, data, S_1M, 1024, true);
+
+    byte[] buffer = new byte[S_1M];
+    try (FSDataInputStream inputStream = getFileSystem().open(dest)) {
+      IOStatistics ioStats = inputStream.getIOStatistics();
+      inputStream.readFully(buffer);
+
+      verifyStatisticCounterValue(ioStats, STREAM_READ_ANALYTICS_GET_REQUESTS, 
1);
+      // Because S3A passes in the meta-data(content length) on file open, we 
expect AAL to make no HEAD requests
+      verifyStatisticCounterValue(ioStats, 
STREAM_READ_ANALYTICS_HEAD_REQUESTS, 0);
+    }
+  }
+
+
+  @Test
+  public void testRandomSeekPatternGets() throws Throwable {
+    describe("Random seek pattern should optimize GET requests");
+
+    Path dest = path("seek-test.txt");
+    byte[] data = dataset(5 * S_1M, 256, 255);
+    writeDataset(getFileSystem(), dest, data, 5 * S_1M, 1024, true);
+
+    byte[] buffer = new byte[S_1M];
+    try (FSDataInputStream inputStream = getFileSystem().open(dest)) {
+      IOStatistics ioStats = inputStream.getIOStatistics();
+
+      inputStream.read(buffer);
+      inputStream.seek(2 * S_1M);
+      inputStream.read(new byte[512 * S_1K]);
+      inputStream.seek(3 * S_1M);
+      inputStream.read(new byte[512 * S_1K]);
+
+      verifyStatisticCounterValue(ioStats, STREAM_READ_ANALYTICS_GET_REQUESTS, 
1);
+      verifyStatisticCounterValue(ioStats, 
STREAM_READ_ANALYTICS_HEAD_REQUESTS, 0);
+    }
+  }
+
+
+  @Test
+  public void testSequentialStreamsNoDuplicateGets() throws Throwable {

Review Comment:
   we probably want another test where we open a stream and close it, without 
reading from. If thats a Parquet file or a small object there will be some 
reads on AAL. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to