ahmarsuhail opened a new pull request, #4458: URL: https://github.com/apache/hadoop/pull/4458
### Description of PR This PR adds in iostats for the prefetching input stream. This PR is dependent on https://github.com/apache/hadoop/pull/4386, which fixes issues after the rebase. Once that gets in merged in, this PR can also update `ITestS3PrefetchingInputStream` with assertions on more stats. The following stats are added: **Counters:** `STREAM_READ_PREFETCH_OPERATIONS`: Total number of prefetch ops requested **Duration:** `STREAM_READ_REMOTE_BLOCK_READ`: Time taken to read a full block of data from S3 `STREAM_READ_BLOCK_ACQUIRE_AND_READ`: Time taken to acquire a buffer from the pool and read data into it. This is not for prefetching ops, but when the first block of data is read (on first read or after a seek) `ACTION_EXECUTOR_ACQUIRED`: Time taken to acquire an executor. Either by the prefetching task or the caching task. **Gauges:** `STREAM_READ_BLOCKS_IN_FILE_CACHE`: Blocks currently in file cache `STREAM_READ_ACTIVE_PREFETCH_OPERATIONS`: Current active prefetch operations `STREAM_READ_ACTIVE_MEMORY_IN_USE`: Current active memory in use There are still some more stats that can be added, especially around cache usage. I'll follow up with another PR for those if we're happy with the approach of this PR. ### How was this patch tested? Tested in eu-west-1 by running `mvn -Dparallel-tests -DtestsThreadCount=16 clean verify` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
