[ https://issues.apache.org/jira/browse/HADOOP-18246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17630739#comment-17630739 ]
ASF GitHub Bot commented on HADOOP-18246: ----------------------------------------- ahmarsuhail commented on code in PR #5120: URL: https://github.com/apache/hadoop/pull/5120#discussion_r1017404868 ########## hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/prefetching.md: ########## @@ -43,6 +43,12 @@ Multiple blocks may be read in parallel. |`fs.s3a.prefetch.block.size` |Size of a block |`8M` | |`fs.s3a.prefetch.block.count` |Number of blocks to prefetch |`8` | +Although, default size of the block for prefetching the input stream is 8 MB, +minimum size allowed to set is 1 byte for a block. +User should set the block size with the understanding that smaller block sizes increases the number of blocks. +Thus, smaller block size affects the performance by increasing the overhead for reading and prefetching +each block. Review Comment: ```suggestion The default size of a block is 8MB, and the minimum allowed block size is 1 byte. Decreasing block size will increase the number of blocks to be read for a file. A smaller block size may negatively impact performance as the number of prefetches required will increase. ``` > Remove lower limit on s3a prefetching/caching block size > -------------------------------------------------------- > > Key: HADOOP-18246 > URL: https://issues.apache.org/jira/browse/HADOOP-18246 > Project: Hadoop Common > Issue Type: Sub-task > Affects Versions: 3.4.0 > Reporter: Daniel Carl Jones > Assignee: Ankit Saurabh > Priority: Minor > Labels: pull-request-available > > The minimum allowed block size currently is {{PREFETCH_BLOCK_DEFAULT_SIZE}} > (8MB). > {code:java} > this.prefetchBlockSize = intOption( > conf, PREFETCH_BLOCK_SIZE_KEY, > PREFETCH_BLOCK_DEFAULT_SIZE, PREFETCH_BLOCK_DEFAULT_SIZE);{code} > [https://github.com/apache/hadoop/blob/3aa03e0eb95bbcb066144706e06509f0e0549196/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L487-L488] > Why is this the case and should we lower or remove it? -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org