[ 
https://issues.apache.org/jira/browse/HADOOP-18246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17630739#comment-17630739
 ] 

ASF GitHub Bot commented on HADOOP-18246:
-----------------------------------------

ahmarsuhail commented on code in PR #5120:
URL: https://github.com/apache/hadoop/pull/5120#discussion_r1017404868


##########
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/prefetching.md:
##########
@@ -43,6 +43,12 @@ Multiple blocks may be read in parallel.
 |`fs.s3a.prefetch.block.size`    |Size of a block    |`8M`    |
 |`fs.s3a.prefetch.block.count`    |Number of blocks to prefetch    |`8`    |
 
+Although, default size of the block for prefetching the input stream is 8 MB, 
+minimum size allowed to set is 1 byte for a block.
+User should set the block size with the understanding that smaller block sizes 
increases the number of blocks.
+Thus, smaller block size affects the performance by increasing the overhead 
for reading and prefetching
+each block.

Review Comment:
   ```suggestion
   The default size of a block is 8MB, and the minimum allowed block size is 1 
byte. 
   Decreasing block size will increase the number of blocks to be read for a 
file. 
   A smaller block size may negatively impact performance as the number of 
prefetches required will increase. 
   ```





> Remove lower limit on s3a prefetching/caching block size
> --------------------------------------------------------
>
>                 Key: HADOOP-18246
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18246
>             Project: Hadoop Common
>          Issue Type: Sub-task
>    Affects Versions: 3.4.0
>            Reporter: Daniel Carl Jones
>            Assignee: Ankit Saurabh
>            Priority: Minor
>              Labels: pull-request-available
>
> The minimum allowed block size currently is {{PREFETCH_BLOCK_DEFAULT_SIZE}} 
> (8MB).
> {code:java}
> this.prefetchBlockSize = intOption(
>                     conf, PREFETCH_BLOCK_SIZE_KEY, 
> PREFETCH_BLOCK_DEFAULT_SIZE, PREFETCH_BLOCK_DEFAULT_SIZE);{code}
> [https://github.com/apache/hadoop/blob/3aa03e0eb95bbcb066144706e06509f0e0549196/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L487-L488]
> Why is this the case and should we lower or remove it?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to