[
https://issues.apache.org/jira/browse/HADOOP-17937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17571832#comment-17571832
]
Steve Loughran commented on HADOOP-17937:
-----------------------------------------
seen this again. as noted, we should postpone buffer allocation until that
first write.
> ITestS3ADeleteFilesOneByOne. testBulkRenameAndDelete OOM: Direct buffer memory
> ------------------------------------------------------------------------------
>
> Key: HADOOP-17937
> URL: https://issues.apache.org/jira/browse/HADOOP-17937
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs/s3, test
> Environment: fs.s3a.fast.upload.buffer = "bytebuffer"
> Reporter: Steve Loughran
> Priority: Minor
>
> on a test setup with bytebuffer, the parallel zero-byte file create phase
> OOMed
> fs.s3a.fast.upload.buffer = "bytebuffer" [core-site.xml]
> fs.s3a.fast.upload.active.blocks = "8" [core-site.xml]
> fs.s3a.multipart.size = "32M" [core-site.xml]
> Root cause: bytebuffer is being allocated on block creation, so every empty
> file took up 32MB of off-heap storage only for this to be released unused in
> close()
> If this allocation was postponed until the first write(), then empty files
> wouldn't need any memory allocation. Do the same on-demand creation for byte
> arrays and filesystem would also have benefits.
> this has implications for HADOOP-17195, which has abfs using a fork of the
> buffering code
> changing the code there to be on-demand would be a good incentive for s3a to
> adopt
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]