[ https://issues.apache.org/jira/browse/MAPREDUCE-7432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17705306#comment-17705306 ]
ASF GitHub Bot commented on MAPREDUCE-7432: ------------------------------------------- steveloughran commented on PR #5378: URL: https://github.com/apache/hadoop/pull/5378#issuecomment-1484965578 not going to merge this just yet; been getting complaints about memory use in some jobs during commit. I think I will have to merge manifest load with the file commit phase, which isn't done right now. problem there is that directories need to be created before the renames begin; that needs to be optimised to not duplicate dir creation for every task, but not be too blocking either. will write some scale tests first to see whether the OOMs are coming from the committer or problems with abfs input streams. null hypothesis: my code > Make Manifest Committer the default for abfs and gcs > ---------------------------------------------------- > > Key: MAPREDUCE-7432 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7432 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Components: client > Affects Versions: 3.3.5 > Reporter: Steve Loughran > Assignee: Steve Loughran > Priority: Major > Labels: pull-request-available > > Switch to the manifest committer as default for abfs and gcs > * abfs: needed for performance, scale and resilience under some failure modes > * gcs: provides correctness through atomic task commit and better job commit > performance -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org