[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17705306#comment-17705306
 ] 

ASF GitHub Bot commented on MAPREDUCE-7432:
-------------------------------------------

steveloughran commented on PR #5378:
URL: https://github.com/apache/hadoop/pull/5378#issuecomment-1484965578

   not going to merge this just yet; been getting complaints about memory use 
in some jobs during commit. I think I will have to merge manifest load with the 
file commit phase, which isn't done right now.
   
   problem there is that directories need to be created before the renames 
begin; that needs to be optimised to not duplicate dir creation for every task, 
but not be too blocking either. 
   
   will write some scale tests first to see whether the OOMs are coming from 
the committer or problems with abfs input streams. null hypothesis: my code




> Make Manifest Committer the default for abfs and gcs
> ----------------------------------------------------
>
>                 Key: MAPREDUCE-7432
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7432
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: client
>    Affects Versions: 3.3.5
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Major
>              Labels: pull-request-available
>
> Switch to the manifest committer as default for abfs and gcs
> * abfs: needed for performance, scale and resilience under some failure modes
> * gcs: provides correctness through atomic task commit and better job commit 
> performance



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

Reply via email to