[ https://issues.apache.org/jira/browse/MAPREDUCE-7470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17813160#comment-17813160 ]
ASF GitHub Bot commented on MAPREDUCE-7470: ------------------------------------------- steveloughran commented on PR #6469: URL: https://github.com/apache/hadoop/pull/6469#issuecomment-1921008573 Like I said on the jira, I don't want this. It has the same scale issues encountered on abfs as #[6399](https://github.com/apache/hadoop/pull/6399) and #[6378](https://github.com/apache/hadoop/pull/6378), the same correctness problems on GCS as v2, as in "incorrect task commit semantics" unless v1 commit can made to not rely on atomic directory rename, but instead "atomic file rename", which does work there. * which cloud store have you tested this against? Does it actually have the semantics of rename for v1 task commit? * what was the depth/width of the directory structure? * did you try a terasort? * did you try multiple jobs through spark at the same time? as there memory is a problem: #5728 Even if the store meets the v1 correctness pre-requisites I would like to see a comparison of the same job you have tested through the manifest committer. Ideally with any profiling to highlight where it could be improved. > multi-thread mapreduce committer > -------------------------------- > > Key: MAPREDUCE-7470 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7470 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv2 > Reporter: TianyiMa > Priority: Major > Labels: mapreduce, pull-request-available > Attachments: MAPREDUCE-7470.0.patch > > > In cloud environment, such as aws, aliyun etc., the internet delay is > non-trival when we commit thounds of files. > In our situation, the ping delay is about 0.03ms in IDC, but when move to > Coud, the ping delay is about 3ms, which is roughly 100x slower. We found > that, committing tens thounds of files will cost a few tens of minutes. The > more files there are, the logger it takes. > So we propose a new committer algorithm, which is a variant of committer > algorithm version 1, called 3. In this new algorithm 3, in order to decrease > the committer time, we use a thread pool to commit job's final output. > Our test result in Cloud production shows that, the new algorithm 3 has > decrease the committer time by serveral tens of times. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org