[ 
https://issues.apache.org/jira/browse/HADOOP-16357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16872362#comment-16872362
 ] 

Steve Loughran commented on HADOOP-16357:
-----------------------------------------

-1 to the submitted patch; breaks those unit and integration tests which 
expected the default to be fail.

Yetus caught this in the unit tests, though it'd have still have to make it 
through your manual job submission. [~Prabhu Joseph], sorry to put extra 
homework onto you, but we are equally strict with each other for hadoop-aws and 
hadoop-azure

https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/testing.html#Policy_for_submitting_patches_which_affect_the_hadoop-aws_module.

The PR I've submitted does past all tests on the last run

> TeraSort Job failing on S3 DirectoryStagingCommitter: destination path exists
> -----------------------------------------------------------------------------
>
>                 Key: HADOOP-16357
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16357
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.1.0, 3.2.0
>            Reporter: Prabhu Joseph
>            Assignee: Steve Loughran
>            Priority: Minor
>         Attachments: MAPREDUCE-7216-001.patch
>
>
> TeraSort Job fails on S3 with below exception. Terasort creates OutputPath 
> and writes partition filename but DirectoryStagingCommitter expects output 
> path to not exist.
> {code}
> 9/06/07 14:13:34 INFO mapreduce.Job: Job job_1559891760159_0011 failed with 
> state FAILED due to: Job setup failed : 
> org.apache.hadoop.fs.PathExistsException: `s3a://bucket/OUTPUT': Setting job 
> as Task committer attempt_1559891760159_0011_m_000000_0: Destination path 
> exists and committer conflict resolution mode is "fail"
>       at 
> org.apache.hadoop.fs.s3a.commit.staging.StagingCommitter.failDestinationExists(StagingCommitter.java:878)
>       at 
> org.apache.hadoop.fs.s3a.commit.staging.DirectoryStagingCommitter.setupJob(DirectoryStagingCommitter.java:71)
>       at 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobSetup(CommitterEventHandler.java:255)
>       at 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:235)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>       at java.lang.Thread.run(Thread.java:748)
> {code}
> Creating partition filename in /tmp or some other directory fixes the issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to