[jira] [Updated] (HADOOP-16357) Fix TeraSort Job failing on S3 DirectoryStagingCommitter

2019-06-10 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated HADOOP-16357:
---
Attachment: HADOOP-16357-001.patch

> Fix TeraSort Job failing on S3 DirectoryStagingCommitter
> 
>
> Key: HADOOP-16357
> URL: https://issues.apache.org/jira/browse/HADOOP-16357
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0, 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Attachments: HADOOP-16357-001.patch, MAPREDUCE-7216-001.patch
>
>
> TeraSort Job fails on S3 with below exception. Terasort creates OutputPath 
> and writes partition filename but DirectoryStagingCommitter expects output 
> path to not exist.
> {code}
> 9/06/07 14:13:34 INFO mapreduce.Job: Job job_1559891760159_0011 failed with 
> state FAILED due to: Job setup failed : 
> org.apache.hadoop.fs.PathExistsException: `s3a://bucket/OUTPUT': Setting job 
> as Task committer attempt_1559891760159_0011_m_00_0: Destination path 
> exists and committer conflict resolution mode is "fail"
>   at 
> org.apache.hadoop.fs.s3a.commit.staging.StagingCommitter.failDestinationExists(StagingCommitter.java:878)
>   at 
> org.apache.hadoop.fs.s3a.commit.staging.DirectoryStagingCommitter.setupJob(DirectoryStagingCommitter.java:71)
>   at 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobSetup(CommitterEventHandler.java:255)
>   at 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:235)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
> Creating partition filename in /tmp or some other directory fixes the issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16357) Fix TeraSort Job failing on S3 DirectoryStagingCommitter

2019-06-10 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-16357:

Issue Type: Sub-task  (was: Bug)
Parent: HADOOP-15620

> Fix TeraSort Job failing on S3 DirectoryStagingCommitter
> 
>
> Key: HADOOP-16357
> URL: https://issues.apache.org/jira/browse/HADOOP-16357
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0, 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Attachments: MAPREDUCE-7216-001.patch
>
>
> TeraSort Job fails on S3 with below exception. Terasort creates OutputPath 
> and writes partition filename but DirectoryStagingCommitter expects output 
> path to not exist.
> {code}
> 9/06/07 14:13:34 INFO mapreduce.Job: Job job_1559891760159_0011 failed with 
> state FAILED due to: Job setup failed : 
> org.apache.hadoop.fs.PathExistsException: `s3a://bucket/OUTPUT': Setting job 
> as Task committer attempt_1559891760159_0011_m_00_0: Destination path 
> exists and committer conflict resolution mode is "fail"
>   at 
> org.apache.hadoop.fs.s3a.commit.staging.StagingCommitter.failDestinationExists(StagingCommitter.java:878)
>   at 
> org.apache.hadoop.fs.s3a.commit.staging.DirectoryStagingCommitter.setupJob(DirectoryStagingCommitter.java:71)
>   at 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobSetup(CommitterEventHandler.java:255)
>   at 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:235)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
> Creating partition filename in /tmp or some other directory fixes the issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org