[ 
https://issues.apache.org/jira/browse/HADOOP-17584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yinan zhan updated HADOOP-17584:
--------------------------------
    Description: 
s3a magic committer isRecoverySupported() is false, so will restart all task 
after application master restart for am jvm crashed, leaving pendingset in 
magic path not to clear. pendingset name format is jobAttemptPath + 
taskAttemptID.getTaskID() + ".pendingset", and jobAttemptPath is actually 
jobIdPath not JobAttemptIdPath in s3a magic committer. These pendingset files 
are overwritted by new task commit.

But if in new am attempt, a speculative task overcomes origin task, so 
pendingset file in last attempt may be hold for job commit, the data for commit 
is wrong

  was:
s3a magic committer isRecoverySupported() is false, so will restart all task 
after application master restart for am jvm crashed, leaving pendingset in 
magic path not to clear. pendingset name format is jobAttemptPath + 
taskAttemptID.getTaskID() + ".pendingset". These pendingset files are 
overwritted by new task commit.


But if in new am attempt, a speculative task overcomes origin task, so 
pendingset file in last attempt may be hold for job commit, the data for commit 
is wrong


> s3a committer 
> --------------
>
>                 Key: HADOOP-17584
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17584
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 3.2.0
>            Reporter: yinan zhan
>            Priority: Major
>
> s3a magic committer isRecoverySupported() is false, so will restart all task 
> after application master restart for am jvm crashed, leaving pendingset in 
> magic path not to clear. pendingset name format is jobAttemptPath + 
> taskAttemptID.getTaskID() + ".pendingset", and jobAttemptPath is actually 
> jobIdPath not JobAttemptIdPath in s3a magic committer. These pendingset files 
> are overwritted by new task commit.
> But if in new am attempt, a speculative task overcomes origin task, so 
> pendingset file in last attempt may be hold for job commit, the data for 
> commit is wrong



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to