[jira] [Commented] (HADOOP-15782) Clarify committers.md around v2 failure handling

Steve Loughran (JIRA) Tue, 25 Sep 2018 04:44:10 -0700


    [ 
https://issues.apache.org/jira/browse/HADOOP-15782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16627203#comment-16627203
 ]


Steve Loughran commented on HADOOP-15782:
-----------------------------------------

how about this "not sure that v2 task recovery is correct"

Specifically: v2 task commit renames from attempt dir into destination. If that 
fails partway through, then both MR and spark assume that the entire task can 
be retried. Which has the following flaws

* a rerun task may generate files with different names. If this holds, those 
files from the first attempt which are copied into place will still be there. 
Outcome: output of two attempts may be in the destination.
* if the created filenames are the same, if the first attempt hasn't actually 
failed, but instead paused for some time, but then resumes (GC pauses, VM hangs 
etc), then the first attempt will continue its rename, and potentially then 
overwriting 1+ file of the previous attempts output. Outcome: the data may be a 
mix of two attempts.

If each attempt creates precisely one file, and the name of the file is the 
same on both, then these problems don't arise. There's no partial commit of 
files; the second attempt will overwrite the first completely, and, if a paused 
attempt resumes, then it will completely overwrite the output of the latter. 
Provided that doesn't happen partway through ongoing work (deletion of task 
attempt dirs will cause the rename to fail, obviously), then the requirement 
for speculative/retriable tasks "output from either attempt is valid" will be 
met  —downstream code will have to deal with it.

That said, happy to be wrong, if I've misunderstood something —that commit code 
is complex enough that I had to step through with a debugger taking notes to 
understand what was going on.

Right now I don't trust v2. it's worse on object stores as time-to-rename is 
potentially much longer, so probability of task failure during rename is higher.

See also [a zero rename 
committer|https://github.com/steveloughran/zero-rename-committer/releases/download/tag_draft_003/a_zero_rename_committer.pdf];
 review & corrections welcome there.

> Clarify committers.md around v2 failure handling
> ------------------------------------------------
>
>                 Key: HADOOP-15782
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15782
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: documentation
>    Affects Versions: 3.1.0, 3.1.1
>            Reporter: Gera Shegalov
>            Priority: Major
>
> The doc file 
> {{hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/committers.md}} 
> refers to the default file output committer (v2) as not supporting job and 
> task recovery throughout the doc:
> {quote}or just by rerunning everything (The "v2" algorithm and Spark).
> {quote}
> This is incorrect.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HADOOP-15782) Clarify committers.md around v2 failure handling

Reply via email to