[
https://issues.apache.org/jira/browse/TEZ-1909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14340077#comment-14340077
]
Jeff Zhang commented on TEZ-1909:
---------------------------------
Attach patch [~hitesh] please help review it.
* Split the summary event and recovery event into each attempt directory like
following
{code}
-rw-r--r-- 1 jzhang supergroup 1661372 2015-02-27 14:58
/tmp/temp-1397147140/.tez/application_1424916692162_0036/recovery/1/dag_1424916692162_0036_1.recovery
-rw-r--r-- 1 jzhang supergroup 145 2015-02-27 14:58
/tmp/temp-1397147140/.tez/application_1424916692162_0036/recovery/1/summary
-rw-r--r-- 1 jzhang supergroup 39284 2015-02-27 14:59
/tmp/temp-1397147140/.tez/application_1424916692162_0036/recovery/2/dag_1424916692162_0036_1.recovery
-rw-r--r-- 1 jzhang supergroup 222 2015-02-27 14:59
/tmp/temp-1397147140/.tez/application_1424916692162_0036/recovery/2/summary
-rw-r--r-- 1 jzhang supergroup 19028 2015-02-27 13:59
/tmp/temp-1397147140/.tez/application_1424916692162_0036/recovery/3/dag_1424916692162_0036_1.recovery
-rw-r--r-- 1 jzhang supergroup 448 2015-02-27 13:59
/tmp/temp-1397147140/.tez/application_1424916692162_0036/recovery/3/summary
{code}
* Remove dataRecoveredFlagFile. Because I think it is for checking whether AM
is shutdown when recovering. But now since we won't copy data to the new
attempt dir, so there would be no data lost.
It won't affect the next AM attempt.
> Remove need to copy over all events from attempt 1 to attempt 2 dir
> -------------------------------------------------------------------
>
> Key: TEZ-1909
> URL: https://issues.apache.org/jira/browse/TEZ-1909
> Project: Apache Tez
> Issue Type: Sub-task
> Reporter: Hitesh Shah
> Assignee: Jeff Zhang
> Attachments: TEZ-1909-1.patch
>
>
> Use of file versions should prevent the need for copying over data into a
> second attempt dir. Care needs to be taken to handle "last corrupt record"
> handling.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)