[ 
https://issues.apache.org/jira/browse/TEZ-1909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14340077#comment-14340077
 ] 

Jeff Zhang commented on TEZ-1909:
---------------------------------

Attach patch [~hitesh] please help review it.

* Split the summary event and recovery event into each attempt directory like 
following
{code}
-rw-r--r--   1 jzhang supergroup    1661372 2015-02-27 14:58 
/tmp/temp-1397147140/.tez/application_1424916692162_0036/recovery/1/dag_1424916692162_0036_1.recovery
-rw-r--r--   1 jzhang supergroup        145 2015-02-27 14:58 
/tmp/temp-1397147140/.tez/application_1424916692162_0036/recovery/1/summary
-rw-r--r--   1 jzhang supergroup      39284 2015-02-27 14:59 
/tmp/temp-1397147140/.tez/application_1424916692162_0036/recovery/2/dag_1424916692162_0036_1.recovery
-rw-r--r--   1 jzhang supergroup        222 2015-02-27 14:59 
/tmp/temp-1397147140/.tez/application_1424916692162_0036/recovery/2/summary
-rw-r--r--   1 jzhang supergroup      19028 2015-02-27 13:59 
/tmp/temp-1397147140/.tez/application_1424916692162_0036/recovery/3/dag_1424916692162_0036_1.recovery
-rw-r--r--   1 jzhang supergroup        448 2015-02-27 13:59 
/tmp/temp-1397147140/.tez/application_1424916692162_0036/recovery/3/summary
{code}
* Remove dataRecoveredFlagFile. Because I think it is for checking whether AM 
is shutdown when recovering. But now since we won't copy data to the new 
attempt dir, so there would be no data lost.
It won't affect the next AM attempt.


> Remove need to copy over all events from attempt 1 to attempt 2 dir
> -------------------------------------------------------------------
>
>                 Key: TEZ-1909
>                 URL: https://issues.apache.org/jira/browse/TEZ-1909
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Hitesh Shah
>            Assignee: Jeff Zhang
>         Attachments: TEZ-1909-1.patch
>
>
> Use of file versions should prevent the need for copying over data into a 
> second attempt dir. Care needs to be taken to handle "last corrupt record" 
> handling. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to