[
https://issues.apache.org/jira/browse/TEZ-1909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14262006#comment-14262006
]
Hitesh Shah edited comment on TEZ-1909 at 12/31/14 7:23 AM:
------------------------------------------------------------
Actually both.
Today, we end up copying over data from the previous attempt into the current
attempt's directory. ( the attempt specific directly already exists hence
covers part 1 of your comment ). It might be better to just have a chain of
partial files to reduce the copy overhead.
was (Author: hitesh):
Actually both.
Today, we end up copying over data from the previous attempt into the current
attempt's directory. It might be better to just have a chain of partial files
to reduce the copy overhead.
> Remove need to copy over all events from attempt 1 to attempt 2 dir
> -------------------------------------------------------------------
>
> Key: TEZ-1909
> URL: https://issues.apache.org/jira/browse/TEZ-1909
> Project: Apache Tez
> Issue Type: Sub-task
> Reporter: Hitesh Shah
> Assignee: Jeff Zhang
>
> Use of file versions should prevent the need for copying over data into a
> second attempt dir. Care needs to be taken to handle "last corrupt record"
> handling.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)