[
https://issues.apache.org/jira/browse/HADOOP-3245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12588753#action_12588753
]
Devaraj Das commented on HADOOP-3245:
-------------------------------------
Doug that makes sense. However, the concern there is this that for each task
update, we need to do a sync to dfs (sync because we want to be sure the info
is there in the dfs). That might be expensive when TIPs complete at a high
rate, no? Also, even in this case, we most likely still need to do the edits
log kind of merge since the file will just contain a bunch of updates.
There is one problem with this approach of periodic merges. We lose the
information for completed tasks that completes in the interval between the last
merge and the time when the JobTracker crashes. So in the next restart the
JobTracker would try to reexecute these tasks but their outputs would already
be present on the dfs and conflicts will happen when save output is invoked for
these new attempts (this problem becomes a bit more complicated with the side
files in the picture). We probably should have a strategy (job configurable?)
to handle such cases on a per path basis - OVERWRITE (if an output path already
exists) or ACCEPT (accept what is already there). Assuming idempotent tasks we
probably should have the default as ACCEPT...
> Provide ability to persist running jobs (extend HADOOP-1876)
> ------------------------------------------------------------
>
> Key: HADOOP-3245
> URL: https://issues.apache.org/jira/browse/HADOOP-3245
> Project: Hadoop Core
> Issue Type: New Feature
> Components: mapred
> Reporter: Devaraj Das
> Fix For: 0.18.0
>
>
> This could probably extend the work done in HADOOP-1876. This feature can be
> applied for things like jobs being able to survive jobtracker restarts.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.