[
https://issues.apache.org/jira/browse/HADOOP-3245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12600364#action_12600364
]
Amar Kamat commented on HADOOP-3245:
------------------------------------
bq. III) The logic for detecting lost TT should not rely on missing data
structures but use some kind of book keeping. We can now use 'missing data
structures logic' for detecting when the TT should SYNC. Note that detecting a
TT as lost (missing TT details) if different from declaring it as lost (10min
gap in heartbeat).
These are two different cases where
1) _Lost TT_ will have _initial contact_ as *false* while the previous
heartbeat will be present
2) _Restarted JT_ will have _initial contact_ as *false* while the previous
heartbeat will also be missing.
Hence there is no need to fix the lost TT logic.
> Provide ability to persist running jobs (extend HADOOP-1876)
> ------------------------------------------------------------
>
> Key: HADOOP-3245
> URL: https://issues.apache.org/jira/browse/HADOOP-3245
> Project: Hadoop Core
> Issue Type: New Feature
> Components: mapred
> Reporter: Devaraj Das
> Assignee: Amar Kamat
> Fix For: 0.18.0
>
>
> This could probably extend the work done in HADOOP-1876. This feature can be
> applied for things like jobs being able to survive jobtracker restarts.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.