[
https://issues.apache.org/jira/browse/HADOOP-5394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689024#action_12689024
]
Devaraj Das commented on HADOOP-5394:
-------------------------------------
It would be good to always invoke getRestartCount() on the JobTracker startup.
Also the code in init() that creates the restart count file can be moved there,
and the creation can happen when the restart count file doesn't exist.
JobTracker recovery should be disabled when the file doesn't exist for the
current run (even if the configuration has set the recovery as true).
> JobTracker might schedule 2 attempts of the same task with the same attempt
> id across restarts
> ----------------------------------------------------------------------------------------------
>
> Key: HADOOP-5394
> URL: https://issues.apache.org/jira/browse/HADOOP-5394
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Amar Kamat
> Assignee: Amar Kamat
> Priority: Critical
> Attachments: HADOOP-5394-v1.2.patch, HADOOP-5394-v1.5.patch
>
>
> This can happen when the jobtracker gets restarted more than once. In such
> cases, the jobtracker depends on the jobhistory file for the next restart
> count. If the new restart-count is not flushed to the file then there is a
> fair chance that upon next restart, the jobtracker might schedule a new
> attempt with an existing id. This can cause problems not only with the
> side-effect files but also can cause the jobtracker to be in an inconsistent
> state.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.