[
https://issues.apache.org/jira/browse/OOZIE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Kanter updated OOZIE-1483:
---------------------------------
Description:
To support for the JobTracker to recover jobs on restart, we need to configure
the launcher job to be restarted by the JT, but not any of the launched jobs
({{mapred.job.restart.recover}}). This way, the launcher job will simply start
over when the JT recovers it; if we allow the JT to recover the actual jobs,
then they will interfere.
This should be fairly trivial except for the MapReduce action because of the
optimization where the launcher finishes instead of waiting for the actual job
and Oozie does an "id swap". Trying to add support for JT to recover the MR
action doesn't seem feasible as we'd run into a lot of trickiness and some race
conditions due to the id swap.
Instead, I think we should remove the MR optimization because it will allow us
to to support the recoverability for the MR action as well. This also has the
benefit of simplifying the code because we'd be getting rid of all of the id
swap stuff and also making the MR action consistent with the other actions.
The only downside is that the MR action will take an extra Map slot just like
the other actions.
was:
To support for the JobTracker to recover jobs on restart, we need to configure
the launcher job to be restarted by the JT, but not any of the launched jobs
({{mapred.job.restart.recover}}. This way, the launcher job will simply start
over when the JT recovers it; if we allow the JT to recover the actual jobs,
then they will interfere.
This should be fairly trivial except for the MapReduce action because of the
optimization where the launcher finishes instead of waiting for the actual job
and Oozie does an "id swap". Trying to add support for JT to recover the MR
action doesn't seem feasible as we'd run into a lot of trickiness and some race
conditions due to the id swap.
Instead, I think we should remove the MR optimization because it will allow us
to to support the recoverability for the MR action as well. This also has the
benefit of simplifying the code because we'd be getting rid of all of the id
swap stuff and also making the MR action consistent with the other actions.
The only downside is that the MR action will take an extra Map slot just like
the other actions.
> Support for Job Recoverability
> ------------------------------
>
> Key: OOZIE-1483
> URL: https://issues.apache.org/jira/browse/OOZIE-1483
> Project: Oozie
> Issue Type: Improvement
> Reporter: Robert Kanter
> Assignee: Robert Kanter
>
> To support for the JobTracker to recover jobs on restart, we need to
> configure the launcher job to be restarted by the JT, but not any of the
> launched jobs ({{mapred.job.restart.recover}}). This way, the launcher job
> will simply start over when the JT recovers it; if we allow the JT to recover
> the actual jobs, then they will interfere.
> This should be fairly trivial except for the MapReduce action because of the
> optimization where the launcher finishes instead of waiting for the actual
> job and Oozie does an "id swap". Trying to add support for JT to recover the
> MR action doesn't seem feasible as we'd run into a lot of trickiness and some
> race conditions due to the id swap.
> Instead, I think we should remove the MR optimization because it will allow
> us to to support the recoverability for the MR action as well. This also has
> the benefit of simplifying the code because we'd be getting rid of all of the
> id swap stuff and also making the MR action consistent with the other
> actions. The only downside is that the MR action will take an extra Map slot
> just like the other actions.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira