[
https://issues.apache.org/jira/browse/OOZIE-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101835#comment-13101835
]
Hadoop QA commented on OOZIE-102:
---------------------------------
anew remarked:
Although "normal" and "catchup" jobs are actually treated equally by Oozie, I
think there is a true difference in use cases:
- A catchup job is started to run/redo computations whose nominal time is often
far back in the past, and the input data is typically historical data and
already available at the creation time of the job. But most likely the catchup
also requires many more jobs to be run than for current computations.
- A current job has data dependencies on other jobs that have just finished or
are expected to finish very soon. If any of these jobs are late, then this job
has to wait. This is more likely to happen than in catchup mode.
Therefore, it is desirable to get expire a catchup job sooner than a current
job. On the other hand, for simplicity's sake, and because Oozie currently does
not have an explicit notion of "catchup", it is desirable to have the same
timeout for both types of jobs.
I do not have a strong preference, but a slight tendency towards keeping it
simple. So for now, I vote for a single timeout which kicks in when max(Nominal
Time, Created Time) + timeout > Current Time.
> GH-67: input data check should have a timeout for catch-up mode too.
> ---------------------------------------------------------------------
>
> Key: OOZIE-102
> URL: https://issues.apache.org/jira/browse/OOZIE-102
> Project: Oozie
> Issue Type: Bug
> Reporter: Hadoop QA
>
> For normal,timeout when Nominal Time + timeout > current time
> For catchup,timeout when Created Time + timeout > current time
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira