[
https://issues.apache.org/jira/browse/OOZIE-1431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14498142#comment-14498142
]
ARVIND KUMAR JAJOO commented on OOZIE-1431:
-------------------------------------------
We need this issue to be fixed for our current use case.
We use cron schedule for coordinator frequency and want to use the same for
dataset frequency as schedule is like 0 3 * * 1-6 which means daily except
Sunday.
Now if we use coord:days(1) for dataset frequency, essentially there is no
coordinator instance for Sunday.
Also , we have some other cron schedule like 15-30/5 10 * * * for coordinator
jobs . Not sure what DS frequency can be defined for such cases.
If both are frequency , then they should support cron schedule too.
> Dataset frequencies should accept cron syntax
> ---------------------------------------------
>
> Key: OOZIE-1431
> URL: https://issues.apache.org/jira/browse/OOZIE-1431
> Project: Oozie
> Issue Type: Sub-task
> Reporter: Robert Kanter
> Assignee: Bowen Zhang
>
> For example, instead of
> {code:xml}
> <datasets>
> <dataset name="raw-logs" frequency="${coord:minutes(20)}"
> initial-instance="2010-01-01T00:00Z" timezone="UTC">
>
> <uri-template>${nameNode}/user/${coord:user()}/${examplesRoot}/input-data/rawLogs/${YEAR}/${MONTH}/${DAY}/${HOUR}/${MINUTE}</uri-template>
> </dataset>
> <dataset name="aggregated-logs" frequency="${coord:hours(1)}"
> initial-instance="2010-01-01T01:00Z" timezone="UTC">
>
> <uri-template>${nameNode}/user/${coord:user()}/${examplesRoot}/output-data/aggregator/aggregatedLogs/${YEAR}/${MONTH}/${DAY}/${HOUR}</uri-template>
> </dataset>
> </datasets>
> {code}
> we should be able to specify something like
> {code:xml}
> <datasets>
> <dataset name="raw-logs" frequency="00 09-18 * * 1-5"
> initial-instance="2010-01-01T00:00Z" timezone="UTC">
>
> <uri-template>${nameNode}/user/${coord:user()}/${examplesRoot}/input-data/rawLogs/${YEAR}/${MONTH}/${DAY}/${HOUR}/${MINUTE}</uri-template>
> </dataset>
> <dataset name="aggregated-logs" frequency="${coord:hours(1)}"
> initial-instance="2010-01-01T01:00Z" timezone="UTC">
>
> <uri-template>${nameNode}/user/${coord:user()}/${examplesRoot}/output-data/aggregator/aggregatedLogs/${YEAR}/${MONTH}/${DAY}/${HOUR}</uri-template>
> </dataset>
> </datasets>
> {code}
> In the second version, the frequency is {{00 09-18 * * 1-5}} instead of
> {{$\{coord:minutes(20)}}}, which indicates that it will be 9am to 6pm mon-fri
> instead of every 20min.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)