Thanks, V.

There¹s only one done flag per day, so what¹s the benefit of using
start/end-instances?

Also, from the doc:
The ${coord:current(int offset)} EL function resolves to coordinator
action creation time minus the specified offset multiplied by the dataset
frequency. This EL function is properly defined in a subsequent section.

It sounds like current(23) will end up being 23*(24 hours) from now, when
you set dataset freq to hours(24). Does days(1) actually differ from
hours(24)?

Admittedly I am pretty confused by the coordinator config, but my
coordinator is basically just copy and pasted from the
https://oozie.apache.org/docs/3.1.3-incubating/CoordinatorFunctionalSpec.ht
ml example for a daily input dataset.

Thanks!
Alvin



On 11/5/15, 5:01 AM, "Vincent Peplinski" <[email protected]> wrote:

>Hi Alvin,
>
>In cases like this I would set the coordinator frequency coord:days(1)
>and the dataset to coord:hours(24).
>
>My input-events would be set as follows:
><start-instance>coord:current(00)</start-instance>
><end-instance>coord:current(23)</end-instance>
>
>This will result in checking for the done-flag in each hour of the day.
>
>I would then schedule the job kickoff at 00:00 every day.
>
>V.
>
>  Original Message
>From: Alvin Chyan
>Sent: Wednesday, November 4, 2015 1:54 PM
>To: [email protected]
>Reply To: [email protected]
>Subject: oozie coordinator not waiting for dataset after daylight savings
>
>Hi all,
>Did anyone else experience some bizarre issues with oozie's coordinator
>after daylight savings time change? Our coordinator was submitted weeks
>ago at 7pm and scheduled to run every 24 hours. The coordinator is
>supposed to wait for an input dataset though, so it normally waits until
>about midnight before the workflow is materialized. However, ever since
>daylight savings on 11/1, the coordinator would no longer wait and just
>materialize a workflow instance immediately at 7pm.
>
>Here's a part of our coordinator definition:
><coordinator-app xmlns="uri:oozie:coordinator:0.2" name="merge"
>start="${coord:conf('schedule.start')}"
>end="${coord:conf('schedule.end')}"
>timezone="US/Pacific"
>frequency="${coord:hours(24)}">
><controls>
><timeout>-1</timeout>
><concurrency>1</concurrency>
></controls>
><datasets>
><dataset name="all-iters-complete" frequency="${coord:days(1)}"
>initial-instance="${coord:conf('start')}"
>timezone="US/Pacific">
><uri-template>${coord:conf('namenode')}/process_info/${YEAR}_${MONTH}_${DA
>Y}</uri-template>
><done-flag>up_to_eod_iters_SUCCESS</done-flag>
></dataset>
></datasets>
>
><input-events>
><data-in name="input" dataset="all-iters-complete">
><instance>${coord:current(0)}</instance>
></data-in>
></input-events>
>...
>
>
>The dataset /process_info/2015_11_01/up_to_eod_iters_SUCCESS gets created
>early on 2015_11_02, but the workflow kicked off before then.
>
>One configuration we had that might affect this was in our oozie-site.xml:
><property>
><name>oozie.processing.timezone</name>
><value>GMT-0800</value>
></property>
>
>
>Thanks!
>Alvin

Reply via email to