[ 
https://issues.apache.org/jira/browse/FALCON-1149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14492373#comment-14492373
 ] 

Alex C commented on FALCON-1149:
--------------------------------

Hi there,

Just to let you know, I tried you suggestion of only using 'today(24,0)' for 
the end, and unfortunately it doesn't work (I still get the same error where 
the input is WAITING state).

Also, unfortunately the workaround I am using doesn't quite produce the desired 
results.. although it resolves the WAITING problem, and p2 can proceed to 
COMPLETED, the partition specification in f2 '/f2/${YEAR}/${MONTH}/${DAY}' 
means that the day written is yesterday instead of today.

Would you happen to know if a workaround is also possible for the specification 
'${DAY}'?

Thanks

> The 'today' EL date expression is resolving to yesterday's date, for process 
> instance input feed ranges
> -------------------------------------------------------------------------------------------------------
>
>                 Key: FALCON-1149
>                 URL: https://issues.apache.org/jira/browse/FALCON-1149
>             Project: Falcon
>          Issue Type: Bug
>    Affects Versions: 0.5, 0.6
>         Environment: HDP 2.1 sandbox, HDP 2.2 sandbox; server in UTC
>            Reporter: Alex C
>            Assignee: Ajay Yadava
>
> *Steps to reproduce* 
> 1. Submit a cluster named 'sandbox':
> {code:xml}
> <cluster colo="local" description="Sandbox Cluster" name="sandbox" 
> xmlns="uri:falcon:cluster:0.1">
>   <interfaces>
>     <interface type="readonly" 
> endpoint="hftp://sandbox.hortonworks.com:50070"; version="2.2.0" />
>     <interface type="write" endpoint="hdfs://sandbox.hortonworks.com:8020" 
> version="2.2.0" />
>     <interface type="execute" endpoint="sandbox.hortonworks.com:8050" 
> version="2.2.0" />
>     <interface type="workflow" 
> endpoint="http://sandbox.hortonworks.com:11000/oozie/"; version="4.0.0" />
>     <interface type="messaging" 
> endpoint="tcp://sandbox.hortonworks.com:61616?daemon=true" version="5.1.6" />
>   </interfaces>
>   <locations>
>     <location name="staging" path="/apps/falcon/sandbox/staging" />
>     <location name="temp" path="/tmp" />
>     <location name="working" path="/apps/falcon/sandbox/working" />
>   </locations>
> </cluster>
> {code}
> 2. Submit a feed f1:
> {code:xml}
> <feed name="f1" description="f1" xmlns="uri:falcon:feed:0.1">
>   <frequency>days(1)</frequency>
>   <timezone>UTC</timezone>
>   <late-arrival cut-off="hours(48)" />
>   <clusters>
>     <cluster name="sandbox" type="source">
>       <validity start="2013-01-01T13:00Z" end="2099-12-31T13:00Z" />
>       <retention limit="months(9999)" action="delete" />
>     </cluster>
>   </clusters>
>   <locations>
>     <location type="data"
>       path="/f1/${YEAR}/${MONTH}/${DAY}" />
>   </locations>
>   <ACL owner="ambari-qa" group="users" permission="0775" />
>   <schema location="/none" provider="none" />
> </feed>
> {code}
> 3. Submit a process p1:
> {code:xml}
> <process name="p1" xmlns="uri:falcon:process:0.1">
>   <clusters>
>     <cluster name="sandbox">
>       <validity start="<TODAY>T08:30Z" end="2099-12-31T00:00Z"/>
>     </cluster>
>   </clusters>
>   <parallel>1</parallel>
>   <order>FIFO</order>
>   <frequency>days(1)</frequency>
>   <outputs>
>     <output name="output" feed="f1" instance="today(0,0)" />
>   </outputs>
>   <properties>
>   </properties>
>   <workflow name="p1-wf" engine="oozie" path="/apps/p1" />
>   <retry policy="periodic" delay="minutes(60)" attempts="24" />
> </process>
> {code}
> 4. Submit a feed f2:
> {code:xml}
> <feed name="f2" description="f2" xmlns="uri:falcon:feed:0.1">
>   <frequency>days(1)</frequency>
>   <timezone>UTC</timezone>
>   <late-arrival cut-off="hours(48)" />
>   <clusters>
>     <cluster name="sandbox" type="source">
>       <validity start="2013-01-01T13:00Z" end="2099-12-31T13:00Z" />
>       <retention limit="months(9999)" action="delete" />
>     </cluster>
>   </clusters>
>   <locations>
>     <location type="data"
>       path="/f2/${YEAR}/${MONTH}/${DAY}" />
>   </locations>
>   <ACL owner="ambari-qa" group="users" permission="0775" />
>   <schema location="/none" provider="none" />
> </feed>
> {code}
> 5. Submit a process p2:
> {code:xml}
> <process name="p2" xmlns="uri:falcon:process:0.1">
>   <clusters>
>     <cluster name="sandbox">
>       <validity start="<TODAY>T08:30Z" end="2099-12-31T00:00Z"/>
>     </cluster>
>   </clusters>
>   <parallel>1</parallel>
>   <order>FIFO</order>
>   <frequency>days(1)</frequency>
>   <inputs>
>     <input name="input" feed="f1" start="today(0,0)" end="today(0,0)" />
>   </inputs>
>   <outputs>
>     <output name="output" feed="f2" instance="today(0,0)" />
>   </outputs>
>   <workflow name="p2-wf" engine="oozie" path="/apps/p2" />
>   <retry policy="periodic" delay="minutes(60)" attempts="24" />
> </process>
> {code}
> 6. Note that:
> - Process p1 has no input feed (the data is fetched from some other location 
> by p1).
> - Feed f1 is referenced in the output of p1, and also referenced in the input 
> of p2.
> - All feeds are daily, and process input feed ranges and output feeds are 
> daily, by way of the 'today(0,0)' EL expression.
> 7. Finally, schedule all feeds and processes after 08:30Z on a given day, 
> 'today'..
> *Expected:*
> 1. The first scheduled instance for p1 proceeds to COMPLETED, and produces a 
> partition in f1 for 'today'
> 2. The first scheduled instance for p2 proceeds to COMPLETED, and produces a 
> partition in f2 for 'today', since it looks for and finds a corresponding 
> partition for 'today' in f1.
> *Actual:*
> 1. The first scheduled instance for p1 proceeds to COMPLETED, and produces a 
> partition in f1 for 'today'
> 2. However, the first scheduled instance for p2 is left in WAITING state, 
> since it is looking for a partition in f1 for 'yesterday', which does not 
> exist (and will never exist).
> I am currently working around this unexpected behaviour by specifying the 
> input feed range start and end for p2 as 'today(24,0)' instead of 'today(0,0)'
> Please advise if this is indeed a) a bug or b) a mistake in the configuration.
> Many thanks,



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to