[ 
https://issues.apache.org/jira/browse/FALCON-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066062#comment-15066062
 ] 

Pallavi Rao commented on FALCON-1676:
-------------------------------------

The reason this happens is because of the way the coordinator definition is 
created by Falcon.
{code}
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<coordinator-app xmlns="uri:oozie:coordinator:0.3" 
name="FALCON_PROCESS_DEFAULT_DP-BaseSummaryProcess" 
frequency="${coord:minutes(30)}" start="2014-12-09T10:00Z" 
end="2099-01-01T00:00Z" timezone="UTC">
..
    <datasets>
        <dataset name="ConversionEnhance" frequency="${coord:minutes(30)}" 
initial-instance="2013-02-26T08:00Z" timezone="UTC">
            
<uri-template>hdfs://emerald/data/fetl/conversionenhance/${YEAR}/${MONTH}/${DAY}/${HOUR}/${MINUTE}</uri-template>
            <done-flag>_SUCCESS</done-flag>
        </dataset>
...
                <property>
                    <name>ConversionEnhance</name>
                    <value>${dataIn('ConversionEnhance', '*/{MATCH}')}</value>
                </property>
{code}

The base directory (without partition) is specified as the dataset on which 
co-ordinator waits for data to become available. The path however is resolved 
to append the partition.




> When a paritition is specified in input feed, Falcon should only wait for 
> data availability in a partition
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: FALCON-1676
>                 URL: https://issues.apache.org/jira/browse/FALCON-1676
>             Project: Falcon
>          Issue Type: Bug
>            Reporter: Pallavi Rao
>
> When a process uses a feed with partition as its input, Falcon waits for data 
> to be available in all partitions (parent dir), rather than just wait for 
> data availability in that particular partition.
> Example process input:
> {code}
> <inputs>
>         <input name="ConversionEnhance" feed="FETL-ConversionEnhance" 
> start="now(0,-30)" end="now(0,-30)" partition="*/{MATCH}"/>
> {code}
> If the consumer doesn't want to wait for all the data to available and is 
> bothered about data only in that partition, currently, user will be forced to 
> create a feed per partition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to