Hi,
In oozie coordinator, we can Using ${coord:current(int n)} to create a
data-pipeline using a coordinator application. It's said that
"${coord:current(int n)} represents the nth dataset instance for a synchronous
dataset, relative to the coordinator action creation (materialization) time.
The coordinator action creation (materialization) time is computed based on the
coordinator job start time and its frequency. The nth dataset instance is
computed based on the dataset's initial-instance datetime, its frequency and
the (current) coordinator action creation (materialization) time."
However, our case is: coordinator starts at for example 2013-12-12-02, step 1
outputs multiple partitioned data, like partitions /data/dth=2013-12-11-22,
/data/dth=2013-12-11-23, /data/dth=2013-12-12-02. We want to process all these
newly generated partitions in step 2. That means, step 2 take the output of
step 1 as its input, and will process data in the new partitions one by one. So
if we define a dataset like below in step 2, how could we define the input
events (in </data-in>) and pass parameters(in configuration property) to step2?
<uri-template>
hdfs://xxx:8020/data/dth=${YEAR}-${MONTH}-${DAY}-${HOUR}
</uri-template>
Does oozie support such kind of pipeline?
Thanks,
Huiting