The newly generated partitions should be part of data-out. You can pass the partitions using coord:dataOut() EL function
Regards, Rohini On Thu, Dec 12, 2013 at 2:12 AM, Huiting Li <[email protected]> wrote: > Hi, > > In oozie coordinator, we can Using ${coord:current(int n)} to create a > data-pipeline using a coordinator application. It's said that > "${coord:current(int n)} represents the nth dataset instance for a > synchronous dataset, relative to the coordinator action creation > (materialization) time. The coordinator action creation (materialization) > time is computed based on the coordinator job start time and its frequency. > The nth dataset instance is computed based on the dataset's > initial-instance datetime, its frequency and the (current) coordinator > action creation (materialization) time." > However, our case is: coordinator starts at for example 2013-12-12-02, > step 1 outputs multiple partitioned data, like partitions /data/dth= > 2013-12-11-22, /data/dth=2013-12-11-23, /data/dth=2013-12-12-02. We want > to process all these newly generated partitions in step 2. That means, step > 2 take the output of step 1 as its input, and will process data in the new > partitions one by one. So if we define a dataset like below in step 2, how > could we define the input events (in </data-in>) and pass parameters(in > configuration property) to step2? > <uri-template> > hdfs://xxx:8020/data/dth=${YEAR}-${MONTH}-${DAY}-${HOUR} > </uri-template> > > Does oozie support such kind of pipeline? > > Thanks, > Huiting >
