I have three workflows which I wish to coordinate.

* WF-A partitions a single input into multiple outputs
* WF-B aggregates the partitions of all WF-A workflows at the time it is run
* WF-C processes a single aggregate partition created by WF-B
There are some more constraints on this system:

* WF-A is started by an external process. Its start time is random. Each
WF-A is independent of the others.
* WF-B cannot run concurrently with another WF-B.
* Each WF-C is independent of the others, except that no two WF-C can
process the same partition simultaneously, and if a WF-C is successful
another WF-C will not reprocess its data again.
* The entire system should be driven by the external process which launches
WF-A (I.e there is no clock in this system)
I feel like this system may be expressible with Oozie using coordinators
(and perhaps bundles), and some custom Map Reduce actions. However I would
appreciate some thoughts on how I might construct this, as it isn¹t
completely clear to me how to proceed.

Thanks,
Chris


Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to