Support for workflow actions running across multiple big data frameworks in parallel?

Som Satpathy Mon, 17 Jun 2013 15:26:59 -0700

Hi,

I had a question regarding a use case where a user might want to design and
execute workflows on YARN/Mesos which runs hadoop and spark in parallel.
Does Oozie currently support or will it support in future actions that can
execute in a framework agnostic manner?


For simplicity let's consider an example - a user defines a workflow as
follows:

Step1: Mapreduce over hadoop job to do some initial data processing (Input:
data from s3, output: data into s3)
Step2: Spark job to further process the data (Input: data from s3, output:
data into s3)

Here assumption is both hadoop and spark are running in the same cluster
managed by YARN/Mesos.

Can oozie support such a usecase today?

Thanks,
Som

Support for workflow actions running across multiple big data frameworks in parallel?

Reply via email to