Hi,

I had a question regarding a use case where a user might want to design and
execute workflows on YARN/Mesos which runs hadoop and spark in parallel.
Does Oozie currently support or will it support in future actions that can
execute in a framework agnostic manner?

For simplicity let's consider an example - a user defines a workflow as
follows:

Step1: Mapreduce over hadoop job to do some initial data processing (Input:
data from s3, output: data into s3)
Step2: Spark job to further process the data (Input: data from s3, output:
data into s3)

Here assumption is both hadoop and spark are running in the same cluster
managed by YARN/Mesos.

Can oozie support such a usecase today?

Thanks,
Som

Reply via email to