Hi,
I'm looking for open source workflow tools/engines that allow us to
schedule spark jobs on a cassandra cluster. Since there are tonnes of
alternatives out there like Ozzie, Azkaban, Luigi , Chronos etc, I wanted
to check with people here to see what they are using today.

Some of the requirements of the workflow engine that I'm looking for are

1. First class support for Spark and  Cassandra.
2. Good open source community support and well tested at production scale.
3. Should be easy to write job dependencices using XML or web itnerface .
Ex; job A depends on Job and Job C, so run Job A after B and C are finished.
4. Time based  recurrent scheduling. Run the spark jobs at a given time
every hour or day or week or month.

Thanks for the inputs

Reply via email to