Looks like Oozie can satisfy most of your requirements.
On Fri, Aug 7, 2015 at 8:43 AM, Vikram Kone <vikramk...@gmail.com> wrote: > Hi, > I'm looking for open source workflow tools/engines that allow us to > schedule spark jobs on a datastax cassandra cluster. Since there are tonnes > of alternatives out there like Ozzie, Azkaban, Luigi , Chronos etc, I > wanted to check with people here to see what they are using today. > > Some of the requirements of the workflow engine that I'm looking for are > > 1. First class support for submitting Spark jobs on Cassandra. Not some > wrapper Java code to submit tasks. > 2. Active open source community support and well tested at production > scale. > 3. Should be dead easy to write job dependencices using XML or web > interface . Ex; job A depends on Job B and Job C, so run Job A after B and > C are finished. Don't need to write full blown java applications to specify > job parameters and dependencies. Should be very simple to use. > 4. Time based recurrent scheduling. Run the spark jobs at a given time > every hour or day or week or month. > 5. Job monitoring, alerting on failures and email notifications on daily > basis. > > I have looked at Ooyala's spark job server which seems to be hated towards > making spark jobs run faster by sharing contexts between the jobs but isn't > a full blown workflow engine per se. A combination of spark job server and > workflow engine would be ideal > > Thanks for the inputs >