Can mesos provide high availability for any generic app/framework on cassandra cluster?

Vikram Kone Mon, 17 Aug 2015 14:47:34 -0700

Hi.
I'm looking at existing open source workflow engines we can use for
scheduling spark jobs with intricate dependencies on a datastax cassandra
cluster. Currently we  are using crontab to schedule jobs and want to move
to something which is more robust and highly available.
There are 2 main problems with cron on cassandra
as we have today
1. Single point of failure: Our cron tasks that do spark-submit run on a
single machine and if that machine goes down in the cluster, all the jobs
are kaput till the node comes back up.
2. Can't easily specify job dependency between cron tasks to model a DAG.


One of the
Work flow 
engines I'm looking at is
A
zkaban where
 
job authoring and dependency config is easy
 via Web UI and REST APIs
. But it
also
has a single point of failure for azkaban master . I'm also open to run the
workflow engine on a separate cluster but since spark doesn't allow remote
job submission
s natively
, we are stuck with running workflow engine on the same cassandra cluster.

High availability of Spark master is taken care of in the Datastax's
version of cassandra. So all I need to do is provide HA for Azkaban. Is
mesos the right tool for this?

I can either go Spark + Mesos + Zookeper  if...

 Mesos provides the ability to configure jobs with dependencies ie run job
A after Job and job C are finished.

Or go with
Spark + Azkaban + zookeper if..
Mesos doesn't provide job dependency features.

Advice?

thx

Can mesos provide high availability for any generic app/framework on cassandra cluster?

Reply via email to