+1
On a related note, there is a lot of interest in Hadoop and Spark from the HPC community who often run slurm, pbs, and sge to control jobs (as opposed to Yarn and Mesos). Currently, there are several projects that launch Yarn clusters (or MR1 clusters) inside PBS Jobs [1] but this is not the ideal situation. It would be much better to spark-submit pbs://master.whatever.org ... and run the job directly.

I would also appreciate help on how to move forward on such a project for Spark since it has the performance benefits over Hadoop and I don't think Hadoop can currently be disentangled from Yarn at the moment.

I think I need to define a new PbsExecutorBackend and PbsSchedulerBackend. IPython approaches this by writing a job script to shell out to command line tools like qsub, qdel, qstat because most of the job schedulers use these command line tools as a front end [2]. Then we should be able to get slurm, pbs, and sge in one shot rather than implementing some wire formats for RPC.

Thanks,
Ewan Higgs

[1] https://hadoop.apache.org/docs/r1.2.1/hod_scheduler.html
https://github.com/glennklockwood/hpchadoop
http://jaliyacgl.blogspot.be/2008/08/hadoop-as-batch-job-using-pbs.html
https://github.com/hpcugent/hanythingondemand

[2] http://ipython.org/ipython-doc/stable/parallel/parallel_process.html
https://github.com/ipython/ipython/blob/master/IPython/parallel/apps/launcher.py#L1150

On 31/01/15 09:55, Anjana Fernando wrote:
Hi everyone,

I've been experimenting, and somewhat of a newbie for Spark. I was
wondering, if there is any way, that I can use a custom cluster manager
implementation with Spark. Basically, as I understood, at the moment, the
inbuilt modes supported are with standalone, Mesos and  Yarn. My
requirement is basically a simple clustering solution with high
availability of the master. I don't want to use a separate Zookeeper
cluster, since this would complicate my deployment, but rather, I would
like to use something like Hazelcast, which has a peer-to-peer cluster
coordination implementation.

I found that, there is already this JIRA [1], which requests for a custom
persistence engine, I guess for storing state information. So basically,
what I would want to do is, use Hazelcast to use for leader election, to
make an existing node the master, and to lookup the state information from
the distributed memory. Appreciate any help on how to archive this. And if
it useful for a wider audience, hopefully I can contribute this back to the
project.

[1] https://issues.apache.org/jira/browse/SPARK-1180

Cheers,
Anjana.



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Reply via email to