Re: Custom Cluster Managers / Standalone Recovery Mode in Spark

Ewan Higgs Sun, 01 Feb 2015 07:33:05 -0800

+1

On a related note, there is a lot of interest in Hadoop and Spark fromthe HPC community who often run slurm, pbs, and sge to control jobs (asopposed to Yarn and Mesos). Currently, there are several projects thatlaunch Yarn clusters (or MR1 clusters) inside PBS Jobs [1] but this isnot the ideal situation. It would be much better to spark-submitpbs://master.whatever.org ... and run the job directly.

I would also appreciate help on how to move forward on such a projectfor Spark since it has the performance benefits over Hadoop and I don'tthink Hadoop can currently be disentangled from Yarn at the moment.

I think I need to define a new PbsExecutorBackend andPbsSchedulerBackend. IPython approaches this by writing a job script toshell out to command line tools like qsub, qdel, qstat because most ofthe job schedulers use these command line tools as a front end [2]. Thenwe should be able to get slurm, pbs, and sge in one shot rather thanimplementing some wire formats for RPC.


Thanks,
Ewan Higgs

[1] https://hadoop.apache.org/docs/r1.2.1/hod_scheduler.html
https://github.com/glennklockwood/hpchadoop
http://jaliyacgl.blogspot.be/2008/08/hadoop-as-batch-job-using-pbs.html
https://github.com/hpcugent/hanythingondemand

[2] http://ipython.org/ipython-doc/stable/parallel/parallel_process.html
https://github.com/ipython/ipython/blob/master/IPython/parallel/apps/launcher.py#L1150

On 31/01/15 09:55, Anjana Fernando wrote:

Hi everyone,

I've been experimenting, and somewhat of a newbie for Spark. I was
wondering, if there is any way, that I can use a custom cluster manager
implementation with Spark. Basically, as I understood, at the moment, the
inbuilt modes supported are with standalone, Mesos and  Yarn. My
requirement is basically a simple clustering solution with high
availability of the master. I don't want to use a separate Zookeeper
cluster, since this would complicate my deployment, but rather, I would
like to use something like Hazelcast, which has a peer-to-peer cluster
coordination implementation.

I found that, there is already this JIRA [1], which requests for a custom
persistence engine, I guess for storing state information. So basically,
what I would want to do is, use Hazelcast to use for leader election, to
make an existing node the master, and to lookup the state information from
the distributed memory. Appreciate any help on how to archive this. And if
it useful for a wider audience, hopefully I can contribute this back to the
project.

[1] https://issues.apache.org/jira/browse/SPARK-1180

Cheers,
Anjana.



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: Custom Cluster Managers / Standalone Recovery Mode in Spark

Reply via email to