We usually run Spark in HA with the following stack:

-> Apache Mesos
-> Marathon - init/control system for starting, stopping, and maintaining
always-on applications.(Mainly SparkStreaming)
-> Chronos - general-purpose scheduler for Mesos, supports job dependency
graphs.
-> Spark Job Server - primarily for it's ability to reuse shared contexts
with multiple jobs

​This thread has a better discussion
http://apache-spark-user-list.1001560.n3.nabble.com/How-do-you-run-your-spark-app-td7935.html
​


Thanks
Best Regards

On Mon, Jan 12, 2015 at 10:08 PM, preeze <etan...@gmail.com> wrote:

> Dear community,
>
> I've been searching the internet for quite a while to find out what is the
> best architecture to support HA for a spark client.
>
> We run an application that connects to a standalone Spark cluster and
> caches
> a big chuck of data for subsequent intensive calculations. To achieve HA
> we'll need to run several instances of the application on different hosts.
>
> Initially I explored the option to reuse (i.e. share) the same executors
> set
> between SparkContext instances of all running applications. Found it
> impossible.
>
> So, every application, which creates an instance of SparkContext, has to
> spawn its own executors. Externalizing and sharing executors' memory cache
> with Tachyon is a semi-solution since each application's executors will
> keep
> using their own set of CPU cores.
>
> Spark-jobserver is another possibility. It manages SparkContext itself and
> accepts job requests from multiple clients for the same context which is
> brilliant. However, this becomes a new single point of failure.
>
> Now I am exploring if it's possible to run the Spark cluster in YARN
> cluster
> mode and connect to the driver from multiple clients.
>
> Is there anything I am missing guys?
> Any suggestion is highly appreciated!
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/Apache-Spark-client-high-availability-tp10088.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>

Reply via email to