We usually run Spark in HA with the following stack: -> Apache Mesos -> Marathon - init/control system for starting, stopping, and maintaining always-on applications.(Mainly SparkStreaming) -> Chronos - general-purpose scheduler for Mesos, supports job dependency graphs. -> Spark Job Server - primarily for it's ability to reuse shared contexts with multiple jobs
This thread has a better discussion http://apache-spark-user-list.1001560.n3.nabble.com/How-do-you-run-your-spark-app-td7935.html Thanks Best Regards On Mon, Jan 12, 2015 at 10:08 PM, preeze <etan...@gmail.com> wrote: > Dear community, > > I've been searching the internet for quite a while to find out what is the > best architecture to support HA for a spark client. > > We run an application that connects to a standalone Spark cluster and > caches > a big chuck of data for subsequent intensive calculations. To achieve HA > we'll need to run several instances of the application on different hosts. > > Initially I explored the option to reuse (i.e. share) the same executors > set > between SparkContext instances of all running applications. Found it > impossible. > > So, every application, which creates an instance of SparkContext, has to > spawn its own executors. Externalizing and sharing executors' memory cache > with Tachyon is a semi-solution since each application's executors will > keep > using their own set of CPU cores. > > Spark-jobserver is another possibility. It manages SparkContext itself and > accepts job requests from multiple clients for the same context which is > brilliant. However, this becomes a new single point of failure. > > Now I am exploring if it's possible to run the Spark cluster in YARN > cluster > mode and connect to the driver from multiple clients. > > Is there anything I am missing guys? > Any suggestion is highly appreciated! > > > > -- > View this message in context: > http://apache-spark-developers-list.1001551.n3.nabble.com/Apache-Spark-client-high-availability-tp10088.html > Sent from the Apache Spark Developers List mailing list archive at > Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > For additional commands, e-mail: dev-h...@spark.apache.org > >