Spark works on different modes, either local (Spark or anything else does not manager) resources and standalone (Spark itself manages resources) plus others (see below)
These are from my notes, excluding mesos that I have not used - Spark Local - Spark runs on the local host. This is the simplest set up and best suited for learners who want to understand different concepts of Spark and those performing unit testing. - Spark Standalone – a simple cluster manager included with Spark that makes it easy to set up a cluster. - YARN Cluster Mode, the Spark driver runs inside an application master process which is managed by YARN on the cluster, and the client can go away after initiating the application. This is invoked with –master yarn and --deploy-mode cluster - YARN Client Mode, the driver runs in the client process, and the application master is only used for requesting resources from YARN. Unlike Spark standalone mode, in which the master’s address is specified in the --master parameter, in YARN mode the ResourceManager’s address is picked up from the Hadoop configuration. Thus, the --master parameter is yarn. This is invoked with --deploy-mode client So in Local mode is the simplest configuration of Spark that does not require a Cluster. The user on the local host can launch and experiment with Spark. In this mode the driver program (SparkSubmit), the resource manager and executor all exist within the same JVM. The JVM itself is the worker thread. In Local mode, you do not need to start master and slaves/workers. In this mode it is pretty simple and you can run as many JVMs (spark-submit) as your resources allow (resource meaning memory and cores). HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 19 June 2016 at 10:39, Takeshi Yamamuro <linguin....@gmail.com> wrote: > There are many technical differences inside though, how to use is the > almost same with each other. > yea, in a standalone mode, spark runs in a cluster way: see > http://spark.apache.org/docs/1.6.1/cluster-overview.html > > // maropu > > On Sun, Jun 19, 2016 at 6:14 PM, Ashok Kumar <ashok34...@yahoo.com> wrote: > >> thank you >> >> What are the main differences between a local mode and standalone mode. I >> understand local mode does not support cluster. Is that the only difference? >> >> >> >> On Sunday, 19 June 2016, 9:52, Takeshi Yamamuro <linguin....@gmail.com> >> wrote: >> >> >> Hi, >> >> In a local mode, spark runs in a single JVM that has a master and one >> executor with `k` threads. >> >> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/local/LocalSchedulerBackend.scala#L94 >> >> // maropu >> >> >> On Sun, Jun 19, 2016 at 5:39 PM, Ashok Kumar < >> ashok34...@yahoo.com.invalid> wrote: >> >> Hi, >> >> I have been told Spark in Local mode is simplest for testing. Spark >> document covers little on local mode except the cores used in --master >> local[k]. >> >> Where are the the driver program, executor and resources. Do I need to >> start worker threads and how many app I can use safely without exceeding >> memory allocated etc? >> >> Thanking you >> >> >> >> >> >> -- >> --- >> Takeshi Yamamuro >> >> >> > > > -- > --- > Takeshi Yamamuro >