Sorry for the incorrect information. Where can I pick up these architectural/design concepts for Spark? I seem to have misunderstood the responsibilities of the master and the driver.
On Mon, Feb 17, 2014 at 10:51 PM, Michael (Bach) Bui <[email protected]>wrote: > Spark has the concept of Driver and Master > > Driver is your the spark program that you run in your local machine. > SparkContext resides in the driver together with the DAG scheduler. > Master is responsible for managing cluster resources, e.g. giving the > Driver the workers that it needed. The Master can be either Mesos master > (for Mesos cluster), or Spark master (for Spark standalone cluster), or > ResourceManager (for Hadoop cluster) > Given the resources assigned by Master, Driver will user DAG to assign > tasks to workers. > > So yes, the result of spark's actions will be sent back to driver, which > is your local console. > > > On Feb 17, 2014, at 10:54 AM, David Thomas <[email protected]> wrote: > > So if I do a spark action, say, collect, will I be able to see the result > on my local console? Or would it be only available only on the cluster > master? > > > On Mon, Feb 17, 2014 at 9:50 AM, purav aggarwal < > [email protected]> wrote: > >> Your local machine simply submits your job (in the form of jar) to the >> cluster. >> The master node is where the SparkContext object is created, a DAG of >> your job is formed and tasks (stages) are assigned to different workers - >> which are not aware of anything but computation of task being assigned. >> >> >> On Mon, Feb 17, 2014 at 10:07 PM, David Thomas <[email protected]>wrote: >> >>> Where is the SparkContext object created then? On my local machine or on >>> the master node in the cluster? >>> >>> >>> On Mon, Feb 17, 2014 at 4:17 AM, Nhan Vu Lam Chi <[email protected]>wrote: >>> >>>> Your local app will be called "driver program", which creates jobs and >>>> submits them to the cluster for running. >>>> >>>> >>>> On Mon, Feb 17, 2014 at 9:19 AM, David Thomas <[email protected]>wrote: >>>> >>>>> From >>>>> docs<https://spark.incubator.apache.org/docs/latest/spark-standalone.html> >>>>> : >>>>> >>>>> >>>>> *Connecting an Application to the ClusterTo run an application on the >>>>> Spark cluster, simply pass the spark://IP:PORT URL of the master as to the >>>>> SparkContext constructor.* >>>>> >>>>> Could someone enlighten me on what happens if I run the app, from say, >>>>> Eclipse on my local machine, but use the url of the master node which is >>>>> on >>>>> cloud. What role does my local JVM play then? >>>>> >>>> >>>> >>> >> > >
