The data would get aggregated on the master node. Since the JVM for the application is invoked from your local machine (spark driver) I think you might be able to print it on your console.
On Mon, Feb 17, 2014 at 10:24 PM, David Thomas <[email protected]> wrote: > So if I do a spark action, say, collect, will I be able to see the result > on my local console? Or would it be only available only on the cluster > master? > > > On Mon, Feb 17, 2014 at 9:50 AM, purav aggarwal < > [email protected]> wrote: > >> Your local machine simply submits your job (in the form of jar) to the >> cluster. >> The master node is where the SparkContext object is created, a DAG of >> your job is formed and tasks (stages) are assigned to different workers - >> which are not aware of anything but computation of task being assigned. >> >> >> On Mon, Feb 17, 2014 at 10:07 PM, David Thomas <[email protected]>wrote: >> >>> Where is the SparkContext object created then? On my local machine or on >>> the master node in the cluster? >>> >>> >>> On Mon, Feb 17, 2014 at 4:17 AM, Nhan Vu Lam Chi <[email protected]>wrote: >>> >>>> Your local app will be called "driver program", which creates jobs and >>>> submits them to the cluster for running. >>>> >>>> >>>> On Mon, Feb 17, 2014 at 9:19 AM, David Thomas <[email protected]>wrote: >>>> >>>>> From >>>>> docs<https://spark.incubator.apache.org/docs/latest/spark-standalone.html> >>>>> : >>>>> >>>>> >>>>> *Connecting an Application to the ClusterTo run an application on the >>>>> Spark cluster, simply pass the spark://IP:PORT URL of the master as to the >>>>> SparkContext constructor.* >>>>> >>>>> Could someone enlighten me on what happens if I run the app, from say, >>>>> Eclipse on my local machine, but use the url of the master node which is >>>>> on >>>>> cloud. What role does my local JVM play then? >>>>> >>>> >>>> >>> >> >
