Thanks everyone, it all makes sense now.

On Mon, Feb 17, 2014 at 10:21 AM, Michael (Bach) Bui <[email protected]>wrote:

> Spark has the concept of  Driver and Master
>
> Driver is your the spark program that you run in your local machine.
> SparkContext resides in the driver together with the DAG scheduler.
> Master is responsible for managing cluster resources, e.g. giving the
> Driver the workers that it needed. The Master can be either Mesos master
> (for Mesos cluster), or Spark master (for Spark standalone cluster), or
> ResourceManager (for Hadoop cluster)
> Given the resources assigned by Master, Driver will user DAG to assign
> tasks to workers.
>
> So yes, the result of spark's actions will be sent back to driver, which
> is your local console.
>
>
> On Feb 17, 2014, at 10:54 AM, David Thomas <[email protected]> wrote:
>
> So if I do a spark action, say, collect, will I be able to see the result
> on my local console? Or would it be only available only on the cluster
> master?
>
>
> On Mon, Feb 17, 2014 at 9:50 AM, purav aggarwal <
> [email protected]> wrote:
>
>> Your local machine simply submits your job (in the form of jar) to the
>> cluster.
>> The master node is where the SparkContext object is created, a DAG of
>> your job is formed and tasks (stages) are assigned to different workers -
>> which are not aware of anything but computation of task being assigned.
>>
>>
>> On Mon, Feb 17, 2014 at 10:07 PM, David Thomas <[email protected]>wrote:
>>
>>> Where is the SparkContext object created then? On my local machine or on
>>> the master node in the cluster?
>>>
>>>
>>> On Mon, Feb 17, 2014 at 4:17 AM, Nhan Vu Lam Chi <[email protected]>wrote:
>>>
>>>> Your local app will be called "driver program", which creates jobs and
>>>> submits them to the cluster for running.
>>>>
>>>>
>>>> On Mon, Feb 17, 2014 at 9:19 AM, David Thomas <[email protected]>wrote:
>>>>
>>>>> From 
>>>>> docs<https://spark.incubator.apache.org/docs/latest/spark-standalone.html>
>>>>> :
>>>>>
>>>>>
>>>>> *Connecting an Application to the ClusterTo run an application on the
>>>>> Spark cluster, simply pass the spark://IP:PORT URL of the master as to the
>>>>> SparkContext constructor.*
>>>>>
>>>>> Could someone enlighten me on what happens if I run the app, from say,
>>>>> Eclipse on my local machine, but use the url of the master node which is 
>>>>> on
>>>>> cloud. What role does my local JVM play then?
>>>>>
>>>>
>>>>
>>>
>>
>
>

Reply via email to