Re: Connecting an Application to the Cluster

Christopher Nguyen Mon, 17 Feb 2014 09:08:54 -0800

David, actually, it's the driver that "creates" and holds a reference to
the SparkContext. The master in this context is only a resource manager
providing information about the cluster, being aware of where workers are,
how many there are, etc.


The SparkContext object can get serialized/deserialized and
instantiated/made available elsewhere (e.g., on the worker nodes), but this
is being overly precise and doesn't apply directly to the question you're
asking.

So yes, if you do collect(), you will be able to see the results on your
local console.
--
Christopher T. Nguyen
Co-founder & CEO, Adatao <http://adatao.com>
linkedin.com/in/ctnguyen



On Mon, Feb 17, 2014 at 8:54 AM, David Thomas <[email protected]> wrote:

> So if I do a spark action, say, collect, will I be able to see the result
> on my local console? Or would it be only available only on the cluster
> master?
>
>
> On Mon, Feb 17, 2014 at 9:50 AM, purav aggarwal <
> [email protected]> wrote:
>
>> Your local machine simply submits your job (in the form of jar) to the
>> cluster.
>> The master node is where the SparkContext object is created, a DAG of
>> your job is formed and tasks (stages) are assigned to different workers -
>> which are not aware of anything but computation of task being assigned.
>>
>>
>> On Mon, Feb 17, 2014 at 10:07 PM, David Thomas <[email protected]>wrote:
>>
>>> Where is the SparkContext object created then? On my local machine or on
>>> the master node in the cluster?
>>>
>>>
>>> On Mon, Feb 17, 2014 at 4:17 AM, Nhan Vu Lam Chi <[email protected]>wrote:
>>>
>>>> Your local app will be called "driver program", which creates jobs and
>>>> submits them to the cluster for running.
>>>>
>>>>
>>>> On Mon, Feb 17, 2014 at 9:19 AM, David Thomas <[email protected]>wrote:
>>>>
>>>>> From 
>>>>> docs<https://spark.incubator.apache.org/docs/latest/spark-standalone.html>
>>>>> :
>>>>>
>>>>>
>>>>> *Connecting an Application to the ClusterTo run an application on the
>>>>> Spark cluster, simply pass the spark://IP:PORT URL of the master as to the
>>>>> SparkContext constructor.*
>>>>>
>>>>> Could someone enlighten me on what happens if I run the app, from say,
>>>>> Eclipse on my local machine, but use the url of the master node which is 
>>>>> on
>>>>> cloud. What role does my local JVM play then?
>>>>>
>>>>
>>>>
>>>
>>
>

Re: Connecting an Application to the Cluster

Reply via email to