ok so you are using some form of NFS mounted file system shared among the
nodes and basically you start the processes through spark-submit.

In Stand-alone mode, a simple cluster manager included with Spark. It does
the management of resources so it is not clear to me what you are referring
as worker manager here?

This is my take from your model.
 The application will go and grab all the cores in the cluster.
You only have one worker that lives within the driver JVM process.
The Driver node runs on the same host that the cluster manager is running.
The Driver requests the Cluster Manager for resources to run tasks. In this
case there is only one executor for the Driver? The Executor runs tasks for
the Driver.


HTH

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com



On 19 May 2016 at 20:37, Mathieu Longtin <math...@closetwork.org> wrote:

> No master and no node manager, just the processes that do actual work.
>
> We use the "stand alone" version because we have a shared file system and
> a way of allocating computing resources already (Univa Grid Engine). If an
> executor were to die, we have other ways of restarting it, we don't need
> the worker manager to deal with it.
>
> On Thu, May 19, 2016 at 3:16 PM Mich Talebzadeh <mich.talebza...@gmail.com>
> wrote:
>
>> Hi Mathieu
>>
>> What does this approach provide that the norm lacks?
>>
>> So basically each node has its master in this model.
>>
>> Are these supposed to be individual stand alone servers?
>>
>>
>> Thanks
>>
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>>
>> On 19 May 2016 at 18:45, Mathieu Longtin <math...@closetwork.org> wrote:
>>
>>> First a bit of context:
>>> We use Spark on a platform where each user start workers as needed. This
>>> has the advantage that all permission management is handled by the OS, so
>>> the users can only read files they have permission to.
>>>
>>> To do this, we have some utility that does the following:
>>> - start a master
>>> - start worker managers on a number of servers
>>> - "submit" the Spark driver program
>>> - the driver then talks to the master, tell it how many executors it
>>> needs
>>> - the master tell the worker nodes to start executors and talk to the
>>> driver
>>> - the executors are started
>>>
>>> From here on, the master doesn't do much, neither do the process manager
>>> on the worker nodes.
>>>
>>> What I would like to do is simplify this to:
>>> - Start the driver program
>>> - Start executors on a number of servers, telling them where to find the
>>> driver
>>> - The executors connect directly to the driver
>>>
>>> Is there a way I could do this without the master and worker managers?
>>>
>>> Thanks!
>>>
>>>
>>> --
>>> Mathieu Longtin
>>> 1-514-803-8977
>>>
>>
>> --
> Mathieu Longtin
> 1-514-803-8977
>

Reply via email to