Thanks Mathieu So it would be interesting to see what resources allocated in your case, especially the num-executors and executor-cores. I gather every node has enough memory and cores.
${SPARK_HOME}/bin/spark-submit \ --master local[2] \ --driver-memory 4g \ --num-executors=1 \ --executor-memory=4G \ --executor-cores=2 \ Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 19 May 2016 at 21:02, Mathieu Longtin <math...@closetwork.org> wrote: > The driver (the process started by spark-submit) runs locally. The > executors run on any of thousands of servers. So far, I haven't tried more > than 500 executors. > > Right now, I run a master on the same server as the driver. > > On Thu, May 19, 2016 at 3:49 PM Mich Talebzadeh <mich.talebza...@gmail.com> > wrote: > >> ok so you are using some form of NFS mounted file system shared among the >> nodes and basically you start the processes through spark-submit. >> >> In Stand-alone mode, a simple cluster manager included with Spark. It >> does the management of resources so it is not clear to me what you are >> referring as worker manager here? >> >> This is my take from your model. >> The application will go and grab all the cores in the cluster. >> You only have one worker that lives within the driver JVM process. >> The Driver node runs on the same host that the cluster manager is >> running. The Driver requests the Cluster Manager for resources to run >> tasks. In this case there is only one executor for the Driver? The Executor >> runs tasks for the Driver. >> >> >> HTH >> >> Dr Mich Talebzadeh >> >> >> >> LinkedIn * >> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >> >> >> >> http://talebzadehmich.wordpress.com >> >> >> >> On 19 May 2016 at 20:37, Mathieu Longtin <math...@closetwork.org> wrote: >> >>> No master and no node manager, just the processes that do actual work. >>> >>> We use the "stand alone" version because we have a shared file system >>> and a way of allocating computing resources already (Univa Grid Engine). If >>> an executor were to die, we have other ways of restarting it, we don't need >>> the worker manager to deal with it. >>> >>> On Thu, May 19, 2016 at 3:16 PM Mich Talebzadeh < >>> mich.talebza...@gmail.com> wrote: >>> >>>> Hi Mathieu >>>> >>>> What does this approach provide that the norm lacks? >>>> >>>> So basically each node has its master in this model. >>>> >>>> Are these supposed to be individual stand alone servers? >>>> >>>> >>>> Thanks >>>> >>>> >>>> Dr Mich Talebzadeh >>>> >>>> >>>> >>>> LinkedIn * >>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>>> >>>> >>>> >>>> http://talebzadehmich.wordpress.com >>>> >>>> >>>> >>>> On 19 May 2016 at 18:45, Mathieu Longtin <math...@closetwork.org> >>>> wrote: >>>> >>>>> First a bit of context: >>>>> We use Spark on a platform where each user start workers as needed. >>>>> This has the advantage that all permission management is handled by the >>>>> OS, >>>>> so the users can only read files they have permission to. >>>>> >>>>> To do this, we have some utility that does the following: >>>>> - start a master >>>>> - start worker managers on a number of servers >>>>> - "submit" the Spark driver program >>>>> - the driver then talks to the master, tell it how many executors it >>>>> needs >>>>> - the master tell the worker nodes to start executors and talk to the >>>>> driver >>>>> - the executors are started >>>>> >>>>> From here on, the master doesn't do much, neither do the process >>>>> manager on the worker nodes. >>>>> >>>>> What I would like to do is simplify this to: >>>>> - Start the driver program >>>>> - Start executors on a number of servers, telling them where to find >>>>> the driver >>>>> - The executors connect directly to the driver >>>>> >>>>> Is there a way I could do this without the master and worker managers? >>>>> >>>>> Thanks! >>>>> >>>>> >>>>> -- >>>>> Mathieu Longtin >>>>> 1-514-803-8977 >>>>> >>>> >>>> -- >>> Mathieu Longtin >>> 1-514-803-8977 >>> >> >> -- > Mathieu Longtin > 1-514-803-8977 >