Hi Mich, I have set up spark default configuration in conf directory spark-defaults.conf where I specify master hence no need to put it in command line spark.master spark://spark.master:7077
the same applies to driver memory which has been increased to 4GB and the same is for spark.executor.memory 12GB as machines have 16GB Jakub On 4 July 2016 at 17:44, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > Hi Jakub, > > In standalone mode Spark does the resource management. Which version of > Spark are you running? > > How do you define your SparkConf() parameters for example setMaster etc. > > From > > spark-submit --driver-class-path spark/sqljdbc4.jar --class DemoApp > SparkPOC.jar 10 4.3 > > I did not see any executor, memory allocation, so I assume you are > allocating them somewhere else? > > HTH > > > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > On 4 July 2016 at 16:31, Jakub Stransky <stransky...@gmail.com> wrote: > >> Hello, >> >> I have a spark cluster consisting of 4 nodes in a standalone mode, master >> + 3 workers nodes with configured available memory and cpus etc. >> >> I have an spark application which is essentially a MLlib pipeline for >> training a classifier, in this case RandomForest but could be a >> DecesionTree just for the sake of simplicity. >> >> But when I submit the spark application to the cluster via spark submit >> it is running out of memory. Even though the executors are "taken"/created >> in the cluster they are esentially doing nothing ( poor cpu, nor memory >> utilization) while the master seems to do all the work which finally >> results in OOM. >> >> My submission is following: >> spark-submit --driver-class-path spark/sqljdbc4.jar --class DemoApp >> SparkPOC.jar 10 4.3 >> >> I am submitting from the master node. >> >> By default it is running in client mode which the driver process is >> attached to spark-shell. >> >> Do I need to set up some settings to make MLlib algos parallelized and >> distributed as well or all is driven by parallel factor set on dataframe >> with input data? >> >> Essentially it seems that all work is just done on master and the rest is >> idle. >> Any hints what to check? >> >> Thx >> Jakub >> >> >> >> > -- Jakub Stransky cz.linkedin.com/in/jakubstransky