Check how much free memory you have on your hosr

/usr/bin/free

as a heuristic values start with these in

export SPARK_EXECUTOR_CORES=4 ##, Number of cores for the workers (Default:
1).
export SPARK_EXECUTOR_MEMORY=8G ## , Memory per Worker (e.g. 1000M, 2G)
(Default: 1G)
export SPARK_DRIVER_MEMORY=1G ## , Memory for Master (e.g. 1000M, 2G)
(Default: 512 Mb)

in conf/spark-env.sh  and then increase another worker processes by adding
you standalone hostname to /conf/slaves. So that will create two worker
processes on that hostnames.

do sbin/start-master.sh (if not started) and do start-slaves.sh.

Log in to spark GUI websites for job on hostname:4040/executors/

And test your jobs for timing and completion. Adjust parameters
accordingly. Also make sure that memory/core ratio is reasonable.
Regardless of what you are using these are general confiiguration for
Spark. The important think for Spark is memory. Without it your application
will start spilling to disk and performance will suffer. Ensure that yu do
not starve OS from memory and cores.

HTH


Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com



On 7 May 2016 at 12:03, kmurph <k.l.mur...@qub.ac.uk> wrote:

>
> Hi,
>
> I'm running spark 1.6.1 on a single machine, initially a small one (8
> cores,
> 16GB ram) using "--master local[*]" to spark-submit and I'm trying to see
> scaling with increasing cores, unsuccessfully.
> Initially I'm setting SPARK_EXECUTOR_INSTANCES=1, and increasing cores for
> each executor.  The way I'm setting cores per executor is either with
> "SPARK_EXECUTOR_CORES=1" (up to 4) and I also tried with " --conf
> "spark.executor.cores=1 spark.executor.memory=9g".
> I'm repartitioning the RDD of the large dataset into 4/8/10 partitions for
> different runs.
>
> Am I setting executors/cores correctly for running Spark 1.6
> locally/Standalone mode ?
> The logs show the same overall  timings for execution of the key stages
> (within a stage I see the number of tasks match the data partitioning
> value)
> whether I'm setting for 1, 4 or 8 cores per executor.  And the process
> table
> looks like the requested cores aren't being used.
>
> I know eg. "--num.executors=X" is only an argument to Yarn.  I can't find
> specific instructions in one place for settings these params
> (executors/cores) on Spark running on one machine.
>
> An example of my full spark-submit command is:
>
> SPARK_EXECUTOR_INSTANCES=1 SPARK_EXECUTOR_CORES=4 spark-submit --master
> local[*] --conf "spark.executor.cores=4 spark.executor.memory=9g" --class
> asap.examples.mllib.TfIdfExample
>
> /home/ubuntu/spark-1.6.1-bin-hadoop2.6/asap_ml/target/scala-2.10/ml-operators_2.10-1.0.jar
>
> Duplicated settings here but it shows the different ways I've been setting
> the parameters.
>
> Thanks
> Karen
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Correct-way-of-setting-executor-numbers-and-executor-cores-in-Spark-1-6-1-for-non-clustered-mode-tp26894.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to