Hello ,

I was trying to optimize my spark cluster. I did it to some extent by doing
some changes in yarn-site.xml and spark-defaults.conf file. before the
changes the mapreduce import job was running fine along with slow thrift
server.
after changes, i have to kill the thrift server to execute my sqoop import
job.

following are the configurations-

*yarn-site.xml*

<name>yarn.nodemanager.resource.pcores-vcores-multiplier</name>
<value>1.0</value>

<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>5</value>

<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>4</value>

<name>yarn.scheduler.maximum-allocation-vcores</name>
<value>4</value>


*spark-defaults.conf*

spark.master                       yarn
spark.driver.memory                9g
spark.executor.memory              8570m
spark.yarn.executor.memoryOverhead 646m

spark.executor.instances           11
spark.executor.cores               3
spark.default.parallelism        30

SPARK_WORKER_MEMORY 10g
SPARK_WORKER_INSTANCES 1
SPARK_WORKER_CORES 4

SPARK_DRIVER_MEMORY 9g
SPARK_DRIVER_CORES 3

SPARK_MASTER_PORT 7077

SPARK_EXECUTOR_INSTANCES 11
SPARK_EXECUTOR_CORES 3
SPARK_EXECUTOR_MEMORY 8570m


*Resources in cluster of 9 nodes are *
12GB RAM and 6 cores on each nodes.


Thanks for your time.

Reply via email to