Re: SPARK-streaming app running 10x slower on YARN vs STANDALONE cluster

Mukesh Jha Tue, 30 Dec 2014 01:43:56 -0800

Thanks Sandy, It was the issue with the no of cores.

Another issue I was facing is that tasks are not getting distributed evenly
among all executors and are running on the NODE_LOCAL locality level i.e.
all the tasks are running on the same executor where my kafkareceiver(s)
are running even though other executors are idle.


I configured *spark.locality.wait=50* instead of the default 3000 ms, which
forced the task rebalancing among nodes, let me know if there is a better
way to deal with this.


On Tue, Dec 30, 2014 at 12:09 AM, Mukesh Jha <me.mukesh....@gmail.com>
wrote:

> Makes sense, I've also tries it in standalone mode where all 3 workers &
> driver were running on the same 8 core box and the results were similar.
>
> Anyways I will share the results in YARN mode with 8 core yarn containers.
>
> On Mon, Dec 29, 2014 at 11:58 PM, Sandy Ryza <sandy.r...@cloudera.com>
> wrote:
>
>> When running in standalone mode, each executor will be able to use all 8
>> cores on the box.  When running on YARN, each executor will only have
>> access to 2 cores.  So the comparison doesn't seem fair, no?
>>
>> -Sandy
>>
>> On Mon, Dec 29, 2014 at 10:22 AM, Mukesh Jha <me.mukesh....@gmail.com>
>> wrote:
>>
>>> Nope, I am setting 5 executors with 2  cores each. Below is the command
>>> that I'm using to submit in YARN mode. This starts up 5 executor nodes and
>>> a drives as per the spark  application master UI.
>>>
>>> spark-submit --master yarn-cluster --num-executors 5 --driver-memory
>>> 1024m --executor-memory 1024m --executor-cores 2 --class
>>> com.oracle.ci.CmsgK2H /homext/lib/MJ-ci-k2h.jar vm.cloud.com:2181/kafka
>>>  spark-yarn avro 1 5000
>>>
>>> On Mon, Dec 29, 2014 at 11:45 PM, Sandy Ryza <sandy.r...@cloudera.com>
>>> wrote:
>>>
>>>> *oops, I mean are you setting --executor-cores to 8
>>>>
>>>> On Mon, Dec 29, 2014 at 10:15 AM, Sandy Ryza <sandy.r...@cloudera.com>
>>>> wrote:
>>>>
>>>>> Are you setting --num-executors to 8?
>>>>>
>>>>> On Mon, Dec 29, 2014 at 10:13 AM, Mukesh Jha <me.mukesh....@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Sorry Sandy, The command is just for reference but I can confirm that
>>>>>> there are 4 executors and a driver as shown in the spark UI page.
>>>>>>
>>>>>> Each of these machines is a 8 core box with ~15G of ram.
>>>>>>
>>>>>> On Mon, Dec 29, 2014 at 11:23 PM, Sandy Ryza <sandy.r...@cloudera.com
>>>>>> > wrote:
>>>>>>
>>>>>>> Hi Mukesh,
>>>>>>>
>>>>>>> Based on your spark-submit command, it looks like you're only
>>>>>>> running with 2 executors on YARN.  Also, how many cores does each 
>>>>>>> machine
>>>>>>> have?
>>>>>>>
>>>>>>> -Sandy
>>>>>>>
>>>>>>> On Mon, Dec 29, 2014 at 4:36 AM, Mukesh Jha <me.mukesh....@gmail.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> Hello Experts,
>>>>>>>> I'm bench-marking Spark on YARN (
>>>>>>>> https://spark.apache.org/docs/latest/running-on-yarn.html) vs a
>>>>>>>> standalone spark cluster (
>>>>>>>> https://spark.apache.org/docs/latest/spark-standalone.html).
>>>>>>>> I have a standalone cluster with 3 executors, and a spark app
>>>>>>>> running on yarn with 4 executors as shown below.
>>>>>>>>
>>>>>>>> The spark job running inside yarn is 10x slower than the one
>>>>>>>> running on the standalone cluster (even though the yarn has more 
>>>>>>>> number of
>>>>>>>> workers), also in both the case all the executors are in the same
>>>>>>>> datacenter so there shouldn't be any latency. On YARN each 5sec batch 
>>>>>>>> is
>>>>>>>> reading data from kafka and processing it in 5sec & on the standalone
>>>>>>>> cluster each 5sec batch is getting processed in 0.4sec.
>>>>>>>> Also, In YARN mode all the executors are not getting used up evenly
>>>>>>>> as vm-13 & vm-14 are running most of the tasks whereas in the 
>>>>>>>> standalone
>>>>>>>> mode all the executors are running the tasks.
>>>>>>>>
>>>>>>>> Do I need to set up some configuration to evenly distribute the
>>>>>>>> tasks? Also do you have any pointers on the reasons the yarn job is 10x
>>>>>>>> slower than the standalone job?
>>>>>>>> Any suggestion is greatly appreciated, Thanks in advance.
>>>>>>>>
>>>>>>>> YARN(5 workers + driver)
>>>>>>>> ========================
>>>>>>>> Executor ID Address RDD Blocks Memory Used DU  AT FT CT TT TT Input
>>>>>>>> ShuffleRead ShuffleWrite Thread Dump
>>>>>>>> 1 vm-18.cloud.com:51796 0 0.0B/530.3MB 0.0 B 1 0 16 17 634 ms 0.0
>>>>>>>> B 2047.0 B 1710.0 B Thread Dump
>>>>>>>> 2 vm-13.cloud.com:57264 0 0.0B/530.3MB 0.0 B 0 0 1427 1427 5.5 m 0.0
>>>>>>>> B 0.0 B 0.0 B Thread Dump
>>>>>>>> 3 vm-14.cloud.com:54570 0 0.0B/530.3MB 0.0 B 0 0 1379 1379 5.2 m 0.0
>>>>>>>> B 1368.0 B 2.8 KB Thread Dump
>>>>>>>> 4 vm-11.cloud.com:56201 0 0.0B/530.3MB 0.0 B 0 0 10 10 625 ms 0.0
>>>>>>>> B 1368.0 B 1026.0 B Thread Dump
>>>>>>>> 5 vm-5.cloud.com:42958 0 0.0B/530.3MB 0.0 B 0 0 22 22 632 ms 0.0 B 
>>>>>>>> 1881.0
>>>>>>>> B 2.8 KB Thread Dump
>>>>>>>> <driver> vm.cloud.com:51847 0 0.0B/530.0MB 0.0 B 0 0 0 0 0 ms 0.0
>>>>>>>> B 0.0 B 0.0 B Thread Dump
>>>>>>>>
>>>>>>>> /homext/spark/bin/spark-submit
>>>>>>>> --master yarn-cluster --num-executors 2 --driver-memory 512m
>>>>>>>> --executor-memory 512m --executor-cores 2
>>>>>>>> --class com.oracle.ci.CmsgK2H /homext/lib/MJ-ci-k2h.jar
>>>>>>>> vm.cloud.com:2181/kafka spark-yarn avro 1 5000
>>>>>>>>
>>>>>>>> STANDALONE(3 workers + driver)
>>>>>>>> ==============================
>>>>>>>> Executor ID Address RDD Blocks Memory Used DU AT FT CT TT TT Input 
>>>>>>>> ShuffleRead
>>>>>>>> ShuffleWrite Thread Dump
>>>>>>>> 0 vm-71.cloud.com:55912 0 0.0B/265.0MB 0.0 B 0 0 1069 1069 6.0 m 0.0
>>>>>>>> B 1534.0 B 3.0 KB Thread Dump
>>>>>>>> 1 vm-72.cloud.com:40897 0 0.0B/265.0MB 0.0 B 0 0 1057 1057 5.9 m 0.0
>>>>>>>> B 1368.0 B 4.0 KB Thread Dump
>>>>>>>> 2 vm-73.cloud.com:37621 0 0.0B/265.0MB 0.0 B 1 0 1059 1060 5.9 m 0.0
>>>>>>>> B 2.0 KB 1368.0 B Thread Dump
>>>>>>>> <driver> vm.cloud.com:58299 0 0.0B/265.0MB 0.0 B 0 0 0 0 0 ms 0.0
>>>>>>>> B 0.0 B 0.0 B Thread Dump
>>>>>>>>
>>>>>>>> /homext/spark/bin/spark-submit
>>>>>>>> --master spark://chsnmvproc71vm3.usdc2.oraclecloud.com:7077
>>>>>>>> --class com.oracle.ci.CmsgK2H /homext/lib/MJ-ci-k2h.jar
>>>>>>>> vm.cloud.com:2181/kafka spark-standalone avro 1 5000
>>>>>>>>
>>>>>>>> PS: I did go through the spark website and
>>>>>>>> http://www.virdata.com/tuning-spark/, but was out of any luck.
>>>>>>>>
>>>>>>>> --
>>>>>>>> Cheers,
>>>>>>>> Mukesh Jha
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>>
>>>>>> Thanks & Regards,
>>>>>>
>>>>>> *Mukesh Jha <me.mukesh....@gmail.com>*
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>>
>>>
>>> Thanks & Regards,
>>>
>>> *Mukesh Jha <me.mukesh....@gmail.com>*
>>>
>>
>>
>
>
> --
>
>
> Thanks & Regards,
>
> *Mukesh Jha <me.mukesh....@gmail.com>*
>



-- 


Thanks & Regards,

*Mukesh Jha <me.mukesh....@gmail.com>*

Re: SPARK-streaming app running 10x slower on YARN vs STANDALONE cluster

Reply via email to