Re: Spark work distribution among execs

bkapukaranov Tue, 15 Mar 2016 07:45:06 -0700

Hi,

Yes, I'm running the executors with 8 cores each. I also have properly
configured executor memory, driver memory, num execs and so on in submit
cmd.
I'm a long time spark user, please lets skip the dummy cmd configuration
stuff and dive in the interesting stuff :)


Another strange thing I've noticed is that this behaviour seems to be
exposed only for reading jsons:
- reading json from remote hdfs -> uneven executor performance
- reading parquets from remote hdfs -> even executor performance

>>What do you mean by the difference between the nodes is huge ?
When I look at the Input column in the Executors tab of the Spark WebUI the
values for the nodes that do the work vs all others is huge. For example in
the image below the diff is x4, but sometimes I've seen in x10 in the same
usecase.
<http://apache-spark-user-list.1001560.n3.nabble.com/file/n26504/inputcol.png> 



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-work-distribution-among-execs-tp26502p26504.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: Spark work distribution among execs

Reply via email to