x27;user@spark.apache.org'
Subject: RE: General configurations on CDH5 to achieve maximum Spark
Performance
Essentially to change the performance yield of software cluster
infrastructure platform like spark you play with different permutations of:
- Number of CPU cores used by Sp
I don't think there's anything specific to CDH that you need to know,
other than it ought to set things up sanely for you.
Sandy did a couple posts about tuning:
http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-1/
http://blog.cloudera.com/blog/2015/03/how-to-tune-your-
-on-yarn.html
From: Manish Gupta 8 [mailto:mgupt...@sapient.com]
Sent: Thursday, April 16, 2015 6:21 PM
To: Evo Eftimov; user@spark.apache.org
Subject: RE: General configurations on CDH5 to achieve maximum Spark
Performance
Thanks Evo. Yes, my concern is only regarding the infrastructure
.
Thanks,
Manish
From: Evo Eftimov [mailto:evo.efti...@isecc.com]
Sent: Thursday, April 16, 2015 10:38 PM
To: Manish Gupta 8; user@spark.apache.org
Subject: RE: General configurations on CDH5 to achieve maximum Spark Performance
Well there are a number of performance tuning guidelines in dedicated
because all worker instances run in the memory of a single
machine ..
Regards,
Evo Eftimov
From: Manish Gupta 8 [mailto:mgupt...@sapient.com]
Sent: Thursday, April 16, 2015 6:03 PM
To: user@spark.apache.org
Subject: General configurations on CDH5 to achieve maximum Spark Performance
Hi
Hi,
Is there a document/link that describes the general configuration settings to
achieve maximum Spark Performance while running on CDH5? In our environment, we
did lot of changes (and still doing it) to get decent performance otherwise our
6 node dev cluster with default configurations, lags