Re: Spark #cores

2017-01-18 Thread Palash Gupta
yes,spark.sql.shuffle.partitions needs to be changed.   From: Saliya Ekanayake [mailto:esal...@gmail.com] Sent: Wednesday, January 18, 2017 8:56 PM To: User <user@spark.apache.org> Subject: Spark #cores   Hi,   I am running a Spark application setting the number of executor

Re: Spark #cores

2017-01-18 Thread Saliya Ekanayake
Thank you, Daniel and Yong! On Wed, Jan 18, 2017 at 4:56 PM, Daniel Siegmann < dsiegm...@securityscorecard.io> wrote: > I am not too familiar with Spark Standalone, so unfortunately I cannot > give you any definite answer. I do want to clarify something though. > > The properties

Re: Spark #cores

2017-01-18 Thread Daniel Siegmann
I am not too familiar with Spark Standalone, so unfortunately I cannot give you any definite answer. I do want to clarify something though. The properties spark.sql.shuffle.partitions and spark.default.parallelism affect how your data is split up, which will determine the *total* number of tasks,

Re: Spark #cores

2017-01-18 Thread Yong Zhang
sal...@gmail.com> Sent: Wednesday, January 18, 2017 3:21 PM To: Yong Zhang Cc: spline_pal...@yahoo.com; jasbir.s...@accenture.com; User Subject: Re: Spark #cores So, I should be using spark.sql.shuffle.partitions to control the parallelism? Is there there a guide to how to tune this? Tha

Re: Spark #cores

2017-01-18 Thread Saliya Ekanayake
. > > > Yong > > > -- > *From:* Saliya Ekanayake <esal...@gmail.com> > *Sent:* Wednesday, January 18, 2017 12:33 PM > *To:* spline_pal...@yahoo.com > *Cc:* jasbir.s...@accenture.com; User > *Subject:* Re: Spark #cores > > The

Re: Spark #cores

2017-01-18 Thread Yong Zhang
elism", instead of "spark.sql.shuffle.partitions". Yong From: Saliya Ekanayake <esal...@gmail.com> Sent: Wednesday, January 18, 2017 12:33 PM To: spline_pal...@yahoo.com Cc: jasbir.s...@accenture.com; User Subject: Re: Spark #cores The Spark version I am

Re: Spark #cores

2017-01-18 Thread Saliya Ekanayake
k. > > On Wed, Jan 18, 2017 at 10:33 AM, <jasbir.s...@accenture.com> wrote: > >> Are you talking here of Spark SQL ? >> >> If yes, spark.sql.shuffle.partitions needs to be changed. >> >> >> >> *From:* Saliya Ekanayake [mailto:esal...@gmail.

Re: Spark #cores

2017-01-18 Thread Palash Gupta
18, 2017 8:56 PM To: User <user@spark.apache.org> Subject: Spark #cores   Hi,   I am running a Spark application setting the number of executor cores 1 and a default parallelism of 32 over 8 physical nodes.    The web UI shows it's running on 200 cores. I can't relate this number to the

Re: Spark #cores

2017-01-18 Thread Saliya Ekanayake
t; > > *From:* Saliya Ekanayake [mailto:esal...@gmail.com] > *Sent:* Wednesday, January 18, 2017 8:56 PM > *To:* User <user@spark.apache.org> > *Subject:* Spark #cores > > > > Hi, > > > > I am running a Spark application setting the number of execut

RE: Spark #cores

2017-01-18 Thread jasbir.sing
Are you talking here of Spark SQL ? If yes, spark.sql.shuffle.partitions needs to be changed. From: Saliya Ekanayake [mailto:esal...@gmail.com] Sent: Wednesday, January 18, 2017 8:56 PM To: User <user@spark.apache.org> Subject: Spark #cores Hi, I am running a Spark application s

Spark #cores

2017-01-18 Thread Saliya Ekanayake
Hi, I am running a Spark application setting the number of executor cores 1 and a default parallelism of 32 over 8 physical nodes. The web UI shows it's running on 200 cores. I can't relate this number to the parameters I've used. How can I control the parallelism in a more deterministic way?