> by specifying a larger heap size than default on each worker node.
I don't follow. Which heap? Are you specifying a large heap size on the
executors? If so, do you mean you somehow launch the shuffle service when
you launch executors? Or something else?
On Wed, Feb 8, 2017 at 5:50 PM, Sun R
Michael,
No. We directly launch the external shuffle service by specifying a larger heap
size than default on each worker node. It is observed that the processes are
quite stable.
> On Feb 9, 2017, at 05:21, Michael Gummelt wrote:
>
> Sun, are you using marathon to run the shuffle service?
>
>
Sun, are you using marathon to run the shuffle service?
On Tue, Feb 7, 2017 at 7:36 PM, Sun Rui wrote:
> Yi Jan,
>
> We have been using Spark on Mesos with dynamic allocation enabled, which
> works and improves the overall cluster utilization.
>
> In terms of job, do you mean jobs inside a Spark
Yi Jan,
We have been using Spark on Mesos with dynamic allocation enabled, which works
and improves the overall cluster utilization.
In terms of job, do you mean jobs inside a Spark application or jobs among
different applications? Maybe you can read
http://spark.apache.org/docs/latest/job-sch
got it, thanks for clarifying!
On Thu, Feb 2, 2017 at 2:57 PM, Michael Gummelt
wrote:
> Yes, that's expected. spark.executor.cores sizes a single executor. It
> doesn't limit the number of executors. For that, you need spark.cores.max
> (--total-executor-cores).
>
> And rdd.parallelize does n
Yes, that's expected. spark.executor.cores sizes a single executor. It
doesn't limit the number of executors. For that, you need spark.cores.max
(--total-executor-cores).
And rdd.parallelize does not specify the number of executors. It specifies
the number of partitions, which relates to the n
I tried setting spark.executor.cores per executor, but Spark seems to be
spinning up as many executors as possible up to spark.cores.max or however
many cpu cores available on the cluster, and this may be undesirable
because the number of executors in rdd.parallelize(collection, # of
partitions) is
As of Spark 2.0, Mesos mode does support setting cores on the executor
level, but you might need to set the property directly (--conf
spark.executor.cores=). I've written about this here:
https://docs.mesosphere.com/1.8/usage/service-guides/spark/job-scheduling/.
That doc is for DC/OS, but the con
I was mainly confused why this is the case with memory, but with cpu cores,
it is not specified on per executor level
On Thu, Feb 2, 2017 at 1:02 PM, Michael Gummelt
wrote:
> It sounds like you've answered your own question, right?
> --executor-memory means the memory per executor. If you have
It sounds like you've answered your own question, right? --executor-memory
means the memory per executor. If you have no executor w/ 200GB memory,
then the driver will accept no offers.
On Thu, Feb 2, 2017 at 1:01 PM, Ji Yan wrote:
> sorry, to clarify, i was using --executor-memory for memory,
sorry, to clarify, i was using --executor-memory for memory,
and --total-executor-cores for cpu cores
On Thu, Feb 2, 2017 at 12:56 PM, Michael Gummelt
wrote:
> What CLI args are your referring to? I'm aware of spark-submit's
> arguments (--executor-memory, --total-executor-cores, and --executor
What CLI args are your referring to? I'm aware of spark-submit's arguments
(--executor-memory, --total-executor-cores, and --executor-cores)
On Thu, Feb 2, 2017 at 12:41 PM, Ji Yan wrote:
> I have done a experiment on this today. It shows that only CPUs are
> tolerant of insufficient cluster si
I have done a experiment on this today. It shows that only CPUs are
tolerant of insufficient cluster size when a job starts. On my cluster, I
have 180Gb of memory and 64 cores, when I run spark-submit ( on mesos )
with --cpu_cores set to 1000, the job starts up with 64 cores. but when I
set --memor
On Mon, Jan 30, 2017 at 9:47 AM, Ji Yan wrote:
> Tasks begin scheduling as soon as the first executor comes up
>
>
> Thanks all for the clarification. Is this the default behavior of Spark on
> Mesos today? I think this is what we are looking for because sometimes a
> job can take up lots of reso
>
> Tasks begin scheduling as soon as the first executor comes up
Thanks all for the clarification. Is this the default behavior of Spark on
Mesos today? I think this is what we are looking for because sometimes a
job can take up lots of resources and later jobs could not get all the
resources th
We've talked about that, but it hasn't become a priority because we haven't
had a driving use case. If anyone has a good argument for "variable"
resource allocation like this, please let me know.
On Sat, Jan 28, 2017 at 9:17 AM, Shuai Lin wrote:
> An alternative behavior is to launch the job wi
>
> An alternative behavior is to launch the job with the best resource offer
> Mesos is able to give
Michael has just made an excellent explanation about dynamic allocation
support in mesos. But IIUC, what you want to achieve is something like
(using RAM as an example) : "Launch each executor wi
What about Spark on Kubernetes, is there a way to manage dynamic resource allocation?
Regards,
Mihai Iacob
> The way I understand is that the Spark job will not run if the CPU/Mem
requirement is not met.
Spark jobs will still run if they only have a subset of the requested
resources. Tasks begin scheduling as soon as the first executor comes up.
Dynamic allocation yields increased utilization by only
19 matches
Mail list logo