Github user dragos commented on a diff in the pull request:
https://github.com/apache/spark/pull/11047#discussion_r51875638
--- Diff: docs/running-on-mesos.md ---
@@ -246,18 +246,13 @@ In either case, HDFS runs separately from Hadoop
MapReduce, without being schedu
# Dynamic Resource Allocation with Mesos
-Mesos supports dynamic allocation only with coarse grain mode, which can
resize the number of executors based on statistics
-of the application. While dynamic allocation supports both scaling up and
scaling down the number of executors, the coarse grain scheduler only supports
scaling down
-since it is already designed to run one executor per slave with the
configured amount of resources. However, after scaling down the number of
executors the coarse grain scheduler
-can scale back up to the same amount of executors when Spark signals more
executors are needed.
-
-Users that like to utilize this feature should launch the Mesos Shuffle
Service that
-provides shuffle data cleanup functionality on top of the Shuffle Service
since Mesos doesn't yet support notifying another framework's
-termination. To launch/stop the Mesos Shuffle Service please use the
provided sbin/start-mesos-shuffle-service.sh and
sbin/stop-mesos-shuffle-service.sh
-scripts accordingly.
-
-The Shuffle Service is expected to be running on each slave node that will
run Spark executors. One way to easily achieve this with Mesos
-is to launch the Shuffle Service with Marathon with a unique host
constraint.
+Mesos supports dynamic allocation only with coarse-grain mode, which can
resize the number of
+executors based on statistics of the application. For general information,
+see [Dynamic Resource
Allocation](job-scheduling.html#dynamic-resource-allocation).
+
+The External Shuffle Service to use in the Mesos Shuffle Service. It
provides shuffle data cleanup functionality
+on top of the Shuffle Service since Mesos doesn't yet support notifying
another framework's
+termination. To launch it, run
`$SPARK_HOME/sbin/start-mesos-shuffle-service.sh` on all slave nodes with
`spark.shuffle.service.enabled` set to true. This can be achieved through
Marathon, using a unique host constraint.
--- End diff --
One thing to note, though. Marathon won't be able to launch
`sbin/start-mesos-shuffle-service.sh` because it immediately goes to background
and Marathon thinks it exited. It will keep re-launching to the end of days.
What you need is to launch it via `spark-class`, for instance I'm using
`bin/spark-class org.apache.spark.deploy.mesos.MesosExternalShuffleService`.
See [this
discussion](http://mail-archives.apache.org/mod_mbox/mesos-user/201511.mbox/%[email protected]%3E)
on `mesos-user`.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]