Github user CodingCat commented on a diff in the pull request:
https://github.com/apache/spark/pull/731#discussion_r28365419
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala
---
@@ -524,52 +524,25 @@ private[master] class Master(
}
/**
- * Can an app use the given worker? True if the worker has enough memory
and we haven't already
- * launched an executor for the app on it (right now the standalone
backend doesn't like having
- * two executors on the same worker).
+ * Schedule executors to be launched on the workers.There are two modes
of launching executors.
+ * The first attempts to spread out an application's executors on as
many workers as possible,
+ * while the second does the opposite (i.e. launch them on as few
workers as possible). The former
+ * is usually better for data locality purposes and is the default. The
number of cores assigned
+ * to each executor is configurable. When this is explicitly set,
multiple executors from the same
+ * application may be launched on the same worker if the worker has
enough cores and memory.
+ * Otherwise, each executor grabs all the cores available on the worker
by default, in which case
+ * only one executor may be launched on each worker.
*/
- private def canUse(app: ApplicationInfo, worker: WorkerInfo): Boolean = {
- worker.memoryFree >= app.desc.memoryPerSlave &&
!worker.hasExecutor(app)
- }
-
- /**
- * Schedule the currently available resources among waiting apps. This
method will be called
- * every time a new app joins or resource availability changes.
- */
- private def schedule() {
- if (state != RecoveryState.ALIVE) { return }
-
- // First schedule drivers, they take strict precedence over
applications
- // Randomization helps balance drivers
- val shuffledAliveWorkers = Random.shuffle(workers.toSeq.filter(_.state
== WorkerState.ALIVE))
- val numWorkersAlive = shuffledAliveWorkers.size
- var curPos = 0
-
- for (driver <- waitingDrivers.toList) { // iterate over a copy of
waitingDrivers
- // We assign workers to each waiting driver in a round-robin
fashion. For each driver, we
- // start from the last worker that was assigned a driver, and
continue onwards until we have
- // explored all alive workers.
- var launched = false
- var numWorkersVisited = 0
- while (numWorkersVisited < numWorkersAlive && !launched) {
- val worker = shuffledAliveWorkers(curPos)
- numWorkersVisited += 1
- if (worker.memoryFree >= driver.desc.mem && worker.coresFree >=
driver.desc.cores) {
- launchDriver(worker, driver)
- waitingDrivers -= driver
- launched = true
- }
- curPos = (curPos + 1) % numWorkersAlive
- }
- }
-
+ private def startExecutorsOnWorkers(): Unit = {
// Right now this is a very simple FIFO scheduler. We keep trying to
fit in the first app
// in the queue, then the second app, etc.
if (spreadOutApps) {
- // Try to spread out each app among all the nodes, until it has all
its cores
+ // Try to spread out each app among all the workers, until it has
all its cores
for (app <- waitingApps if app.coresLeft > 0) {
val usableWorkers = workers.toArray.filter(_.state ==
WorkerState.ALIVE)
- .filter(canUse(app, _)).sortBy(_.coresFree).reverse
+ .filter(worker => worker.memoryFree >=
app.desc.memoryPerExecutorMB &&
+ worker.coresFree > 0)
--- End diff --
should be `worker.coresFree >= app.desc.coresPerExecutor.getOrElse(1)`
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]