Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/4129#discussion_r28389067
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala
---
@@ -556,17 +556,18 @@ private[spark] class Master(
// in the queue, then the second app, etc.
if (spreadOutApps) {
// Try to spread out each app among all the nodes, until it has all
its cores
- for (app <- waitingApps if app.coresLeft > 0) {
+ for (app <- waitingApps if app.coresLeft >= app.desc.coreNumPerTask)
{
+ val coreNumPerTask = app.desc.coreNumPerTask
val usableWorkers = workers.toArray.filter(_.state ==
WorkerState.ALIVE)
.filter(canUse(app, _)).sortBy(_.coresFree).reverse
val numUsable = usableWorkers.length
val assigned = new Array[Int](numUsable) // Number of cores to
give on each node
var toAssign = math.min(app.coresLeft,
usableWorkers.map(_.coresFree).sum)
var pos = 0
- while (toAssign > 0) {
- if (usableWorkers(pos).coresFree - assigned(pos) > 0) {
- toAssign -= 1
- assigned(pos) += 1
+ while (toAssign >= coreNumPerTask) {
+ if (usableWorkers(pos).coresFree - assigned(pos) >=
coreNumPerTask) {
+ toAssign -= coreNumPerTask
+ assigned(pos) += coreNumPerTask
--- End diff --
is it possible for this to fall into an infinite loop? Let's say I have 5
workers, each of them has 1 core left. The application's `spark.task.cpus` is
2. Here we will never assign any worker resources because not one worker has
enough resources for the application.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]