Ngone51 commented on a change in pull request #27126: [SPARK-30417][CORE] Task speculation numTaskThreshold should be greater than 0 even if the configuration is not set correctly URL: https://github.com/apache/spark/pull/27126#discussion_r364096069
########## File path: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ########## @@ -87,7 +87,13 @@ private[spark] class TaskSetManager( // number of slots on a single executor, would the task manager speculative run the tasks if // their duration is longer than the given threshold. In this way, we wouldn't speculate too // aggressively but still handle basic cases. - val speculationTasksLessEqToSlots = numTasks <= (conf.get(EXECUTOR_CORES) / sched.CPUS_PER_TASK) + // SPARK-30417: #cores per executor might not be set in spark conf for standalone mode, then + // the value of the conf would 1 by default. However, the executor would use all the cores on + // the worker. Therefore, CPUS_PER_TASK is okay to be greater than 1 without setting #cores. + // To handle this case, we assume the minimum number of slots is 1. Review comment: Will this case fails this check: https://github.com/apache/spark/blob/e1ea806b3075d279b5f08a29fe4c1ad6d3c4191a/core/src/main/scala/org/apache/spark/SparkContext.scala#L2737 It seems `checkResourcesPerTask` also doesn't handle this case properly. So, `CPUS_PER_TASK` greater than 1 should fail the check. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
