srowen commented on a change in pull request #25487: [SPARK-28769][CORE] 
Improve warning message of BarrierExecutionMode when required slots > maximum 
slots
URL: https://github.com/apache/spark/pull/25487#discussion_r315656228
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
 ##########
 @@ -981,11 +983,20 @@ private[spark] class DAGScheduler(
       finalStage = createResultStage(finalRDD, func, partitions, jobId, 
callSite)
     } catch {
       case e: BarrierJobSlotsNumberCheckFailed =>
-        logWarning(s"The job $jobId requires to run a barrier stage that 
requires more slots " +
-          "than the total number of slots in the cluster currently.")
         // If jobId doesn't exist in the map, Scala coverts its value null to 
0: Int automatically.
         val numCheckFailures = 
barrierJobIdToNumTasksCheckFailures.compute(jobId,
           (_: Int, value: Int) => value + 1)
+        val retryCount = numCheckFailures - 1
+        val retryMessage = if (retryCount == 0) {
+          ""
+        } else {
+          s" (Retry ${retryCount}/$maxFailureNumTasksCheck failed)"
+        }
+
+        logWarning(s"The job $jobId requires to run a barrier stage " +
 
 Review comment:
   I think this needs to be rewritten a little, and doesn't need the special 
case above
   ```
   Barrier stage in job $jobId  requires ${...} slots, but only ${...} are 
available. Failure ${numCheckFailures} / ${maxFailureNumTasksCheck}
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to