weixiuli commented on code in PR #36162:
URL: https://github.com/apache/spark/pull/36162#discussion_r878639022


##########
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala:
##########
@@ -863,6 +870,29 @@ private[spark] class TaskSchedulerImpl(
       executorUpdates)
   }
 
+ private def getTaskAccumulableInfosAndProgressRate(
+      updates: Seq[AccumulatorV2[_, _]]): (Seq[AccumulableInfo], Double) = {
+   var records = 0L
+   var runTime = 0L
+   val accInfos = updates.map { acc =>
+     if (calculateTaskProgressRate && acc.name.isDefined) {
+       val name = acc.name.get
+       if (name == shuffleRead.RECORDS_READ || name == input.RECORDS_READ) {
+         records += acc.value.asInstanceOf[Long]
+       } else if (name == InternalAccumulator.EXECUTOR_RUN_TIME) {
+         runTime = acc.value.asInstanceOf[Long]
+       }
+     }
+     acc.toInfo(Some(acc.value), None)
+   }
+   val taskProgressRate = if (calculateTaskProgressRate && runTime > 0) {
+     records / (runTime / 1000.0)
+   } else {
+     0.0D
+   }

Review Comment:
   Setting the accumUpdates to the TaskInfo may be unnecessary, there are two 
reasons:
   1. We only need the records and runTime in accumUpdates
   2. Setting the accumUpdates to the TaskInfo may  take up more storage space, 
and  calculating the rate also should  traverse it.
   
   With your suggestions ,  we may only set  the records and runTime to the 
TaskInfo and  calculate the rate in InefficientTaskCalculator when it's 
required.  But, if we do that , we should make sure that setting and reading 
the records and runTime with lock,  which  may make the logic more complicated 
than that we only set a  taskProgressRate to the TaskInfo, right ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to