mridulm commented on code in PR #36162:
URL: https://github.com/apache/spark/pull/36162#discussion_r866163541
##########
core/src/main/scala/org/apache/spark/SparkStatusTracker.scala:
##########
@@ -120,4 +120,8 @@ class SparkStatusTracker private[spark] (sc: SparkContext,
store: AppStatusStore
exec.memoryMetrics.map(_.totalOnHeapStorageMemory).getOrElse(0L))
}.toArray
}
+
+ def getAppStatusStore: AppStatusStore = {
+ store
+ }
Review Comment:
> I think this essentially means we'll have intermediate accumulables for
TaskInfo rather than only final accumulables for the completed tasks as what we
have today
Materializing the subset of required values was an optimization to this -
since `_accumulables` is a `Seq` and the scan would be done repeatedly (we only
need a small subset of input/shuffle related metrics to determine progress,
while the total set can be fairly large).
> And we'll have to track all tasks since the completed tasks were
inprogress tasks ever.
For completed tasks, we are already tracking this.
For in-progress tasks, we are not - and will need to be added.
For tasks which are yet to start, this would be empty.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]