mridulm commented on code in PR #36162:
URL: https://github.com/apache/spark/pull/36162#discussion_r865591651
##########
core/src/main/scala/org/apache/spark/SparkStatusTracker.scala:
##########
@@ -120,4 +120,8 @@ class SparkStatusTracker private[spark] (sc: SparkContext,
store: AppStatusStore
exec.memoryMetrics.map(_.totalOnHeapStorageMemory).getOrElse(0L))
}.toArray
}
+
+ def getAppStatusStore: AppStatusStore = {
+ store
+ }
Review Comment:
There are a few aspects here:
a) Should we track metrics in scheduler ?
Task metrics are part of `TaskInfo` in scheduler - a view of this state is
exposed using the listener, and is used by the UI (`AppStatusStore`, etc): but
source of truth is the scheduler.
This is maintained within scheduler only for completed states currently.
In order to make the UI more responsive to task progress, metrics also sent
via executor heartbeat (SPARK-2099, etc) - so that users can see the inprogress
updates to metrics for running tasks.
This is not tracked at the scheduler, since we did not need any of this
information (and directly fire the `SparkListenerExecutorMetricsUpdate` event).
Now that we do, we can leverage the heartbeat to update relevant state.
(Note, I am not proposing to maintain all `accumulables`, just the once we
need - see next pls).
b) Should we materialize specific metrics ?
I would look at it as an optimization - to materialize the specific subset
of metrics we need.
For completed tasks, this is already available - but will require a scan
through `accumulables` (and so a performance gain by materializing them).
For in progress tasks, this is currently not maintained in scheduler - and
so tracking them during heatbeat would allow scheduler to leverage the state.
c) Going to `AppStatusStore`, etc.
There are two things which impact this design choice - the cost and
reliability.
With speculation enabled, we check for speculative tasks even 100ms. Since
this is within the scheduler lock, we should ensure it completes quickly (as
other things will be blocked). When the total number of task increases, the
performance of `InMemoryStore` degrades (and also has a bad gc impact) [1] -
which results in scheduler/driver getting impacted.
From a reliability point of view, the listener subsystem can and does drop
events - so depending on it is introducing error into scheduler decision :
which can be avoided by tracking this in scheduler itself.
Hopefully I am not missing anything, please let me know your thoughts
@Ngone51 !
Also, please do let me know if the proposal is practical - in case I am
missing some cases where this would not work.
[1] For the current usecases for this store, this is fine (infrequently
powering spark UI)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]