[GitHub] [spark] mridulm commented on a diff in pull request #36162: [SPARK-32170][CORE] Improve the speculation through the stage task metrics.

GitBox Wed, 04 May 2022 23:39:52 -0700


mridulm commented on code in PR #36162:
URL: https://github.com/apache/spark/pull/36162#discussion_r865591651



##########
core/src/main/scala/org/apache/spark/SparkStatusTracker.scala:
##########
@@ -120,4 +120,8 @@ class SparkStatusTracker private[spark] (sc: SparkContext, 
store: AppStatusStore
         exec.memoryMetrics.map(_.totalOnHeapStorageMemory).getOrElse(0L))
     }.toArray
   }
+
+  def getAppStatusStore: AppStatusStore = {
+    store
+  }

Review Comment:
   There are a few aspects here:
   
   a) Should we track metrics in scheduler ?
   
   Task metrics are part of `TaskInfo` in scheduler - a view of this state is 
exposed using the listener, and is used by the UI (`AppStatusStore`, etc): but 
source of truth is the scheduler.
   This is maintained within scheduler only for completed states currently.
   
   In order to make the UI more responsive to task progress, metrics also sent 
via executor heartbeat (SPARK-2099, etc) - so that users can see the inprogress 
updates to metrics for running tasks.
   This is not tracked at the scheduler, since we did not need any of this 
information (and directly fire the `SparkListenerExecutorMetricsUpdate` event).
   Now that we do, we can leverage the heartbeat to update relevant state.
   
   (Note, I am not proposing to maintain all `accumulables`, just the once we 
need - see next pls).
   
   b) Should we materialize specific metrics ?
   
   I would look at it as an optimization - to materialize the specific subset 
of metrics we need.
   For completed tasks, this is already available - but will require a scan 
through `accumulables` (and so a performance gain by materializing them).
   For in progress tasks, this is currently not maintained in scheduler - and 
so tracking them during heatbeat would allow scheduler to leverage the state.
   
   c) Going to `AppStatusStore`, etc.
   
   There are two things which impact this design choice - the cost and 
reliability.
   
   With speculation enabled, we check for speculative tasks even 100ms. Since 
this is within the scheduler lock, we should ensure it completes quickly (as 
other things will be blocked). When the total number of task increases, the 
performance of `InMemoryStore` degrades (and also has a bad gc impact) [1] - 
which results in scheduler/driver getting impacted.
   
   From a reliability point of view, the listener subsystem can and does drop 
events - so depending on it is introducing error into scheduler decision : 
which can be avoided by tracking this in scheduler itself.
   
   
   Hopefully I am not missing anything, please let me know your thoughts 
@Ngone51 !
   Also, please do let me know if the proposal is practical - in case I am 
missing some cases where this would not work.
   
   [1] For the current usecases for this store, this is fine (infrequently 
powering spark UI)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] mridulm commented on a diff in pull request #36162: [SPARK-32170][CORE] Improve the speculation through the stage task metrics.

Reply via email to