Github user edwinalu commented on a diff in the pull request:
https://github.com/apache/spark/pull/21221#discussion_r195957438
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala ---
@@ -169,6 +182,31 @@ private[spark] class EventLoggingListener(
// Events that trigger a flush
override def onStageCompleted(event: SparkListenerStageCompleted): Unit
= {
+ if (shouldLogExecutorMetricsUpdates) {
+ // clear out any previous attempts, that did not have a stage
completed event
+ val prevAttemptId = event.stageInfo.attemptNumber() - 1
+ for (attemptId <- 0 to prevAttemptId) {
+ liveStageExecutorMetrics.remove((event.stageInfo.stageId,
attemptId))
+ }
+
+ // log the peak executor metrics for the stage, for each live
executor,
+ // whether or not the executor is running tasks for the stage
+ val accumUpdates = new ArrayBuffer[(Long, Int, Int,
Seq[AccumulableInfo])]()
+ val executorMap = liveStageExecutorMetrics.remove(
+ (event.stageInfo.stageId, event.stageInfo.attemptNumber()))
+ executorMap.foreach {
+ executorEntry => {
+ for ((executorId, peakExecutorMetrics) <- executorEntry) {
+ val executorMetrics = new ExecutorMetrics(-1,
peakExecutorMetrics.metrics)
--- End diff --
The last timestamp seems like it wouldn't have enough information, since
peaks for different metrics could occur at different times, and with different
combinations of stages running.
Only -1 would be logged. Right now it's writing out
SparkListenerExecutorMetricsUpdate events, which contain ExecutorMetrics, which
has timestamp. Do you think timestamp should be removed from ExecutorMetrics?
It seems good to have the timestamp for when the metrics were gathered, but
it's not being exposed at this point.
For both the history server and the live UI, the goal is to show the peak
value for each metric each executor. For the executors tab, this is the peak
value of each metric over the lifetime of the executor. For the stages tab,
this is the peak value for each metric for that executor while the stage is
running. The executor could be processing tasks for other stages as well, if
there are concurrent stages, or no tasks for this stage if it isn't assigned
any tasks, but it is the peak values between the time the stage starts and ends.
Can you describe how the stage level metrics would work the last timestamp
for any peak metric? Would there be a check to see if the event is being read
from the history log?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]