wypoon commented on a change in pull request #23767: [SPARK-26329][CORE][WIP]
Faster polling of executor memory metrics.
URL: https://github.com/apache/spark/pull/23767#discussion_r262716263
##########
File path:
core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala
##########
@@ -267,12 +279,17 @@ private[spark] class EventLoggingListener(
override def onExecutorMetricsUpdate(event:
SparkListenerExecutorMetricsUpdate): Unit = {
if (shouldLogStageExecutorMetrics) {
- // For the active stages, record any new peak values for the memory
metrics for the executor
- event.executorUpdates.foreach { executorUpdates =>
- liveStageExecutorMetrics.values.foreach { peakExecutorMetrics =>
- val peakMetrics = peakExecutorMetrics.getOrElseUpdate(
- event.execId, new ExecutorMetrics())
- peakMetrics.compareAndUpdatePeakValues(executorUpdates)
+ event.executorUpdates.foreach { case (k1, peakUpdates) =>
+ liveStageExecutorMetrics.foreach { case (k2, peakExecutorMetrics) =>
+ // If the update came from the driver, the key k1 will be the dummy
key (-1, -1),
+ // so record those peaks for all active stages.
+ // Otherwise, record the peaks for the matching stage.
+ val k0 = (-1, -1)
+ if (k1 == k0 || k1 == k2) {
Review comment:
In this case, I do think it is necessary to iterate over
`liveStageExecutorMetrics`, precisely because the executor updates in the
driver case does not match any stage key; in the driver case, we want to update
the peaks for all active stages. This is the same behavior as before.
`liveStageExecutorMetrics` is a map from stage key to a value which is in
turn a map (a map from id (either executor id or "driver") to metrics). In the
driver case, for each value in `liveStageExecutorMetrics`, which is this map of
id to metrics, we get the metrics for "driver" (the id), and update their
peaks.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]