Github user squito commented on a diff in the pull request:
https://github.com/apache/spark/pull/20940#discussion_r179795171
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala ---
@@ -234,8 +244,22 @@ private[spark] class EventLoggingListener(
}
}
- // No-op because logging every update would be overkill
- override def onExecutorMetricsUpdate(event:
SparkListenerExecutorMetricsUpdate): Unit = { }
+ /**
+ * Log if there is a new peak value for one of the memory metrics for
the given executor.
+ * Metrics are cleared out when a new stage is started in
onStageSubmitted, so this will
+ * log new peak memory metric values per executor per stage.
+ */
+ override def onExecutorMetricsUpdate(event:
SparkListenerExecutorMetricsUpdate): Unit = {
--- End diff --
I wouldn't think you'd want to log for every new peak, as I'd expect it
would be natural for the peak to keep growing, so you'd just end up with a lot
of logs. I'd expect you'd just log the peak when the stage ended, or when the
executor died.
the downside of that approach is that you never log a peak if the driver
dies ... but then you've got to figure out the driver issue anyway.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]