Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21221#discussion_r195290278
  
    --- Diff: 
core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala ---
    @@ -169,6 +182,31 @@ private[spark] class EventLoggingListener(
     
       // Events that trigger a flush
       override def onStageCompleted(event: SparkListenerStageCompleted): Unit 
= {
    +    if (shouldLogExecutorMetricsUpdates) {
    +      // clear out any previous attempts, that did not have a stage 
completed event
    --- End diff --
    
    one potential issue here -- even though there is a stage completed event, 
you can still have tasks running from stage attempt (when there is a fetch 
failure, all existing tasks keep running).  Those leftover tasks will effect 
the memory usage for other tasks which run on those executors.
    
    that said, I dunno if we can do much better here.  the alternative would be 
to track the task start & end events for each stage attempt.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to