Josh Rosen created SPARK-8344:
---------------------------------

             Summary: Add internal metrics / logging for DAGScheduler to detect 
long pauses / blocking
                 Key: SPARK-8344
                 URL: https://issues.apache.org/jira/browse/SPARK-8344
             Project: Spark
          Issue Type: New Feature
          Components: Scheduler, Spark Core
            Reporter: Josh Rosen


It would be useful to be able to log warnings if the DAGScheduler event 
processing loop blocks for more than a certain amount of time (or if its 
message inbox grows too large).  This debugging logging (probably disabled by 
default) would be very helpful for finding places where the scheduling loop 
blocks / slows down.

We might be able to infer this information now from the web UI scheduler 
delays, but that's kind of hard to parse out of logs or use to raise monitoring 
alerts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to