Josh Rosen created SPARK-8344:
---------------------------------
Summary: Add internal metrics / logging for DAGScheduler to detect
long pauses / blocking
Key: SPARK-8344
URL: https://issues.apache.org/jira/browse/SPARK-8344
Project: Spark
Issue Type: New Feature
Components: Scheduler, Spark Core
Reporter: Josh Rosen
It would be useful to be able to log warnings if the DAGScheduler event
processing loop blocks for more than a certain amount of time (or if its
message inbox grows too large). This debugging logging (probably disabled by
default) would be very helpful for finding places where the scheduling loop
blocks / slows down.
We might be able to infer this information now from the web UI scheduler
delays, but that's kind of hard to parse out of logs or use to raise monitoring
alerts.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]