Victor Sunderland created SPARK-53156:
-----------------------------------------
Summary: Better Capture Driver Memory Metrics when the app
proactively terminates due to resource issues
Key: SPARK-53156
URL: https://issues.apache.org/jira/browse/SPARK-53156
Project: Spark
Issue Type: Improvement
Components: Spark Core
Affects Versions: 4.1.0
Reporter: Victor Sunderland
When the application proactively terminates due to some memory issues at the
driver (SparkOOM, result size too large, etc...), due to metric sampling issues
we will often miss this resourcing problem in the memory metrics and in the
event log. We will abort the job before we capture accurate metrics for the
driver. We should improve this case.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]