[
https://issues.apache.org/jira/browse/SPARK-55353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Huan Zheng updated SPARK-55353:
-------------------------------
Summary: Driver OOM regression with complex SQL queries: Add config to
disable SQLAppStatusListener (was: Driver OOM with complex SQL queries: Add
config to disable SQLAppStatusListener)
> Driver OOM regression with complex SQL queries: Add config to disable
> SQLAppStatusListener
> -------------------------------------------------------------------------------------------
>
> Key: SPARK-55353
> URL: https://issues.apache.org/jira/browse/SPARK-55353
> Project: Spark
> Issue Type: Improvement
> Components: SQL, UI
> Affects Versions: 3.5.7
> Reporter: Huan Zheng
> Priority: Major
> Attachments: desensitized_sql_example.txt,
> 企业微信截图_6a9bc73d-dbd5-4013-b517-5c074114b511.png,
> 企业微信截图_742d6985-ae85-4f8d-b33b-d9445a65ba7a.png
>
>
> PROBLEM:
> Severe driver OOM regression from Spark 3.2.2 to 3.5.7 for complex SQL
> queries.
> - Spark 3.2.2: Same query runs with 2GB driver memory
> - Spark 3.5.7(with extra patch for SPARK-45439): Same query OOMs with 8GB
> driver memory
> - Memory regression: 4x increase in driver memory requirements
> Query characteristics:
> - ~200 stages
> - Complex feature engineering with multiple UDFs
> - 12 LEFT JOINs with subqueries
> - 100+ output columns
> ================================================================================
> ROOT CAUSE:
> 2G Driver Heap dump analysis shows 15.6M AccumulatorMetadata objects
> consuming ~1.5GB memory, held by SQLAppStatusListener.
> Accumulator count per task increased 4x due to new metrics added between
> versions:
> - SPARK-36620 (3.3.0): Push-based shuffle metrics
> - SPARK-40711 (3.4.0): Window spill metrics
> - SPARK-43214 (3.5.0): Driver-side metrics
> SQLAppStatusListener collects all these metrics in driver memory, causing OOM
> for queries with many stages and tasks.
> ================================================================================
> SOLUTION:
> Add new static configuration: spark.sql.ui.appStatusListener.enabled Controls
> whether SQLAppStatusListener should be loaded.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]