ever4Kenny opened a new pull request, #54133: URL: https://github.com/apache/spark/pull/54133
### What changes were proposed in this pull request? This patch adds a new static configuration `spark.sql.ui.appStatusListener.enabled` to control whether SQLAppStatusListener is loaded during SharedState initialization. ### Why are the changes needed? Complex SQL queries with many stages and tasks cause driver OOM due to excessive AccumulatorMetadata objects held by SQLAppStatusListener. **Memory regression from Spark 3.2 to 3.5:** - Accumulator count per task increased 4x due to new metrics (SPARK-36620, SPARK-40711, SPARK-43214) - Example: 155 stages × 1000 tasks × 100 accumulators = 15.5M objects (~1.5GB driver memory) - Same query runs on 3.2.2 with 2GB driver but OOMs on 3.5.7 with 8GB driver ### Does this PR introduce any user-facing change? Yes. A new static configuration `spark.sql.ui.appStatusListener.enabled` (default: true) is added. When set to false: - SQLAppStatusListener is not created, significantly reducing driver memory usage - Limited SQL execution metrics in live Spark UI - Event logs are still written for History Server analysis ### How was this patch tested? - Added unit tests in SharedStateSuite to verify SQLAppStatusListener creation behavior - Tested with production workload: OOM resolved when config is disabled ### Was this patch authored or co-authored using generative AI tooling? No. --- I affirm that the contribution is my original work and that I license the work to the project under the project's open source license. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
