Alessandro Bellina created SPARK-48861:
------------------------------------------
Summary: Cleanup shuffle dependencies for all SQL executions
Key: SPARK-48861
URL: https://issues.apache.org/jira/browse/SPARK-48861
Project: Spark
Issue Type: Improvement
Components: SQL
Affects Versions: 4.0.0
Reporter: Alessandro Bellina
Fix For: 4.0.0
https://issues.apache.org/jira/browse/SPARK-47764 added a shuffle cleanup
mechanism that applies to a `QueryExecution`, but it only enables it for Spark
Connect.
We'd like to enable this for all query executions via configuration. The
configurations that were added in
https://issues.apache.org/jira/browse/SPARK-47764 are generic (did not have
"connect" in the name), so we hope we can just use the same config:
`spark.sql.shuffleDependency.fileCleanup.enabled`. SPARK-47764 also added:
`spark.sql.shuffleDependency.skipMigration.enabled`, and I don't understand in
which cases it is useful yet, but given the interface as it is, it could also
be done for all queries.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]