Re: [PR] [SPARK-56919][SQL] Move setupJob before materializeAdaptiveSparkPlan to prevent table path loss [spark]

via GitHub Wed, 17 Jun 2026 16:03:35 -0700


cloud-fan commented on code in PR #56126:
URL: https://github.com/apache/spark/pull/56126#discussion_r3431966626



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala:
##########
@@ -195,6 +203,11 @@ object FileFormatWriter extends Logging {
       executeWrite(sparkSession, plan, job, description, committer, outputSpec,
         requiredOrdering, partitionColumns, sortColumns, orderingMatched)
     }
+    } catch {
+      case cause: Throwable =>
+        committer.abortJob(job)

Review Comment:
   `writeAndCommit` already calls `abortJob` and rethrows on a write- or 
commit-failure, and both `executeWrite` calls sit inside this new `try` - so on 
those paths `abortJob` now runs twice (it was once before this PR). It's 
harmless for the built-in committers (their `abortJob` is idempotent), but it's 
a behavior change to the `FileCommitProtocol.abortJob` contract, and a 
non-idempotent custom committer could double-clean. The materialize-failure 
path you're fixing only hits this outer catch, so it's unaffected. Simplest 
single-abort form: drop `writeAndCommit`'s now-redundant `try/catch` and move 
its `logError` here.



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala:
##########
@@ -148,6 +148,14 @@ object FileFormatWriter extends Logging {
         writerBucketSpec.map(_.bucketIdExpression) ++ sortColumns
     val writeFilesOpt = V1WritesUtils.getWriteFilesOpt(plan)
 
+    // SPARK-56919: setupJob must run before materializeAdaptiveSparkPlan, 
which can throw.
+    // Otherwise INSERT OVERWRITE permanently loses the table path if AQE 
fails.
+    // A leaked staging dir (_temporary / .spark-staging-*) is acceptable vs. 
losing the table
+    // path — these dirs are dot/underscore-prefixed (filtered from reads) and 
self-heal on the
+    // next overwrite.

Review Comment:
   This rationale is now stale: the `try/catch` you added below calls 
`abortJob` on any failure after `setupJob`, and 
`HadoopMapReduceCommitProtocol.abortJob` deletes `_temporary` / 
`.spark-staging-*` - so the staging dir isn't actually leaked on the 
materialize-failure path. The only leak window is `setupJob` itself throwing 
(it's outside the `try`). The rewrite below also drops the non-ASCII em-dash 
that would otherwise fail scalastyle.
   ```suggestion
       // setupJob is outside the try below because it only initializes the 
job; the try/catch
       // calls abortJob on any later failure (e.g. materialize throwing), 
which cleans up the
       // staging dir (_temporary / .spark-staging-*).
   ```



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala:
##########
@@ -148,6 +148,14 @@ object FileFormatWriter extends Logging {
         writerBucketSpec.map(_.bucketIdExpression) ++ sortColumns
     val writeFilesOpt = V1WritesUtils.getWriteFilesOpt(plan)
 
+    // SPARK-56919: setupJob must run before materializeAdaptiveSparkPlan, 
which can throw.
+    // Otherwise INSERT OVERWRITE permanently loses the table path if AQE 
fails.
+    // A leaked staging dir (_temporary / .spark-staging-*) is acceptable vs. 
losing the table
+    // path — these dirs are dot/underscore-prefixed (filtered from reads) and 
self-heal on the
+    // next overwrite.
+    committer.setupJob(job)
+
+    try {

Review Comment:
   Optional: the body here isn't re-indented, so `def 
materializeAdaptiveSparkPlan` and the locals read as method-level rather than 
inside the `try`. Re-indenting ~47 lines bloats the diff, so leaving it is a 
defensible call - your judgment on the project norm.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-56919][SQL] Move setupJob before materializeAdaptiveSparkPlan to prevent table path loss [spark]

Reply via email to