pabloggarc opened a new pull request, #55692:
URL: https://github.com/apache/spark/pull/55692

   ### What changes were proposed in this pull request?
   
   `ReplaceData` and `WriteDelta` now implement the 
`SupportsNonDeterministicExpression` trait (introduced in SPARK-48871). This 
fixes `INVALID_NON_DETERMINISTIC_EXPRESSIONS` errors when running `MERGE INTO` 
on DSv2 tables (e.g. Apache Iceberg) where the source plan contains 
non-deterministic expressions such as `uuid()`, `current_timestamp()`, or 
`input_file_name()`.
   
   ### Why are the changes needed?
   
   In Spark 3.5, `RewriteMergeIntoTable` wraps the source plan in an `Exists` 
subquery stored as `groupFilterCondition` on `ReplaceData`. If the source plan 
contains any non-deterministic expression, the `Exists` becomes 
non-deterministic and `CheckAnalysis` rejects the query. The 
`SupportsNonDeterministicExpression` trait was introduced in SPARK-48871 for 
exactly this situation, but was never applied to `ReplaceData` or `WriteDelta`. 
This is a regression from Spark 3.3, where these queries worked correctly.
   
   Root cause analysis by @DaxterXS in 
[apache/iceberg#14585](https://github.com/apache/iceberg/issues/14585). 
Multiple users affected across AWS Glue 4/5 with Iceberg 1.7–1.10.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes. `MERGE INTO` queries with non-deterministic expressions in the source 
plan will no longer fail with `INVALID_NON_DETERMINISTIC_EXPRESSIONS`.
   
   ### How was this patch tested?
   
   Added unit tests in `AnalysisErrorSuite` verifying that `ReplaceData` and 
`WriteDelta` implement `SupportsNonDeterministicExpression`.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   No.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to