Anupam Yadav created SPARK-57685:
------------------------------------

             Summary: Support non-deterministic expressions in MERGE INTO 
action assignments on DSv2 tables
                 Key: SPARK-57685
                 URL: https://issues.apache.org/jira/browse/SPARK-57685
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 4.3.0
            Reporter: Anupam Yadav


Follow-up to SPARK-56729.

SPARK-56729 (PR #55858) allows non-deterministic expressions in the MERGE 
*source* for DataSource V2 row-level operations (ReplaceData/WriteDelta 
implement SupportsNonDeterministicExpression, with a determinism guard in 
RowLevelOperationRuntimeGroupFiltering to keep runtime group pruning safe).

However, non-determinism written directly into a MERGE *action* assignment, 
e.g.:
{code:sql}
MERGE INTO t USING src ON t.id = src.id
WHEN MATCHED THEN UPDATE SET t.x = uuid()
{code}
still fails analysis with INVALID_NON_DETERMINISTIC_EXPRESSIONS. The assignment 
expression lands on the MergeRows operator, which does not implement 
SupportsNonDeterministicExpression and is not in CheckAnalysis's allowlist of 
operators permitted to carry non-deterministic expressions.

This is a separate, pre-existing limitation, orthogonal to the 
source-non-determinism regression fixed in SPARK-56729. Supporting it requires 
making MergeRows evaluate/validate such action expressions safely (each 
evaluated once per produced row) or implementing 
SupportsNonDeterministicExpression on MergeRows with execution validation.

Noted by cloud-fan during review of PR #55858.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to