Anupam Yadav created SPARK-57685:
------------------------------------
Summary: Support non-deterministic expressions in MERGE INTO
action assignments on DSv2 tables
Key: SPARK-57685
URL: https://issues.apache.org/jira/browse/SPARK-57685
Project: Spark
Issue Type: Improvement
Components: SQL
Affects Versions: 4.3.0
Reporter: Anupam Yadav
Follow-up to SPARK-56729.
SPARK-56729 (PR #55858) allows non-deterministic expressions in the MERGE
*source* for DataSource V2 row-level operations (ReplaceData/WriteDelta
implement SupportsNonDeterministicExpression, with a determinism guard in
RowLevelOperationRuntimeGroupFiltering to keep runtime group pruning safe).
However, non-determinism written directly into a MERGE *action* assignment,
e.g.:
{code:sql}
MERGE INTO t USING src ON t.id = src.id
WHEN MATCHED THEN UPDATE SET t.x = uuid()
{code}
still fails analysis with INVALID_NON_DETERMINISTIC_EXPRESSIONS. The assignment
expression lands on the MergeRows operator, which does not implement
SupportsNonDeterministicExpression and is not in CheckAnalysis's allowlist of
operators permitted to carry non-deterministic expressions.
This is a separate, pre-existing limitation, orthogonal to the
source-non-determinism regression fixed in SPARK-56729. Supporting it requires
making MergeRows evaluate/validate such action expressions safely (each
evaluated once per produced row) or implementing
SupportsNonDeterministicExpression on MergeRows with execution validation.
Noted by cloud-fan during review of PR #55858.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]