This is an automated email from the ASF dual-hosted git repository.

gengliang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new 53606f21eb1a [SPARK-51695][SQL][DOCS][FOLLOW-UP] Clarify NULL handling 
semantics in UNIQUE constraint Javadoc
53606f21eb1a is described below

commit 53606f21eb1a4dd47be15a2fc353f1dffa23c58d
Author: Yan Yan <[email protected]>
AuthorDate: Wed Feb 25 17:04:24 2026 -0800

    [SPARK-51695][SQL][DOCS][FOLLOW-UP] Clarify NULL handling semantics in 
UNIQUE constraint Javadoc
    
    ### What changes were proposed in this pull request?
    
    Document NULLS DISTINCT behavior: NULL values are treated as distinct from 
each other, so rows with NULLs in unique columns never conflict. Also note that 
UNIQUE allows nullable columns (unlike PRIMARY KEY) and that NULLS NOT DISTINCT 
is not currently supported.
    
    ### Why are the changes needed?
    Better javadoc clarity on UNIQUE constraint expectation.
    
    ### Does this PR introduce _any_ user-facing change?
    No
    
    ### How was this patch tested?
    N/A
    
    ### Was this patch authored or co-authored using generative AI tooling?
    Co-Authored-By: Claude Opus 4.6 <noreplyanthropic.com>
    
    Closes #54357 from yyanyy/unique_definition_clarify.
    
    Authored-by: Yan Yan <[email protected]>
    Signed-off-by: Gengliang Wang <[email protected]>
---
 .../apache/spark/sql/connector/catalog/constraints/Unique.java   | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git 
a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/constraints/Unique.java
 
b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/constraints/Unique.java
index d983ef656297..d4837932863f 100644
--- 
a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/constraints/Unique.java
+++ 
b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/constraints/Unique.java
@@ -28,6 +28,15 @@ import 
org.apache.spark.sql.connector.expressions.NamedReference;
  * <p>
  * A UNIQUE constraint specifies one or more columns as unique columns. Such 
constraint is satisfied
  * if and only if no two rows in a table have the same non-null values in the 
unique columns.
+ * Unlike PRIMARY KEY, UNIQUE allows nullable columns.
+ * <p>
+ * NULL values are treated as distinct from each other (NULLS DISTINCT 
semantics). Two rows
+ * are considered duplicates only when every column in the unique key has a 
non-null value and
+ * every value matches. If any column in the unique key is NULL, the row is 
always considered
+ * unique regardless of other values. In other words, multiple rows with NULL 
in one or more
+ * unique columns are allowed and do not violate the constraint definition.
+ * <p>
+ * The {@code NULLS NOT DISTINCT} modifier is not currently supported.
  * <p>
  * Spark doesn't enforce UNIQUE constraints but leverages them for query 
optimization. Each
  * constraint is either valid (the existing data is guaranteed to satisfy the 
constraint), invalid


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to