(spark) branch branch-4.1 updated: [SPARK-51695][SQL][DOCS][FOLLOW-UP] Clarify NULL handling semantics in UNIQUE constraint Javadoc

gengliang Wed, 25 Feb 2026 17:05:01 -0800

This is an automated email from the ASF dual-hosted git repository.

gengliang pushed a commit to branch branch-4.1
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/branch-4.1 by this push:
     new e31ed5ddb379 [SPARK-51695][SQL][DOCS][FOLLOW-UP] Clarify NULL handling 
semantics in UNIQUE constraint Javadoc
e31ed5ddb379 is described below

commit e31ed5ddb379a0da49fd2be9042e9036f5266cee
Author: Yan Yan <[email protected]>
AuthorDate: Wed Feb 25 17:04:24 2026 -0800

    [SPARK-51695][SQL][DOCS][FOLLOW-UP] Clarify NULL handling semantics in 
UNIQUE constraint Javadoc
    
    ### What changes were proposed in this pull request?
    
    Document NULLS DISTINCT behavior: NULL values are treated as distinct from 
each other, so rows with NULLs in unique columns never conflict. Also note that 
UNIQUE allows nullable columns (unlike PRIMARY KEY) and that NULLS NOT DISTINCT 
is not currently supported.
    
    ### Why are the changes needed?
    Better javadoc clarity on UNIQUE constraint expectation.
    
    ### Does this PR introduce _any_ user-facing change?
    No
    
    ### How was this patch tested?
    N/A
    
    ### Was this patch authored or co-authored using generative AI tooling?
    Co-Authored-By: Claude Opus 4.6 <noreplyanthropic.com>
    
    Closes #54357 from yyanyy/unique_definition_clarify.
    
    Authored-by: Yan Yan <[email protected]>
    Signed-off-by: Gengliang Wang <[email protected]>
    (cherry picked from commit 53606f21eb1a4dd47be15a2fc353f1dffa23c58d)
    Signed-off-by: Gengliang Wang <[email protected]>
---
 .../apache/spark/sql/connector/catalog/constraints/Unique.java   | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git 
a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/constraints/Unique.java
 
b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/constraints/Unique.java
index d983ef656297..d4837932863f 100644
--- 
a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/constraints/Unique.java
+++ 
b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/constraints/Unique.java
@@ -28,6 +28,15 @@ import 
org.apache.spark.sql.connector.expressions.NamedReference;
  * <p>
  * A UNIQUE constraint specifies one or more columns as unique columns. Such 
constraint is satisfied
  * if and only if no two rows in a table have the same non-null values in the 
unique columns.
+ * Unlike PRIMARY KEY, UNIQUE allows nullable columns.
+ * <p>
+ * NULL values are treated as distinct from each other (NULLS DISTINCT 
semantics). Two rows
+ * are considered duplicates only when every column in the unique key has a 
non-null value and
+ * every value matches. If any column in the unique key is NULL, the row is 
always considered
+ * unique regardless of other values. In other words, multiple rows with NULL 
in one or more
+ * unique columns are allowed and do not violate the constraint definition.
+ * <p>
+ * The {@code NULLS NOT DISTINCT} modifier is not currently supported.
  * <p>
  * Spark doesn't enforce UNIQUE constraints but leverages them for query 
optimization. Each
  * constraint is either valid (the existing data is guaranteed to satisfy the 
constraint), invalid


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(spark) branch branch-4.1 updated: [SPARK-51695][SQL][DOCS][FOLLOW-UP] Clarify NULL handling semantics in UNIQUE constraint Javadoc

Reply via email to