yadavay-amzn opened a new pull request, #56472:
URL: https://github.com/apache/spark/pull/56472

   ### What changes were proposed in this pull request?
   
   Block references to temporary (session) variables in CHECK constraint 
expressions, on both the `CREATE`/`REPLACE TABLE` path and the `ALTER TABLE ... 
ADD CONSTRAINT` path. A CHECK constraint that references a session variable is 
now rejected at analysis time with `INVALID_TEMP_OBJ_REFERENCE`.
   
   The check mirrors the existing non-deterministic-constraint validation and 
the sibling fix for generated columns (SPARK-57360): it detects a 
`VariableReference` anywhere in the constraint expression via 
`containsPattern(VARIABLE_REFERENCE)` and raises `INVALID_TEMP_OBJ_REFERENCE` 
(the same error class already used when a persistent view references a 
temporary variable).
   
   ### Why are the changes needed?
   
   A CHECK constraint is persisted with the table, but a temporary/session 
variable is session-scoped and will not exist in other sessions, making the 
persisted constraint invalid. Today both paths silently accept it:
   
   ```sql
   DECLARE OR REPLACE VARIABLE my_var INT DEFAULT 5;
   
   -- Both currently succeed and persist `CHECK (val > my_var)`:
   CREATE TABLE t (id INT, val INT, CONSTRAINT c1 CHECK (val > my_var)) USING 
parquet;
   
   CREATE TABLE t2 (id INT, val INT) USING parquet;
   ALTER TABLE t2 ADD CONSTRAINT c2 CHECK (val > my_var);
   ```
   
   Spark already blocks the analogous case for persistent views via 
`INVALID_TEMP_OBJ_REFERENCE`; this extends the same protection to CHECK 
constraints.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes. Creating or altering a table with a CHECK constraint that references a 
temporary variable now fails with `INVALID_TEMP_OBJ_REFERENCE` instead of 
silently persisting an invalid constraint. No previously-valid constraint is 
affected.
   
   ### How was this patch tested?
   
   New tests in `CheckConstraintSuite` covering `CREATE TABLE`, `REPLACE 
TABLE`, and `ALTER TABLE ADD CONSTRAINT`, including a nested reference (`CHECK 
(i > CAST(my_var AS BIGINT))`) to confirm the detection is recursive, plus 
existing constraint tests as a regression guard. Verified the tests fail 
without the fix.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   Yes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to