cloud-fan opened a new pull request, #55871:
URL: https://github.com/apache/spark/pull/55871

   Followup to https://github.com/apache/spark/pull/54722.
   
   ### What changes were proposed in this pull request?
   
   The grammar for INSERT ... REPLACE WHERE | ON unifies the two variants into 
`#insertIntoReplaceBooleanCond` and accepts a `tableAlias` for both, because 
REPLACE ON's condition can reference the target via the alias (e.g. `t.col`). 
The REPLACE WHERE branch in `AstBuilder` never reads `ctx.tableAlias()`, so an 
alias supplied to REPLACE WHERE is silently ignored. A query like
   
   ```sql
   INSERT INTO t AS s REPLACE WHERE s.a = 1 SELECT * FROM source
   ```
   
   parses successfully, then fails at analysis with a confusing "column s.a not 
found" because the underlying `UnresolvedRelation` was not wrapped with the 
alias.
   
   This PR rejects the alias at parse time so users get a clear error pointing 
at the right place. The grammar stays unified (no rule split); the visitor adds 
a single guard before the WHERE branch's existing logic and throws a new 
`INSERT_REPLACE_WHERE_TABLE_ALIAS_NOT_ALLOWED` parse error that suggests 
REPLACE ON when an alias is needed.
   
   ### Why are the changes needed?
   
   The current behavior — silently ignoring the alias and then failing at 
analysis — is misleading. Either the alias should be wired through (a semantic 
change requiring more invasive plumbing through `OverwriteByExpression`'s write 
resolution path) or it should be rejected. Rejecting it at parse time is the 
smaller, safer fix and matches the natural reading of the grammar (an alias 
only makes sense when the condition references the target via the alias, which 
is REPLACE ON's case, not REPLACE WHERE's).
   
   ### Does this PR introduce *any* user-facing change?
   
   Yes. `INSERT INTO t AS s REPLACE WHERE …` now fails with 
`INSERT_REPLACE_WHERE_TABLE_ALIAS_NOT_ALLOWED` at parse time instead of 
silently dropping the alias and failing later (or, for queries whose WHERE 
doesn't reference the alias, silently producing the same plan as if the alias 
were absent). The new error message suggests using REPLACE ON for cases that 
need the alias.
   
   ### How was this patch tested?
   
   - Two existing `DDLParserSuite` tests (`insert table: REPLACE WHERE with 
tableAlias [and / without] BY NAME`) documented the silent-ignore behavior; 
they are rewritten to assert the new parse error.
   - Verified the rewritten tests fail without the AstBuilder guard and pass 
with it.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   Yes — written with assistance from Claude.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to