hvanhovell opened a new pull request #27562: [SPARK-30811][SQL][BRANCH-2.4] CTE 
should not cause stack overflow when it refers to non-existent table with same 
name
URL: https://github.com/apache/spark/pull/27562
 
 
   ### Why are the changes needed?
   A query with Common Table Expressions can cause a stack overflow when it 
contains a CTE that refers a non-existing table with the same name. The name of 
the table need to have a database qualifier. This is caused by a couple of 
things:
   - `CTESubstitution` runs analysis on the CTE, but this does not throw an 
exception because the table has a database qualifier. The reason is that we 
don't fail is because we re-attempt to resolve the relation in a later rule;
   - `CTESubstitution` replace logic does not check if the table it is 
replacing has a database, it shouldn't replace the relation if it does. So now 
we will happily replace `nonexist.t` with `t`;
   - `CTESubstitution` transforms down, this means it will keep replacing t 
with itself, creating an infinite recursion.
   
   This PR fixes this by checking whether the relation name does not have a 
database, and it also reverses the transformation order.
   
   ### Does this PR introduce any user-facing change?
   No
   
   ### How was this patch tested?
   Added regression test to `DataFrameSuite`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to