[GitHub] [spark] dtenedor opened a new pull request, #38052: [SPARK-40618][SQL] Fix bug in MergeScalarSubqueries rule with nested subqueries

GitBox Thu, 29 Sep 2022 16:46:22 -0700


dtenedor opened a new pull request, #38052:
URL: https://github.com/apache/spark/pull/38052


   ### What changes were proposed in this pull request?
   
   There is a bug in the `MergeScalarSubqueries` rule for queries with subquery 
expressions nested inside each other, wherein the rule attempts to merge the 
nested subquery with its enclosing parent subquery. The result is not a valid 
plan and raises an exception in the optimizer. Here is a minimal reproducing 
case:
   
   ```
   sql("create table test(col int) using csv")
   checkAnswer(sql("select(select sum((select sum(col) from test)) from 
test)"), Row(null))
   ```
   
   To fix, we disable the optimization for subqueries with nested subqueries 
inside them for now.
   
   ### Why are the changes needed?
   
   This fixes a bug.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   Updated existing unit tests and added the reproducing case as a new test 
case.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] dtenedor opened a new pull request, #38052: [SPARK-40618][SQL] Fix bug in MergeScalarSubqueries rule with nested subqueries

Reply via email to