Bruce Robbins created SPARK-50091:
-------------------------------------

             Summary: Query fails when aggregate expression is in left-hand 
operand of IN-subquery
                 Key: SPARK-50091
                 URL: https://issues.apache.org/jira/browse/SPARK-50091
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 3.5.0
            Reporter: Bruce Robbins


Consider this query:
{noformat}
create or replace temp view v1(c1, c2) as values
(1, 2), (1, 3), (2, 2), (3, 7), (3, 1);
create or replace temp view v2(col1, col2) as values
(1, 2), (1, 3), (2, 2), (3, 7), (3, 1);

select col1, sum(col2) in (select c2 from v1)
from v2 group by col1;
{noformat}
It fails with the following error:
{noformat}
[INTERNAL_ERROR] Cannot generate code for expression: sum(input[1, int, false]) 
SQLSTATE: XX000
{noformat}
With SPARK_TESTING=1, it fails with
{noformat}
[PLAN_VALIDATION_FAILED_RULE_IN_BATCH] Rule 
org.apache.spark.sql.catalyst.optimizer.RewritePredicateSubquery in batch 
RewriteSubquery generated an invalid plan: Special expressions are placed in 
the wrong plan: Aggregate [col1#11], [col1#11, first(exists#20, false) AS 
(sum(col2) IN (listquery()))#19]
+- Join ExistenceJoin(exists#20), (sum(col2#12) = c2#18L)
   :- LocalRelation [col1#11, col2#12]
   +- LocalRelation [c2#18L]
{noformat}
The {{Join}} is invalid because it contains an aggregate expression in the join 
condition.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to