Bruce Robbins created SPARK-46779: ------------------------------------- Summary: Grouping by subquery with a cached relation can fail Key: SPARK-46779 URL: https://issues.apache.org/jira/browse/SPARK-46779 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 4.0.0 Reporter: Bruce Robbins
Example: {noformat} create or replace temp view data(c1, c2) as values (1, 2), (1, 3), (3, 7), (4, 5); cache table data; select c1, (select count(*) from data d1 where d1.c1 = d2.c1), count(c2) from data d2 group by all; {noformat} It fails with the following error: {noformat} [INTERNAL_ERROR] Couldn't find count(1)#163L in [c1#78,_groupingexpression#149L,count(1)#82L] SQLSTATE: XX000 org.apache.spark.SparkException: [INTERNAL_ERROR] Couldn't find count(1)#163L in [c1#78,_groupingexpression#149L,count(1)#82L] SQLSTATE: XX000 {noformat} If you don't cache the view, the query succeeds. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org