[ https://issues.apache.org/jira/browse/IMPALA-8790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908134#comment-16908134 ]
Quanlong Huang commented on IMPALA-8790: ---------------------------------------- We can work around the bug by rewriting the query to avoid putting the analytic functions and the aggregations in the same query block: {code:sql} explain select uid, cid, rank() over (partition by uid order by cnt desc) from (select uid, cid, count(*) as cnt from foo) w; {code} Impala has an optimization for analytic functions if its query block contains group-by expressions of aggregations in the same time. It will trim the exchange slots according to the partition-by and the group-by slots. See this [commit|https://github.com/apache/impala/commit/0b3124ab35402f1ab8141e8ffcca28a5a481c81e] for more details. In IMPALA-110, there's a bug using wrong group-by expressions referencing the non-materialized slots (of the inline view) in this optimization. For the rewritten query, it avoids putting the analytic function rank() and the aggregation (count\(*\) group by ...) in the same query block. So the above optimization is bypassed, which avoids the bug. Patch for review: https://gerrit.cloudera.org/c/14063 > IllegalStateException: Illegal reference to non-materialized slot > ----------------------------------------------------------------- > > Key: IMPALA-8790 > URL: https://issues.apache.org/jira/browse/IMPALA-8790 > Project: IMPALA > Issue Type: Bug > Components: Frontend > Reporter: Quanlong Huang > Assignee: Quanlong Huang > Priority: Major > Attachments: foo.parq > > > Reproduce: > {code:sql} > $ hdfs dfs -put foo.parq hdfs:///tmp > impala> create table foo (uid string, cid string) stored as parquet; > impala> load data inpath 'hdfs:///tmp/foo.parq' into table foo; > {code} > With the stats, the following query hits an IllegalStateException: > {code:sql} > impala> compute stats foo; > impala> explain select uid, cid, > rank() over (partition by uid order by count(*) desc) > from (select uid, cid from foo) w > group by uid, cid; > ERROR: IllegalStateException: Illegal reference to non-materialized slot: > tid=1 sid=2{code} > Without the stats, it runs successfully: > {code:sql} > impala> drop stats foo; > impala> explain select uid, cid, > rank() over (partition by uid order by count(*) desc) > from (select uid, cid from foo) w > group by uid, cid; > +------------------------------------------------------------------------------------+ > | Explain String > | > +------------------------------------------------------------------------------------+ > | Max Per-Host Resource Reservation: Memory=84.02MB Threads=5 > | > | Per-Host Resource Estimates: Memory=304MB > | > | WARNING: The following tables are missing relevant table and/or column > statistics. | > | common_action.foo > | > | > | > | PLAN-ROOT SINK > | > | | > | > | 07:EXCHANGE [UNPARTITIONED] > | > | | > | > | 03:ANALYTIC > | > | | functions: rank() > | > | | partition by: uid > | > | | order by: count(*) DESC > | > | | window: RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW > | > | | row-size=40B cardinality=1.10K > | > | | > | > | 02:SORT > | > | | order by: uid ASC NULLS FIRST, count(*) DESC > | > | | row-size=32B cardinality=1.10K > | > | | > | > | 06:EXCHANGE [HASH(uid)] > | > | | > | > | 05:AGGREGATE [FINALIZE] > | > | | output: count:merge(*) > | > | | group by: uid, cid > | > | | row-size=32B cardinality=1.10K > | > | | > | > | 04:EXCHANGE [HASH(uid,cid)] > | > | | > | > | 01:AGGREGATE [STREAMING] > | > | | output: count(*) > | > | | group by: uid, cid > | > | | row-size=32B cardinality=1.10K > | > | | > | > | 00:SCAN HDFS [common_action.foo] > | > | HDFS partitions=1/1 files=1 size=5.19KB > | > | row-size=24B cardinality=1.10K > | > +------------------------------------------------------------------------------------+ > Fetched 37 row(s) in 0.03s > {code} > -- This message was sent by Atlassian JIRA (v7.6.14#76016) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org