[ 
https://issues.apache.org/jira/browse/IMPALA-8790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16909741#comment-16909741
 ] 

ASF subversion and git services commented on IMPALA-8790:
---------------------------------------------------------

Commit c665fc1e06d53c7b70611ad3993c712c7fe35cb2 in impala's branch 
refs/heads/master from stiga-huang
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=c665fc1 ]

IMPALA-8790: fix referencing wrong grouping exprs of MultiAggregateInfo

When creating the single node plan for analytic functions (see
SingleNodePlanner#createQueryPlan), if the query block contains
aggregations, the grouping exprs are carried on for optimizations (see
AnalyticPlanner#computeInputPartitionExprs). This patch fixes a
regression bug due to IMPALA-110 in this phase.

The pre-IMPALA-110 behavior is carrying the groupingExprs of the
AggregateInfo (if has). Those exprs are already substituted in
AggregationNode#init calling from SingleNodePlanner#createSelectPlan.
Now the behavior is carrying the groupingExprs of the MultiAggregateInfo
(if has). Those exprs are not substituted in AggregationNode#init.
Instead, MultiAggregateInfo creates a substituted grouping exprs in this
step. They are what we actually need. The original grouping exprs may
use non-materialized slots, which should not be referenced in exchanges.

Also add some useful TRACE logs for future debugging.

Tests
  - Add planner tests to cover the regression bug.
  - Run CORE tests

Change-Id: I11a80bd6d73ea00ad8c644469558a1885706f596
Reviewed-on: http://gerrit.cloudera.org:8080/14063
Reviewed-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>


> IllegalStateException: Illegal reference to non-materialized slot
> -----------------------------------------------------------------
>
>                 Key: IMPALA-8790
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8790
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>    Affects Versions: Impala 3.1.0, Impala 3.2.0
>            Reporter: Quanlong Huang
>            Assignee: Quanlong Huang
>            Priority: Major
>             Fix For: Impala 3.3.0
>
>         Attachments: foo.parq
>
>
> Reproduce:
> {code:sql}
> $ hdfs dfs -put foo.parq hdfs:///tmp
> impala> create table foo (uid string, cid string) stored as parquet;
> impala> load data inpath 'hdfs:///tmp/foo.parq' into table foo;
> {code}
> With the stats, the following query hits an IllegalStateException:
> {code:sql}
> impala> compute stats foo;
> impala> explain select uid, cid,
>    rank() over (partition by uid order by count(*) desc)
> from (select uid, cid from foo) w
> group by uid, cid;
> ERROR: IllegalStateException: Illegal reference to non-materialized slot: 
> tid=1 sid=2{code}
> Without the stats, it runs successfully:
> {code:sql}
> impala> drop stats foo;
> impala> explain select uid, cid,
>    rank() over (partition by uid order by count(*) desc)
> from (select uid, cid from foo) w
> group by uid, cid;
> +------------------------------------------------------------------------------------+
> | Explain String                                                              
>        |
> +------------------------------------------------------------------------------------+
> | Max Per-Host Resource Reservation: Memory=84.02MB Threads=5                 
>        |
> | Per-Host Resource Estimates: Memory=304MB                                   
>        |
> | WARNING: The following tables are missing relevant table and/or column 
> statistics. |
> | common_action.foo                                                           
>        |
> |                                                                             
>        |
> | PLAN-ROOT SINK                                                              
>        |
> | |                                                                           
>        |
> | 07:EXCHANGE [UNPARTITIONED]                                                 
>        |
> | |                                                                           
>        |
> | 03:ANALYTIC                                                                 
>        |
> | |  functions: rank()                                                        
>        |
> | |  partition by: uid                                                        
>        |
> | |  order by: count(*) DESC                                                  
>        |
> | |  window: RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW                
>        |
> | |  row-size=40B cardinality=1.10K                                           
>        |
> | |                                                                           
>        |
> | 02:SORT                                                                     
>        |
> | |  order by: uid ASC NULLS FIRST, count(*) DESC                             
>        |
> | |  row-size=32B cardinality=1.10K                                           
>        |
> | |                                                                           
>        |
> | 06:EXCHANGE [HASH(uid)]                                                     
>        |
> | |                                                                           
>        |
> | 05:AGGREGATE [FINALIZE]                                                     
>        |
> | |  output: count:merge(*)                                                   
>        |
> | |  group by: uid, cid                                                       
>        |
> | |  row-size=32B cardinality=1.10K                                           
>        |
> | |                                                                           
>        |
> | 04:EXCHANGE [HASH(uid,cid)]                                                 
>        |
> | |                                                                           
>        |
> | 01:AGGREGATE [STREAMING]                                                    
>        |
> | |  output: count(*)                                                         
>        |
> | |  group by: uid, cid                                                       
>        |
> | |  row-size=32B cardinality=1.10K                                           
>        |
> | |                                                                           
>        |
> | 00:SCAN HDFS [common_action.foo]                                            
>        |
> |    HDFS partitions=1/1 files=1 size=5.19KB                                  
>        |
> |    row-size=24B cardinality=1.10K                                           
>        |
> +------------------------------------------------------------------------------------+
> Fetched 37 row(s) in 0.03s
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to