[ 
https://issues.apache.org/jira/browse/IMPALA-6006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-6006:
-------------------------------------

    Assignee:     (was: Philip Martin)

> Incorrect cardinality estimation when dimension table has inequality predicate
> ------------------------------------------------------------------------------
>
>                 Key: IMPALA-6006
>                 URL: https://issues.apache.org/jira/browse/IMPALA-6006
>             Project: IMPALA
>          Issue Type: Bug
>    Affects Versions: Impala 2.11.0
>            Reporter: Mostafa Mokhtar
>            Priority: Major
>
> Query 
> {code}
> select count(*)
>        from catalog_sales
>         JOIN date_dim ON catalog_sales.cs_sold_date_sk = date_dim.d_date_sk
>        where
>          d_month_seq between 1193 and 1193+11;
> {code}
> Plan
> {code}
> +-------------------------------------------------------------------------------+
> | Explain String                                                              
>   |
> +-------------------------------------------------------------------------------+
> | Max Per-Host Resource Reservation: Memory=1.94MB                            
>   |
> | Per-Host Resource Estimates: Memory=54.94MB                                 
>   |
> |                                                                             
>   |
> | F02:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1                       
>   |
> | |  Per-Host Resources: mem-estimate=10.00MB mem-reservation=0B              
>   |
> | PLAN-ROOT SINK                                                              
>   |
> | |  mem-estimate=0B mem-reservation=0B                                       
>   |
> | |                                                                           
>   |
> | 06:AGGREGATE [FINALIZE]                                                     
>   |
> | |  output: count:merge(*)                                                   
>   |
> | |  mem-estimate=10.00MB mem-reservation=0B spill-buffer=2.00MB              
>   |
> | |  tuple-ids=2 row-size=8B cardinality=1                                    
>   |
> | |                                                                           
>   |
> | 05:EXCHANGE [UNPARTITIONED]                                                 
>   |
> | |  mem-estimate=0B mem-reservation=0B                                       
>   |
> | |  tuple-ids=2 row-size=8B cardinality=1                                    
>   |
> | |                                                                           
>   |
> | F00:PLAN FRAGMENT [RANDOM] hosts=7 instances=7                              
>   |
> | Per-Host Resources: mem-estimate=12.94MB mem-reservation=1.94MB             
>   |
> | 03:AGGREGATE                                                                
>   |
> | |  output: count(*)                                                         
>   |
> | |  mem-estimate=10.00MB mem-reservation=0B spill-buffer=2.00MB              
>   |
> | |  tuple-ids=2 row-size=8B cardinality=1                                    
>   |
> | |                                                                           
>   |
> | 02:HASH JOIN [INNER JOIN, BROADCAST]                                        
>   |
> | |  hash predicates: catalog_sales.cs_sold_date_sk = date_dim.d_date_sk      
>   |
> | |  fk/pk conjuncts: catalog_sales.cs_sold_date_sk = date_dim.d_date_sk      
>   |
> | |  runtime filters: RF000 <- date_dim.d_date_sk                             
>   |
> | |  mem-estimate=1.94MB mem-reservation=1.94MB spill-buffer=64.00KB          
>   |
> | |  tuple-ids=0,1 row-size=16B cardinality=14399964710                       
>   |
> | |                                                                           
>   |
> | |--04:EXCHANGE [BROADCAST]                                                  
>   |
> | |  |  mem-estimate=0B mem-reservation=0B                                    
>   |
> | |  |  tuple-ids=1 row-size=8B cardinality=7305                              
>   |
> | |  |                                                                        
>   |
> | |  F01:PLAN FRAGMENT [RANDOM] hosts=1 instances=1                           
>   |
> | |  Per-Host Resources: mem-estimate=32.00MB mem-reservation=0B              
>   |
> | |  01:SCAN HDFS [tpcds_10000_parquet.date_dim, RANDOM]                      
>   |
> | |     partitions=1/1 files=1 size=2.15MB                                    
>   |
> | |     predicates: d_month_seq <= 1204, d_month_seq >= 1193                  
>   |
> | |     stats-rows=73049 extrapolated-rows=disabled                           
>   |
> | |     table stats: rows=73049 size=unavailable                              
>   |
> | |     column stats: all                                                     
>   |
> | |     parquet statistics predicates: d_month_seq <= 1204, d_month_seq >= 
> 1193 |
> | |     parquet dictionary predicates: d_month_seq <= 1204, d_month_seq >= 
> 1193 |
> | |     mem-estimate=32.00MB mem-reservation=0B                               
>   |
> | |     tuple-ids=1 row-size=8B cardinality=7305                              
>   |
> | |                                                                           
>   |
> | 00:SCAN HDFS [tpcds_10000_parquet.catalog_sales, RANDOM]                    
>   |
> |    partitions=1837/1837 files=5055 size=971.94GB                            
>   |
> |    runtime filters: RF000 -> catalog_sales.cs_sold_date_sk                  
>   |
> |    stats-rows=14399964710 extrapolated-rows=disabled                        
>   |
> |    table stats: rows=14399964710 size=unavailable                           
>   |
> |    column stats: all                                                        
>   |
> |    mem-estimate=1.00MB mem-reservation=0B                                   
>   |
> |    tuple-ids=0 row-size=8B cardinality=14399964710                          
>   |
> +-------------------------------------------------------------------------------+
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to