[
https://issues.apache.org/jira/browse/IMPALA-6006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Armstrong reassigned IMPALA-6006:
-------------------------------------
Assignee: (was: Philip Martin)
> Incorrect cardinality estimation when dimension table has inequality predicate
> ------------------------------------------------------------------------------
>
> Key: IMPALA-6006
> URL: https://issues.apache.org/jira/browse/IMPALA-6006
> Project: IMPALA
> Issue Type: Bug
> Affects Versions: Impala 2.11.0
> Reporter: Mostafa Mokhtar
> Priority: Major
>
> Query
> {code}
> select count(*)
> from catalog_sales
> JOIN date_dim ON catalog_sales.cs_sold_date_sk = date_dim.d_date_sk
> where
> d_month_seq between 1193 and 1193+11;
> {code}
> Plan
> {code}
> +-------------------------------------------------------------------------------+
> | Explain String
> |
> +-------------------------------------------------------------------------------+
> | Max Per-Host Resource Reservation: Memory=1.94MB
> |
> | Per-Host Resource Estimates: Memory=54.94MB
> |
> |
> |
> | F02:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
> |
> | | Per-Host Resources: mem-estimate=10.00MB mem-reservation=0B
> |
> | PLAN-ROOT SINK
> |
> | | mem-estimate=0B mem-reservation=0B
> |
> | |
> |
> | 06:AGGREGATE [FINALIZE]
> |
> | | output: count:merge(*)
> |
> | | mem-estimate=10.00MB mem-reservation=0B spill-buffer=2.00MB
> |
> | | tuple-ids=2 row-size=8B cardinality=1
> |
> | |
> |
> | 05:EXCHANGE [UNPARTITIONED]
> |
> | | mem-estimate=0B mem-reservation=0B
> |
> | | tuple-ids=2 row-size=8B cardinality=1
> |
> | |
> |
> | F00:PLAN FRAGMENT [RANDOM] hosts=7 instances=7
> |
> | Per-Host Resources: mem-estimate=12.94MB mem-reservation=1.94MB
> |
> | 03:AGGREGATE
> |
> | | output: count(*)
> |
> | | mem-estimate=10.00MB mem-reservation=0B spill-buffer=2.00MB
> |
> | | tuple-ids=2 row-size=8B cardinality=1
> |
> | |
> |
> | 02:HASH JOIN [INNER JOIN, BROADCAST]
> |
> | | hash predicates: catalog_sales.cs_sold_date_sk = date_dim.d_date_sk
> |
> | | fk/pk conjuncts: catalog_sales.cs_sold_date_sk = date_dim.d_date_sk
> |
> | | runtime filters: RF000 <- date_dim.d_date_sk
> |
> | | mem-estimate=1.94MB mem-reservation=1.94MB spill-buffer=64.00KB
> |
> | | tuple-ids=0,1 row-size=16B cardinality=14399964710
> |
> | |
> |
> | |--04:EXCHANGE [BROADCAST]
> |
> | | | mem-estimate=0B mem-reservation=0B
> |
> | | | tuple-ids=1 row-size=8B cardinality=7305
> |
> | | |
> |
> | | F01:PLAN FRAGMENT [RANDOM] hosts=1 instances=1
> |
> | | Per-Host Resources: mem-estimate=32.00MB mem-reservation=0B
> |
> | | 01:SCAN HDFS [tpcds_10000_parquet.date_dim, RANDOM]
> |
> | | partitions=1/1 files=1 size=2.15MB
> |
> | | predicates: d_month_seq <= 1204, d_month_seq >= 1193
> |
> | | stats-rows=73049 extrapolated-rows=disabled
> |
> | | table stats: rows=73049 size=unavailable
> |
> | | column stats: all
> |
> | | parquet statistics predicates: d_month_seq <= 1204, d_month_seq >=
> 1193 |
> | | parquet dictionary predicates: d_month_seq <= 1204, d_month_seq >=
> 1193 |
> | | mem-estimate=32.00MB mem-reservation=0B
> |
> | | tuple-ids=1 row-size=8B cardinality=7305
> |
> | |
> |
> | 00:SCAN HDFS [tpcds_10000_parquet.catalog_sales, RANDOM]
> |
> | partitions=1837/1837 files=5055 size=971.94GB
> |
> | runtime filters: RF000 -> catalog_sales.cs_sold_date_sk
> |
> | stats-rows=14399964710 extrapolated-rows=disabled
> |
> | table stats: rows=14399964710 size=unavailable
> |
> | column stats: all
> |
> | mem-estimate=1.00MB mem-reservation=0B
> |
> | tuple-ids=0 row-size=8B cardinality=14399964710
> |
> +-------------------------------------------------------------------------------+
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]