[
https://issues.apache.org/jira/browse/IMPALA-11647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17615283#comment-17615283
]
Qifan Chen commented on IMPALA-11647:
-------------------------------------
The output width from the scan being 0B instead of 8B is due to this line of
code:
https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/planner/ScanNode.java#L160.
Once the restriction is relaxed, we can get a better plan, where the row size
is 8B and the # of rows is the # of files in the table.
> Row size for source tables in a cross join query is set to 0 in query plan
> --------------------------------------------------------------------------
>
> Key: IMPALA-11647
> URL: https://issues.apache.org/jira/browse/IMPALA-11647
> Project: IMPALA
> Issue Type: Improvement
> Components: Frontend
> Reporter: Qifan Chen
> Priority: Major
>
> The row-size in the following explain output for both source tables is set to
> 0B. On paper, it is possible to apply the count star optimization for such
> queries and therefore set the row-size correctly.
> {code:java}
> explain select count(*) from store_sales a, store_sales b limit 500
> +--------------------------------------------------------------+
> | Explain String |
> +--------------------------------------------------------------+
> | Max Per-Host Resource Reservation: Memory=256.00KB Threads=5 |
> | Per-Host Resource Estimates: Memory=10MB |
> | |
> | PLAN-ROOT SINK |
> | | |
> | 06:AGGREGATE [FINALIZE] |
> | | output: count:merge(*) |
> | | limit: 500 |
> | | row-size=8B cardinality=1 |
> | | |
> | 05:EXCHANGE [UNPARTITIONED] |
> | | |
> | 03:AGGREGATE |
> | | output: count(*) |
> | | row-size=8B cardinality=1 |
> | | |
> | 02:NESTED LOOP JOIN [CROSS JOIN, BROADCAST] |
> | | row-size=0B cardinality=8.30T |
> | | |
> | |--04:EXCHANGE [BROADCAST] |
> | | | |
> | | 01:SCAN HDFS [tpcds_parquet.store_sales b] |
> | | HDFS partitions=1824/1824 files=1824 size=199.83MB |
> | | row-size=0B cardinality=2.88M |
> | | |
> | 00:SCAN HDFS [tpcds_parquet.store_sales a] |
> | HDFS partitions=1824/1824 files=1824 size=199.83MB |
> | row-size=0B cardinality=2.88M |
> +--------------------------------------------------------------+
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]