[
https://issues.apache.org/jira/browse/IMPALA-1374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Armstrong updated IMPALA-1374:
----------------------------------
Issue Type: Improvement (was: Bug)
> Improve Join Order Planning
> ---------------------------
>
> Key: IMPALA-1374
> URL: https://issues.apache.org/jira/browse/IMPALA-1374
> Project: IMPALA
> Issue Type: Improvement
> Components: Frontend
> Affects Versions: Impala 1.3.2
> Reporter: Ryan Bosshart
> Priority: Minor
> Labels: performance, planner
> Attachments: consolidatedqueries_fast, consolidatedqueries_slow
>
>
> The join order is determined entirely by total size (#rows * column width).
> This makes sense in general. However, when the fact table size (after
> partition pruning) is close to the dim table, it can be a wrong choice
> because the join key from the fact table is duplicated many many times. This
> will make the hash chain very long.
> On an almost identical query (similar join condition, tables, & number of
> results), this caused a query time of ~10 seconds for one query and ~3
> minutes for the other (first row fetched, queries attached).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]