[
https://issues.apache.org/jira/browse/IMPALA-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Paul Rogers updated IMPALA-8220:
--------------------------------
Description:
See IMPALA-8213. To proper account for join cardinalities, we must track the
adjusted NDV of columns as they pass through filters. IMPALA-8014, IMPALA-8015
and IMPALA-8213 suggest work-arounds based on the current code design.
A better longer-term solution is to track the adjusted NDV for each column up
the plan tree.
That is, suppose we have column {{c}} with an original NDV of {{|c|}}. The scan
applies a filter of {{c = 10}}. Clearly, the NDV out of the scan, {{|c'|}} is
just 1.
By tracking the filtered NDV, calculations up the tree become local. At
present, the join node must reach down through the tree to find filters and
potentially reverse them. This is complex and can be replaced with per-column
NDV tracking.
was:
See IMPALA-XXXX. To proper account for join cardinalities, we must track the
adjusted NDV of columns as they pass through filters. IMPALA-8014, IMPALA-8015
and IMPALA-8213 suggest work-arounds based on the current code design.
A better longer-term solution is to track the adjusted NDV for each column up
the plan tree.
That is, suppose we have column {{c}} with an original NDV of {{|c|}}. The scan
applies a filter of {{c = 10}}. Clearly, the NDV out of the scan, {{|c'|}} is
just 1.
By tracking the filtered NDV, calculations up the tree become local. At
present, the join node must reach down through the tree to find filters and
potentially reverse them. This is complex and can be replaced with per-column
NDV tracking.
> Track adjusted NDV for each column through plan tree
> ----------------------------------------------------
>
> Key: IMPALA-8220
> URL: https://issues.apache.org/jira/browse/IMPALA-8220
> Project: IMPALA
> Issue Type: Improvement
> Components: Frontend
> Affects Versions: Impala 3.1.0
> Reporter: Paul Rogers
> Assignee: Paul Rogers
> Priority: Minor
>
> See IMPALA-8213. To proper account for join cardinalities, we must track the
> adjusted NDV of columns as they pass through filters. IMPALA-8014,
> IMPALA-8015 and IMPALA-8213 suggest work-arounds based on the current code
> design.
> A better longer-term solution is to track the adjusted NDV for each column up
> the plan tree.
> That is, suppose we have column {{c}} with an original NDV of {{|c|}}. The
> scan applies a filter of {{c = 10}}. Clearly, the NDV out of the scan,
> {{|c'|}} is just 1.
> By tracking the filtered NDV, calculations up the tree become local. At
> present, the join node must reach down through the tree to find filters and
> potentially reverse them. This is complex and can be replaced with per-column
> NDV tracking.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]