Paul Rogers created IMPALA-8220:
-----------------------------------
Summary: Track adjusted NDV for each column through plan tree
Key: IMPALA-8220
URL: https://issues.apache.org/jira/browse/IMPALA-8220
Project: IMPALA
Issue Type: Improvement
Components: Frontend
Affects Versions: Impala 3.1.0
Reporter: Paul Rogers
Assignee: Paul Rogers
See IMPALA-XXXX. To proper account for join cardinalities, we must track the
adjusted NDV of columns as they pass through filters. IMPALA-8014, IMPALA-8015
and IMPALA-8213 suggest work-arounds based on the current code design.
A better longer-term solution is to track the adjusted NDV for each column up
the plan tree.
That is, suppose we have column {{c}} with an original NDV of {{|c|}}. The scan
applies a filter of {{c = 10}}. Clearly, the NDV out of the scan, {{|c'|}} is
just 1.
By tracking the filtered NDV, calculations up the tree become local. At
present, the join node must reach down through the tree to find filters and
potentially reverse them. This is complex and can be replaced with per-column
NDV tracking.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)