Shant Hovsepian created IMPALA-9972:
---------------------------------------
Summary: Use defined referential constraints for join cardinality
calculations
Key: IMPALA-9972
URL: https://issues.apache.org/jira/browse/IMPALA-9972
Project: IMPALA
Issue Type: Sub-task
Components: Frontend
Reporter: Shant Hovsepian
Assignee: Shant Hovsepian
Fix For: Impala 4.0
Currently an estimation technique is used to determine if the join predicates
consistent a foreign key -> primary key type of functional dependency. These
types of joins are common in "star schemas" and allow for certain query
planning optimization.
The current technique however can produce both false negatives and false
positives given the reliance on table stats which can be out of date or
incorrect due to the statistical methods used to derive stats. For example
higher variability in the error rates of the HyperLogLog algorithm used by
stats computation to calculate the number of distinct values for a specific
column.
In case swhere a referential integrity constraint exists and is defined in the
table metadata, this information should be used instead of the stats based
estimation to determine the type and cardinality of a join.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]