Shant Hovsepian created IMPALA-9972:
---------------------------------------

             Summary: Use defined referential constraints for join cardinality 
calculations
                 Key: IMPALA-9972
                 URL: https://issues.apache.org/jira/browse/IMPALA-9972
             Project: IMPALA
          Issue Type: Sub-task
          Components: Frontend
            Reporter: Shant Hovsepian
            Assignee: Shant Hovsepian
             Fix For: Impala 4.0


Currently an estimation technique is used to determine if the join predicates 
consistent a foreign key -> primary key type of functional dependency. These 
types of joins are common in "star schemas" and allow for certain query 
planning optimization.

The current technique however can produce both false negatives and false 
positives given the reliance on table stats which can be out of date or 
incorrect due to the statistical methods used to derive stats. For example 
higher variability in the error rates of the HyperLogLog algorithm used by 
stats computation to calculate the number of distinct values for a specific 
column.

In case swhere a referential integrity constraint exists and is defined in the 
table metadata, this information should be used instead of the stats based 
estimation to determine the type and cardinality of a join.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to