Harish Butani created HIVE-7905:
-----------------------------------
Summary: CBO: more cost model changes
Key: HIVE-7905
URL: https://issues.apache.org/jira/browse/HIVE-7905
Project: Hive
Issue Type: Sub-task
Reporter: Harish Butani
Assignee: Harish Butani
1. For composite predicates smoothen the Selectivity calculation using
+exponential backoff+. Thanks to [~ mmokhtar] for this formula.
{quote}
Can you change the algorithm to use exponential back-off :
ndv(pe0) * ndv(pe1) ^(1/2) * ndv(pe2) ^(1/4) * ndv(pe3) ^(1/8)
Opposed to :
ndv(pex)*log(ndv(pe1))*log(ndv(pe2))
If we assume selectivity of 0.7 for each store_sales join then join selectivity
can end up being 6.24285E-05 which is too low and eventually results in an
un-optimal plan.
{quote}
See attached picture.
2. In case of Fact - Dim joins on the Dim primary key we infer the Join
cardinality as a filter on the Fact table:
{code}
join card = rowCount(Fact table) * selectivity(dim table)
{code}
Whether a Column is a Key is inferred based on either:
* table rowCount = column ndv
* (tbd shortly) table rowCount = (maxVal - minVal)
--
This message was sent by Atlassian JIRA
(v6.2#6252)