Harish Butani created HIVE-7905:
-----------------------------------

             Summary: CBO: more cost model changes
                 Key: HIVE-7905
                 URL: https://issues.apache.org/jira/browse/HIVE-7905
             Project: Hive
          Issue Type: Sub-task
            Reporter: Harish Butani
            Assignee: Harish Butani


1. For composite predicates smoothen the Selectivity calculation using 
+exponential backoff+. Thanks to [~ mmokhtar] for this formula.

{quote}
Can you change the algorithm to use exponential back-off  :
ndv(pe0) * ndv(pe1) ^(1/2)  * ndv(pe2) ^(1/4)  * ndv(pe3) ^(1/8)

Opposed to :

ndv(pex)*log(ndv(pe1))*log(ndv(pe2))

If we assume selectivity of 0.7 for each store_sales join then join selectivity 
can end up being 6.24285E-05 which is too low and eventually results in an 
un-optimal plan.
{quote}

See attached picture.

2. In case of Fact - Dim joins on the Dim primary key we infer the Join 
cardinality as a filter on the Fact table:
{code}
join card = rowCount(Fact table) * selectivity(dim table)
{code}

Whether a Column is a Key is inferred based on either:
* table rowCount = column ndv
* (tbd shortly) table rowCount = (maxVal - minVal)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to