Harish Butani created HIVE-7905: ----------------------------------- Summary: CBO: more cost model changes Key: HIVE-7905 URL: https://issues.apache.org/jira/browse/HIVE-7905 Project: Hive Issue Type: Sub-task Reporter: Harish Butani Assignee: Harish Butani
1. For composite predicates smoothen the Selectivity calculation using +exponential backoff+. Thanks to [~ mmokhtar] for this formula. {quote} Can you change the algorithm to use exponential back-off : ndv(pe0) * ndv(pe1) ^(1/2) * ndv(pe2) ^(1/4) * ndv(pe3) ^(1/8) Opposed to : ndv(pex)*log(ndv(pe1))*log(ndv(pe2)) If we assume selectivity of 0.7 for each store_sales join then join selectivity can end up being 6.24285E-05 which is too low and eventually results in an un-optimal plan. {quote} See attached picture. 2. In case of Fact - Dim joins on the Dim primary key we infer the Join cardinality as a filter on the Fact table: {code} join card = rowCount(Fact table) * selectivity(dim table) {code} Whether a Column is a Key is inferred based on either: * table rowCount = column ndv * (tbd shortly) table rowCount = (maxVal - minVal) -- This message was sent by Atlassian JIRA (v6.2#6252)