[
https://issues.apache.org/jira/browse/HIVE-14018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15332208#comment-15332208
]
Ashutosh Chauhan commented on HIVE-14018:
-----------------------------------------
+1
> Make IN clause row selectivity estimation customizable
> ------------------------------------------------------
>
> Key: HIVE-14018
> URL: https://issues.apache.org/jira/browse/HIVE-14018
> Project: Hive
> Issue Type: Improvement
> Components: Statistics
> Affects Versions: 2.1.0, 2.2.0
> Reporter: Jesus Camacho Rodriguez
> Assignee: Jesus Camacho Rodriguez
> Priority: Minor
> Attachments: HIVE-14018.patch
>
>
> After HIVE-13287 went in, we calculate IN clause estimates natively (instead
> of just dividing incoming number of rows by 2). However, as the distribution
> of values of the columns is considered uniform, we might end up heavily
> underestimating/overestimating the resulting number of rows.
> This issue is to add a factor that multiplies the IN clause estimation so we
> can alleviate this problem. The solution is not very elegant, but it is the
> best we can do until we have histograms to improve our estimate.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)