[ 
https://issues.apache.org/jira/browse/HIVE-13233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15186751#comment-15186751
 ] 

Jesus Camacho Rodriguez commented on HIVE-13233:
------------------------------------------------

[~ashutoshc], thanks for checking.

1) <= and < are also taken into account (line 747 in StatsRulesProcFactory, 
thus _else_ block in {{evaluateComparator}} refers to them).

2) Checking nested expressions is already taken care of by 
{{evaluateChildExpr}}. In fact, we call {{evaluateComparator}} only when we 
have found an expression a _CMP_ b, where _CMP_ is >=, >, <=, or <. Previously, 
we were just returning 1/3 of the rows for these cases.

> Use min and max values to estimate better stats for comparison operators
> ------------------------------------------------------------------------
>
>                 Key: HIVE-13233
>                 URL: https://issues.apache.org/jira/browse/HIVE-13233
>             Project: Hive
>          Issue Type: Bug
>          Components: Statistics
>    Affects Versions: 2.1.0
>            Reporter: Jesus Camacho Rodriguez
>            Assignee: Jesus Camacho Rodriguez
>         Attachments: HIVE-13233.patch
>
>
> We should benefit from the min/max values for each column to calculate more 
> precisely the number of rows produced by expressions with comparison operators



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to