[ 
https://issues.apache.org/jira/browse/HIVE-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhuoluo (Clark) Yang updated HIVE-4561:
---------------------------------------

    Attachment: HIVE-4561.2.patch

Update a new patch.
In case of all the long values are positive, we can get the right min. In case 
of all the values are negative, we can get the right max.
UT "compute_stats_long.q" reads values from data/files/int.txt which values are 
all above zero. Original ut computes the min value "0", however, the correct 
min value is "4". This patch fixes the bug.
                
> Column stats :  LOW_VALUE (or HIGH_VALUE) will always be 0.0000 ,if all the 
> column values larger than 0.0 (or if all column values smaller than 0.0)
> ----------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-4561
>                 URL: https://issues.apache.org/jira/browse/HIVE-4561
>             Project: Hive
>          Issue Type: Bug
>          Components: Statistics
>    Affects Versions: 0.12.0
>            Reporter: caofangkun
>            Assignee: Zhuoluo (Clark) Yang
>         Attachments: HIVE-4561.1.patch, HIVE-4561.2.patch
>
>
> if all column values larger than 0.0  DOUBLE_LOW_VALUE always will be 0.0 
> or  if all column values less than 0.0,  DOUBLE_HIGH_VALUE will always be 
> hive (default)> create table src_test (price double);
> hive (default)> load data local inpath './test.txt' into table src_test;
> hive (default)> select * from src_test;
> OK
> 1.0
> 2.0
> 3.0
> Time taken: 0.313 seconds, Fetched: 3 row(s)
> hive (default)> analyze table src_test compute statistics for columns price;
> mysql> select * from TAB_COL_STATS \G;
>                  CS_ID: 16
>                DB_NAME: default
>             TABLE_NAME: src_test
>            COLUMN_NAME: price
>            COLUMN_TYPE: double
>                 TBL_ID: 2586
>         LONG_LOW_VALUE: 0
>        LONG_HIGH_VALUE: 0
>       DOUBLE_LOW_VALUE: 0.0000   # Wrong Result ! Expected is 1.0000
>      DOUBLE_HIGH_VALUE: 3.0000
>  BIG_DECIMAL_LOW_VALUE: NULL
> BIG_DECIMAL_HIGH_VALUE: NULL
>              NUM_NULLS: 0
>          NUM_DISTINCTS: 1
>            AVG_COL_LEN: 0.0000
>            MAX_COL_LEN: 0
>              NUM_TRUES: 0
>             NUM_FALSES: 0
>          LAST_ANALYZED: 1368596151
> 2 rows in set (0.00 sec)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to