[
https://issues.apache.org/jira/browse/HIVE-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13676611#comment-13676611
]
Zhuoluo (Clark) Yang commented on HIVE-4561:
--------------------------------------------
[~shreepadma] I think I am wrong. Originally, I want to "return" like this:
{code}
@@ -189,6 +187,11 @@
statsObj.setStatsData(statsData);
}
} else {
+ // Any null object, such as min/max value of an empty table,
+ // need not be unpacked.
+ if (o == null) {
+ return;
+ }
// invoke the right unpack method depending on data type of the column
if (statsObj.getStatsData().isSetBooleanStats()) {
unpackBooleanStats(oi, o, fieldName, statsObj);
{code}
However, I've found that LongColumnStatsData.highValue is required by thrift.
And also modifications of ObjectStore is required and checks
LongColumnStatsData.isSetHighValue(). Any suggestions? Thanks!
> Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 ,if all the
> column values larger than 0.0 (or if all column values smaller than 0.0)
> ----------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HIVE-4561
> URL: https://issues.apache.org/jira/browse/HIVE-4561
> Project: Hive
> Issue Type: Bug
> Components: Statistics
> Affects Versions: 0.12.0
> Reporter: caofangkun
> Assignee: Zhuoluo (Clark) Yang
> Attachments: HIVE-4561.1.patch, HIVE-4561.2.patch, HIVE-4561.3.patch,
> HIVE-4561.4.patch
>
>
> if all column values larger than 0.0 DOUBLE_LOW_VALUE always will be 0.0
> or if all column values less than 0.0, DOUBLE_HIGH_VALUE will always be
> hive (default)> create table src_test (price double);
> hive (default)> load data local inpath './test.txt' into table src_test;
> hive (default)> select * from src_test;
> OK
> 1.0
> 2.0
> 3.0
> Time taken: 0.313 seconds, Fetched: 3 row(s)
> hive (default)> analyze table src_test compute statistics for columns price;
> mysql> select * from TAB_COL_STATS \G;
> CS_ID: 16
> DB_NAME: default
> TABLE_NAME: src_test
> COLUMN_NAME: price
> COLUMN_TYPE: double
> TBL_ID: 2586
> LONG_LOW_VALUE: 0
> LONG_HIGH_VALUE: 0
> DOUBLE_LOW_VALUE: 0.0000 # Wrong Result ! Expected is 1.0000
> DOUBLE_HIGH_VALUE: 3.0000
> BIG_DECIMAL_LOW_VALUE: NULL
> BIG_DECIMAL_HIGH_VALUE: NULL
> NUM_NULLS: 0
> NUM_DISTINCTS: 1
> AVG_COL_LEN: 0.0000
> MAX_COL_LEN: 0
> NUM_TRUES: 0
> NUM_FALSES: 0
> LAST_ANALYZED: 1368596151
> 2 rows in set (0.00 sec)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira