[
https://issues.apache.org/jira/browse/HIVE-11266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638659#comment-14638659
]
Simone commented on HIVE-11266:
-------------------------------
Yes, I agree with you, external table is just my personal use case.
> count(*) wrong result based on table statistics
> -----------------------------------------------
>
> Key: HIVE-11266
> URL: https://issues.apache.org/jira/browse/HIVE-11266
> Project: Hive
> Issue Type: Bug
> Affects Versions: 1.1.0
> Reporter: Simone
> Priority: Critical
>
> Hive returns wrong count result on an external table with table statistics if
> I change table data files.
> This is the scenario in details:
> 1) create external table my_table (...) location 'my_location';
> 2) analyze table my_table compute statistics;
> 3) change/add/delete one or more files in 'my_location' directory;
> 4) select count(\*) from my_table;
> In this case the count query doesn't generate a MR job and returns the result
> based on table statistics. This result is wrong because is based on
> statistics stored in the Hive metastore and doesn't take into account
> modifications introduced on data files.
> Obviously setting "hive.compute.query.using.stats" to FALSE this problem
> doesn't occur but the default value of this property is TRUE.
> I thinks that also this post on stackoverflow, that shows another type of bug
> in case of multiple insert, is related to the one that I reported:
> http://stackoverflow.com/questions/24080276/wrong-result-for-count-in-hive-table
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)