[ 
https://issues.apache.org/jira/browse/HIVE-11266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11266:
------------------------------------
            Priority: Blocker  (was: Critical)
    Target Version/s: 3.0.0, 2.4.0, 2.3.1

> count(*) wrong result based on table statistics for external tables
> -------------------------------------------------------------------
>
>                 Key: HIVE-11266
>                 URL: https://issues.apache.org/jira/browse/HIVE-11266
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 1.1.0
>            Reporter: Simone Battaglia
>            Priority: Blocker
>
> Hive returns wrong count result on an external table with table statistics if 
> I change table data files.
> This is the scenario in details:
> 1) create external table my_table (...) location 'my_location';
> 2) analyze table my_table compute statistics;
> 3) change/add/delete one or more files in 'my_location' directory;
> 4) select count(\*) from my_table;
> In this case the count query doesn't generate a MR job and returns the result 
> based on table statistics. This result is wrong because is based on 
> statistics stored in the Hive metastore and doesn't take into account 
> modifications introduced on data files.
> Obviously setting "hive.compute.query.using.stats" to FALSE this problem 
> doesn't occur but the default value of this property is TRUE.
> I thinks that also this post on stackoverflow, that shows another type of bug 
> in case of multiple insert, is related to the one that I reported:
> http://stackoverflow.com/questions/24080276/wrong-result-for-count-in-hive-table



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to