[ 
https://issues.apache.org/jira/browse/HIVE-19489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16560351#comment-16560351
 ] 

BELUGA BEHR edited comment on HIVE-19489 at 7/27/18 9:23 PM:
-------------------------------------------------------------

There is already such a flag and it is mentioned in [HIVE-18743].

My suggestions would be to use this flag (though rename it, I dislike the 
"do_not_" prefix).

Users could manually set it at the table properties level, but by default it 
would be set to 'true' for managed tables and 'false' for external tables.


was (Author: belugabehr):
There is already such a flag and it is mentioned in [HIVE-18743].

My suggestions would be to use this flag (though rename it, I dislike the 
"do_not_" prefix).

Users could manually set it, but by default it would be set to 'true' for 
managed tables and 'false' for external tables.

> Disable stats autogather for external tables
> --------------------------------------------
>
>                 Key: HIVE-19489
>                 URL: https://issues.apache.org/jira/browse/HIVE-19489
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Statistics
>            Reporter: Jason Dere
>            Assignee: Jason Dere
>            Priority: Major
>
> Hive auto-gather of table statistics can result in incorrect generation of 
> stats (and the stats being marked as accurate) in the case of external tables 
> where the data is being written by external apps.
> To avoid this issue, stats autogather will be disabled on external tables 
> when loading/inserting into a table with existing data, if 
> HIVE_DISABLE_UNSAFE_EXTERNALTABLE_OPERATIONS is enabled. In this situation, 
> users should rely on explicitly calling ANALYZE TABLE on their external 
> tables to make sure the stats are kept up-to-date.
> Autogather of stats will still be allowed to occur on external tables in the 
> case of INSERT OVERWRITE or LOAD DATA OVERWRITE, since the existing data is 
> being removed and so the stats calculated on the inserted/loaded data should 
> be accurate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to