[
https://issues.apache.org/jira/browse/HIVE-15653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172752#comment-16172752
]
Alexander Behm commented on HIVE-15653:
---------------------------------------
[~ctang.ma], am I understanding correctly that there is no interest in fixing
this issue on the Metastore side? I understand that all clients can pass
DO_NOT_UPDATE_STATS.
> Some ALTER TABLE commands drop table stats
> ------------------------------------------
>
> Key: HIVE-15653
> URL: https://issues.apache.org/jira/browse/HIVE-15653
> Project: Hive
> Issue Type: Bug
> Components: Metastore, Statistics
> Affects Versions: 1.1.0
> Reporter: Alexander Behm
> Assignee: Chaoyu Tang
> Priority: Critical
> Fix For: 2.3.0
>
> Attachments: HIVE-15653.1.patch, HIVE-15653.2.patch,
> HIVE-15653.3.patch, HIVE-15653.4.patch, HIVE-15653.5.patch,
> HIVE-15653.6.patch, HIVE-15653.patch
>
>
> Some ALTER TABLE commands drop the table stats. That may make sense for some
> ALTER TABLE operations, but certainly not for others. Personally, I I think
> ALTER TABLE should only change what was requested by the user without any
> side effects that may be unclear to users. In particular, collecting stats
> can be an expensive operation so it's rather inconvenient for users if they
> get wiped accidentally.
> Repro:
> {code}
> create table t (i int);
> insert into t values(1);
> analyze table t compute statistics;
> alter table t set tblproperties('test'='test');
> hive> describe formatted t;
> OK
> # col_name data_type comment
>
> i int
>
> # Detailed Table Information
> Database: default
> Owner: abehm
> CreateTime: Tue Jan 17 18:13:34 PST 2017
> LastAccessTime: UNKNOWN
> Protect Mode: None
> Retention: 0
> Location: hdfs://localhost:20500/test-warehouse/t
> Table Type: MANAGED_TABLE
> Table Parameters:
> COLUMN_STATS_ACCURATE false
> last_modified_by abehm
> last_modified_time 1484705748
> numFiles 1
> numRows -1
> rawDataSize -1
> test test
> totalSize 2
> transient_lastDdlTime 1484705748
>
> # Storage Information
> SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>
> InputFormat: org.apache.hadoop.mapred.TextInputFormat
> OutputFormat:
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> Compressed: No
> Num Buckets: -1
> Bucket Columns: []
> Sort Columns: []
> Storage Desc Params:
> serialization.format 1
> Time taken: 0.169 seconds, Fetched: 34 row(s)
> {code}
> The same behavior can be observed with several other ALTER TABLE commands.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)