[
https://issues.apache.org/jira/browse/HIVE-15653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chaoyu Tang updated HIVE-15653:
-------------------------------
Attachment: HIVE-15653.patch
For most of alter table operations like table rename, add columns, change
column type etc (besides the set table properties), the table stats status
should not change. But for some other operations like update statistics, change
location, the basic stats status should change.
[~pxiong] could you review the patch?
> Some ALTER TABLE commands drop table stats
> ------------------------------------------
>
> Key: HIVE-15653
> URL: https://issues.apache.org/jira/browse/HIVE-15653
> Project: Hive
> Issue Type: Bug
> Components: Metastore
> Affects Versions: 1.1.0
> Reporter: Alexander Behm
> Assignee: Chaoyu Tang
> Priority: Critical
> Attachments: HIVE-15653.patch
>
>
> Some ALTER TABLE commands drop the table stats. That may make sense for some
> ALTER TABLE operations, but certainly not for others. Personally, I I think
> ALTER TABLE should only change what was requested by the user without any
> side effects that may be unclear to users. In particular, collecting stats
> can be an expensive operation so it's rather inconvenient for users if they
> get wiped accidentally.
> Repro:
> {code}
> create table t (i int);
> insert into t values(1);
> analyze table t compute statistics;
> alter table t set tblproperties('test'='test');
> hive> describe formatted t;
> OK
> # col_name data_type comment
>
> i int
>
> # Detailed Table Information
> Database: default
> Owner: abehm
> CreateTime: Tue Jan 17 18:13:34 PST 2017
> LastAccessTime: UNKNOWN
> Protect Mode: None
> Retention: 0
> Location: hdfs://localhost:20500/test-warehouse/t
> Table Type: MANAGED_TABLE
> Table Parameters:
> COLUMN_STATS_ACCURATE false
> last_modified_by abehm
> last_modified_time 1484705748
> numFiles 1
> numRows -1
> rawDataSize -1
> test test
> totalSize 2
> transient_lastDdlTime 1484705748
>
> # Storage Information
> SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>
> InputFormat: org.apache.hadoop.mapred.TextInputFormat
> OutputFormat:
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> Compressed: No
> Num Buckets: -1
> Bucket Columns: []
> Sort Columns: []
> Storage Desc Params:
> serialization.format 1
> Time taken: 0.169 seconds, Fetched: 34 row(s)
> {code}
> The same behavior can be observed with several other ALTER TABLE commands.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)