Chaoyu Tang created HIVE-16572:
----------------------------------

             Summary: Rename a partition should not drop its column stats
                 Key: HIVE-16572
                 URL: https://issues.apache.org/jira/browse/HIVE-16572
             Project: Hive
          Issue Type: Bug
          Components: Statistics
            Reporter: Chaoyu Tang
            Assignee: Chaoyu Tang


The column stats for the table sample_pt partition (dummy=1) is as following:
{code}
hive> describe formatted sample_pt partition (dummy=1) code;
OK
# col_name              data_type               min                     max     
                num_nulls               distinct_count          avg_col_len     
        max_col_len             num_trues               num_falses              
comment             
                                                                                
 
code                    string                                                  
                0                       303                     6.985           
        7                                                                       
from deserializer   
Time taken: 0.259 seconds, Fetched: 3 row(s)
{code}
But when this partition is renamed, say
alter table sample_pt partition (dummy=1) rename to partition (dummy=11);
The COLUMN_STATS in partition description are true, but column stats are 
actually all deleted.
{code}
hive> describe formatted sample_pt partition (dummy=11);
OK
# col_name              data_type               comment             
                 
code                    string                                      
description             string                                      
salary                  int                                         
total_emp               int                                         
                 
# Partition Information          
# col_name              data_type               comment             
                 
dummy                   int                                         
                 
# Detailed Partition Information                 
Partition Value:        [11]                     
Database:               default                  
Table:                  sample_pt                
CreateTime:             Thu Mar 30 23:03:59 EDT 2017     
LastAccessTime:         UNKNOWN                  
Location:               file:/user/hive/warehouse/apache/sample_pt/dummy=11     
 
Partition Parameters:            
        COLUMN_STATS_ACCURATE   
{\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
        numFiles                1                   
        numRows                 200                 
        rawDataSize             10228               
        totalSize               10428               
        transient_lastDdlTime   1490929439          
                 
# Storage Information            
SerDe Library:          org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe      
 
InputFormat:            org.apache.hadoop.mapred.TextInputFormat         
OutputFormat:           
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat       
Compressed:             No                       
Num Buckets:            -1                       
Bucket Columns:         []                       
Sort Columns:           []                       
Storage Desc Params:             
        serialization.format    1                   
Time taken: 6.783 seconds, Fetched: 37 row(s)

===
hive> describe formatted sample_pt partition (dummy=11) code;
OK
# col_name              data_type               comment                         
                                                 
                                                                                
 
code                    string                  from deserializer               
                                                 
Time taken: 9.429 seconds, Fetched: 3 row(s)
{code}
The column stats should not be drop when a partition is renamed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to