[ 
https://issues.apache.org/jira/browse/HIVE-16572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15999650#comment-15999650
 ] 

Chaoyu Tang commented on HIVE-16572:
------------------------------------

The test failure is not related to the patch.

> Rename a partition should not drop its column stats
> ---------------------------------------------------
>
>                 Key: HIVE-16572
>                 URL: https://issues.apache.org/jira/browse/HIVE-16572
>             Project: Hive
>          Issue Type: Bug
>          Components: Statistics
>            Reporter: Chaoyu Tang
>            Assignee: Chaoyu Tang
>         Attachments: HIVE-16572.1.patch, HIVE-16572.patch
>
>
> The column stats for the table sample_pt partition (dummy=1) is as following:
> {code}
> hive> describe formatted sample_pt partition (dummy=1) code;
> OK
> # col_name                    data_type               min                     
> max                     num_nulls               distinct_count          
> avg_col_len             max_col_len             num_trues               
> num_falses              comment             
>                                                                               
>  
> code                  string                                                  
>                 0                       303                     6.985         
>           7                                                                   
>     from deserializer   
> Time taken: 0.259 seconds, Fetched: 3 row(s)
> {code}
> But when this partition is renamed, say
> alter table sample_pt partition (dummy=1) rename to partition (dummy=11);
> The COLUMN_STATS in partition description are true, but column stats are 
> actually all deleted.
> {code}
> hive> describe formatted sample_pt partition (dummy=11);
> OK
> # col_name                    data_type               comment             
>                
> code                  string                                      
> description           string                                      
> salary                int                                         
> total_emp             int                                         
>                
> # Partition Information                
> # col_name                    data_type               comment             
>                
> dummy                 int                                         
>                
> # Detailed Partition Information               
> Partition Value:      [11]                     
> Database:             default                  
> Table:                sample_pt                
> CreateTime:           Thu Mar 30 23:03:59 EDT 2017     
> LastAccessTime:       UNKNOWN                  
> Location:             file:/user/hive/warehouse/apache/sample_pt/dummy=11     
>  
> Partition Parameters:          
>       COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>       numFiles                1                   
>       numRows                 200                 
>       rawDataSize             10228               
>       totalSize               10428               
>       transient_lastDdlTime   1490929439          
>                
> # Storage Information          
> SerDe Library:        org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe      
>  
> InputFormat:          org.apache.hadoop.mapred.TextInputFormat         
> OutputFormat:         
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat       
> Compressed:           No                       
> Num Buckets:          -1                       
> Bucket Columns:       []                       
> Sort Columns:         []                       
> Storage Desc Params:           
>       serialization.format    1                   
> Time taken: 6.783 seconds, Fetched: 37 row(s)
> ===
> hive> describe formatted sample_pt partition (dummy=11) code;
> OK
> # col_name                    data_type               comment                 
>                                                          
>                                                                               
>  
> code                  string                  from deserializer               
>                                                  
> Time taken: 9.429 seconds, Fetched: 3 row(s)
> {code}
> The column stats should not be drop when a partition is renamed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to