[ 
https://issues.apache.org/jira/browse/HIVE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-6968:
-----------------------------

    Status: Patch Available  (was: Open)

> list bucketing feature does not update the location map for unpartitioned 
> tables
> --------------------------------------------------------------------------------
>
>                 Key: HIVE-6968
>                 URL: https://issues.apache.org/jira/browse/HIVE-6968
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.12.0, 0.11.0, 0.13.0, 0.14.0
>            Reporter: Prasanth J
>            Assignee: Prasanth J
>         Attachments: HIVE-6968.1.patch
>
>
> list bucketing feature maintains a map of skewed columns/values to location 
> in metastore. This map is not getting updated for unpartitioned tables. For 
> partitioned tables the location map gets updated properly. To reproduce the 
> issue
> {code}
> hive>set hive.mapred.supports.subdirectories=true;
> hive>set mapred.input.dir.recursive=true;
> hive>create table t(col1 string, col2 string);
> hive>load  data local inpath '/home/hadoop/a.txt' into table t; 
> hive> select * from t;                                                        
>            
> OK
> 1     a
> 2     b
> 3     c
> 4     a
> 5     b
> 6     a
> hive>create tablet1(r1 string, r2 string) skewed by (r2) on (‘a’) stored as 
> directories;
> hive>insert into table t1 select * from t;
> hive>desc extended t1;
> OK
> r1                    string                                      
> r2                    string                                      
>                
> Detailed Table Information    Table(tableName:t1, dbName:default, 
> owner:pjayachandran, createTime:1398295903, lastAccessTime:0, retention:0, 
> sd:StorageDescriptor(cols:[FieldSchema(name:r1, type:string, comment:null), 
> FieldSchema(name:r2, type:string, comment:null)], 
> location:file:/app/warehouse/t1, 
> inputFormat:org.apache.hadoop.mapred.TextInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, 
> compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
> serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
> parameters:{serialization.format=1}), bucketCols:[], sortCols:[], 
> parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[r2], 
> skewedColValues:[[a]], skewedColValueLocationMaps:{}), 
> storedAsSubDirectories:true), partitionKeys:[], parameters:{numFiles=6, 
> COLUMN_STATS_ACCURATE=true, transient_lastDdlTime=1398297887, numRows=6, 
> totalSize=72, rawDataSize=18}, viewOriginalText:null, viewExpandedText:null, 
> tableType:MANAGED_TABLE)     
> Time taken: 0.119 seconds, Fetched: 4 row(s)
> {code}
> as seen from describe output *skewedColValueLocationMaps* is empty



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to