[ 
https://issues.apache.org/jira/browse/HIVE-18582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16345491#comment-16345491
 ] 

Vihang Karajgaonkar commented on HIVE-18582:
--------------------------------------------

This is interesting. It looks like when we set {{hive.msck.path.validation}} to 
{{skip}} or {{throw}} msck will throw an exception when there are empty 
partition directories. Moving "AbstractList<String> vals = null" after "while 
(iter.hasNext())" as suggested above may not help since if the vals is set to 
null {{Warehouse.makeValsFromName}} will initialize it.

This behavior may be but design although I am not a 100% sure. [~sershe] Do you 
know if this an intended behavior or a bug? Based on the description of the 
config it looks like it should check only for "invalid" characters in the 
partition names. But looks like it is throwing for empty partitions as well.

>  MSCK REPAIR TABLE Throw MetaException
> --------------------------------------
>
>                 Key: HIVE-18582
>                 URL: https://issues.apache.org/jira/browse/HIVE-18582
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Planning
>    Affects Versions: 2.1.1
>            Reporter: liubangchen
>            Priority: Major
>
> while executing query MSCK REPAIR TABLE tablename I got Exception:
> {code:java}
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> MetaException(message:Expected 1 components, got 2 
> (log_date=2015121309/vgameid=lyjt))
> at org.apache.hadoop.hive.ql.exec.DDLTask.msck(DDLTask.java:1847)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:402)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2073)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1744)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1453)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1171)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
> at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
> --
> Caused by: MetaException(message:Expected 1 components, got 2 
> (log_date=2015121309/vgameid=lyjt))
> at 
> org.apache.hadoop.hive.metastore.Warehouse.makeValsFromName(Warehouse.java:385)
> at org.apache.hadoop.hive.ql.exec.DDLTask.msck(DDLTask.java:1845)
> {code}
> table PARTITIONED by (log_date,vgameid)
> The data file on HDFS is:
>  
> {code:java}
> /usr/hive/warehouse/a.db/tablename/log_date=2015063023
> drwxr-xr-x - root supergroup 0 2018-01-26 09:41 
> /usr/hive/warehouse/a.db/tablename/log_date=2015121309/vgameid=lyjt
> {code}
> The subdir of log_data=2015063023 is empty
> If i set  hive.msck.path.validation=ignore Then msck repair table will 
> executed ok.
> Then I found code like this:
> {code:java}
> private int msck(Hive db, MsckDesc msckDesc) {
>   CheckResult result = new CheckResult();
>   List<String> repairOutput = new ArrayList<String>();
>   try {
>     HiveMetaStoreChecker checker = new HiveMetaStoreChecker(db);
>     String[] names = Utilities.getDbTableName(msckDesc.getTableName());
>     checker.checkMetastore(names[0], names[1], msckDesc.getPartSpecs(), 
> result);
>     List<CheckResult.PartitionResult> partsNotInMs = 
> result.getPartitionsNotInMs();
>     if (msckDesc.isRepairPartitions() && !partsNotInMs.isEmpty()) {
>      //I think bug is here
>       AbstractList<String> vals = null;
>       String settingStr = HiveConf.getVar(conf, 
> HiveConf.ConfVars.HIVE_MSCK_PATH_VALIDATION);
>       boolean doValidate = !("ignore".equals(settingStr));
>       boolean doSkip = doValidate && "skip".equals(settingStr);
>       // The default setting is "throw"; assume doValidate && !doSkip means 
> throw.
>       if (doValidate) {
>         // Validate that we can add partition without escaping. Escaping was 
> originally intended
>         // to avoid creating invalid HDFS paths; however, if we escape the 
> HDFS path (that we
>         // deem invalid but HDFS actually supports - it is possible to create 
> HDFS paths with
>         // unprintable characters like ASCII 7), metastore will create 
> another directory instead
>         // of the one we are trying to "repair" here.
>         Iterator<CheckResult.PartitionResult> iter = partsNotInMs.iterator();
>         while (iter.hasNext()) {
>           CheckResult.PartitionResult part = iter.next();
>           try {
>             vals = Warehouse.makeValsFromName(part.getPartitionName(), vals);
>           } catch (MetaException ex) {
>             throw new HiveException(ex);
>           }
>           for (String val : vals) {
>             String escapedPath = FileUtils.escapePathName(val);
>             assert escapedPath != null;
>             if (escapedPath.equals(val)) continue;
>             String errorMsg = "Repair: Cannot add partition " + 
> msckDesc.getTableName()
>                 + ':' + part.getPartitionName() + " due to invalid characters 
> in the name";
>             if (doSkip) {
>               repairOutput.add(errorMsg);
>               iter.remove();
>             } else {
>               throw new HiveException(errorMsg);
>             }
>           }
>         }
>       }
> {code}
> I think  AbstractList<String> vals = null; must placed after  "while 
> (iter.hasNext()) {" will work ok.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to