[ 
https://issues.apache.org/jira/browse/HIVE-18582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347487#comment-16347487
 ] 

Sergey Shelukhin commented on HIVE-18582:
-----------------------------------------

This code should still respect the mode as specified - i.e. in throw case, it 
should throw on bad directories, not skip them.

Partition directories actually do not have to be under a table directory, so 
the code that does the sub-directory may be incorrect, unless the caller 
ensures they are.

Can you post on review board? There are some minor comments (typo in Valid, 
etc) that are easier to leave there.

>  MSCK REPAIR TABLE Throw MetaException
> --------------------------------------
>
>                 Key: HIVE-18582
>                 URL: https://issues.apache.org/jira/browse/HIVE-18582
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Planning
>    Affects Versions: 2.1.1
>            Reporter: liubangchen
>            Assignee: liubangchen
>            Priority: Major
>         Attachments: HIVE-18582.patch
>
>
> while executing query MSCK REPAIR TABLE tablename I got Exception:
> {code:java}
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> MetaException(message:Expected 1 components, got 2 
> (log_date=2015121309/vgameid=lyjt))
> at org.apache.hadoop.hive.ql.exec.DDLTask.msck(DDLTask.java:1847)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:402)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2073)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1744)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1453)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1171)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
> at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
> --
> Caused by: MetaException(message:Expected 1 components, got 2 
> (log_date=2015121309/vgameid=lyjt))
> at 
> org.apache.hadoop.hive.metastore.Warehouse.makeValsFromName(Warehouse.java:385)
> at org.apache.hadoop.hive.ql.exec.DDLTask.msck(DDLTask.java:1845)
> {code}
> table PARTITIONED by (log_date,vgameid)
> The data file on HDFS is:
>  
> {code:java}
> /usr/hive/warehouse/a.db/tablename/log_date=2015063023
> drwxr-xr-x - root supergroup 0 2018-01-26 09:41 
> /usr/hive/warehouse/a.db/tablename/log_date=2015121309/vgameid=lyjt
> {code}
> The subdir of log_data=2015063023 is empty
> If i set  hive.msck.path.validation=ignore Then msck repair table will 
> executed ok.
> Then I found code like this:
> {code:java}
> private int msck(Hive db, MsckDesc msckDesc) {
>   CheckResult result = new CheckResult();
>   List<String> repairOutput = new ArrayList<String>();
>   try {
>     HiveMetaStoreChecker checker = new HiveMetaStoreChecker(db);
>     String[] names = Utilities.getDbTableName(msckDesc.getTableName());
>     checker.checkMetastore(names[0], names[1], msckDesc.getPartSpecs(), 
> result);
>     List<CheckResult.PartitionResult> partsNotInMs = 
> result.getPartitionsNotInMs();
>     if (msckDesc.isRepairPartitions() && !partsNotInMs.isEmpty()) {
>      //I think bug is here
>       AbstractList<String> vals = null;
>       String settingStr = HiveConf.getVar(conf, 
> HiveConf.ConfVars.HIVE_MSCK_PATH_VALIDATION);
>       boolean doValidate = !("ignore".equals(settingStr));
>       boolean doSkip = doValidate && "skip".equals(settingStr);
>       // The default setting is "throw"; assume doValidate && !doSkip means 
> throw.
>       if (doValidate) {
>         // Validate that we can add partition without escaping. Escaping was 
> originally intended
>         // to avoid creating invalid HDFS paths; however, if we escape the 
> HDFS path (that we
>         // deem invalid but HDFS actually supports - it is possible to create 
> HDFS paths with
>         // unprintable characters like ASCII 7), metastore will create 
> another directory instead
>         // of the one we are trying to "repair" here.
>         Iterator<CheckResult.PartitionResult> iter = partsNotInMs.iterator();
>         while (iter.hasNext()) {
>           CheckResult.PartitionResult part = iter.next();
>           try {
>             vals = Warehouse.makeValsFromName(part.getPartitionName(), vals);
>           } catch (MetaException ex) {
>             throw new HiveException(ex);
>           }
>           for (String val : vals) {
>             String escapedPath = FileUtils.escapePathName(val);
>             assert escapedPath != null;
>             if (escapedPath.equals(val)) continue;
>             String errorMsg = "Repair: Cannot add partition " + 
> msckDesc.getTableName()
>                 + ':' + part.getPartitionName() + " due to invalid characters 
> in the name";
>             if (doSkip) {
>               repairOutput.add(errorMsg);
>               iter.remove();
>             } else {
>               throw new HiveException(errorMsg);
>             }
>           }
>         }
>       }
> {code}
> I think  AbstractList<String> vals = null; must placed after  "while 
> (iter.hasNext()) {" will work ok.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to