[
https://issues.apache.org/jira/browse/HIVE-18582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16345757#comment-16345757
]
Sergey Shelukhin commented on HIVE-18582:
-----------------------------------------
Throwing for a missing partition directory in throw mode is by design.
However, the change itself makes sense to me, vals should not really survive
between partitions so the scope should be reduced.
Patches welcome ;)
> MSCK REPAIR TABLE Throw MetaException
> --------------------------------------
>
> Key: HIVE-18582
> URL: https://issues.apache.org/jira/browse/HIVE-18582
> Project: Hive
> Issue Type: Bug
> Components: Query Planning
> Affects Versions: 2.1.1
> Reporter: liubangchen
> Priority: Major
>
> while executing query MSCK REPAIR TABLE tablename I got Exception:
> {code:java}
> org.apache.hadoop.hive.ql.metadata.HiveException:
> MetaException(message:Expected 1 components, got 2
> (log_date=2015121309/vgameid=lyjt))
> at org.apache.hadoop.hive.ql.exec.DDLTask.msck(DDLTask.java:1847)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:402)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
> at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2073)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1744)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1453)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1171)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
> at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
> --
> Caused by: MetaException(message:Expected 1 components, got 2
> (log_date=2015121309/vgameid=lyjt))
> at
> org.apache.hadoop.hive.metastore.Warehouse.makeValsFromName(Warehouse.java:385)
> at org.apache.hadoop.hive.ql.exec.DDLTask.msck(DDLTask.java:1845)
> {code}
> table PARTITIONED by (log_date,vgameid)
> The data file on HDFS is:
>
> {code:java}
> /usr/hive/warehouse/a.db/tablename/log_date=2015063023
> drwxr-xr-x - root supergroup 0 2018-01-26 09:41
> /usr/hive/warehouse/a.db/tablename/log_date=2015121309/vgameid=lyjt
> {code}
> The subdir of log_data=2015063023 is empty
> If i set hive.msck.path.validation=ignore Then msck repair table will
> executed ok.
> Then I found code like this:
> {code:java}
> private int msck(Hive db, MsckDesc msckDesc) {
> CheckResult result = new CheckResult();
> List<String> repairOutput = new ArrayList<String>();
> try {
> HiveMetaStoreChecker checker = new HiveMetaStoreChecker(db);
> String[] names = Utilities.getDbTableName(msckDesc.getTableName());
> checker.checkMetastore(names[0], names[1], msckDesc.getPartSpecs(),
> result);
> List<CheckResult.PartitionResult> partsNotInMs =
> result.getPartitionsNotInMs();
> if (msckDesc.isRepairPartitions() && !partsNotInMs.isEmpty()) {
> //I think bug is here
> AbstractList<String> vals = null;
> String settingStr = HiveConf.getVar(conf,
> HiveConf.ConfVars.HIVE_MSCK_PATH_VALIDATION);
> boolean doValidate = !("ignore".equals(settingStr));
> boolean doSkip = doValidate && "skip".equals(settingStr);
> // The default setting is "throw"; assume doValidate && !doSkip means
> throw.
> if (doValidate) {
> // Validate that we can add partition without escaping. Escaping was
> originally intended
> // to avoid creating invalid HDFS paths; however, if we escape the
> HDFS path (that we
> // deem invalid but HDFS actually supports - it is possible to create
> HDFS paths with
> // unprintable characters like ASCII 7), metastore will create
> another directory instead
> // of the one we are trying to "repair" here.
> Iterator<CheckResult.PartitionResult> iter = partsNotInMs.iterator();
> while (iter.hasNext()) {
> CheckResult.PartitionResult part = iter.next();
> try {
> vals = Warehouse.makeValsFromName(part.getPartitionName(), vals);
> } catch (MetaException ex) {
> throw new HiveException(ex);
> }
> for (String val : vals) {
> String escapedPath = FileUtils.escapePathName(val);
> assert escapedPath != null;
> if (escapedPath.equals(val)) continue;
> String errorMsg = "Repair: Cannot add partition " +
> msckDesc.getTableName()
> + ':' + part.getPartitionName() + " due to invalid characters
> in the name";
> if (doSkip) {
> repairOutput.add(errorMsg);
> iter.remove();
> } else {
> throw new HiveException(errorMsg);
> }
> }
> }
> }
> {code}
> I think AbstractList<String> vals = null; must placed after "while
> (iter.hasNext()) {" will work ok.
>
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)