[
https://issues.apache.org/jira/browse/HIVE-23347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17098894#comment-17098894
]
Syed Shameerur Rahman commented on HIVE-23347:
----------------------------------------------
[~adeshrao]
https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreChecker.java#L386
We remove all the known partitions path (fetched from metastore) from the
partitions path listed from the fileSystem. Since partition path fetched from
metastore will always have lower case partition column name and the partition
path listed from the fileSystem might have upper case column we might end up
not removing the already present partition path.
Eg:
partition from metastore: <tablepath>/year=2020/month=3/day=2;
partition from fileSystem: <tablepath>/Year=2020/Month=3/Day=2;
Both these paths should be considered same and hence removed from
*allPartDirs*. I guess HIVE-23347.3.patch doesn't handle that case.
So i guess it is better to tackle this issue at place where the partition paths
are fetched from fileSystem.
> MSCK REPAIR cannot discover partitions with upper case directory names.
> -----------------------------------------------------------------------
>
> Key: HIVE-23347
> URL: https://issues.apache.org/jira/browse/HIVE-23347
> Project: Hive
> Issue Type: Bug
> Components: Standalone Metastore
> Affects Versions: 3.1.0
> Reporter: Sankar Hariappan
> Assignee: Adesh Kumar Rao
> Priority: Minor
> Labels: pull-request-available
> Attachments: HIVE-23347.01.patch, HIVE-23347.2.patch,
> HIVE-23347.3.patch
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> For the following scenario, we expect MSCK REPAIR to discover partitions but
> it couldn't.
> 1. Have partitioned data path as follows.
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=10
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=11
> 2. create external table t1 (key int, value string) partitioned by (Year int,
> Month int, Day int) stored as orc location hdfs://mycluster/datapath/t1'';
> 3. msck repair table t1;
> 4. show partitions t1; --> Returns zero partitions
> 5. select * from t1; --> Returns empty data.
> When the partition directory names are changed to lower case, this works fine.
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=10
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=11
--
This message was sent by Atlassian Jira
(v8.3.4#803005)