[GitHub] [hudi] aditiwari01 opened a new issue #2801: Issues with Hive querying on MOR tables with no partitions

GitBox Sat, 10 Apr 2021 03:32:33 -0700


aditiwari01 opened a new issue #2801:
URL: https://github.com/apache/hudi/issues/2801



   Unable to read data via Hive from both _ro & _rt tables if my data is not 
partitioned.
   Reading from spark api works fine.
   
   Related Write Confs used:
   
   ```
   PARTITIONPATH_FIELD_OPT_KEY->"",
   "hoodie.datasource.hive_sync.enable" -> "true"
   "hoodie.datasource.hive_sync.partition_fields"->"",
   
HIVE_PARTITION_EXTRACTOR_CLASS_OPT_KEY->classOf[NonPartitionedExtractor].getCanonicalName
   ```
   
   Issue faced: NullPointerException in `getTableMetaClientForBasePath` of 
class `HoodieInputFormatUtils`.
   
   My take:
   
   ```
     public static HoodieTableMetaClient 
getTableMetaClientForBasePath(FileSystem fs, Path dataPath) throws IOException {
       LOG.info("Getting Table Meta Client from path: " + dataPath.toString());
       int levels = HoodieHiveUtils.DEFAULT_LEVELS_TO_BASEPATH;
       if (HoodiePartitionMetadata.hasPartitionMetadata(fs, dataPath)) {
         HoodiePartitionMetadata metadata = new HoodiePartitionMetadata(fs, 
dataPath);
         metadata.readFromFS();
         levels = metadata.getPartitionDepth();
       }
       Path baseDir = HoodieHiveUtils.getNthParent(dataPath, levels);
       LOG.info("Reading hoodie metadata from path " + baseDir.toString());
       return 
HoodieTableMetaClient.builder().setConf(fs.getConf()).setBasePath(baseDir.toString()).build();
     }
   ```
   
   Herein if partition meta is not available (as in case of no partition), we 
set levels to default of 3, in which case the base path fetched is wrong.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] aditiwari01 opened a new issue #2801: Issues with Hive querying on MOR tables with no partitions

Reply via email to