BalaMahesh commented on issue #2251:
URL: https://github.com/apache/hudi/issues/2251#issuecomment-726732563


   Update : 1 . After adding the additional log statement in 
HoodieParquetInputFormat and InputHandler classes, I have found this : 
   
   1) [InputInitializer {Map 1} #0] |hadoop.InputPathHandler|: Got the input 
paths : 
[s3a://xxx/test/hudi/data/xxx/xxx/dt=2020-11-13/.hoodie_partition_metadata, 
s3a://xxx/test/hudi/data/xxx/xxx/dt=2020-11-13/4e5582b0-ceb4-4d7c-ab98-bb9dfb0962e6-0_0-17038-5024094_20201113170011.parquet]conf
 : Configuration: incrementalTables : []
   
   Query Job has got the input paths as the files inside partition directory 
instead of partition directory itself , now Hudi mr bundle is trying to append 
metadata filename to these base files and failing to find the metadata file 
path . 
   
   In the same hive session , query on the different hudi table has the below 
logs : 
   
   hadoop.InputPathHandler|: Got the input paths : 
[s3a://xxxx/test/hudi/data/xxx/xxx/dt=2020-11-13]conf : Configuration: 
incrementalTables : []  which is upto partition directory unlike above base 
file path, in this case ,partition metadata file is accessible and query is 
finishing . 
   
   I would need help to figuring out from where job is getting the base files 
are inputPath instead of directory, i did describe formatted table 
partition(val) on the tables and they both have same directory structure. 
   
   
   
    
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to