garyli1019 commented on pull request #3703:
URL: https://github.com/apache/hudi/pull/3703#issuecomment-974789158


   @mincwang I think I find the cause of this behavior
   The codepath of hive rt query goes to 
https://github.com/apache/hudi/blob/0fb8556b0d9274aef650a46bb82a8cf495d4450b/hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieHiveUtils.java#L158-L169
   you could set the config HOODIE_CONSUME_PENDING_COMMITS to true and try 
again.
   
   The Spark MOR snapshot read codepath goes to 
   
https://github.com/apache/hudi/blob/a0dae41409a4f2d509aae1b16a4b509ec774c454/hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieRealtimeInputFormatUtils.java#L238-L240
   We should include the compaction request instant here as well.
   
   Do you mind having a try with this fix?
   
   The file listing code path of Spark/Hive/Flink is different now, which leads 
to this issue. We need to unify the file listing as a high-priority task. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to