[GitHub] [hudi] garyli1019 commented on pull request #1817: [HUDI-651] Fix incremental queries in MOR tables

GitBox Wed, 22 Jul 2020 23:13:34 -0700


garyli1019 commented on pull request #1817:
URL: https://github.com/apache/hudi/pull/1817#issuecomment-662836052



   > @garyli1019 are you talking about corner cases not handled in this PR? can 
you review the PR once for intended functionality? I am trying to see if this 
can help MOR/Incremental query on spark SQL in some form.
   
   @vinothchandar I think the Spark Datasource will use a different approach. 
IIUC, this PR is trying to solve when the incremental query started from an 
uncompacted delta commit, which doesn't have a base file for some file groups 
and leads to missing the log records. For Spark Datasource, we can create a 
`HoodieFileSplit` without `baseFile` and read logs only. I am not sure if this 
could be done in the `HoodieRealtimeFileSplit` and without the extra handle for 
`HoodieMORIncrementalFileSplit`. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] garyli1019 commented on pull request #1817: [HUDI-651] Fix incremental queries in MOR tables

Reply via email to