codope commented on PR #11975:
URL: https://github.com/apache/hudi/pull/11975#issuecomment-2365138352

   
   > Can we just skip the file filtering if we find the file in the index item 
is a log file when rdo RO query, so that we can eliminate the new API on the 
file index. And a RO query on full compacted table can also utilitize the RLI 
index.
   
   @danny0405 This is a good point. But, the index item in RLI only contains 
fileId and not the file name. Moreover, I still think it makes sense to have 
this new API because think about time travel query. Let's say there is an 
instant `t` in the past which is a compaction instant and table was fully 
compacted at that instant. If user runs a query with record key predicate and 
`as of instant t` then RLI would still return candidate files as per the latest 
snapshot. And RO queries are nothing but time travel as of the latest 
compaction time.
   
   I think it's much cleaner to simply not use RLI for any type of queries 
other than Snapshot queries. Later, when we add support for time travel on 
metadata table, we can easily change the condition in the implementation of 
this API and use the index.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to