boneanxs commented on issue #6938:
URL: https://github.com/apache/hudi/issues/6938#issuecomment-1280359415

   @yihua Yea, Identifying replaced file groups might be time consuming, we 
have to list affected partitions to build `FileSystemView` to get replaced file 
groups. I'm thinking If using `HoodieMetadataFileSystemView` in the end, the 
time cost of listing operation can be reduced a lot, besides, one replace 
operation usually doesn't contain many partitions, so maybe the time spent here 
can be acceptable(we can also make here run in parallel if there're many 
partitions affected)
   
   By the way, maybe we can provide a basic/simple fix at least address the 
issue(duplicates is actually a critical issue), and try to improve this logic 
in the long term.
   
   Do you think it's worth a try? Very appreciate for your suggestions!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to