jackye1995 commented on issue #3118: URL: https://github.com/apache/iceberg/issues/3118#issuecomment-920552114
Just to confirm, @Reo-LEI are you mostly doing this through Flink? I am asking because I think currently we have the following dilemma: delete files are mostly generated by CDC pipelines in Flink, but rewrite functionality is not yet ready for delete files, and even if it's ready it's in Spark. #2867 is another PR that tries to tackle the same root issue in a different way. We definitely need to speed up the delete compaction progress, making it the top of the top priority. On the other side, I think we should start considering developing some actions in Flink to run compaction natively. Maybe the second compactor in streaming pipeline approach is not avoidable although a bit complex. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
