jackye1995 commented on pull request #2841: URL: https://github.com/apache/iceberg/pull/2841#issuecomment-887999783
Thank you! I reviewed the PR. I will paste my major comment also here: After reading `ConvertEqDeletesStrategy.rewriteDeletes`, I think we can make the `RewriteDeleteStrategy` interface closer to `RewriteStrategy` interface. What we have there is basically the equivalent of `planFileGroups` plus `rewriteFiles` in `RewriteStrategy`. So I would propose we have the following methods in `RewriteDeleteStrategy` to be more aligned: ``` Iterable<DeleteFile> selectDeletesToRewrite(Iterable<FileScanTask> dataFiles); Iterable<List<FileScanTask>> planDeleteGroups(Iterable<DeleteFile> deleteFiles); Set<DeleteFile> rewriteDeletes(List<DeleteFile> deleteFilesToRewrite); ``` And we can get the partition `StructLike` directly from the list of scan tasks instead of passing it through the task pair in `EqualityDeleteRewriter`. In this way, we can also enable partial progress for commits. Any thoughts? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
