Guosmilesmile opened a new pull request, #13998: URL: https://github.com/apache/iceberg/pull/13998
Currently, when Flink searches for SystemFiles during the deletion of orphaned files, it does not support concurrent search if the Hadoop library is used. The purpose of this PR is to enable parallel file search. The specific approach is to conduct a two-layer search. First, in the initial search layer, we use a parallelism of 1 to gather as many folders as possible. Subsequently, we send the collected folders downstream for distributed searching of all folders. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org