zhoulii opened a new issue, #6957: URL: https://github.com/apache/paimon/issues/6957
### Search before asking - [x] I searched in the [issues](https://github.com/apache/paimon/issues) and found nothing similar. ### Motivation The FlinkOrphanFilesClean job suffered from two issues: 1. Sequential File Discovery: - Listed all directories first, then scanned them sequentially. - Caused significant performance degradation on big tables with multi branches and thousands of partitions. 2. Lack of Observability: All operators in the orphan files cleanup pipeline had generic or no names, making it difficult to identify specific stages in Flink Web UI. ### Solution 1.Decoupled Directory Listing and File Scanning. 2.Added descriptive names to all operators. ### Anything else? _No response_ ### Are you willing to submit a PR? - [x] I'm willing to submit a PR! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
