hydrogenlee opened a new issue, #14079: URL: https://github.com/apache/iceberg/issues/14079
### Apache Iceberg version 1.8.1 ### Query engine Flink ### Please describe the bug 🐞 Problem: When running a flink job with **RANGE** distribution mode, after several cycles of running the job for a while, stopping with a savepoint, and restarting from that savepoint, the range shuffle operator stuck at the “**INITIALIZING**” status without any error or warn logs, while all other operators successfully transition to the RUNNING state. Steps to Reproduce: 1. Run a job in range distribution mode. 2. Stop with savepoint. 3. Restart from the savepoint. 4. Repeat steps 1–3 multiple times. 5. Eventually, after a restart, the job gets stuck in the INITIALIZING stage. The hot thread of range-shuffle operator is: <img width="1346" height="807" alt="Image" src="https://github.com/user-attachments/assets/80ab4a1e-7f94-4fc0-87fa-1e50a5f42afe" /> ### Willingness to contribute - [ ] I can contribute a fix for this bug independently - [x] I would be willing to contribute a fix for this bug with guidance from the Iceberg community - [ ] I cannot contribute a fix for this bug at this time -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
