chenjunjiedada opened a new pull request, #4943: URL: https://github.com/apache/iceberg/pull/4943
This adds an option to control how many snapshots to monitor at once when using iceberg table as a Flink source. Currently, the monitor operator generates file splits from the last consumed snapshot to the latest snapshot, which may lead to backpressure when the consumer lag behind as follow image shows. We can reduce the checkpoint lock scope (https://github.com/apache/iceberg/pull/4911) or increase the network buffer to mitigate the situation while the problem still cannot be completely avoided since the number of the splits is unknown, especially when starting a consumer for the first time.  With the option, the user can tune the monitoring flow according to backpressure and busy metrics.  -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
