Hello, I am running into checkpoint timeouts and am looking for guidance on troubleshooting. What should I be looking at? What configuration parameters would affect this? I am afraid I am a Flink newbie so I am still picking up the concepts. Additional notes are below, anything else I can provide? Thanks.
The checkpoint size is small (less than 100kB) Multiple flink apps are running on a cluster, only one is running into checkpoint timeouts Timeout is set to 10 mins Tried aligned and unaligned checkpoints Tried clearing checkpoints to start fresh Plenty of disk space Dataflow: kafka source -> flink app -> kafka sink