I don’t think it’s a KafkaIO issue since checkpoints are handled by runner.
Could it be similar to this issue? https://lists.apache.org/thread.html/r4a454a40197f2a59280ffeccfe44837ec072237aea56d50599f12184%40%3Cuser.beam.apache.org%3E Could you try a workaround with sliding windows proposed there? > On 22 Oct 2020, at 05:18, Eleanore Jin <eleanore....@gmail.com> wrote: > > Hi all, > > I am using beam 2.23 (java), and flink 1.10.2, my pipeline is quite simple > read from a kafka topic and write to another kafka topic. > > When I enabled checkpoint, I see the memory usage of the flink job manager > keeps on growing > <image.png> > > The Flink cluster is running on kubernetes, with 1 job manager, and 12 task > managers each with 4 slots, kafka input topic has 96 partitions. The > checkpoint is stored in azure blob storage. > > Checkpoint happens every 3 seconds, with timeout 10 seconds, with minimum > pause of 1 second. > > Any ideas why this happens? > Thanks a lot! > Eleanore