Hi Ivan, For completedCheckpointxxxx files will keep growing, do you mean too many files exist in the S3 bucket?
AFAIK, if the K8s HA services work normally, only one completedCheckpointxxxx file will be retained. Once a new one is generated, the old one will be deleted. Best, Yang Ivan Yang <ivanygy...@gmail.com> 于2021年6月23日周三 上午12:31写道: > Hi Dear Flink users, > > We recently implemented enabled the zookeeper less HA in our kubernetes > Flink deployment. The set up has > > high-availability.storageDir: s3://some-bucket/recovery > > > Since we have a retention policy on the s3 bucket, relatively short 7 > days. So the HA will fail if the submittedJobGraph > <https://s3.console.aws.amazon.com/s3/object/flink-checkpointing-prod-eu-central-1?region=eu-central-1&prefix=recovery/default/submittedJobGraph5b30c5214899> > xxxxxx is deleted by s3. If we remove the retention policy, > completedCheckpoint > <https://s3.console.aws.amazon.com/s3/object/flink-checkpointing-prod-eu-central-1?prefix=recovery/default/completedCheckpoint001fd6e39810>xxxx > files will keep growing. The only way I can think of is to use a patterned > based file retention policy in s3. Before I do that, Is there any config > keys available in Flink I can tune to not keep the all the > completeCheckpoint* in HA? > > Thanks, > Ivan > > > > >