One concrete question, under the HA folder I also see these sample entries:
- job_name/blob/job_uuid/blob_... - job_name/submittedJobGraphX - job_name/submittedJobGraphY Is it safe to clean these up when the job is in a healthy state? Regards, Alexis. Am Mo., 5. Dez. 2022 um 20:09 Uhr schrieb Alexis Sarda-Espinosa < sarda.espin...@gmail.com>: > Hi Gyula, > > that certainly helps, but to set up automatic cleanup (in my case, of > azure blob storage), the ideal option would be to be able to set a simple > policy that deletes blobs that haven't been updated in some time, but that > would assume that anything that's actually relevant for the latest state is > "touched" by the JM on every checkpoint, and since I also see blobs > referencing "submitted job graphs", I imagine that might not be a safe > assumption. > > I understand the life cycle of those blobs isn't directly managed by the > operator, but in that regard it could make things more cumbersome. > > Ideally, Flink itself would guarantee this sort of allowable TTL for HA > files, but I'm sure that's not trivial. > > Regards, > Alexis. > > On Mon, 5 Dec 2022, 19:19 Gyula Fóra, <gyula.f...@gmail.com> wrote: > >> Hi! >> >> There are some files that are not cleaned up over time in the HA dir that >> need to be cleaned up by the user: >> >> >> https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/concepts/overview/#jobresultstore-resource-leak >> >> >> Hope this helps >> Gyula >> >> On Mon, 5 Dec 2022 at 11:56, Alexis Sarda-Espinosa < >> sarda.espin...@gmail.com> wrote: >> >>> Hello, >>> >>> I see the number of entries in the directory configured for HA increases >>> over time, particularly in the context of job upgrades in a Kubernetes >>> environment managed by the operator. Would it be safe to assume that any >>> files that haven't been updated in a while can be deleted? Assuming the >>> checkpointing interval is much smaller than the period used to determine if >>> files are too old. >>> >>> Regards, >>> Alexis. >>> >>>