Hi Barisa, from what you've described I believe it could work. But I never tried it out. Maybe you could report back once you tried it out. I believe it would be interesting to hear your experience with this approach.
One thing to note is that the approach hinges on the fact that the older JobManager is still running. If for whatever reason the old JobManager fails shortly before the new one comes up, then you might not execute the job you want to upgrade. You could mitigate the problem by using externalized checkpoints [1] but then you would fall back to an earlier point. [1] https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/checkpoints.html#retained-checkpoints Cheers, Till On Thu, Apr 30, 2020 at 3:38 PM Alexander Fedulov <alexan...@ververica.com> wrote: > Hi Barisa, > > it seems that there is no immediate answer to your concrete question here, > so I wanted to ask you back a more general question: did you consider using > the Community Edition of Ververica Platform for your purposes [1] > <https://www.ververica.com/blog/announcing-ververica-platform-community-edition>? > It comes with a complete lifecycle management for Flink jobs on K8S. It > also exposes a full REST API for integrating into CI/CD workflows, so if > you do not need the UI, you can just ignore it. Community Edition is > permanently free for commercial use at any scale. > > I see that you are already using Helm, so installation could be very > straightforward [2] <https://www.ververica.com/getting-started>. > Here is the documentation with a bit more comprehensive "Getting started" > guide [3] <https://docs.ververica.com/getting_started/index.html>. > > [1] > https://www.ververica.com/blog/announcing-ververica-platform-community-edition > [2] https://www.ververica.com/getting-started > [3] https://docs.ververica.com/getting_started/index.html > > Best regards, > > -- > > Alexander Fedulov | Solutions Architect > > +49 1514 6265796 > > <https://www.ververica.com/> > > Follow us @VervericaData > > -- > > Join Flink Forward <https://flink-forward.org/> - The Apache Flink > Conference > > Stream Processing | Event Driven | Real Time > > -- > > Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany > > -- > > Ververica GmbH > Registered at Amtsgericht Charlottenburg: HRB 158244 B > Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji > (Tony) Cheng > > > > On Wed, Apr 29, 2020 at 5:32 PM Barisa Obradovic <bbaj...@gmail.com> > wrote: > >> Hi, we are attempting to migrate our flink cluster to K8, and are looking >> into options how to automate job upgrades; wondering if anyone here has >> done >> it with init container? Or if there is a simpler way? >> >> 0: So, let's assume we have a job manager with few task managers running, >> in >> a stateful set; managed with helm. >> >> 1: New helm chart is published, and helm attempts the upgrade. >> Since it's a stateful set, new version of job manager and taskmanager is >> started even while old one is still running. >> 2: In the job manager pod, there is an init container, whose purpose it to >> find currently running job manager with previous version of JOB ( either >> via >> zookeeper or kubernetes service which points to currently running job >> manager). After it finds it, it runs cancel with savepoint using flink >> CLI, >> and passes the savepoint URL via volume to main container. >> 3: job manager container starts, it finds the savepoint, and restores the >> new version of job, with the state from savepoint. >> 4: new pods are passing healthchecks, so old pods are destroyed by >> kubernetes. >> >> >> What happens if there is no previous job manager running? init container >> sees that, and just exits without any other work. >> >> >> >> >> >> Caveat: >> Most of solutions I noticed were using operators, which feel quite a bit >> more complex, yet since I haven't found any solution using init container, >> I'm guessing I'm missing something, just can't figure out what? >> >> >> >> -- >> Sent from: >> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ >> >