Yep, the cleanup_pods script is set up now as an optional Kubernetes
CronJob (
https://github.com/astronomer/airflow-chart/blob/master/templates/cleanup/cleanup-cronjob.yaml)
that we have run periodically to clean failed pods up and could stay
separate.

The wait_for_migrations script could definitely be pulled into Airflow. For
context, we deploy an initContainer on the scheduler (
https://github.com/astronomer/airflow-chart/blob/master/templates/scheduler/scheduler-deployment.yaml#L77-L84)
that runs the upgradedb command before booting the scheduler. This new
wait_for_migration script runs in an initContainer on the webserver and
workers (
https://github.com/astronomer/airflow-chart/blob/master/templates/webserver/webserver-deployment.yaml#L58-L65)
so that they don't boot up ahead of a potentially long-running migration
and attempt to operate on new or missing columns/tables before the
migrations run. This prevents these pods from entering a CrashLoop.

On Tue, Mar 24, 2020 at 11:48 AM Jarek Potiuk <jarek.pot...@polidea.com>
wrote:

> >
> > @Tomasz great question. Our images are currently generated from
> Dockerfiles
> > in this repo https://github.com/astronomer/ap-airflow and get published
> to
> > DockerHub
> > https://hub.docker.com/repository/docker/astronomerinc/ap-airflow.
> >
> > For the most part those are typical Airflow images. There's an entrypoint
> > script that we include in the image that handles waiting for the database
> > and redis (if used) to come up, which is pretty generic.
>
>
> I already added waiting for the database (both metadata and celery URL) in
> the PR:
>
> https://github.com/apache/airflow/pull/7832/files#diff-3759f40d4e8ba0c0e82e82b66d376741
> .
> It's functionally the same but more generic.
>
> The only other
> > thing that I think the Helm Chart uses would be the scripts in this repo
> > https://github.com/astronomer/astronomer-airflow-scripts. Our
> Dockerfiles
> > pull this package in. These scripts are used to coordinate running
> > migrations and cleaning up failed pods.
> >
>
> I see two scripts:
>
> * cleanup_pods -> this is (I believe) not needed to run in airflow - this
> could be run  as a separate pod/container?
> * waiting for migrations  -> I think this is a good candidate to add
> *airflow
> db wait_for_migration* command and make it part of airflow itself.
>
> I think we also have to agree on the Airflow version supported by the
> official helm chart. I'd suggest we support 1.10.10+ and we incorporate all
> the changes needed to airflow (like the "db wait_for_migration")  into 2.0
> and 1.10 and we support both - image and helm chart for those versions
> only. That would help with people migrating to the latest version.
>
> WDYT?
>
>
> > On Tue, Mar 24, 2020 at 10:49 AM Daniel Imberman <
> > daniel.imber...@gmail.com>
> > wrote:
> >
> > > @jarek I agree completely. I think that pairing an official helm chart
> > > with the official image would make for a REALLY powerful “up and
> running
> > > with airflow” story :). Tomek and I have also been looking into
> > > operator-sdk which has the ability to create custom controllers from
> helm
> > > charts. We might even able to get a 1-2 punch from the same code base
> :).
> > >
> > > @kaxil @jarek @aizhamal @ash if there’s no issues, can we please start
> > the
> > > process of donation?
> > >
> > > +1 on my part, of course :)
> > >
> > >
> > >
> > > Daniel
> > > On Mar 24, 2020, 7:40 AM -0700, Jarek Potiuk <jarek.pot...@polidea.com
> >,
> > > wrote:
> > > > +1. And it should be paired with the official image we have work in
> > > > progress on. I looked a lot at the Astronomer's image while preparing
> > my
> > > > draft and we can make any adjustments needed to make it works with
> the
> > > helm
> > > > chart - and I am super happy to collaborate on that.
> > > >
> > > > PR here: https://github.com/apache/airflow/pull/7832
> > > >
> > > > J.
> > > >
> > > >
> > > > On Tue, Mar 24, 2020 at 3:15 PM Kaxil Naik <kaxiln...@gmail.com>
> > wrote:
> > > >
> > > > > @Tomasz Urbaszek <tomasz.urbas...@polidea.com> :
> > > > > Helm Chart Link: https://github.com/astronomer/airflow-chart
> > > > >
> > > > > On Tue, Mar 24, 2020 at 2:13 PM Tomasz Urbaszek <
> > turbas...@apache.org>
> > > > > wrote:
> > > > >
> > > > > > An official helm chart is something our community needs! Using
> your
> > > > > > chart as the official makes a lot of sens to me because as you
> > > > > > mentioned - it's battle tested.
> > > > > >
> > > > > > One question: what Airflow image do you use? Also, would you mind
> > > > > > sharing a link to the chart?
> > > > > >
> > > > > > Tomek
> > > > > >
> > > > > >
> > > > > > On Tue, Mar 24, 2020 at 2:07 PM Greg Neiheisel
> > > > > > <g...@astronomer.io.invalid> wrote:
> > > > > > >
> > > > > > > Hey everyone,
> > > > > > >
> > > > > > > Over the past few years at Astronomer, we’ve created, managed,
> > and
> > > > > > hardened
> > > > > > > a production-ready Helm Chart for Airflow (
> > > > > > > https://github.com/astronomer/airflow-chart) that is being
> used
> > by
> > > > > both
> > > > > > our
> > > > > > > SaaS and Enterprise customers. This chart is battle-tested and
> > > running
> > > > > > > hundreds of Airflow deployments of varying sizes and runtime
> > > > > > environments.
> > > > > > > It’s been built up to encapsulate the issues that Airflow users
> > run
> > > > > into
> > > > > > in
> > > > > > > the real world.
> > > > > > >
> > > > > > > While this chart was originally developed internally for our
> > > Astronomer
> > > > > > > Platform, we’ve recently decoupled the chart from the rest of
> our
> > > > > > platform
> > > > > > > to make it usable by the greater Airflow community. With these
> > > changes
> > > > > in
> > > > > > > mind, we want to start a conversation about donating this chart
> > to
> > > the
> > > > > > > Airflow community.
> > > > > > >
> > > > > > > Some of the main features of the chart are:
> > > > > > >
> > > > > > > - It works out of the box. With zero configuration, a user will
> > get
> > > > > a
> > > > > > > postgres database, a default user and the KubernetesExecutor
> > ready
> > > > > to
> > > > > > run
> > > > > > > DAGs.
> > > > > > > - Support for Local, Celery (w/ optional KEDA autoscaling) and
> > > > > > > Kubernetes executors.
> > > > > > >
> > > > > > > Support for optional pgbouncer. We use this to share a
> > configurable
> > > > > > > connection pool size per deployment. Useful for limiting
> > > connections to
> > > > > > the
> > > > > > > metadata database.
> > > > > > >
> > > > > > > - Airflow migration support. A user can push a newer version of
> > > > > > Airflow
> > > > > > > into an existing release and migrations will automatically run
> > > > > > cleanly.
> > > > > > > - Prometheus support. Optionally install and configure a
> > > > > > statsd-exporter
> > > > > > > to ingest Airflow metrics and expose them to Prometheus
> > > > > automatically.
> > > > > > > - Resource control. Optionally control the ResourceQuotas and
> > > > > > > LimitRanges for each deployment so that no deployment can
> > overload
> > > a
> > > > > > > cluster.
> > > > > > > - Simple optional Elasticsearch support.
> > > > > > > - Optional namespace cleanup. Sometimes KubernetesExecutor and
> > > > > > > KubernetesPodOperator pods fail for reasons other than the
> actual
> > > > > > task.
> > > > > > > This feature helps keep things clean in Kubernetes.
> > > > > > > - Support for running locally in KIND (Kubernetes in Docker).
> > > > > > > - Automatically tested across many Kubernetes versions with
> Helm
> > 2
> > > > > > and 3
> > > > > > > support.
> > > > > > >
> > > > > > > We’ve found that the cleanest and most reliable way to deploy
> > DAGs
> > > to
> > > > > > > Kubernetes and manage them at scale is to package them into the
> > > actual
> > > > > > > docker image, so we have geared this chart towards that method
> of
> > > > > > > operation, though adding other methods should be
> straightforward.
> > > > > > >
> > > > > > > We would love thoughts from the community and would love to see
> > > this
> > > > > > chart
> > > > > > > help others to get up and running on Kubernetes!
> > > > > > >
> > > > > > > --
> > > > > > > *Greg Neiheisel* / Chief Architect Astronomer.io
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Jarek Potiuk
> > > > Polidea <https://www.polidea.com/> | Principal Software Engineer
> > > >
> > > > M: +48 660 796 129 <+48660796129>
> > > > [image: Polidea] <https://www.polidea.com/>
> > >
> >
> >
> > --
> > *Greg Neiheisel* / Chief Architect Astronomer.io
> >
>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>


-- 
*Greg Neiheisel* / Chief Architect Astronomer.io

Reply via email to