We're actively following the Airflow/Kubernetes integration <https://issues.apache.org/jira/browse/AIRFLOW-1314>, and are eventually going to move to both running everything on k8s and using KubernetesExecutors for many things, but we've deployed Airflow to ECS from day one. It works mostly fine, and we're using a tool we open-sourced called Broadside <https://github.com/lumoslabs/broadside> to simplify configuration and deployment. Our deploy is broken up into one scheduler, one Flower instance, a few web servers, and a number of workers, using CeleryExecutor backed by redis/Elasticache (and RDS postgres, as you're suggesting), all in ECS from the same private docker image.
Tacking on to what Bolke is saying, it is also somewhat tricky in our experience to get deploys right in ECS with CeleryExecutors. Our first impulse was to bake the DAG directory/repo into the docker image and run an ECS deploy every time we added or updated DAGs, bouncing all of the components and killing the workers. Where we wound up is that our CI system still bakes the DAG directory into the images when we merge to master, but for a "short" deploy we only bounce the web server and scheduler--the worker containers all just execute `git pull` and pull down the new/updated DAGs. Others may have different approaches that work, I'm sure, possibly moving the DAG directory to a shared EFS mount. On Thu, Nov 2, 2017 at 11:06 AM, Bolke de Bruin <[email protected]> wrote: > Please remember that with the LocalExecutor your tasks run in > process(group) with the scheduler. If you want to restart the scheduler, it > will need to wait until all tasks have finished that are currently running. > In addition if you tasks are resource intensive (cpu, memory) this can also > affect the scheduler. In 1.9.0 we are a little bit more robust in this > respect, but guarding against OOM errors is very hard. > > Furthermore, the new logging framework in 1.9.0, will allow you to have > logs centrally which might be convenient. However, documentation is not up > to date so you will have to tune it yourself. > > My 2 cents, > > Bolke. > > > On 2 Nov 2017, at 18:55, Shoumitra Srivastava <[email protected]> > wrote: > > > > Hi guys, > > > > So far we have had a lot of success testing out Airflow and we are now > > going for a full scale deployment. To that end, we are considering > > dockerizing airflow and deploying it on one of our ECS clusters. We are > > planning on separating out the web server and the scheduler to separate > > tasks and using local executor with an RDS postgres and redis backend. > Does > > anyone else have any suggestions regarding the setup? Any design patterns > > or good practises and gotchas would be welcome. > > > > -Shoumitra > >
