Hey, As Bolke said, with LE and tasks consuming variable amounts of memory, you can run into memory issues on a container. I'd reconsider running on a containerized environment at all, because with the LE and the scheduler, you need to set up a huge one for that to work. You're probably better off on an EC2 instance for that. With LE, you don't need redis at all, because redis can serve as the back-end for the CeleryExecutor, not LocalExecutor.
We used CeleryExecutor with redis in a spike on ECS. Indeed, logging is the biggest issue here. We used static ip's and hostnames for the containers we started (which doesn't necessarily make them "cattle"). We closed it off and used "splunk" to get all logging output in a centralized location. I didn't spend enough time to consider all the implications there though, because the web UI is helpful to see the log output for a specific window for example and through splunk you actually lose that. There were issues with memory usage and OOM, which gets reserved by the container, so if anything restarts or gets unstable, look at that first. To synchronize dags across all vm's, we experimented with EFS (works like NFS) and the idea was to let CI deploy onto that as the single write instance. Rgds, Gerard On Thu, Nov 2, 2017 at 6:55 PM, Shoumitra Srivastava <[email protected] > wrote: > Hi guys, > > So far we have had a lot of success testing out Airflow and we are now > going for a full scale deployment. To that end, we are considering > dockerizing airflow and deploying it on one of our ECS clusters. We are > planning on separating out the web server and the scheduler to separate > tasks and using local executor with an RDS postgres and redis backend. Does > anyone else have any suggestions regarding the setup? Any design patterns > or good practises and gotchas would be welcome. > > -Shoumitra >
