Hi, We have a requirement to scale to run 1000(s) concurrent dags. With celery executor we observed that Airflow worker gets stuck sometimes if connection to redis/mysql breaks (https://github.com/celery/celery/issues/3932 https://github.com/celery/celery/issues/4457) Currently we are using Airflow 1.9 with LocalExecutor but planning to switch to Airflow 1.10 with K8 Executor.
Thanks, Raman Gupta On 2018/09/05 12:56:38, Deng Xiaodong <xd.den...@gmail.com> wrote: > Hi folks, > > May you kindly share how your organization is setting up Airflow and using > it? Especially in terms of architecture. For example, > > - *Setting-Up*: Do you install Airflow in a "one-time" fashion, or > containerization fashion? > - *Executor:* Which executor are you using (*LocalExecutor*, > *CeleryExecutor*, etc)? I believe most production environments are using > *CeleryExecutor*? > - *Scale*: If using Celery, normally how many worker nodes do you add? (for > sure this is up to workloads and performance of your worker nodes). > - *Queue*: if Queue feature > <https://airflow.apache.org/concepts.html#queues> is used in your > architecture? For what advantage? (for example, explicitly assign > network-bound tasks to a worker node whose parallelism can be much higher > than its # of cores) > - *SLA*: do you have any SLA for your scheduling? (this is inspired by > @yrqls21's PR 3830 <https://github.com/apache/incubator-airflow/pull/3830>) > - etc. > > Airflow's setting-up can be quite flexible, but I believe there is some > sort of best practice, especially in the organisations where scalability is > essential. > > Thanks for sharing in advance! > > > Best regards, > XD >