Recommended backend metastore for Airflow

2018-12-10 Thread ramandumcs
Hi All, It seems that Airflow supports mysql, postgresql and mssql as backend store. Any recommendation on using one over other. We are expecting to run 1000(s) of concurrent Dags which would generate heavy load on backend store. Any pointer on this would be useful. Thanks, Raman Gupta

Re: Recommended backend metastore for Airflow

2018-12-10 Thread Ash Berlin-Taylor
Postgres. Friends don't let friends use MySQL is my personal rule. (I can get in to the reasons if you'd like, but the short version is I find Postgres has more compliant behaviour with SQL standard, and a much better query planner.) -ash > On 10 Dec 2018, at 15:10, ramandu...@gmail.com wrote

Re: [RESULT] Graduate Apache Airflow as a TLP

2018-12-10 Thread Kevin Dasilva
Congratulations! Such a huge accomplishment for Airflow, and the community supporting it. Cheers, from sunny south Florida! On Sun, Dec 9, 2018 at 3:03 PM Kaxil Naik wrote: > Awesome, thanks @Jakob, that is a great news. A good Christmas present ;) > > On Sun, Dec 9, 2018, 19:18 Sid Anand > >

Re: Recommended backend metastore for Airflow

2018-12-10 Thread airflowuser
Definitely PostgreSQL. https://www.2ndquadrant.com/en/postgresql/postgresql-vs-mysql/ ‐‐‐ Original Message ‐‐‐ On Monday, December 10, 2018 5:10 PM, ramandu...@gmail.com wrote: > Hi All, > > It seems that Airflow supports mysql, postgresql and mssql as backend store. > Any recommend

Re: Recommended backend metastore for Airflow

2018-12-10 Thread ramandumcs
Thanks Ash, We are trying to run 1000 concurrent Dags and are facing scalability issues with mysql. So we are exploring other backend stores pgsql and mssql. Any recommendation on airflow config like heartbeat interval, pool size etc.. to support this much workload Thanks, Raman Gupta On 2018/

run tasks parallel in airflow

2018-12-10 Thread lkeswar
Hi, Can you please guide us how to run the tasks parallely in airflow as which i need to run almost 14 tasks parallely. In config file executor as 'celery'. Is there any thing which i need to concentrate or guide me to run the tasks. thx, Lokesh

Re: run tasks parallel in airflow

2018-12-10 Thread Sai Phanindhra
Hi, Along with celery worker, you might have to configure dag concurrency and number of celery workers to get maximum parallelism in airflow. As long as scheduler has enough resources to schedule tasks and enough workers to run the job, you can parallely run a lot of tasks in airflow. On Tue