Re: Benchmarking of Airflow Scheduler with Celery Executor

2018-04-18 Thread yrqls21
On 2018/04/13 17:00:36, Maxime Beauchemin wrote: > If you're concerned about scheduler scalability I'd go with a bigger box. > The scheduler uses multiprocessing so more CPU power means more throughput. > > Also you may want to provision a beefy MySQL box to make

Re: Benchmarking of Airflow Scheduler with Celery Executor

2018-04-13 Thread Maxime Beauchemin
If you're concerned about scheduler scalability I'd go with a bigger box. The scheduler uses multiprocessing so more CPU power means more throughput. Also you may want to provision a beefy MySQL box to make sure that doesn't become the bottleneck. 10k tasks heartbeating to the DB every 30 seconds

Re: Benchmarking of Airflow Scheduler with Celery Executor

2018-04-13 Thread ramandumcs
Thanks Ry, Just wondering if there is any approximate number on concurrent tasks a scheduler can run on say 16 GB RAM and 8 core machine. If its already been done that would be useful. We did some benchmarking with local executor and observed that each TaskInstance was taking ~100MB of memory so

Re: Benchmarking of Airflow Scheduler with Celery Executor

2018-04-12 Thread Ry Walker
Hi Raman - First, we’d be happy to help you test this out with Airflow. Or you could do it yourself by using http://open.astronomer.io/airflow/ (w/ Docker Engine + Docker Compose) to quickly spin up a test environment. Everything is hooked to Prometheus/Grafana to monitor how the system reacts to