ashb commented on issue #5615: [AIRFLOW-5035] Remove multiprocessing.Manager in-favour of Pipes URL: https://github.com/apache/airflow/pull/5615#issuecomment-515565720 I haven't got nice graphs, but the "scheduler overhead" (which I've defined as the time from dag run start to dag run end, minus time spent in executors) seems relatively stable an small. Both these are the first 5 dags activated. (not all 20). This isn't conclu **1.10.4rc3**: 09.022235 (stddev 2.731696s, 41runs) **With this branch**: 8.931782 (3.937999s, 30 dag runs) This is not the most exhaustive benchmark, but indicative for light-to-medium loads it doesn't affect things very much. This is the query I used to the data: ```sql WITH summary as (SELECT dag_run.dag_id, dag_run.execution_date, dag_run.state, dag_run.end_date - dag_run.start_date AS duration, dag_run.start_date - (dag_run.execution_date + interval '10 minutes') AS schedule_delay, max(task_instance.end_date) - min(task_instance.start_date) AS total_ti_exec_time, avg(task_instance.start_date - task_instance.queued_dttm) AS avg_queued_time FROM dag_run JOIN task_instance USING (dag_id, execution_date) GROUP BY dag_run.dag_id, dag_run.execution_date, dag_run.state, dag_run.end_date, dag_run.start_date ORDER BY execution_date), data AS (SELECT *, duration-total_ti_exec_time AS scheduler_overhead FROM summary) SELECT avg(scheduler_overhead), (stddev(extract ('epoch' from scheduler_overhead)) || ' seconds')::interval as stddev, count(*) as num_runs FROM data ```
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
