[ 
https://issues.apache.org/jira/browse/AIRFLOW-774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16936931#comment-16936931
 ] 

ASF subversion and git services commented on AIRFLOW-774:
---------------------------------------------------------

Commit 885ed13b8294268826e133ba2ecd3e4523a2f496 in airflow's branch 
refs/heads/v1-10-test from Ash Berlin-Taylor
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=885ed13 ]

[AIRFLOW-774] Fix long-broken DAG parsing Statsd metrics (#6157)

Since we switched to using sub-processes to parse the DAG files sometime
back in 2016(!) the metrics we have been emitting about dag bag size and
parsing have been incorrect.

We have also been emitting metrics from the webserver which is going to
be become wrong as we move towards a stateless webserver.

To fix both of these issues I have stopped emitting the metrics from
models.DagBag and only emit them from inside the
DagFileProcessorManager.

(There was also a bug in the `dag.loading-duration.*` we were emitting
from the DagBag code where the "dag_file" part of that metric was empty.
I have fixed that even though I have now deprecated that metric. The
webserver was emitting the right metric though so many people wouldn't
notice)

(cherry picked from commit 5f9ab7a1d5cd4540e953005d43898be08ed56d60)


> dagbag_size/collect_dags/dagbag_import_errors stats incorrect
> -------------------------------------------------------------
>
>                 Key: AIRFLOW-774
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-774
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: logging
>            Reporter: Dan Davydov
>            Assignee: Ash Berlin-Taylor
>            Priority: Major
>             Fix For: 1.10.6
>
>
> After the multiprocessor change was made (dag folders are processed in 
> parallel), the number of dags reported by airflow is for each of these 
> subprocesses which is inaccurate, and potentially orders of magnitude less 
> than the actual number of dags. These individual processes stats should be 
> aggregated. The collect_dags/dagbag_import_errors stats should also be fixed 
> (time it takes to parse the dags).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to