vandonr-amz opened a new pull request, #30290: URL: https://github.com/apache/airflow/pull/30290
I think the doc is misleading on this specific metric, it says this is the time it takes to parse ALL dag files, but it's not the case. There are some reasons for why a dag could be excluded from a run: https://github.com/apache/airflow/blob/42db6ab4327da1a4bfd215194aee37a3d36c92d3/airflow/dag_processing/manager.py#L1179-L1182 and it makes this metric very disconcerting to look at in some circumstances, because the number of files varies, so the numbers may be inconsistent. There is a metric for the number of files processed, but it is not mentioned in the doc. I think the `total_parse_time` should always be looked at in conjunction with `file_path_queue_size`. They are emitted at different times though (the queue size is sent before parsing, and the time, obviously, after), which might make it hard to combine depending on the visualization. We could eventually move that metric to be emitted at the same time. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
