1fanwang commented on code in PR #67572:
URL: https://github.com/apache/airflow/pull/67572#discussion_r3358893848


##########
airflow-core/src/airflow/dag_processing/manager.py:
##########
@@ -1568,6 +1573,16 @@ def emit_metrics(*, parse_time: float, dag_file_stats: 
Sequence[DagFileStat]):
     stats.gauge("dag_processing.total_parse_time", parse_time)
     stats.gauge("dagbag_size", sum(stat.num_dags for stat in dag_file_stats))
     stats.gauge("dag_processing.import_errors", sum(stat.import_errors for 
stat in dag_file_stats))
+    # Gated by config so large deployments can opt out of the extra DB 
round-trip.
+    # On failure an error counter is incremented so dashboards can alert on
+    # missing samples rather than silently showing a stale last value.
+    if conf.getboolean("scheduler", "emit_serialized_dag_count_metric", 
fallback=True):

Review Comment:
   Flagging something on scope before you put more time in.
   
   Current `main` doesn't actually have an existing 
`SerializedDagModel.get_count()` or the `or 0` / `serialized_dag.count` emit 
that #66796 describes, so this PR is really *adding* a new metric (this gauge, 
plus the config and error counter) rather than the one-line None-safety fix the 
issue framed. That's on me — #66796 read like a small fix when the groundwork 
it references isn't in `main`.
   
   Which raises the question worth settling first: is the 
`serialized_dag.count` gauge itself wanted? It runs a DB `COUNT` every parse 
loop, and nothing emits it today — so the None-masking the issue worried about 
only appears once we add this. If the goal is just None-safe counting, the 
footprint could be much smaller.
   
   (Also fyi the branch has a merge conflict and a failing static check, so it 
isn't mergeable as-is — I'd settle "is this needed" before chasing those.)
   
   If a leaner version turns out worth having, it could be a separate PR to 
compare here. Either way, appreciate you picking it up.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to