BasPH commented on code in PR #28907:
URL: https://github.com/apache/airflow/pull/28907#discussion_r1070241161


##########
docs/apache-airflow/administration-and-deployment/logging-monitoring/metrics.rst:
##########
@@ -121,6 +121,10 @@ Name                                                       
            Descripti
                                                                        fully 
asynchronous)
 ``triggers.failed``                                                    Number 
of triggers that errored before they could fire an event
 ``triggers.succeeded``                                                 Number 
of triggers that have fired at least one event
+``dataset.updates``                                                    Number 
of updated datasets
+``dataset.orphaned``                                                   Number 
of datasets marked as orphans because they are no longer referenced in DAG
+                                                                       
schedule parameters or task outlets
+``dataset.triggered_runs``                                             Number 
of runs triggered by a dataset update

Review Comment:
   ```suggestion
   ``dataset.triggered_dagruns``                                             
Number of DAG runs triggered by a dataset update
   ```



##########
airflow/jobs/scheduler_job.py:
##########
@@ -1198,6 +1198,7 @@ def _create_dag_runs_dataset_triggered(
                     dag_hash=dag_hash,
                     creating_job_id=self.id,
                 )
+                Stats.incr("dataset.triggered_runs")

Review Comment:
   ```suggestion
                   Stats.incr("dataset.triggered_dagruns")
   ```
   
   Would rename for clarity



##########
airflow/jobs/scheduler_job.py:
##########
@@ -1633,6 +1634,9 @@ def _orphan_unreferenced_datasets(self, session: Session 
= NEW_SESSION) -> None:
                 )
             )
         )
+        orphaned_datasets_count = 0
         for dataset in orphaned_dataset_query:
             self.log.info("Orphaning unreferenced dataset '%s'", dataset.uri)
             dataset.is_orphaned = expression.true()
+            orphaned_datasets_count += 1
+        Stats.incr("dataset.orphaned", count=orphaned_datasets_count)

Review Comment:
   I think 
   ```suggestion
           for dataset in orphaned_dataset_query:
               self.log.info("Orphaning unreferenced dataset '%s'", dataset.uri)
               dataset.is_orphaned = expression.true()
           Stats.gauge("dataset.orphaned", len(orphaned_dataset_query))
   ```
   
   This should be a gauge instead of a counter, because datasets can also be 
"un-orphaned".



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to