Re: [PR] Implement the dataset_dags API endpoint for retrieving DAGs related to a specific dataset [airflow]

via GitHub Thu, 21 Mar 2024 06:21:20 -0700


pankajkoti commented on code in PR #38369:
URL: https://github.com/apache/airflow/pull/38369#discussion_r1533885409



##########
airflow/www/views.py:
##########
@@ -1082,6 +1088,54 @@ def datasets(self):
             state_color_mapping=state_color_mapping,
         )
 
+    @expose("/dataset_dags/<int:dataset_id>")
+    @auth.has_access(
+        [
+            (permissions.ACTION_CAN_READ, permissions.RESOURCE_DATASET),
+        ]
+    )
+    @action_logging
+    def retrieve_related_dags_for_dataset(self, dataset_id: int):
+        """
+        Retrieve related DAGs for a specific dataset.
+
+        :param dataset_id: The dataset id
+        """
+        with create_session() as session:
+            dataset = 
session.query(DatasetModel).filter_by(id=dataset_id).one_or_none()
+            if not dataset:
+                return flask.json.jsonify({"error": "Dataset not found"}), 404
+
+            # Fetch all consuming DAGs (downstream)
+            consuming_dags_refs = (
+                
session.query(DagScheduleDatasetReference).filter_by(dataset_id=dataset_id).all()
+            )
+            consuming_dags = [
+                {
+                    "dag_id": dag_ref.dag_id,
+                    "is_paused": 
session.query(DagModel).filter_by(dag_id=dag_ref.dag_id).one().is_paused,

Review Comment:
   Wondering if we can optimise the number of DB calls here.
   Instead of making a query every single time when iterating over 
consuming_dag_refs, can we pre-build the list of all the dag_ids from 
consuming_dag_regs and make a single query to DagModel with all those DAG IDs 
for getting their is_paused values? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Implement the dataset_dags API endpoint for retrieving DAGs related to a specific dataset [airflow]

Reply via email to