ashb commented on code in PR #24743:
URL: https://github.com/apache/airflow/pull/24743#discussion_r912821200


##########
airflow/models/dagrun.py:
##########
@@ -631,6 +631,32 @@ def update_state(
         session.merge(self)
         # We do not flush here for performance reasons(It increases queries 
count by +20)
 
+        from airflow.models import Dataset
+        from airflow.models.dataset_dag_run_event import DatasetDagRunEvent as 
DDRE
+        from airflow.models.serialized_dag import SerializedDagModel
+
+        datasets = []
+        for task in self.dag.tasks:
+            for outlet in getattr(task, '_outlets', []):
+                if isinstance(outlet, Dataset):
+                    datasets.append(outlet)
+        dataset_ids = [x.get_dataset_id(session=session) for x in datasets]
+        events_to_process = 
session.query(DDRE).filter(DDRE.dataset_id.in_(dataset_ids)).all()

Review Comment:
   > make sure create_dagrun does essentially insert ignore or on conflict do 
nothing so that if the dag run is already created, it will just do nothing. i 
believe some version of this is supported on all our databases.
   
   
   The other option is SAVEPOINT/ROLLBACK TO as I'm not sure Mysql 5.7 (which 
we still do support, sadly) has this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to