michaelmicheal commented on code in PR #29441:
URL: https://github.com/apache/airflow/pull/29441#discussion_r1102819882
##########
airflow/www/views.py:
##########
@@ -3715,7 +3715,6 @@ def next_run_datasets(self, dag_id):
DatasetEvent,
and_(
DatasetEvent.dataset_id == DatasetModel.id,
- DatasetEvent.timestamp > DatasetDagRunQueue.created_at,
Review Comment:
> However, I would also remove the and_ around it since then there would
only be one filter condition in that join:
Yes, you're right the `and` becomes unnecessary.
I think there might be some confusion around DDRQ. My understanding is that
when a `DatasetEvent` is created, a DDRQ record is created per consuming DAG.
Then, once a DAG has an associated DDRQ record for each `Dataset` that it
depends on, a dag_run is created and then all DDRQ records associated with that
DAG are deleted.
> If you go for option 2, I think you should be able to compare the
existence and creation time of the DDRQ with the DatasetEvent timestamp to
figure out whether or not the last update time has already triggered a
DDRQ/DagRun or if it has partially satisfied the conditions of a future DagRun.
As I understand it, if there are DDRQ records for a DAG, we can assume that
there hasn't been a DagRun triggered since the last DatasetEvent (because we
delete DDRQ records on the creation of a DagRun).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]