jx2lee commented on code in PR #45175:
URL: https://github.com/apache/airflow/pull/45175#discussion_r1897421753
##########
airflow/timetables/simple.py:
##########
@@ -137,14 +137,18 @@ def next_dagrun_info(
) -> DagRunInfo | None:
if restriction.earliest is None: # No start date, won't run.
return None
+
+ current_time = timezone.coerce_datetime(timezone.utcnow())
+
if last_automated_data_interval is not None: # has already run once
start = last_automated_data_interval.end
- end = timezone.coerce_datetime(timezone.utcnow())
+ end = current_time
+
+ if start > end: # Skip scheduling if the last run ended in the
future
+ return None
Review Comment:
I missed that `return None` prevents the DAG from running anymore. So, how
about applying `start and end` as shown below when the start_date is in the
future(instead of None)?
```python
if last_automated_data_interval is not None: # has already run once
if last_automated_data_interval.end > current_time: # start
date is future
start = restriction.earliest
elapsed = last_automated_data_interval.end -
last_automated_data_interval.start # elapsed already run
end = start + elapsed.as_timedelta()
else:
start = last_automated_data_interval.end
end = current_time
```
`start` is set to restriction.earliest, and `end` is calculated by adding
the previous execution's time difference to start. This way, even if it's set
in the future, expect the DAG to run. Could this approach cause any problems?
(Also, since date calculations are involved, would it be better to put it in a
separate function?)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]