Copilot commented on code in PR #64162:
URL: https://github.com/apache/airflow/pull/64162#discussion_r3025333325


##########
airflow-core/docs/authoring-and-scheduling/timetable.rst:
##########
@@ -151,6 +151,63 @@ must be a :class:`datetime.timedelta` or 
``dateutil.relativedelta.relativedelta`
     def example_dag():
         pass
 
+.. versionadded:: 3.0.0
+    The ``run_immediately`` argument was introduced in Airflow 3.
+
+The optional ``run_immediately`` argument controls which cron point is 
scheduled when a Dag is first
+enabled or re-enabled after a pause. It has no effect when ``catchup=True`` or 
when prior Dag runs
+already exist (in those cases the scheduler always continues from where it 
left off).
+

Review Comment:
   The docs say `run_immediately` affects a DAG when it is “first enabled or 
re-enabled after a pause”, but the implementation (and the tests added in this 
PR) state `run_immediately` only affects the very first run; after pause/resume 
with prior runs, the most recent past boundary is always scheduled. Please 
reword this paragraph to avoid implying `run_immediately` changes pause/resume 
behavior.



##########
airflow-core/src/airflow/timetables/trigger.py:
##########
@@ -399,7 +395,7 @@ def __init__(
         *,
         timezone: str | Timezone | FixedTimezone,
         run_offset: int | datetime.timedelta | relativedelta | None = None,
-        run_immediately: bool | datetime.timedelta = False,
+        run_immediately: bool | datetime.timedelta = True,
         # todo: AIP-76 we can't infer partition date from this, so we need to 
store it separately.
         key_format: str = r"%Y-%m-%dT%H:%M:%S",

Review Comment:
   `CronPartitionTimetable`’s docstring description of `run_immediately` still 
documents the old semantics (including `None` and the auto-buffer window). 
Since `CronPartitionTimetable` inherits the updated `_calc_first_run()` 
behavior and now defaults `run_immediately=True`, the docstring should be 
updated to match the new three-case semantics and default.



##########
airflow-core/docs/authoring-and-scheduling/timetable.rst:
##########
@@ -151,6 +151,63 @@ must be a :class:`datetime.timedelta` or 
``dateutil.relativedelta.relativedelta`
     def example_dag():
         pass
 
+.. versionadded:: 3.0.0
+    The ``run_immediately`` argument was introduced in Airflow 3.
+
+The optional ``run_immediately`` argument controls which cron point is 
scheduled when a Dag is first
+enabled or re-enabled after a pause. It has no effect when ``catchup=True`` or 
when prior Dag runs
+already exist (in those cases the scheduler always continues from where it 
left off).
+
+* ``run_immediately=True`` *(default)* — schedule the **most recent past** 
cron point immediately.
+* ``run_immediately=False`` — skip the past cron point and wait for the **next 
future** cron point.
+* ``run_immediately=timedelta(...)`` — schedule the most recent past cron 
point only if it fired
+  within the given window; otherwise wait for the **next future** cron point.
+
+.. code-block:: python
+
+    from datetime import timedelta
+
+    from airflow.timetables.trigger import CronTriggerTimetable
+
+
+    @dag(
+        # Runs every 10 minutes.
+        # run_immediately=False: on first enable, skip any past slot that is 
more than
+        # 5 minutes old (the minimum buffer) and wait for the next 10-minute 
boundary.
+        schedule=CronTriggerTimetable(
+            "*/10 * * * *",
+            timezone="UTC",
+            run_immediately=False,
+        ),
+        start_date=datetime(2024, 1, 1),
+        catchup=False,

Review Comment:
   This example comment still refers to the removed “minimum buffer” behavior 
(“skip any past slot that is more than 5 minutes old”). With the simplified 
semantics, `run_immediately=False` always skips the past cron point regardless 
of how recently it fired. Also, the snippet uses `datetime(...)` in 
`start_date` but doesn’t import `datetime` (only `timedelta`).



##########
airflow-core/src/airflow/timetables/trigger.py:
##########
@@ -186,7 +187,7 @@ def __init__(
         *,
         timezone: str | Timezone | FixedTimezone,
         interval: datetime.timedelta | relativedelta = datetime.timedelta(),
-        run_immediately: bool | datetime.timedelta = False,
+        run_immediately: bool | datetime.timedelta = True,
     ) -> None:
         super().__init__(cron, timezone)
         self._interval = interval

Review Comment:
   Constructor default for `run_immediately` is now `True`, but 
`CronTriggerTimetable.deserialize()` still uses `data.get("run_immediately", 
False)` as the fallback when the key is missing. If older serialized payloads 
omit the field, deserialization will set `run_immediately=False` and (with the 
bug fixed) change scheduling behavior unexpectedly. Consider switching the 
fallback to `True` (or handling legacy payload versions explicitly) and adding 
a regression test for deserializing without the field.



##########
airflow-core/src/airflow/timetables/trigger.py:
##########
@@ -399,7 +395,7 @@ def __init__(
         *,
         timezone: str | Timezone | FixedTimezone,
         run_offset: int | datetime.timedelta | relativedelta | None = None,
-        run_immediately: bool | datetime.timedelta = False,
+        run_immediately: bool | datetime.timedelta = True,
         # todo: AIP-76 we can't infer partition date from this, so we need to 
store it separately.
         key_format: str = r"%Y-%m-%dT%H:%M:%S",
     ) -> None:

Review Comment:
   `CronPartitionTimetable` now defaults `run_immediately=True`, but its 
`deserialize()` implementation still falls back to `False` when 
`run_immediately` is missing (`data.get("run_immediately", False)`). For legacy 
serialized DAGs without the field, this can change behavior after upgrade. 
Align the deserialize fallback with the new default (or version-gate it).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to