romanzdk opened a new issue, #68315:
URL: https://github.com/apache/airflow/issues/68315

   ### Under which category would you file this issue?
   
   Airflow Core
   
   ### Apache Airflow version
   
   3.2.1
   
   ### What happened and how to reproduce it?
   
   After upgrading from Airflow **3.1.8** to **3.2.1**, DAGs using a custom 
plugin timetable stopped scheduling with **no visible error** in the UI. 
Cron-based DAGs on the same deployment continued to work normally.
   
   Affected DAGs show an empty **Next Run** column. The scheduler and webserver 
appear healthy. The failure is easy to miss unless dag-processor logs are 
inspected.
   
   ### Environment
   
   - **Airflow:** 3.2.1 (also reproducible on 3.2.x generally)
   - **Executor:** KubernetesExecutor
   - **Provider:** `apache-airflow-providers-cncf-kubernetes` 10.17.0 (we also 
hit separate scheduler issues on 10.17.1 — see #67813)
   - **Deployment:** Custom Docker image, DAGs synced to dag-processor; plugins 
in `/opt/airflow/plugins/`
   - **Python:** 3.12
   
   ### Custom timetable
   
   We use an `AfterWorkdayTimetable` plugin, following the [official custom 
timetable 
how-to](https://airflow.apache.org/docs/apache-airflow/stable/howto/timetable.html)
 pattern:
   
   - Registered via `AirflowPlugin.timetables`
   - Implements `serialize` / `deserialize`
   - Implements `next_dagrun_info` and `infer_manual_data_interval`
   - Used by many production DAGs
   
   On **3.1.8** this worked correctly. On **3.2.1** all these DAGs show no next 
run date and create no scheduled runs.
   
   ### Root cause (our analysis)
   
   Airflow **3.2** extended `DagRunInfo` with two additional required fields 
for partition-oriented scheduling (AIP-76 / #61167):
   
   ```python
   class DagRunInfo(NamedTuple):
       run_after: DateTime
       data_interval: DataInterval | None
       partition_date: DateTime | None   # new in 3.2
       partition_key: str | None         # new in 3.2
   ```
   
   Our custom timetable (and the official how-to example) still construct 
`DagRunInfo` the 3.1.x way:
   
   ```python
   return DagRunInfo(
       run_after=run_after,
       data_interval=DataInterval(start=next_start, end=next_end),
   )
   ```
   
   On 3.2 this raises:
   
   ```
   TypeError: DagRunInfo.__new__() missing 2 required positional arguments: 
'partition_date' and 'partition_key'
   ```
   
   That exception appears to be caught by the exception handling added in 
#18729 around `DAG.next_dagrun_info()`. The result is:
   
   - `DagModel.next_dagrun` / `next_dagrun_create_after` stay `NULL`
   - UI shows empty Next Run
   - No import error or dag-processor failure surfaced to the user
   
   Built-in timetables (e.g. `CronDataIntervalTimetable`) use helpers like 
`DagRunInfo.interval(...)` that pass `partition_date=None, partition_key=None` 
internally — so only **custom** timetables are affected.
   
   ### Steps to reproduce
   
   1. Create a plugin timetable following the official how-to (with 
`serialize`/`deserialize` and `next_dagrun_info`).
   2. In `next_dagrun_info`, return a 2-field `DagRunInfo` as shown in the docs:
   
      ```python
      return DagRunInfo(
          data_interval=DataInterval(start=start, end=end),
          run_after=run_after,
      )
      ```
   
   3. Deploy on Airflow **3.2.1** with dag-processor + scheduler running.
   4. Create a DAG: `schedule=MyCustomTimetable(...)`, `start_date` in the 
past, `catchup=True`.
   5. Observe:
      - DAG parses successfully (no import error in UI)
      - **Next Run** column is empty
      - No scheduled dag runs are created
   6. Check dag-processor logs — likely a swallowed `TypeError` from 
`DagRunInfo` construction.
   
   **Working fix on our side:**
   
   ```python
   return DagRunInfo(
       run_after=run_after,
       data_interval=data_interval,
       partition_date=None,
       partition_key=None,
   )
   ```
   
   Or use `DagRunInfo.interval(start, end)` and override `run_after` if needed.
   
   ### Expected behavior
   
   At least one of:
   
   1. **Backward compatibility:** `DagRunInfo` accepts the 3.1.x 2-argument 
form with `partition_date=None, partition_key=None` as defaults.
   2. **Visible failure:** If `next_dagrun_info` raises, surface it in the UI 
(similar to DAG import errors) instead of silently clearing next-run fields.
   3. **Documentation:** Update the [custom timetable 
how-to](https://airflow.apache.org/docs/apache-airflow/stable/howto/timetable.html)
 and 3.2 release notes to show the 4-field `DagRunInfo` constructor. The 
current docs still show the pre-3.2 API.
   
   ### Actual behavior
   
   - Custom timetable DAGs silently stop scheduling on 3.2.
   - No UI indication of failure.
   - Built-in cron DAGs on the same cluster continue to schedule normally.
   
   ### Impact
   
   High for teams with custom timetables upgrading 3.1 → 3.2. Looks like 
"Airflow is broken" but gives no actionable error. We initially attempted a 
full downgrade back to 3.1.8, which caused additional metadata DB problems (see 
Issue 2 below).
   
   ### Related issues / PRs
   
   - [#61167](https://github.com/apache/airflow/issues/61167) — partition 
fields added to `DagRunInfo`
   - [#18729](https://github.com/apache/airflow/pull/18729) — exception 
handling in `DAG.next_dagrun_info` (may contribute to silent failure)
   - [#19304](https://github.com/apache/airflow/issues/19304) — related 
silent/incorrect handling when `next_dagrun_info` returns `None`
   - [#63962](https://github.com/apache/airflow/pull/63962) — partitioned 
timetables and null `next_dagrun` fields
   
   ### What you think should happen instead?
   
   _No response_
   
   ### Operating System
   
   _No response_
   
   ### Deployment
   
   None
   
   ### Apache Airflow Provider(s)
   
   _No response_
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Official Helm Chart version
   
   Not Applicable
   
   ### Kubernetes Version
   
   _No response_
   
   ### Helm Chart configuration
   
   _No response_
   
   ### Docker Image customizations
   
   _No response_
   
   ### Anything else?
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to