Dev-iL commented on PR #58175:
URL: https://github.com/apache/airflow/pull/58175#issuecomment-3932308315

   Feedback from Opus:
   
   ---
   
   ## PR Review: Make DagRun `start_date` Nullable in Execution API
   
   **PR Goal:** Allow `DagRun.start_date` to be `None` across the execution API 
and Task SDK so that deadline alert callbacks can build a `Context` for DagRuns 
that are still queued (never started).
   
   **AIP Alignment:** Deadline Alerts (Airflow 3.1)
   
   ---
   
   ### Verdict: Approve with minor suggestions
   
   This is a clean, well-scoped change that correctly prepares the ground for 
full-context deadline callbacks running on the executor/triggerer. The Cadwyn 
versioning is done properly, tests are meaningful, and the fallback logic is 
sound.
   
   ---
   
   ### What the PR does well
   
   1. **Aligns API with the database.** The DB column `dagrun.start_date` is 
already `nullable=True` (`mapped_column(UtcDateTime, nullable=True)`). The 
execution API and Task SDK were more restrictive than the storage layer — this 
PR removes that mismatch.
   
   2. **Safe backward-compatible fallback.** For older API clients, 
`start_date` falls back to `run_after`, which is guaranteed non-null 
(`nullable=False` in DB, has a `_default_run_after` that always resolves to a 
value). This means old Task SDK versions will never receive `null` for a field 
they expect to be present.
   
   3. **Dual converter coverage.** The `MakeDagRunStartDateNullable` version 
change covers both:
      - `TIRunContext` (where `DagRun` is nested under 
`response.body["dag_run"]`) — used by `/execution/task-instances/{id}/run`
      - Direct `DagRun` responses — used by `get_dag_run` in `dag_runs.py:68`
   
      This matches the two distinct return paths that expose `DagRun` to 
clients.
   
   4. **Test coverage hits the right scenarios:** old-version fallback, 
head-version allows null, old-version preserves real values. The task-sdk-side 
tests verify model construction and `TIRunContext` compatibility.
   
   5. **Correct AIP alignment.** The AIP explicitly calls out that deadlines 
can fire before a DagRun starts (e.g. a run queued for a very long time). This 
change is a prerequisite for executing deadline callbacks with full context in 
that scenario.
   
   ---
   
   ### Issues and suggestions
   
   #### 1. `_generated.py` is codegen'd — verify regeneration (Low risk)
   
   `task-sdk/src/airflow/sdk/api/datamodels/_generated.py` has the header:
   ```
   # generated by datamodel-codegen:
   #   filename:  http://0.0.0.0:8080/execution/openapi.json
   ```
   
   The change to this file looks correct and consistent with the source model 
change in `taskinstance.py`, but confirm the file was **regenerated** (not 
hand-edited) after updating the source model. A hand-edit will be silently 
overwritten the next time someone runs the codegen.
   
   #### 2. `make_ti_context` fixture type hint is now slightly inaccurate (Nit)
   
   In `task-sdk/tests/conftest.py:270`, the fixture signature is:
   ```python
   start_date: str | datetime = "2024-12-01T01:00:00Z",
   ```
   
   The test calls `make_ti_context(start_date=None)`, which works at runtime 
(Pydantic accepts it after the model change, and the `# type: ignore` 
suppresses static checking), but the fixture's own type annotation doesn't 
include `None`. Consider updating to `str | datetime | None = 
"2024-12-01T01:00:00Z"` for accuracy. Not blocking.
   
   #### 3. `RuntimeTaskInstance.start_date` is still non-nullable 
(Informational)
   
   In `task-sdk/src/airflow/sdk/execution_time/task_runner.py:157`:
   ```python
   start_date: AwareDatetime  # TaskInstance start_date, not DagRun's
   ```
   
   This is the **TaskInstance** start date (set when the TI begins running), 
not the **DagRun** start date. It's correct that this remains non-nullable — by 
the time a TI is running, it has a start date. No action needed, but worth 
noting to avoid confusion during review.
   
   #### 4. Consider a defensive comment on the `run_after` fallback (Nit)
   
   In the converter:
   ```python
   dag_run["start_date"] = dag_run.get("run_after")
   ```
   
   While `run_after` is guaranteed non-null in practice (DB constraint + 
default), a brief inline comment explaining *why* `run_after` is a safe 
fallback would help future readers:
   ```python
   # run_after is always non-null (DB NOT NULL + default), safe as a sentinel
   dag_run["start_date"] = dag_run.get("run_after")
   ```
   
   #### 5. No test for the direct `DagRun` endpoint backward compat (Minor gap)
   
   The `ensure_start_date_in_dag_run` converter handles the 
`/execution/dag-runs/{dag_id}/{run_id}` endpoint (`get_dag_run`), but the tests 
only exercise the `TIRunContext` path (via 
`/execution/task-instances/{id}/run`). Consider adding a test that hits the 
DagRun detail endpoint with an old API version and a queued (null start_date) 
DagRun to validate the second converter.
   
   #### 6. `context["start_date"]` is currently not populated in the template 
context (Observation)
   
   Looking at `get_template_context()` in `task_runner.py`, the `start_date` 
key is never explicitly set in the context dict. The `Context` TypedDict has 
`total=False`, so all keys are optional. The PR description correctly notes 
this change is forward-looking — it won't affect current behavior. But when the 
"full context" is eventually wired into deadline callbacks, whoever adds 
`start_date` to the context dict should add a null-safety guard or document the 
semantics (is `None` expected to propagate to Jinja templates as `{{ start_date 
}}`?).
   
   ---
   
   ### Version placement assessment
   
   Adding `MakeDagRunStartDateNullable` to the existing `Version("2026-03-31", 
...)` is correct. This is a backward-incompatible schema change (nullable field 
that was non-nullable), and grouping it with the current development version 
ensures old clients get the fallback via Cadwyn.
   
   The alphabetical ordering of version changes within the `Version()` 
constructor (`MakeDagRunStartDateNullable` before 
`ModifyDeferredTaskKwargsToJsonValue`) follows the existing convention.
   
   ---
   
   ### Risk assessment
   
   | Area | Risk | Rationale |
   |------|------|-----------|
   | Breaking old Task SDK clients | **None** | Cadwyn converter provides 
`run_after` fallback |
   | Breaking current deadline alerts | **None** | Current callbacks don't use 
full `Context` |
   | Future deadline callbacks | **Low** | Type change is in place; just needs 
context population |
   | Codegen drift | **Low** | If `_generated.py` was regenerated, no risk; if 
hand-edited, will be overwritten |
   
   ---
   
   ### Summary
   
   Solid preparatory change for the Deadline Alerts feature. The core logic is 
correct, backward compatibility is handled properly, and the test coverage is 
good. The only actionable suggestion is adding a test for the direct `DagRun` 
endpoint's backward compat path. Everything else is nit-level.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to