nathadfield opened a new issue, #67239:
URL: https://github.com/apache/airflow/issues/67239

   ### Description
   
   Propagate `partition_date` (datetime) from the producer's DagRun to the 
consumer DagRun and Context for partitioned asset events (AIP-76), so that 
templates can use the existing Airflow filter idiom `{{ partition_date | ds }}` 
/ `{{ partition_date | ds_nodash }}` instead of parsing `partition_key` by hand.
   
   ### Use case/motivation
   
   After #65340 / #65359, consumers can access `partition_key` (a string) 
directly from Context. But the partition's underlying datetime, 
`partition_date`, is `None` on consumer DagRuns. The column exists on `DagRun` 
(added via #61167) and is populated on the producer side, but nothing carries 
it across the asset event boundary.
   
   This forces every consumer of a partitioned asset to parse the key string 
manually. With the default `CronPartitionTimetable` `key_format` of 
`%Y-%m-%dT%H:%M:%S`, the workarounds look like:
   
   ```python
   # Inline string slicing - opaque, depends on key_format
   WHERE dt = "{{ partition_key[:10] }}"
   
   # Or a userland plugin macro
   WHERE dt = "{{ macros.partition_ds(dag_run) }}"
   ```
   
   Both push date-parsing back to userland, which is at odds with the project's 
canonical filter-based templating (per the [templates 
ref](https://airflow.apache.org/docs/apache-airflow/stable/templates-ref.html)):
   
   ```sql
   -- Non-partitioned DAG today
   WHERE dt = "{{ logical_date | ds }}"
   
   -- Partitioned-asset consumer should be able to write
   WHERE dt = "{{ partition_date | ds }}"
   ```
   
   AIP-76 is the headline scheduling feature of 3.2; the templating story 
should match the rest of Airflow's datetime ergonomics.
   
   ### Proposal
   
   When materialising a consumer DagRun from a partitioned asset event, 
populate `DagRun.partition_date` from the event's partition (the producer's 
`partition_date`, or re-derived from `partition_key` + `key_format`). Surface 
it in Context alongside `partition_key`, the same way #65359 surfaced the 
string:
   
   ```python
   ctx = get_current_context()
   ctx["partition_date"]                   # datetime
   ctx["partition_key"]                    # str (already in Context)
   ```
   
   Templates then use the filter form that already works everywhere else:
   
   ```sql
   WHERE dt = "{{ partition_date | ds }}"
   WHERE dt_nodash = "{{ partition_date | ds_nodash }}"
   WHERE ts = "{{ partition_date | ts_nodash }}"
   ```
   
   No new macros, no new naming - just extending the existing `partition_date` 
column's reach to the consumer side and the same Context-exposure pattern 
#65359 established for `partition_key`.
   
   ### Counterpoint
   
   `CronPartitionTimetable.key_format` is configurable; not every partition key 
is a date. The existing `DagRun.partition_date` column is already 
`nullable=True` for this reason. The same gate should apply on the consumer 
side: `partition_date` is populated only when the producer's partition is 
date-shaped. For non-date partition keys (e.g. region codes, run IDs), 
`partition_date` stays `None` and users fall back to `partition_key`. This 
matches the existing producer-side behaviour.
   
   ### Related issues
   
   - #61167 - added `partition_date` column on `DagRun` (producer only)
   - #65340 / #65359 - added `partition_key` to Context
   - #65339 - `DagRunProtocol.partition_key`
   - AIP-76
   
   ### Are you willing to submit a PR?
   
   - [x] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to