Lee-W commented on code in PR #64571:
URL: https://github.com/apache/airflow/pull/64571#discussion_r3354559782
##########
airflow-core/newsfragments/64571.significant.rst:
##########
@@ -29,8 +29,33 @@ an hourly partition).
Mappers can be set globally on a ``PartitionedAssetTimetable`` or overridden
per upstream asset via ``partition_mapper_config``.
+**Rollup (one downstream run per window of upstream partitions)**:
+
+- ``RollupMapper`` — wraps a ``upstream_mapper`` (which normalises the
upstream key to the
+ downstream granularity) with a ``window`` describing how many upstream
partitions form one
+ downstream run. The downstream Dag run is held until every expected upstream
partition for
+ the window has arrived (e.g. all 24 hourly partitions before firing a daily
summary).
+- Window types: ``HourWindow``, ``DayWindow``, ``WeekWindow``, ``MonthWindow``,
+ ``QuarterWindow``, ``YearWindow`` — enumerate the upstream partition keys
that compose one
+ downstream window. ``MonthWindow``/``QuarterWindow``/``YearWindow`` iterate
from the
+ ``upstream_mapper``'s emitted period start, so fiscal calendars are handled
transparently when
+ the upstream mapper emits non-1st period starts.
+- Typical use:
``default_partition_mapper=RollupMapper(upstream_mapper=StartOfDayMapper(),
+ window=DayWindow())`` on a ``PartitionedAssetTimetable`` whose ``assets``
are the hourly
+ upstream.
+
Within the task context, the ``partition_key`` is available as
``dag_run.partition_key``. It can also be provided when manually triggering a
Dag run via the REST API (``POST /dags/{dag_id}/dagRuns``).
+**Known limitations**:
+
+- ``DayWindow`` with a local-timezone upstream mapper is unsatisfiable on
spring-forward days:
+ the DST gap (e.g. 02:00 ET skips to 03:00) means one of the 24 expected
upstream keys is
+ never emitted by producers, so the rollup window can never be fully
satisfied.
Review Comment:
yep, mentioned in docs and opened an issue
https://github.com/apache/airflow/issues/68004 for it
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]