[DISCUSSION] "Freshness" in Multi-Asset Dag Triggering Conditions

Dev iL Mon, 21 Jul 2025 02:14:16 -0700

The below is a more organized writeup of a suggestion I once shared on
Slack. I figured it should be documented and discussed here as well. I am
currently developing something equivalent using a short-circuit operator,
but it would be nice for this to be a built-in functionality. An
illustration of the proposed behavior compared to ANY and ALL is provided
as code as well as a rendered image.


Please let me know: 1) what you think; 2) whether this fits into the bigger
vision of Asset-based triggering; and 3) would you be willing to
participate in the implementation of this feature.

So without further ado....

--------------------------------------------------
A "freshness" check is a condition applied to asset updates of multiple
URIs to determine if they're sufficiently new in the context of each other.
Freshness can be used to define temporal constraints between Assets that
are updated at different frequencies. As a result, a freshness check
results in a behavior that's intermediate between AssetAny and AssetAll:

   - The check is triggered when ANY of the assets is updated.
   - The check is satisfied when ALL of the assets are *sufficiently* new
   ("fresh").


```none
Illustration (+ represents a trigger, | represents an interval boundary):

            Intrvl 1  Intrvl 2  Intrvl 3  Intrvl 4
Asset A: |----1----|--2------|------3--|----------|
Asset B: |---a--b--|---c--d--|--e-f----|---------g|
All:     |----+-------+-------------+-------------|  Times triggered: 3
Any:     |---++-+-----++--+-----+-+-+------------+|  Times triggered: 10
Fresh:   |----+-+--|---+--+--|------+--|----------|  Times triggered: 5
```

Thus, within each interval, freshness initially behaves like ALL, but
"becomes" ANY after the condition is met.

Claude was easily able to suggest potential use cases:

   - Data Drift Detection: Comparing week-old predictions against fresh
   labels (or vice versa) produces misleading drift calculations.
   - A/B Test Analysis: Update analysis only when both groups have data
   from the same time window, since temporal misalignment between groups can
   create false positive/negative results due to seasonality or external
   events.
   - Real-time vs Batch Consistency Monitoring: Flag inconsistencies only
   when both pipelines have processed the same time range.
   - Data Quality Monitoring: Calculate data quality metrics only when
   processing lag is under acceptable threshold.

[image: image.png]

[DISCUSSION] "Freshness" in Multi-Asset Dag Triggering Conditions

Reply via email to