Re: [DISCUSSION] "Freshness" in Multi-Asset Dag Triggering Conditions

Jarek Potiuk Sun, 03 Aug 2025 13:27:34 -0700

I think there were some prior discussion on that - certainly Constance and
TP who worked on Assets had some thoughts there. I think it's an important
subject and possibly we could have a discussion on the future
assets features - we know not all the planned features have been
implemented yet - certainly once around partitioning and possibly some
relations between partitions with overlapping ranges. I feel that
"freshness" and "time" and "time partiticions" have a lot to do with it,
and maybe that's a good start for a discussion?


J.


On Mon, Jul 21, 2025 at 11:14 AM Dev iL <gid....@gmail.com> wrote:

> The below is a more organized writeup of a suggestion I once shared on
> Slack. I figured it should be documented and discussed here as well. I am
> currently developing something equivalent using a short-circuit operator,
> but it would be nice for this to be a built-in functionality. An
> illustration of the proposed behavior compared to ANY and ALL is provided
> as code as well as a rendered image.
>
> Please let me know: 1) what you think; 2) whether this fits into the
> bigger vision of Asset-based triggering; and 3) would you be willing to
> participate in the implementation of this feature.
>
> So without further ado....
>
> --------------------------------------------------
> A "freshness" check is a condition applied to asset updates of multiple
> URIs to determine if they're sufficiently new in the context of each other.
> Freshness can be used to define temporal constraints between Assets that
> are updated at different frequencies. As a result, a freshness check
> results in a behavior that's intermediate between AssetAny and AssetAll:
>
>    - The check is triggered when ANY of the assets is updated.
>    - The check is satisfied when ALL of the assets are *sufficiently* new
>    ("fresh").
>
>
> ```none
> Illustration (+ represents a trigger, | represents an interval boundary):
>
>             Intrvl 1  Intrvl 2  Intrvl 3  Intrvl 4
> Asset A: |----1----|--2------|------3--|----------|
> Asset B: |---a--b--|---c--d--|--e-f----|---------g|
> All:     |----+-------+-------------+-------------|  Times triggered: 3
> Any:     |---++-+-----++--+-----+-+-+------------+|  Times triggered: 10
> Fresh:   |----+-+--|---+--+--|------+--|----------|  Times triggered: 5
> ```
>
> Thus, within each interval, freshness initially behaves like ALL, but
> "becomes" ANY after the condition is met.
>
> Claude was easily able to suggest potential use cases:
>
>    - Data Drift Detection: Comparing week-old predictions against fresh
>    labels (or vice versa) produces misleading drift calculations.
>    - A/B Test Analysis: Update analysis only when both groups have data
>    from the same time window, since temporal misalignment between groups can
>    create false positive/negative results due to seasonality or external
>    events.
>    - Real-time vs Batch Consistency Monitoring: Flag inconsistencies only
>    when both pipelines have processed the same time range.
>    - Data Quality Monitoring: Calculate data quality metrics only when
>    processing lag is under acceptable threshold.
>
> [image: image.png]
>

Re: [DISCUSSION] "Freshness" in Multi-Asset Dag Triggering Conditions

Reply via email to