Jarek - The Deadlines stuff got voted out of 3.0, I'm hoping to have this for 
3.1 now.


Ash - Valid comments.  "need_by" was mostly just used there for demonstrative 
purposes, it won't be a user-facing variable.  I can certainly change it if you 
feel one of the other names is easier for Future Us though.


Daniel - You made a lot of points and I had a very long reply typed out and 
Outlook ate it on me. Computers.... who needs 'em, right?  I'll try to remember 
everything I replied to.


>  `landing_time = DeadlineReference.DAGRUN_QUEUED + timedelta(hours=1)`
>
> does this mean that the expectation will be missed if the *dag run does not
> complete* within 1 hr of dag run queued time?

Yeah, that would  (tentatively, this may change, etc) be defined as

```
with DAG(
    dag_id='dag_with_deadline',
    deadline=DeadlineAlert(
        trigger=DeadlineReference.DAGRUN_QUEUED,
        interval=timedelta(hours=1),
        callback=missed_deadline,
    ),
):
```

Which would translate to English as "If the dag run has not been marked as a 
terminal state before one hour after it was queued, then execute 
missed_deadline()".  I have a working POC for DAGRUN_LOGICAL_DATE which is 
straight-forward; the dagrun_queued one is trickier to sort out but I'll get 
back to working on it Soon :tm:



Your idea of "expectations" is actually not far from my plan.  Here is a more 
complicated example of what I envision:

```
with DAG(
    dag_id='dag_with_deadline',
    deadline=[
        DeadlineAlert(
            trigger=DeadlineReference.BY_TIME(hour:8, minute:30),
            callback=morning_meeting_datat_unavailable,
            callback_kwargs={"notify": "marketing team"}
        ),
        DeadlineAlert(
            trigger=DeadlineReference.DAGRUN_LOGICAL_DATE,
            interval=timedelta(hours=1),
            callback=deadline_running_long,
            callback_kwargs={"notify": "it on call"}
        ),
        DeadlineAlert(
            trigger=DeadlineReference.DAGRUN_LOGICAL_DATE,
            interval=timedelta(hours=2),
            callback=missed_deadline,
            callback_kwargs={"notify": "the boss", "alert_level": "red", 
"cancel_three_ring_circus": True}
        ),
        DeadlineAlert(
            trigger=DeadlineReference.DAG_LAST_RUN_END,
            interval=timedelta(days=1),
            callback=daily_run_failed
        )
        DeadlineAlert(
            trigger=DeadlineReference.DATASET_UPDATED_DATE,
            interval=timedelta(hours=1),
            callback=missed_dataset_deadline,
        )
    ]
):
```

Obviously all of those DeadlineAlert objects could be defined elsewhere and/or 
imported, and that would clean this up a lot, but for the sake of simplified 
discussion, I'm leaving it all in one blob.

That would translate to:

0) If the dag run has not reached a terminal state by the next 08:30 (so either 
later today or tomorrow, depending on when this is run), then execute 
morning_meeting_datat_unavailable(notify="marketing team")  # I'll likely have 
some built-in static date stuff like this for "tomorrow by...", "every day 
by...", etc
1) If the dag run has not reached a terminal state one hour after it was 
started then execute deadline_running_long(notify="IT on call")
2) If the dag run has not reached a terminal state two hours after it was 
started then execute missed_deadline
3) If it has been more than 1 day since the last time the dag finished, execute 
daily_run_failed()  # Future work; There is not currently an easy way to get 
that, we'd need to add something tot he dag model to track last execution date
4) If the dag run has not reached a terminal state one hour after the 
associated dataset is updated, execute missed_dataset_deadline  # Future work, 
datasets aren't part of the initial launch, just an example of where I see this 
going later.


So in all, I think we're on a similar page.   One of the things that folks at 
the meetup last month said was they really wanted to be able to set a list of 
deadlines so they could do tiered escalations or use it to monitor multiple 
different points of failure, so that is definitely going to part of the 
implementation.  I also have DeadlineAlert set up (in my POC, at least) as an 
interface so users can add their own as long as it returns a datetime and a 
callback, they can conceivably make it monitor whatever they want.  But so far 
my plans have been entirely time-based, either dynamically ("one hour after the 
last run finished") or static ("every day before the morning standup").



 - ferruzzi


________________________________
From: Daniel Standish <daniel.stand...@astronomer.io.INVALID>
Sent: Tuesday, February 25, 2025 10:13 AM
To: dev@airflow.apache.org
Subject: RE: [EXT] Two Hard Things: Deadline Alerts Edition

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



I thought it was planned for 3.1, not necessarily 3.0.

In any case, it's just an idea.  Let's see how Feruzzi feels about it.
Even the generalization is not taken up, perhaps you might look at the
suggestion re target event vs reference event.  I think it's helpful to
think about it that way but, that's me.



On Tue, Feb 25, 2025 at 7:20 AM Jarek Potiuk <ja...@potiuk.com> wrote:

> Re: generalization: Nice idea Daniel, but are you sure we can come up (and
> agree) with a generic enough approach and have it still done for Airflow 3.
> For me this is a bit YAGNI - and rather than generalising now, it would be
> better to hash it out, vote and see all the considerations in a new AIP -
> we are complicating the simple "deadline" feature to a more generic
> expectations that we might or might not need or not agree to and we don't
> even know what those would be. Any generalization makes it more complex to
> grasp by the users, so if we generalize - I think we should very good idea
> how to.
>
> I'd say we stick with the deadline as agreed and approved in AIP-86 and we
> can have a more generic "expectations" discussion later. It's very easy
> later to change it and deprecate this one, otherwise we are risking that
> whatever we come up with now will be half-baked and we will have to change
> it later anyway, because we miss an important "generalization" feature.
>
> My 3 cents.
>
> J.
>
>
> On Tue, Feb 25, 2025 at 3:56 PM Daniel Standish
> <daniel.stand...@astronomer.io.invalid> wrote:
>
> > Thinking about this interface a little bit I have some thoughts to share.
> >
> > I think it might be good to generalize it very slightly to "expectations"
> > so that we can set multiple different kinds of expectations, and not just
> > deadlines.
> >
> > I also thought it might be helpful to make clear what is the reference
> > event and what is the target event, and to think of these as events.  It
> > might make it a little easier to understand.
> >
> > Something like this
> >
> > with DAG(
> >     dag_id="something",
> >     expectations=[
> >         DagRunEventDeadline(  # this is about interval betwween queued
> > and finished
> >             reference_event=DagRunEvent.QUEUED,
> >             target_event=DagRunEvent.END,
> >             max_interval=datetime.timedelta(hour=1),
> >         ),
> >         DagRunDateDeadline(  # this is about the interval between last
> > run and this one
> >             reference_event=DagRunEvent.LAST_START,
> >             target_event=DagRunEvent.START,
> >             max_interval=datetime.timedelta(hour=1),
> >         ),
> >         RunFrequenceyExpectation(  # this is about the interval
> > between last run and this one
> >             expected_runs=5,
> >             lookback_interval=timedelta(days=3),
> >         ),
> >     ],
> > ) as dag:
> >     ...
> >
> > My "run frequency expectation" is completely made up but it's just there
> to
> > show, maybe there would be interesting things we could do re allowing
> users
> > to set expectations that are not specifically limited to deadlines.  In
> > this example, the expectation that there should be 5 runs in last N time.
> > I'm sure we can all think of others.  We could just implement some
> deadline
> > expectations, and maybe others could implement others.  And maybe we
> could
> > allow users to define their own expectations, but these, if not known by
> > airflow, would just be stored as json, and users could roll their own
> > system to evaluate them or leave it to 3rd party service etc.
> >
> > WDYT!?!?
> >
> >
> >
> >
> >
> >
> >
> > On Tue, Feb 25, 2025 at 6:43 AM Daniel Standish <
> > daniel.stand...@astronomer.io> wrote:
> >
> > > also... curious can you confirm my understanding of what this means
> > >
> > > `landing_time = DeadlineReference.DAGRUN_QUEUED + timedelta(hours=1)`
> > >
> > > does this mean that the expectation will be missed if the *dag run does
> > > not complete* within 1 hr of dag run queued time?
> > >
> > > i.e. is it correct the event of interest is dag run completion?  and
> the
> > > "anchor" is, dag run queued time?
> > >
> > >
> > >
> > >
> > > On Tue, Feb 25, 2025 at 6:33 AM Daniel Standish <
> > > daniel.stand...@astronomer.io> wrote:
> > >
> > >> or "expected_by" i like expected_by over need_by
> > >>
> > >> On Tue, Feb 25, 2025 at 6:26 AM Ash Berlin-Taylor <a...@apache.org>
> > wrote:
> > >>
> > >>> Breaking with the herd/answering a question you didn’t even ask
> > >>> (:evil-grin:)
> > >>>
> > >>> How about changing need_by to landing_time, i.e.
> > >>>
> > >>> `landing_time = DeadlineReference.DAGRUN_QUEUED + timedelta(hours=1)`
> > >>>
> > >>> Or some variation there of - `landed_by` etc.
> > >>> -ash
> > >>>
> > >>>
> > >>>
> > >>> > On 25 Feb 2025, at 05:15, Ankit Chaurasia <sunank...@gmail.com>
> > wrote:
> > >>> >
> > >>> > DeadlineReference +1.
> > >>> >
> > >>> > Regards,
> > >>> > *Ankit Chaurasia*
> > >>> >
> > >>> >
> > >>> >
> > >>> >
> > >>> >
> > >>> >
> > >>> >
> > >>> > On Tue, 25 Feb 2025 at 10:05, Pavankumar Gopidesu <
> > >>> gopidesupa...@gmail.com>
> > >>> > wrote:
> > >>> >
> > >>> >> +1  `DeadlineReference`
> > >>> >>
> > >>> >> Regards
> > >>> >> Pavan Kumar
> > >>> >>
> > >>> >> On Tue, Feb 25, 2025, 04:11 Amogh Desai <amoghdesai....@gmail.com
> >
> > >>> wrote:
> > >>> >>
> > >>> >>> Late to the party, but I'd vote for `DeadlineReference` too.
> > Concise
> > >>> and
> > >>> >>> does the job well.
> > >>> >>>
> > >>> >>> Thanks & Regards,
> > >>> >>> Amogh Desai
> > >>> >>>
> > >>> >>>
> > >>> >>> On Tue, Feb 25, 2025 at 7:56 AM Wei Lee <weilee...@gmail.com>
> > wrote:
> > >>> >>>
> > >>> >>>> A bit late to the party, but DeadlineReference +1.
> > >>> >>>>
> > >>> >>>> Best,
> > >>> >>>> Wei
> > >>> >>>>
> > >>> >>>>> On Feb 25, 2025, at 4:02 AM, Daniel Standish
> > >>> >>>> <daniel.stand...@astronomer.io.invalid> wrote:
> > >>> >>>>>
> > >>> >>>>> Stewing presently thank you :)
> > >>> >>>>>
> > >>> >>>>> On Mon, Feb 24, 2025 at 11:53 AM Ferruzzi, Dennis
> > >>> >>>>> <ferru...@amazon.com.invalid> wrote:
> > >>> >>>>>
> > >>> >>>>>> Alright then.  DeadlineReference gets the green light for now!
> > >>>  If
> > >>> >>>> anyone
> > >>> >>>>>> has a suggestion they like more, please feel free to drop it
> in
> > >>> >> here.
> > >>> >>>> I'm
> > >>> >>>>>> working on some Ariflow3.0 stuff before I get back to this, so
> > >>> >> there's
> > >>> >>>>>> plenty of time before this is set down if anyone wants to stew
> > on
> > >>> >> it a
> > >>> >>>> bit.
> > >>> >>>>>>
> > >>> >>>>>> Thanks for the thoughts, all!
> > >>> >>>>>>
> > >>> >>>>>>
> > >>> >>>>>> - ferruzzi
> > >>> >>>>>>
> > >>> >>>>>>
> > >>> >>>>>> ________________________________
> > >>> >>>>>> From: Jarek Potiuk <ja...@potiuk.com>
> > >>> >>>>>> Sent: Monday, February 24, 2025 11:11 AM
> > >>> >>>>>> To: dev@airflow.apache.org
> > >>> >>>>>> Subject: RE: [EXT] Two Hard Things: Deadline Alerts Edition
> > >>> >>>>>>
> > >>> >>>>>> CAUTION: This email originated from outside of the
> organization.
> > >>> Do
> > >>> >>> not
> > >>> >>>>>> click links or open attachments unless you can confirm the
> > sender
> > >>> >> and
> > >>> >>>> know
> > >>> >>>>>> the content is safe.
> > >>> >>>>>>
> > >>> >>>>>>
> > >>> >>>>>>
> > >>> >>>>>> AVERTISSEMENT: Ce courrier électronique provient d’un
> expéditeur
> > >>> >>>> externe.
> > >>> >>>>>> Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si
> > vous
> > >>> ne
> > >>> >>>> pouvez
> > >>> >>>>>> pas confirmer l’identité de l’expéditeur et si vous n’êtes pas
> > >>> >> certain
> > >>> >>>> que
> > >>> >>>>>> le contenu ne présente aucun risque.
> > >>> >>>>>>
> > >>> >>>>>>
> > >>> >>>>>>
> > >>> >>>>>> Yep. DaeadlineReference is good .
> > >>> >>>>>>
> > >>> >>>>>>
> > >>> >>>>>> On Mon, Feb 24, 2025 at 8:07 PM Mehta, Shubham
> > >>> >>>> <shu...@amazon.com.invalid>
> > >>> >>>>>> wrote:
> > >>> >>>>>>
> > >>> >>>>>>> +1 to DeadlineReference. It is clear and allows flexibility
> to
> > >>> >> choose
> > >>> >>>> any
> > >>> >>>>>>> reference point.
> > >>> >>>>>>>
> > >>> >>>>>>> shubham
> > >>> >>>>>>>
> > >>> >>>>>>> On 2025-02-24, 10:00 AM, "Ferruzzi, Dennis"
> > >>> >>> <ferru...@amazon.com.inva
> > >>> >>>>>>> <mailto:ferru...@amazon.com.inva>LID> wrote:
> > >>> >>>>>>>
> > >>> >>>>>>>
> > >>> >>>>>>> CAUTION: This email originated from outside of the
> > organization.
> > >>> Do
> > >>> >>> not
> > >>> >>>>>>> click links or open attachments unless you can confirm the
> > sender
> > >>> >> and
> > >>> >>>>>> know
> > >>> >>>>>>> the content is safe.
> > >>> >>>>>>>
> > >>> >>>>>>>
> > >>> >>>>>>>
> > >>> >>>>>>>
> > >>> >>>>>>>
> > >>> >>>>>>>
> > >>> >>>>>>> AVERTISSEMENT: Ce courrier électronique provient d’un
> > expéditeur
> > >>> >>>> externe.
> > >>> >>>>>>> Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si
> > vous
> > >>> >> ne
> > >>> >>>>>> pouvez
> > >>> >>>>>>> pas confirmer l’identité de l’expéditeur et si vous n’êtes
> pas
> > >>> >>> certain
> > >>> >>>>>> que
> > >>> >>>>>>> le contenu ne présente aucun risque.
> > >>> >>>>>>>
> > >>> >>>>>>>
> > >>> >>>>>>>
> > >>> >>>>>>>
> > >>> >>>>>>>
> > >>> >>>>>>>
> > >>> >>>>>>> Hey folks. I need to narrow down the name for one of the
> > >>> parameters
> > >>> >>> on
> > >>> >>>>>> the
> > >>> >>>>>>> Deadline Alerts work and I'm fishing for suggestions.
> > >>> >>>>>>>
> > >>> >>>>>>>
> > >>> >>>>>>> TLDR: need_by_date = {some existing timestamp} + {a
> > user-defined
> > >>> >>>>>> timedelta}
> > >>> >>>>>>>
> > >>> >>>>>>>
> > >>> >>>>>>> The existing timestamp could be dynamic (like when the dagrun
> > is
> > >>> >>> queued
> > >>> >>>>>> or
> > >>> >>>>>>> when a specific task stars, etc), or fixed datetime.
> > >>> >>>>>>>
> > >>> >>>>>>>
> > >>> >>>>>>> So in practice this will look something like:
> > >>> >>>>>>>
> > >>> >>>>>>>
> > >>> >>>>>>> need_by = DeadlineTrigger.DAGRUN_QUEUED + timedelta(hours=1)
> > >>> >>>>>>>
> > >>> >>>>>>>
> > >>> >>>>>>> where the DeadlineTrigger is the part we are trying to
> rename.
> > >>> >>>>>>>
> > >>> >>>>>>>
> > >>> >>>>>>> I initially used Anchor in the AIP for lack of a better name
> > and
> > >>> it
> > >>> >>> was
> > >>> >>>>>>> universally hated. I like Trigger better, but that name is
> > >>> already
> > >>> >>>>>>> overloaded in Airflow so I don't want to reuse it. Maybe
> > >>> Reference?
> > >>> >>>>>>>
> > >>> >>>>>>>
> > >>> >>>>>>> need_by = DeadlineReference.DAGRUN_STARTED +
> > >>> timedelta(minutes=30?
> > >>> >>>>>>>
> > >>> >>>>>>>
> > >>> >>>>>>> Please throw some other suggestions or naming thoughts on the
> > >>> pile
> > >>> >>> and
> > >>> >>>>>>> maybe we can come up with something good.
> > >>> >>>>>>>
> > >>> >>>>>>>
> > >>> >>>>>>>
> > >>> >>>>>>>
> > >>> >>>>>>> - ferruzzi
> > >>> >>>>>>>
> > >>> >>>>>>>
> > >>> >>>>>>>
> > >>> >>>>>>>
> > >>> >>>>>>
> > >>> >>>>
> > >>> >>>>
> > >>> >>>
> > >>> >>
> > >>>
> > >>>
> > >>> ---------------------------------------------------------------------
> > >>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > >>> For additional commands, e-mail: dev-h...@airflow.apache.org
> > >>>
> > >>>
> >
>

Reply via email to