Re: [DISCUSS] simplifying try_number handling

Amogh Desai Thu, 02 May 2024 21:00:33 -0700

Looks good to me.

Personally I never ran into any issues with this so far but I agree with
the issues it solves.
Thanks & Regards,
Amogh Desai



On Fri, May 3, 2024 at 2:50 AM Vincent Beck <[email protected]> wrote:

> I am all +1 on this one. This thing gave me headaches when working on
> AIP-44 and I could not understand the difference between the private
> "_try_number" and the public "try_number". Thanks for simplifying it!
>
> This is obviously assuming it does not break anything I am not aware of :)
>
> On 2024/05/02 19:37:32 Daniel Standish wrote:
> > TLDR
> > * changing handling of try_number in
> > https://github.com/apache/airflow/pull/39336
> > * no more private attr
> > * no more getter that changes value based on state of task
> > * no more decrementing
> > * try number now only handled by scheduler
> > * hope that sounds good to all of you
> >
> > For more detail read on...
> >
> > In https://github.com/apache/airflow/pull/39336 I am doing some work to
> > resolve some longstanding pain and frustration caused by try_number.
> >
> > The way we handle try_number has for quite some time been messy and
> > problematic.
> >
> > For example, if you access `ti.try_number` and then change the state to
> or
> > from RUNNING, you will get a different value if you access it again!
> >
> > And the responsibility for managing this number has been distributed
> > throughout the codebase.  For example the task itself always increments
> > when it starts running.  But then if it defers or reschedules itself, it
> > decrements it back down so that when it runs again and naively
> increments,
> > then it will be right again.
> >
> > Recently more issues have become visible as I have worked with AIP-44
> > because for example pydantic does not like private attrs and it's just
> > awkward to know *what value to use* when serializing it when the TI will
> > give you a different answer depending on the state of the task!
> >
> > And there's yet another edge case being solved in this community PR
> > <https://github.com/apache/airflow/pull/38984#issuecomment-2090944403>.
> >  And then when we start looking at try history and AIP-64, it also
> forces a
> > look at this.
> >
> > So it all sounds bad and indeed it is bad but I think I have a solution.
> >
> > What I do is, just have the scheduler increment try_number at the moment
> > when it schedules the task.  It alone will have the responsibility for
> > incrementing try_number.  And no where will it ever be decremented.  It
> > will not be incremented when resuming after deferral or reschedule.  And
> > that's about all there is to it.
> >
> > I've tested it out and it works.  But I'm working through many test
> > failures that need to be resolved (there's lots of asserts re
> try_number).
> >
> > One small thing I just want to point out is that if a user were
> previously
> > to be doing `task.run()` sort of manually without the task having been
> > scheduled by the scheduler, well now their try_number won't be
> > automatically incremented.  Same if they just do `airflow tasks run` --
> > because now the responsibility is going to be solely with the scheduler.
> > But airflow was never designed to assume that tasks will be run without
> > having been scheduled, so I do not think that counts as a breaking
> change.
> > So I don't think that's a blocker for this.
> >
> > Thanks for the consideration.  Let me know if you have any concerns.
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Re: [DISCUSS] simplifying try_number handling

Reply via email to