Also another good case that just happened -
https://github.com/apache/airflow/discussions/44211  a user added a very
valid question on overloading triggerers (i.e. trigger scalability). If we
are able to answer the question of the user now - this would be great (I
can't).

But I think the answer won't be pretty (i.e. we can't help with that now) -
so my question is how confident is that we can support such user's
questions and how confident current scalability promises for example can be
met ?

J.


On Wed, Nov 20, 2024 at 11:11 PM Jarek Potiuk <ja...@potiuk.com> wrote:

> Actually that's an extremely good point and I agree with Elad here.
>
> If we commit to something as first priority, we should treat issues we
> have with it with priority.
> There is another - very similar - issue with on_kill handling for
> deferrables that I just recalled.
>
> The issue is here: https://github.com/apache/airflow/issues/36090 that
> Matthew Wicks has been following on multiple times, and proposed a way to
> approach it in general way - but the only solution we were able to come up
> with was to implement workaround in a few deferrable operators.
>
> I am here with Elad that until we solve this and prioritise other issues
> there, making deferrable a default is a bad idea.
>
> J.
>
>
>
>
> On Wed, Nov 20, 2024 at 6:56 PM Elad Kalif <elad...@apache.org> wrote:
>
>> I don't think deferrable can act as a replacement at this point.
>> Removing/changing defaults of poke and reschedule in favor of deferrable
>> means that deferrable becomes a critical feature. This means that any bug
>> related to it must be triaged and fixed ASAP.
>> We are not there yet. I don't think many maintainers know this area of the
>> code.
>>
>> In my perspective as a user I have a functionality that is working fine. I
>> learned to trust it. I know how it behaves and am able to write code
>> efficiently with it. The suggested replacement is a feature that has 30ish
>> bug reports with little community focus. That is a lot of risk with small
>> profit to gain. For me this is an indication of -1 to this proposal.
>>
>> For example issue https://github.com/apache/airflow/issues/36734 with
>> suggested PR: https://github.com/apache/airflow/pull/33718
>> I admit I am not into the details but if we accept this proposal this bug
>> becomes top priority problem and this is open since 2023.
>>
>> So my position is -1 for any change about poke and reschedule (even for
>> changing defaults).
>>
>>
>>
>> On Thu, Nov 14, 2024 at 10:28 PM Jarek Potiuk <ja...@potiuk.com> wrote:
>>
>> > > I think that if we have implemented async for an operator, we should
>> > remove
>> > the non-async version.
>> >
>> > I think we all agree here. The question is only "when". I see two
>> options:
>> >
>> > 1) When Airflow 3 is released
>> > 2) When Airflow 2 stops being supported
>> >
>> > I'd be for 2) .
>> >
>> > Why? Because it has the least impact on our users while it does not make
>> > our life much harder, while we might commit to clean-up eventually.
>> >
>> > I personally (with my ADHD personality twist) often have an urge to fix
>> > things "now". It's very rewarding and has great "feedback" of feeling
>> > better. But I think the next best thing is to have a clear timeline to
>> > remove things like that and relentlessly follow it through. And it has
>> the
>> > added benefits of avoiding surprises for users who follow our decisions
>> and
>> > policies.
>> >
>> > I think the biggest problem I have with things that are deprecated are
>> not
>> > that they are lying around, but with the feeling that they will stay
>> there
>> > forever (aka - we will not clean our technical debt). I think having a
>> > policy when we remove things and following that, avoids that problem (as
>> > long as we do follow).
>> >
>> > See for example this PR for Google Provider that Max opened today:
>> > https://github.com/apache/airflow/pull/43953.= ~ -8000 lines of code we
>> > will not have to maintain. and whoever followed it, they had 5 months
>> or so
>> > to adjust as Google team set some rules for that and in most cases set
>> > the deprecation date. This is cool. Most of those warnings have a
>> > "planned_removal_date" set.
>> >
>> > And I think it also shows our maturity - that we do not "jump" on
>> changing
>> > something, but we plan it, we have a reason behind it, whe know why we
>> are
>> > doing it. what are consequences to our users, and follow through, giving
>> > enough warning to the users (those who care to listen).
>> >
>> > And the more we do it, the more we plan and announce in advance and
>> > relentlessly follow it through, the more our users will listen to those
>> > things.
>> >
>> > J.
>> >
>> >
>> > On Thu, Nov 14, 2024 at 4:12 PM Daniel Standish
>> > <daniel.stand...@astronomer.io.invalid> wrote:
>> >
>> > > To me, I don't really mind reschedule mode so much.  But I *really
>> *don't
>> > > like that we have *both* styles implemented in operators (talking
>> about
>> > > providers here).
>> > >
>> > > I think that if we have implemented async for an operator, we should
>> > remove
>> > > the non-async version.
>> > >
>> > > And we should remove the `deferrable` param for 3.0 -- this is
>> something
>> > > that to me makes no sense, since it implies that sensors should have
>> both
>> > > ways implemented.
>> > >
>> > > But users should still be allowed to write rescheduling sensors.
>> > >
>> >
>>
>

Reply via email to