Damian,

Thank you for that response. That's an interesting use case and really
appreciate you sharing that perspective.
Just to make sure that I have clarity on this, how are you "separating
Airflow Workers and setting Airflow Tasks to one server or the other"
today?

Vikram

On Wed, Nov 13, 2024 at 8:13 AM Damian Shaw <ds...@striketechnologies.com>
wrote:

> One use case that does not work with defer is when you're checking
> resources that are only available on a subset of Airflow workers.
>
> While my team produces reports for the entire company, we must segregate
> different business team's information, and to ensure this there is a policy
> that no server has access to information from both teams. We separate our
> workers and set certain Airflow tasks to one server or the other, for this
> case we cannot set sensors to deferrable, because the Airflow defer process
> does not support being able to choose which triggerer to send to.
>
> I am sure that we aren't the only company with this problem and are using
> Airflow workers in this way.
>
> Damian
>
> -----Original Message-----
> From: ambika garg <ambikagarg1...@gmail.com>
> Sent: Wednesday, November 13, 2024 11:01 AM
> To: dev@airflow.apache.org
> Subject: Re: [DISCUSSION] Replace Poke & Reschedule mode from Sensors for
> Airflow 3 in favor of Deferrable
>
> I would vote for option 3 as well, making deferrable operators the default
> mode ensures that users benefit from the most efficient, async-driven
> solution without requiring any additional configuration changes. Also,
> keeping Poke and Reschedule modes ensures backward compatibility with
> existing operators and users who rely on these modes.
>
> On Wed, Nov 13, 2024 at 9:58 AM Vincent Beck <vincb...@apache.org> wrote:
>
> > I am definitely in favor of considering the deferrable mode as the
> > default one. Between 1 and 3, even though I am a big fan of removing
> > and simplifying things in general, I feel like (no real data here) we
> > are not ready for 1 yet. So my vote would go to 3. I feel like
> > removing the poke mode would require too much work on the operators.
> >
> > On 2024/11/13 13:59:35 Jarek Potiuk wrote:
> > > I am torn between 1) and 3). While 1) is tempting and it would
> > > simplify
> > our
> > > state management, I think 3) is the safer choice. I think if we were
> > > not able to convert all our operators to be deferrable yet, there
> > > are
> > probably
> > > many thousands of custom ones that will stop working if we remove
> > > that feature.
> > >
> > > If we go 1) but then any operator that would go to "poke &
> > > reschedule", should just be "normal sensor" and simply start taking
> > > resources while waiting. Technically it's not "breaking" the flow,
> > > but it's likely
> > breaking
> > > installation which heavily relies on rescheduling and it would
> > dramatically
> > > increase resource usage. And there is no easy way out short of
> > > rewriting all such operators to support deferrable.
> > >
> > > I think personally that in making such decision we should consider
> > > two
> > > things:
> > >
> > > 1) will this stop some people from migrating to Airflow 3 because it
> > > will be "heavy operation"?
> > > 2) how likely we think it's going to happen - will we use big,
> > > important users who might be "success" stories for Airflow 3.
> > >
> > > I have no data to back it up, (maybe some people here could have it)
> > > -
> > but
> > > my intuition tells me that:
> > >
> > > 1) yes it will stop some users from migrating to Airflow 3 (because
> > > they will either have to accept increased resource usage or find
> > > engineering time to rewrite their custom operators)
> > > 2) yeah, I think it's quite likely and quite likely big users that
> > > could
> > be
> > > "Airflow 3 success story" might be affected
> > >
> > > But I am guessing. If someone could provide some data telling that
> > > either
> > > 1) or 2) assumption I made is false, I am happy to support option
> > > 1). For now it's 3).
> > >
> > >
> > >
> > > On Wed, Nov 13, 2024 at 2:43 PM Abhishek Bhakat
> > > <abhishek.bha...@astronomer.io.invalid> wrote:
> > >
> > > > +1 to 3. For cases where absolute minimal latency is critical, and
> > worker
> > > > resources aren't constrained, poke mode could still be the optimal
> > choice.
> > > > I don't see any value in reschedule mode anymore, deferrable
> > > > should be
> > the
> > > > default.
> > > >
> > > > On Wed, Nov 13, 2024 at 12:21 PM Kaxil Naik <kaxiln...@gmail.com>
> > wrote:
> > > >
> > > > > There is 4th option to keep things as-is too :)
> > > > >
> > > > > On Wed, 13 Nov 2024 at 12:19, Kaxil Naik <kaxiln...@gmail.com>
> > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > Following up on the Dev call discussions last Thursday, I am
> > opening
> > > > this
> > > > > > up for discussion.
> > > > > >
> > > > > > Reschedule mode was introduced to improve efficiency over poke
> > mode by
> > > > > > allowing tasks to wait without holding a worker slot. Since
> > > > > > the introduction of deferrable operators in Airflow 2.2,
> > > > > > however, we
> > now
> > > > have
> > > > > > an even more optimal, async-driven solution. The adoption of
> > deferrable
> > > > > > operators has been really good, and since we are already
> > > > > > chopping
> > > > things
> > > > > > off with Airflow 3 it might be time to consider making them
> > > > > > the
> > default
> > > > > > mode.
> > > > > >
> > > > > > This will ensure that our users always have the most optimal
> > > > > > way of running sensors by default and that we, the maintainers
> > > > > > or folks
> > > > > supporting
> > > > > > Airflow deployments in companies, do not need to know
> > > > > > different
> > > > > approaches
> > > > > > with Reschedule mode, either.
> > > > > >
> > > > > > However, not all sensors can be async, either due to
> > > > > > limitations in underlying libraries or a lack of unique ids for
> async polling.
> > > > > >
> > > > > > Knowing that we have a few options:
> > > > > >
> > > > > > 1) *Remove Poke & Reschedule modes*
> > > > > >
> > > > > > This is aggressive and it means we will have to remove things
> > > > > > like PostgresSensor that does not support async.
> > > > > >
> > > > > > 2) *Remove Reschedule mode *
> > > > > >
> > > > > > Make deferrable the primary mode, falling back to poke where
> > > > > > async
> > > > isn’t
> > > > > > supported.
> > > > > >
> > > > > > 3) *Make Deferrable the default, keep Poke & Reschedule*
> > > > > >
> > > > > > This is a defensive option that maintains current behaviour
> > > > > > but
> > ensures
> > > > > > that we have the most performant option by default. It could
> > > > > > be as
> > > > simple
> > > > > > as making  AIRFLOW__OPERATORS__DEFAULT_DEFERRABLE default to
> True.
> > > > > >
> > > > > > I’d love to hear feedback, especially from users who rely on
> > reschedule
> > > > > > mode today!
> > > > > >
> > > > > > Regards,
> > > > > > Kaxil
> > > > > >
> > > > >
> > > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > For additional commands, e-mail: dev-h...@airflow.apache.org
> >
> >
> ________________________________
>  Strike Technologies, LLC (“Strike”) is part of the GTS family of
> companies. Strike is a technology solutions provider, and is not a broker
> or dealer and does not transact any securities related business directly
> whatsoever. This communication is the property of Strike and its
> affiliates, and does not constitute an offer to sell or the solicitation of
> an offer to buy any security in any jurisdiction. It is intended only for
> the person to whom it is addressed and may contain information that is
> privileged, confidential, or otherwise protected from disclosure.
> Distribution or copying of this communication, or the information contained
> herein, by anyone other than the intended recipient is prohibited. If you
> have received this communication in error, please immediately notify Strike
> at i...@striketechnologies.com, and delete and destroy any copies hereof.
> ________________________________
>
> CONFIDENTIALITY / PRIVILEGE NOTICE: This transmission and any attachments
> are intended solely for the addressee. This transmission is covered by the
> Electronic Communications Privacy Act, 18 U.S.C ''2510-2521. The
> information contained in this transmission is confidential in nature and
> protected from further use or disclosure under U.S. Pub. L. 106-102, 113
> U.S. Stat. 1338 (1999), and may be subject to attorney-client or other
> legal privilege. Your use or disclosure of this information for any purpose
> other than that intended by its transmittal is strictly prohibited, and may
> subject you to fines and/or penalties under federal and state law. If you
> are not the intended recipient of this transmission, please DESTROY ALL
> COPIES RECEIVED and confirm destruction to the sender via return
> transmittal.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> For additional commands, e-mail: dev-h...@airflow.apache.org
>

Reply via email to