Yeah

On Wed, 13 Nov 2024 at 22:26, Jarek Potiuk <ja...@potiuk.com> wrote:

> Re: Damian's point.
>
> Just to separate this to another thread:
>
> > While my team produces reports for the entire company, we must segregate
> different business team's information, and to ensure this there is a policy
> that no server has access to information from both teams. We separate our
> workers and set certain Airflow tasks to one server or the other, for this
> case we cannot set sensors to deferrable, because the Airflow defer process
> does not support being able to choose which triggerer to send to.
>
> > We use the celery executor and set the queue name on the relevant tasks,
> that queue name corresponds to a celery queue, and then the airflow workers
> are assigned a specific celery queue to read from.
>
> Multi-team airflow (AIP-67)
>
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-67+Multi-team+deployment+of+Airflow+components
> - should serve very well in this case. Each team can have its own
> environment, including dependencies and its own triggerer. It is planned to
> be implemented for 3.1 for now, so I guess this means that when it is
> available, you should be able to migrate to a Deferrable-only approach for
> all your sensors.
>
> Do you agree Damian?
>
> J.
>
>
>
>
> On Wed, Nov 13, 2024 at 8:07 PM Damian Shaw <ds...@striketechnologies.com>
> wrote:
>
> > We use the celery executor and set the queue name on the relevant tasks,
> > that queue name corresponds to a celery queue, and then the airflow
> workers
> > are assigned a specific celery queue to read from.
> >
> > Damian
> >
> > -----Original Message-----
> > From: Vikram Koka <vik...@astronomer.io.INVALID>
> > Sent: Wednesday, November 13, 2024 12:59 PM
> > To: dev@airflow.apache.org
> > Subject: Re: [DISCUSSION] Replace Poke & Reschedule mode from Sensors for
> > Airflow 3 in favor of Deferrable
> >
> > Damian,
> >
> > Thank you for that response. That's an interesting use case and really
> > appreciate you sharing that perspective.
> > Just to make sure that I have clarity on this, how are you "separating
> > Airflow Workers and setting Airflow Tasks to one server or the other"
> > today?
> >
> > Vikram
> >
> > On Wed, Nov 13, 2024 at 8:13 AM Damian Shaw <
> ds...@striketechnologies.com>
> > wrote:
> >
> > > One use case that does not work with defer is when you're checking
> > > resources that are only available on a subset of Airflow workers.
> > >
> > > While my team produces reports for the entire company, we must
> > > segregate different business team's information, and to ensure this
> > > there is a policy that no server has access to information from both
> > > teams. We separate our workers and set certain Airflow tasks to one
> > > server or the other, for this case we cannot set sensors to
> > > deferrable, because the Airflow defer process does not support being
> > able to choose which triggerer to send to.
> > >
> > > I am sure that we aren't the only company with this problem and are
> > > using Airflow workers in this way.
> > >
> > > Damian
> > >
> > > -----Original Message-----
> > > From: ambika garg <ambikagarg1...@gmail.com>
> > > Sent: Wednesday, November 13, 2024 11:01 AM
> > > To: dev@airflow.apache.org
> > > Subject: Re: [DISCUSSION] Replace Poke & Reschedule mode from Sensors
> > > for Airflow 3 in favor of Deferrable
> > >
> > > I would vote for option 3 as well, making deferrable operators the
> > > default mode ensures that users benefit from the most efficient,
> > > async-driven solution without requiring any additional configuration
> > > changes. Also, keeping Poke and Reschedule modes ensures backward
> > > compatibility with existing operators and users who rely on these
> modes.
> > >
> > > On Wed, Nov 13, 2024 at 9:58 AM Vincent Beck <vincb...@apache.org>
> > wrote:
> > >
> > > > I am definitely in favor of considering the deferrable mode as the
> > > > default one. Between 1 and 3, even though I am a big fan of removing
> > > > and simplifying things in general, I feel like (no real data here)
> > > > we are not ready for 1 yet. So my vote would go to 3. I feel like
> > > > removing the poke mode would require too much work on the operators.
> > > >
> > > > On 2024/11/13 13:59:35 Jarek Potiuk wrote:
> > > > > I am torn between 1) and 3). While 1) is tempting and it would
> > > > > simplify
> > > > our
> > > > > state management, I think 3) is the safer choice. I think if we
> > > > > were not able to convert all our operators to be deferrable yet,
> > > > > there are
> > > > probably
> > > > > many thousands of custom ones that will stop working if we remove
> > > > > that feature.
> > > > >
> > > > > If we go 1) but then any operator that would go to "poke &
> > > > > reschedule", should just be "normal sensor" and simply start
> > > > > taking resources while waiting. Technically it's not "breaking"
> > > > > the flow, but it's likely
> > > > breaking
> > > > > installation which heavily relies on rescheduling and it would
> > > > dramatically
> > > > > increase resource usage. And there is no easy way out short of
> > > > > rewriting all such operators to support deferrable.
> > > > >
> > > > > I think personally that in making such decision we should consider
> > > > > two
> > > > > things:
> > > > >
> > > > > 1) will this stop some people from migrating to Airflow 3 because
> > > > > it will be "heavy operation"?
> > > > > 2) how likely we think it's going to happen - will we use big,
> > > > > important users who might be "success" stories for Airflow 3.
> > > > >
> > > > > I have no data to back it up, (maybe some people here could have
> > > > > it)
> > > > > -
> > > > but
> > > > > my intuition tells me that:
> > > > >
> > > > > 1) yes it will stop some users from migrating to Airflow 3
> > > > > (because they will either have to accept increased resource usage
> > > > > or find engineering time to rewrite their custom operators)
> > > > > 2) yeah, I think it's quite likely and quite likely big users that
> > > > > could
> > > > be
> > > > > "Airflow 3 success story" might be affected
> > > > >
> > > > > But I am guessing. If someone could provide some data telling that
> > > > > either
> > > > > 1) or 2) assumption I made is false, I am happy to support option
> > > > > 1). For now it's 3).
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Nov 13, 2024 at 2:43 PM Abhishek Bhakat
> > > > > <abhishek.bha...@astronomer.io.invalid> wrote:
> > > > >
> > > > > > +1 to 3. For cases where absolute minimal latency is critical,
> > > > > > +and
> > > > worker
> > > > > > resources aren't constrained, poke mode could still be the
> > > > > > optimal
> > > > choice.
> > > > > > I don't see any value in reschedule mode anymore, deferrable
> > > > > > should be
> > > > the
> > > > > > default.
> > > > > >
> > > > > > On Wed, Nov 13, 2024 at 12:21 PM Kaxil Naik
> > > > > > <kaxiln...@gmail.com>
> > > > wrote:
> > > > > >
> > > > > > > There is 4th option to keep things as-is too :)
> > > > > > >
> > > > > > > On Wed, 13 Nov 2024 at 12:19, Kaxil Naik <kaxiln...@gmail.com>
> > > > wrote:
> > > > > > >
> > > > > > > > Hi all,
> > > > > > > >
> > > > > > > > Following up on the Dev call discussions last Thursday, I am
> > > > opening
> > > > > > this
> > > > > > > > up for discussion.
> > > > > > > >
> > > > > > > > Reschedule mode was introduced to improve efficiency over
> > > > > > > > poke
> > > > mode by
> > > > > > > > allowing tasks to wait without holding a worker slot. Since
> > > > > > > > the introduction of deferrable operators in Airflow 2.2,
> > > > > > > > however, we
> > > > now
> > > > > > have
> > > > > > > > an even more optimal, async-driven solution. The adoption of
> > > > deferrable
> > > > > > > > operators has been really good, and since we are already
> > > > > > > > chopping
> > > > > > things
> > > > > > > > off with Airflow 3 it might be time to consider making them
> > > > > > > > the
> > > > default
> > > > > > > > mode.
> > > > > > > >
> > > > > > > > This will ensure that our users always have the most optimal
> > > > > > > > way of running sensors by default and that we, the
> > > > > > > > maintainers or folks
> > > > > > > supporting
> > > > > > > > Airflow deployments in companies, do not need to know
> > > > > > > > different
> > > > > > > approaches
> > > > > > > > with Reschedule mode, either.
> > > > > > > >
> > > > > > > > However, not all sensors can be async, either due to
> > > > > > > > limitations in underlying libraries or a lack of unique ids
> > > > > > > > for
> > > async polling.
> > > > > > > >
> > > > > > > > Knowing that we have a few options:
> > > > > > > >
> > > > > > > > 1) *Remove Poke & Reschedule modes*
> > > > > > > >
> > > > > > > > This is aggressive and it means we will have to remove
> > > > > > > > things like PostgresSensor that does not support async.
> > > > > > > >
> > > > > > > > 2) *Remove Reschedule mode *
> > > > > > > >
> > > > > > > > Make deferrable the primary mode, falling back to poke where
> > > > > > > > async
> > > > > > isn’t
> > > > > > > > supported.
> > > > > > > >
> > > > > > > > 3) *Make Deferrable the default, keep Poke & Reschedule*
> > > > > > > >
> > > > > > > > This is a defensive option that maintains current behaviour
> > > > > > > > but
> > > > ensures
> > > > > > > > that we have the most performant option by default. It could
> > > > > > > > be as
> > > > > > simple
> > > > > > > > as making  AIRFLOW__OPERATORS__DEFAULT_DEFERRABLE default to
> > > True.
> > > > > > > >
> > > > > > > > I’d love to hear feedback, especially from users who rely on
> > > > reschedule
> > > > > > > > mode today!
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > Kaxil
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > > --------------------------------------------------------------------
> > > > - To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > > > For additional commands, e-mail: dev-h...@airflow.apache.org
> > > >
> > > >
> > > ________________________________
> > >  Strike Technologies, LLC (“Strike”) is part of the GTS family of
> > > companies. Strike is a technology solutions provider, and is not a
> > > broker or dealer and does not transact any securities related business
> > > directly whatsoever. This communication is the property of Strike and
> > > its affiliates, and does not constitute an offer to sell or the
> > > solicitation of an offer to buy any security in any jurisdiction. It
> > > is intended only for the person to whom it is addressed and may
> > > contain information that is privileged, confidential, or otherwise
> > protected from disclosure.
> > > Distribution or copying of this communication, or the information
> > > contained herein, by anyone other than the intended recipient is
> > > prohibited. If you have received this communication in error, please
> > > immediately notify Strike at i...@striketechnologies.com, and delete
> > and destroy any copies hereof.
> > > ________________________________
> > >
> > > CONFIDENTIALITY / PRIVILEGE NOTICE: This transmission and any
> > > attachments are intended solely for the addressee. This transmission
> > > is covered by the Electronic Communications Privacy Act, 18 U.S.C
> > > ''2510-2521. The information contained in this transmission is
> > > confidential in nature and protected from further use or disclosure
> > > under U.S. Pub. L. 106-102, 113 U.S. Stat. 1338 (1999), and may be
> > > subject to attorney-client or other legal privilege. Your use or
> > > disclosure of this information for any purpose other than that
> > > intended by its transmittal is strictly prohibited, and may subject
> > > you to fines and/or penalties under federal and state law. If you are
> > > not the intended recipient of this transmission, please DESTROY ALL
> > > COPIES RECEIVED and confirm destruction to the sender via return
> > transmittal.
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > > For additional commands, e-mail: dev-h...@airflow.apache.org
> > >
> > ________________________________
> >  Strike Technologies, LLC (“Strike”) is part of the GTS family of
> > companies. Strike is a technology solutions provider, and is not a broker
> > or dealer and does not transact any securities related business directly
> > whatsoever. This communication is the property of Strike and its
> > affiliates, and does not constitute an offer to sell or the solicitation
> of
> > an offer to buy any security in any jurisdiction. It is intended only for
> > the person to whom it is addressed and may contain information that is
> > privileged, confidential, or otherwise protected from disclosure.
> > Distribution or copying of this communication, or the information
> contained
> > herein, by anyone other than the intended recipient is prohibited. If you
> > have received this communication in error, please immediately notify
> Strike
> > at i...@striketechnologies.com, and delete and destroy any copies
> hereof.
> > ________________________________
> >
> > CONFIDENTIALITY / PRIVILEGE NOTICE: This transmission and any attachments
> > are intended solely for the addressee. This transmission is covered by
> the
> > Electronic Communications Privacy Act, 18 U.S.C ''2510-2521. The
> > information contained in this transmission is confidential in nature and
> > protected from further use or disclosure under U.S. Pub. L. 106-102, 113
> > U.S. Stat. 1338 (1999), and may be subject to attorney-client or other
> > legal privilege. Your use or disclosure of this information for any
> purpose
> > other than that intended by its transmittal is strictly prohibited, and
> may
> > subject you to fines and/or penalties under federal and state law. If you
> > are not the intended recipient of this transmission, please DESTROY ALL
> > COPIES RECEIVED and confirm destruction to the sender via return
> > transmittal.
> >
>

Reply via email to