+100 to what Jarek and Michal said. Changing DAG code for this will seriously 
impact Airflow 3 adoption.

We can move the code to a provider without having to change the DAG Author 
interface can’t we? For instance we can change the DAG parser to convert these 
options in to an SMTP notifier can’t we?

-ash 

> On 28 Jan 2025, at 16:50, Jarek Potiuk <ja...@potiuk.com> wrote:
> 
> I agree with Michał - it's not how simple it is, but that you do it at all.
> Chance is that if you have 1000s of DAGs in your company and some of them
> use email notification, it means that vast majority of them will. And I
> already know of companies where there are 20+ teams working on those and
> having tens of changes in all of those dags happening a day, and wha we
> basically ask the "deployment managers" of those teams to make a migration
> where all of those teams will have to - at the day of migration modify
> their DAGs to account for this change.
> 
> And in many places it's difficult to keep consistency - some people will
> use one way and some will use other ways.
> 
> Great that we see some code examples and what would happen, because it
> opens up my imagination and now I know that what I felt intuitively, is
> pretty much justified.
> I can imagine variations of the code, ALL at the same time implemented in a
> single Airflow installation:
> 
> COMMON CODE:
> common.get_emails_to_send_from(group):
>   if group == "group1":
>      return ["a@b.c"}
>   if group. == "group2"
>      return ["a@c.d"]
> 
> with DAG(
>    default_args={
>        "email": get_emails_to_send_from("my_group"),
>        "email_on_failure": True,
>        "email_on_retry": True,
>    },
> 
> with DAG(
>    default_args={
>        "email": [Variable.get("email_for_my_customer_x_dags")],
>        "email_on_failure": True,
>        "email_on_retry": True,
>    },
> 
> team = os.environ.get("TEAM_MANAGING_THE_MACHINE")
> 
> with DAG(
>    default_args={
>        "email": [f"{team}@mycompany_com"],
>        "email_on_failure": True,
>        "email_on_retry": True,
>    },
> 
> with DAG(
>    default_args={
>        "email_on_failure": True,
>        "email_on_retry": True
>    ):
>    op1 = FirstOperator(
>        "email" = ["firstopera...@company.com"]
>    }
> 
>    op2 = SecondOperator)
>        "email" = ["secondopera...@company.com"]
>        }
> 
> 
> And this is only the beginning, I can imagine 100s of other ways you could
> set up notification schemes - and there are so many ways possible that I
> hardly imagine automation that could handle such a move.  This would be
> pretty much a nightmare for such deployment managers who would have to
> track down and make sure that all such cases and usages will be converted
> "properly" - and DAG authors won't care. They will just want them to "work"
> after migration.
> 
> I don't think we can do more than deprecation and possibly converting it to
> the common Notifiers automatically under the hood - to be honest.
> 
> On Tue, Jan 28, 2025 at 2:23 PM Michał Modras
> <michalmod...@google.com.invalid> wrote:
> 
>> I am concerned simply because it is a physical code change, and one that
>> would affect a good percentage of DAGs out there. No matter how complex the
>> change is, it forces the users to modify their code, which is huge
>> operational overhead in bigger organizations. Imagine - there could be a
>> central platform team responsible for migrating to Airflow 3. But then,
>> before Airflow 3 deployments can be used, each team using particular
>> Airflow deployment would need to modify the code. At the same time, the
>> platform team might not be permitted to touch code of these teams -
>> these are different personas. It can easily become a very complex migration
>> procedure to expedite across the organization, even if the code change
>> itself is simple.
>> 
>> On Tue, Jan 28, 2025 at 3:05 PM Elad Kalif <elad...@apache.org> wrote:
>> 
>>> I'd like to provide example:
>>> 
>>> with DAG(
>>>    default_args={
>>>        "email": ["airf...@airflow.com"],
>>>        "email_on_failure": True,
>>>        "email_on_retry": True,
>>>    },
>>>    ...
>>> ) as dag:
>>> 
>>> becomes:
>>> 
>>> from airflow.providers.smtp.notifications.smtp import SmtpNotifier
>>> 
>>> with DAG(
>>>    default_args={
>>>        on_failure_callback=SmtpNotifier(to="airf...@airflow.com"),
>>>        on_retry_callback=SmtpNotifier(to="airf...@airflow.com"),
>>>    },
>>>    ...
>>> ) as dag:
>>> 
>>> 
>>> That is it. There are no "moving configurations". There are no judgment
>>> calls.
>>> I do not follow on why this is risky nor why it is expensive.
>>> Is the issue you are worried about that this change can not be automated
>> by
>>> the migration tool we plan to have?
>>> 
>>> Am I missing something?
>>> 
>>> 
>>> On Tue, Jan 28, 2025 at 3:48 PM Michał Modras
>>> <michalmod...@google.com.invalid> wrote:
>>> 
>>>> +1 to what Jarek proposes.
>>>> 
>>>> IMO, for Airflow to become not only a leading solution in its market,
>> but
>>>> also a de facto standard, we need to be super conscious about the
>> effort
>>>> required for users to maintain it and keep it up to date. It's a super
>>>> common concern raised by the users, especially those, who don't only
>> run
>>>> Airflow on their laptops, but manage deployments of hundreds or
>> thousands
>>>> Airflow environments.
>>>> 
>>>> If we lean on doing things that are right from a purely engineering
>> point
>>>> of view, but put cost on the users, we'll go in the opposite direction.
>>> It
>>>> might be seen as a sad reality from the perspective of engineering
>>> purity,
>>>> but *user* experience should come first, Airflow maintainer second.
>>>> 
>>>> On Tue, Jan 28, 2025 at 12:01 AM Jarek Potiuk <ja...@potiuk.com>
>> wrote:
>>>> 
>>>>> I'd say - if it is a very used feature, and likely a lot of people
>> use
>>>> it,
>>>>> the only way we can justify that change is if we can fully automate
>> it
>>>> with
>>>>> near 100% certainty. This is one of the reasons we did not want to do
>>>>> Templating change as breaking in 3.0  - because it was unlikely to be
>>>> fully
>>>>> automatically "patchable" (i.e. behave in the same way).
>>>>> 
>>>>> I think we all agreed at the very beginning of Airflow 3 migration
>> that
>>>>> while it's perfectly fine to make a change that requires
>> reconfiguring
>>>>> airflow, any change that REQUIRES a big number of DAGs to be
>> rewritten
>>> is
>>>>> no go. And at the very least we should only do that when there is a
>>>> massive
>>>>> gain for the users and us. I am not sure if we are breaking the DAGs,
>>> the
>>>>> "massive pain" is in this case justified by "massive gain". I simply
>>>> don't
>>>>> see the gain clearly.
>>>>> 
>>>>> It's fine if the configuration has to be adapted by the admin person
>>> for
>>>>> everyone, but it's not fine if each user has to go and fix their
>> DAGs.
>>> In
>>>>> many cases this will introduce months of  delays with migration to
>>>> Airflow
>>>>> 3 for little reason - because the people who manage  Airflow will
>> have
>>> to
>>>>> get everyone who writes DAG (sometimes many people, some of them busy
>>> or
>>>>> unavailable at times) to apply and modify their changes, and
>> migration
>>> in
>>>>> one "go" will be difficult.
>>>>> 
>>>>> I'd be for option 2  but where email_on_failure would remain but will
>>> be
>>>>> deprecated and will be really the same as "notify_on_failure" - while
>>> the
>>>>> admin will have to configure default notifier in this case. Yes it
>> will
>>>>> make no sense if someone will set "email_on_failure" and notifier
>> will
>>> be
>>>>> different from email but there will be a deprecation warning (and we
>>> can
>>>>> also raise it every time notification is done). I see no other
>> problem
>>>> with
>>>>> that.
>>>>> 
>>>>> J.
>>>>> 
>>>>> On Mon, Jan 27, 2025 at 11:56 PM Hussein Awala <huss...@awala.fr>
>>> wrote:
>>>>> 
>>>>>>> But the cost is forcing all users of this popular feature to
>> modify
>>>>>> source code of their DAGs, putting another hurdle for Airflow 3
>>>> adoption.
>>>>>> 
>>>>>> We're all Airflow users before being maintainers and contributors,
>>> so I
>>>>>> understand how difficult it can be to update code for a library
>>>> upgrade.
>>>>>> However, breaking changes are a normal part of major releases, and
>>>> users
>>>>>> should be prepared for them. We’re not asking users to update all
>>> their
>>>>>> code at once - any option we choose will be backported to 2.11,
>>>> allowing
>>>>>> users to clean up their code and resolve incompatibilities smoothly
>>>>> before
>>>>>> upgrading to Airflow 3.0.
>>>>>> 
>>>>>> On Mon, Jan 27, 2025 at 11:36 PM Michał Modras
>>>>>> <michalmod...@google.com.invalid> wrote:
>>>>>> 
>>>>>>>> Actually, modifying DAGs code is the easiest part of the
>> migration
>>>>>>> 
>>>>>>> I strongly disagree. Modifying source code of DAGs might be easy
>>> for
>>>>>>> individual devs, but it is what blocks customers who have
>>> significant
>>>>>> size
>>>>>>> deployments from perfoming upgrades and migrations. Even if the
>>> code
>>>>>> change
>>>>>>> is seemingly simple and automated, it becomes a massive
>> operational
>>>>>>> overhead.
>>>>>>> 
>>>>>>> To be honest, I don't think we're net positive on this cleanup -
>>>> sure,
>>>>>> the
>>>>>>> code get cleaner, there are no potentially overlapping
>> parameters,
>>> we
>>>>>>> separate email which is not core Airflow feature, etc. All great
>>>> stuff.
>>>>>> But
>>>>>>> the cost is forcing all users of this popular feature to modify
>>>> source
>>>>>> code
>>>>>>> of their DAGs, putting another hurdle for Airflow 3 adoption.
>>>>>>> 
>>>>>>> pon., 27 sty 2025, 23:21 użytkownik Elad Kalif <
>> elad...@apache.org
>>>> 
>>>>>>> napisał:
>>>>>>> 
>>>>>>>> I vote for option 1. complete removal.
>>>>>>>> 
>>>>>>>> Email is just another integration. We should not favor it.
>>>>>>>> 
>>>>>>>> I agree with Michal that it's massively used but I do not agree
>>> on
>>>>> the
>>>>>>>> risk.
>>>>>>>> 
>>>>>>>> From my point of view, the SmtpNotifier (Which has templates!
>>>> thanks
>>>>> to
>>>>>>>> https://github.com/apache/airflow/pull/36226) has existed for
>> a
>>>> long
>>>>>>> time.
>>>>>>>> It provides almost all the needed functionality (except the
>>>>>>>> default_email_on_failure) and the migration to it is very
>> simple.
>>>> We
>>>>>> can
>>>>>>>> catch all the cases with the upgrade tool we plan and alert
>> users
>>>>> where
>>>>>>>> they need to make changes.
>>>>>>>> 
>>>>>>>> Given that:
>>>>>>>> a. We can prevent broken dags by catching all cases in
>> migration
>>>> tool
>>>>>>>> b. Migration is easy. No thinking required. It's just a copy -
>>>> paste
>>>>>>>> replacement.
>>>>>>>> c. There are no edge cases
>>>>>>>> 
>>>>>>>> I think it's considered safe to continue with complete removal.
>>>>>>>> Now, the question left is: Is it worth it? What value do we get
>>>> from
>>>>>>> doing
>>>>>>>> that?
>>>>>>>> I don't know about other people's experience but I am tired of
>>>>>> explaining
>>>>>>>> to users about the difference between on_failure_call_back and
>>>>>>>> email_on_failure. What happens when someone use both, etc...
>>>>>>>> It confuses so many people. We have a chance to get it simple.
>>>> Let's
>>>>> do
>>>>>>>> that.
>>>>>>>> 
>>>>>>>> However, My answer did not cover the default_email_on_failure.
>>>>>>> Personally,
>>>>>>>> I don't know anyone who actually uses it.
>>>>>>>> In most Airflow envs I saw there is more than 1 person / 1 team
>>> so
>>>> it
>>>>>>> makes
>>>>>>>> little sense to have a global notification address.
>>>>>>>> For those kind of deployments it's no issue because they are
>> not
>>>>> using
>>>>>>>> it, for other, smaller deployments that do use it - I can argue
>>>> that
>>>>>> the
>>>>>>>> change is very low impact as it's very easy to fix, but we can
>>> also
>>>>>>> modify
>>>>>>>> the SMTP connection to hold default_to_email if we want to
>>> mitigate
>>>>>> that
>>>>>>>> risk.
>>>>>>>> 
>>>>>>>> On Tue, Jan 28, 2025 at 12:19 AM Hussein Awala <
>> huss...@awala.fr
>>>> 
>>>>>> wrote:
>>>>>>>> 
>>>>>>>>>> avoiding forcing users to modify their DAGs code
>>>>>>>>> 
>>>>>>>>> Actually, modifying DAGs code is the easiest part of the
>>>> migration
>>>>>>>>> for users, because we will provide migration rules via ruff.
>>> For
>>>>> more
>>>>>>>>> details, please see
>>>> https://github.com/apache/airflow/issues/41641
>>>>> .
>>>>>>>>> 
>>>>>>>>> On Mon, Jan 27, 2025 at 11:01 PM Michał Modras
>>>>>>>>> <michalmod...@google.com.invalid> wrote:
>>>>>>>>> 
>>>>>>>>>> Email notifications are a *massively* used Airflow
>> feature. I
>>>>>> suggest
>>>>>>>>> that
>>>>>>>>>> any change does not break the current syntax of
>> BaseOperator,
>>>>>>> avoiding
>>>>>>>>>> forcing users to modify their DAGs code. Changes on the
>>> config
>>>>> side
>>>>>>> and
>>>>>>>>>> provider separation should be fine though (seems separating
>>>>>> provider
>>>>>>>>> while
>>>>>>>>>> keeping the parameters of BaseOperator would require some
>>> safe
>>>>>>>> defaults).
>>>>>>>>>> 
>>>>>>>>>> On Mon, Jan 27, 2025 at 10:51 PM Hussein Awala <
>>>> huss...@awala.fr
>>>>>> 
>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Hello everyone!
>>>>>>>>>>> 
>>>>>>>>>>> We had an implicit consensus about removing the
>> email/SMTP
>>>>>>>> integration
>>>>>>>>>> from
>>>>>>>>>>> Airflow Core and replacing it with the SMTP provider and
>>>>>>>> SmtpNotifier.
>>>>>>>>>> This
>>>>>>>>>>> change will simplify Airflow's core, improve
>>> maintainability,
>>>>> and
>>>>>>>>>> provide a
>>>>>>>>>>> more consistent experience with other providers. Now is
>> the
>>>>> best
>>>>>>>> moment
>>>>>>>>>> to
>>>>>>>>>>> do this before Airflow 3.0.
>>>>>>>>>>> 
>>>>>>>>>>> All the Airflow configurations in the *smtp* section are
>>>>>> supported
>>>>>>> in
>>>>>>>>> the
>>>>>>>>>>> SMTP connection, so removing the whole section should
>> have
>>>>>> minimal
>>>>>>>>> impact
>>>>>>>>>>> on users. The migration effort is expected to be very
>> low.
>>>>>>>>>>> 
>>>>>>>>>>> This thread focuses on how best to handle the
>>> configurations
>>>>>>>> currently
>>>>>>>>> in
>>>>>>>>>>> the email section, mainly:
>>>>>>>>>>> 
>>>>>>>>>>>   -
>>>>>>>>>>> 
>>>>>>>>>>>   *default_email_on_failure*
>>>>>>>>>>>   -
>>>>>>>>>>> 
>>>>>>>>>>>   *default_email_on_retry*
>>>>>>>>>>>   -
>>>>>>>>>>> 
>>>>>>>>>>>   *email_backend*
>>>>>>>>>>>   -
>>>>>>>>>>> 
>>>>>>>>>>>   *email_conn_id*
>>>>>>>>>>> 
>>>>>>>>>>> The other configurations could be replaced by connection
>>>>> extras.
>>>>>>>>>>> 
>>>>>>>>>>> Options for handling email configuration:
>>>>>>>>>>> 
>>>>>>>>>>>   -
>>>>>>>>>>> 
>>>>>>>>>>>   Option 1: Complete Removal
>>>>>>>>>>> 
>>>>>>>>>>>   -
>>>>>>>>>>> 
>>>>>>>>>>>      Remove the entire email section from the Airflow
>>>>>>> configuration.
>>>>>>>>>>>      -
>>>>>>>>>>> 
>>>>>>>>>>>      Remove the *email_on_failure* and *email_on_retry*
>>>>>> arguments
>>>>>>>> from
>>>>>>>>>>>      BaseOperator.
>>>>>>>>>>>      -
>>>>>>>>>>> 
>>>>>>>>>>>      Implications:
>>>>>>>>>>>      -
>>>>>>>>>>> 
>>>>>>>>>>>         Users will need to explicitly define notifiers
>>>> (using
>>>>>>>>>>>         on_*_callbacks) for any task that requires email
>>>>>>>>> notifications.
>>>>>>>>>>>         -
>>>>>>>>>>> 
>>>>>>>>>>>         There will be no built-in mechanism for default
>>>> email
>>>>>>>>>>>         notifications.
>>>>>>>>>>>         -
>>>>>>>>>>> 
>>>>>>>>>>>         This offers the most significant simplification
>> of
>>>>>>> Airflow's
>>>>>>>>>> core
>>>>>>>>>>>         but may increase the effort required to
>> configure
>>>>> email
>>>>>>>>> alerts.
>>>>>>>>>>>         -
>>>>>>>>>>> 
>>>>>>>>>>>   Option 2: Replacement Configurations
>>>>>>>>>>> 
>>>>>>>>>>>   -
>>>>>>>>>>> 
>>>>>>>>>>>      Introduce new configuration options:
>>>>>>>>>>>      -
>>>>>>>>>>> 
>>>>>>>>>>>         *default_notification_on_failure*: (boolean)
>>> Enables
>>>>>>> default
>>>>>>>>>>>         notifications on task failure.
>>>>>>>>>>>         -
>>>>>>>>>>> 
>>>>>>>>>>>         *default_notification_on_retry*: (boolean)
>> Enables
>>>>>> default
>>>>>>>>>>>         notifications on task retry.
>>>>>>>>>>>         -
>>>>>>>>>>> 
>>>>>>>>>>>         *default_notifier*: (string) Specifies a method
>>> that
>>>>>>>>> initializes
>>>>>>>>>>>         and returns the default notifier.
>>>>>>>>>>>         -
>>>>>>>>>>> 
>>>>>>>>>>>      Replace *email_on_failure* and *email_on_retry* in
>>>>>>> BaseOperator
>>>>>>>>>> with
>>>>>>>>>>>      *notify_on_failure* and *notify_on_retry*, which
>>> would
>>>>>>> default
>>>>>>>> to
>>>>>>>>>> the
>>>>>>>>>>>      values set in the new configuration options.
>>>>>>>>>>>      -
>>>>>>>>>>> 
>>>>>>>>>>>      Replace the old email handling behavior by simply
>>>> adding
>>>>>> the
>>>>>>>>>> notifier
>>>>>>>>>>>      to the corresponding callback.
>>>>>>>>>>>      -
>>>>>>>>>>> 
>>>>>>>>>>>      Implications:
>>>>>>>>>>>      -
>>>>>>>>>>> 
>>>>>>>>>>>         Preserve the concept of default notifications
>>> while
>>>>>> moving
>>>>>>>>> away
>>>>>>>>>>>         from the mail-specific configuration.
>>>>>>>>>>>         -
>>>>>>>>>>> 
>>>>>>>>>>>         Support other notification types and send
>>>>> notifications
>>>>>> by
>>>>>>>>>>> default.
>>>>>>>>>>>         -
>>>>>>>>>>> 
>>>>>>>>>>>         Requires introducing new configuration options
>> and
>>>>>>> modifying
>>>>>>>>>>>         BaseOperator.
>>>>>>>>>>>         -
>>>>>>>>>>> 
>>>>>>>>>>>   Option 3: Dedicated Notification Connection
>>>>>>>>>>> 
>>>>>>>>>>>   -
>>>>>>>>>>> 
>>>>>>>>>>>      Create a new connection (of any type) with a
>> specific
>>>> and
>>>>>>>>> reserved
>>>>>>>>>>>      name for notifications (e.g.,
>>>>> *notifications_connection*).
>>>>>>>>>>>      -
>>>>>>>>>>> 
>>>>>>>>>>>      This connection would have extras to configure the
>>>>> notifier
>>>>>>> and
>>>>>>>>>>>      specify whether to send notifications by default.
>>>>>>>>>>>      -
>>>>>>>>>>> 
>>>>>>>>>>>      Implications:
>>>>>>>>>>>      -
>>>>>>>>>>> 
>>>>>>>>>>>         Provides a centralized and extensible way to
>>> manage
>>>>> all
>>>>>>>>>>>         notification settings without Airflow
>>> configuration.
>>>>>>>>>>>         -
>>>>>>>>>>> 
>>>>>>>>>>>         Allows users to easily switch between different
>>>>>>> notification
>>>>>>>>>>>         methods and send notifications by default.
>>>>>>>>>>>         -
>>>>>>>>>>> 
>>>>>>>>>>>         May require more initial setup for users but
>>> offers
>>>>>>> greater
>>>>>>>>>>>         flexibility in the long term.
>>>>>>>>>>>         -
>>>>>>>>>>> 
>>>>>>>>>>>         Require more dev on our side.
>>>>>>>>>>> 
>>>>>>>>>>> Regarding migration, a fully automated migration might
>> not
>>> be
>>>>>>>> possible
>>>>>>>>> in
>>>>>>>>>>> all cases because of the need to migrate Airflow
>>>>> configurations,
>>>>>> we
>>>>>>>>> will
>>>>>>>>>>> provide comprehensive documentation and examples to
>> assist
>>>>> users
>>>>>> in
>>>>>>>>>>> transitioning their email configurations.
>>>>>>>>>>> 
>>>>>>>>>>> I'm interested in hearing your thoughts and any other
>>>>> suggestions
>>>>>>> you
>>>>>>>>> may
>>>>>>>>>>> have.
>>>>>>>>>>> 
>>>>>>>>>>> P.S: I have created an almost ready-to-merge PR for
>> option
>>> 1
>>>> (
>>>>>>>>>>> https://github.com/apache/airflow/pull/46041)
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
For additional commands, e-mail: dev-h...@airflow.apache.org

Reply via email to