+100 to what Jarek and Michal said. Changing DAG code for this will seriously impact Airflow 3 adoption.
We can move the code to a provider without having to change the DAG Author interface can’t we? For instance we can change the DAG parser to convert these options in to an SMTP notifier can’t we? -ash > On 28 Jan 2025, at 16:50, Jarek Potiuk <ja...@potiuk.com> wrote: > > I agree with Michał - it's not how simple it is, but that you do it at all. > Chance is that if you have 1000s of DAGs in your company and some of them > use email notification, it means that vast majority of them will. And I > already know of companies where there are 20+ teams working on those and > having tens of changes in all of those dags happening a day, and wha we > basically ask the "deployment managers" of those teams to make a migration > where all of those teams will have to - at the day of migration modify > their DAGs to account for this change. > > And in many places it's difficult to keep consistency - some people will > use one way and some will use other ways. > > Great that we see some code examples and what would happen, because it > opens up my imagination and now I know that what I felt intuitively, is > pretty much justified. > I can imagine variations of the code, ALL at the same time implemented in a > single Airflow installation: > > COMMON CODE: > common.get_emails_to_send_from(group): > if group == "group1": > return ["a@b.c"} > if group. == "group2" > return ["a@c.d"] > > with DAG( > default_args={ > "email": get_emails_to_send_from("my_group"), > "email_on_failure": True, > "email_on_retry": True, > }, > > with DAG( > default_args={ > "email": [Variable.get("email_for_my_customer_x_dags")], > "email_on_failure": True, > "email_on_retry": True, > }, > > team = os.environ.get("TEAM_MANAGING_THE_MACHINE") > > with DAG( > default_args={ > "email": [f"{team}@mycompany_com"], > "email_on_failure": True, > "email_on_retry": True, > }, > > with DAG( > default_args={ > "email_on_failure": True, > "email_on_retry": True > ): > op1 = FirstOperator( > "email" = ["firstopera...@company.com"] > } > > op2 = SecondOperator) > "email" = ["secondopera...@company.com"] > } > > > And this is only the beginning, I can imagine 100s of other ways you could > set up notification schemes - and there are so many ways possible that I > hardly imagine automation that could handle such a move. This would be > pretty much a nightmare for such deployment managers who would have to > track down and make sure that all such cases and usages will be converted > "properly" - and DAG authors won't care. They will just want them to "work" > after migration. > > I don't think we can do more than deprecation and possibly converting it to > the common Notifiers automatically under the hood - to be honest. > > On Tue, Jan 28, 2025 at 2:23 PM Michał Modras > <michalmod...@google.com.invalid> wrote: > >> I am concerned simply because it is a physical code change, and one that >> would affect a good percentage of DAGs out there. No matter how complex the >> change is, it forces the users to modify their code, which is huge >> operational overhead in bigger organizations. Imagine - there could be a >> central platform team responsible for migrating to Airflow 3. But then, >> before Airflow 3 deployments can be used, each team using particular >> Airflow deployment would need to modify the code. At the same time, the >> platform team might not be permitted to touch code of these teams - >> these are different personas. It can easily become a very complex migration >> procedure to expedite across the organization, even if the code change >> itself is simple. >> >> On Tue, Jan 28, 2025 at 3:05 PM Elad Kalif <elad...@apache.org> wrote: >> >>> I'd like to provide example: >>> >>> with DAG( >>> default_args={ >>> "email": ["airf...@airflow.com"], >>> "email_on_failure": True, >>> "email_on_retry": True, >>> }, >>> ... >>> ) as dag: >>> >>> becomes: >>> >>> from airflow.providers.smtp.notifications.smtp import SmtpNotifier >>> >>> with DAG( >>> default_args={ >>> on_failure_callback=SmtpNotifier(to="airf...@airflow.com"), >>> on_retry_callback=SmtpNotifier(to="airf...@airflow.com"), >>> }, >>> ... >>> ) as dag: >>> >>> >>> That is it. There are no "moving configurations". There are no judgment >>> calls. >>> I do not follow on why this is risky nor why it is expensive. >>> Is the issue you are worried about that this change can not be automated >> by >>> the migration tool we plan to have? >>> >>> Am I missing something? >>> >>> >>> On Tue, Jan 28, 2025 at 3:48 PM Michał Modras >>> <michalmod...@google.com.invalid> wrote: >>> >>>> +1 to what Jarek proposes. >>>> >>>> IMO, for Airflow to become not only a leading solution in its market, >> but >>>> also a de facto standard, we need to be super conscious about the >> effort >>>> required for users to maintain it and keep it up to date. It's a super >>>> common concern raised by the users, especially those, who don't only >> run >>>> Airflow on their laptops, but manage deployments of hundreds or >> thousands >>>> Airflow environments. >>>> >>>> If we lean on doing things that are right from a purely engineering >> point >>>> of view, but put cost on the users, we'll go in the opposite direction. >>> It >>>> might be seen as a sad reality from the perspective of engineering >>> purity, >>>> but *user* experience should come first, Airflow maintainer second. >>>> >>>> On Tue, Jan 28, 2025 at 12:01 AM Jarek Potiuk <ja...@potiuk.com> >> wrote: >>>> >>>>> I'd say - if it is a very used feature, and likely a lot of people >> use >>>> it, >>>>> the only way we can justify that change is if we can fully automate >> it >>>> with >>>>> near 100% certainty. This is one of the reasons we did not want to do >>>>> Templating change as breaking in 3.0 - because it was unlikely to be >>>> fully >>>>> automatically "patchable" (i.e. behave in the same way). >>>>> >>>>> I think we all agreed at the very beginning of Airflow 3 migration >> that >>>>> while it's perfectly fine to make a change that requires >> reconfiguring >>>>> airflow, any change that REQUIRES a big number of DAGs to be >> rewritten >>> is >>>>> no go. And at the very least we should only do that when there is a >>>> massive >>>>> gain for the users and us. I am not sure if we are breaking the DAGs, >>> the >>>>> "massive pain" is in this case justified by "massive gain". I simply >>>> don't >>>>> see the gain clearly. >>>>> >>>>> It's fine if the configuration has to be adapted by the admin person >>> for >>>>> everyone, but it's not fine if each user has to go and fix their >> DAGs. >>> In >>>>> many cases this will introduce months of delays with migration to >>>> Airflow >>>>> 3 for little reason - because the people who manage Airflow will >> have >>> to >>>>> get everyone who writes DAG (sometimes many people, some of them busy >>> or >>>>> unavailable at times) to apply and modify their changes, and >> migration >>> in >>>>> one "go" will be difficult. >>>>> >>>>> I'd be for option 2 but where email_on_failure would remain but will >>> be >>>>> deprecated and will be really the same as "notify_on_failure" - while >>> the >>>>> admin will have to configure default notifier in this case. Yes it >> will >>>>> make no sense if someone will set "email_on_failure" and notifier >> will >>> be >>>>> different from email but there will be a deprecation warning (and we >>> can >>>>> also raise it every time notification is done). I see no other >> problem >>>> with >>>>> that. >>>>> >>>>> J. >>>>> >>>>> On Mon, Jan 27, 2025 at 11:56 PM Hussein Awala <huss...@awala.fr> >>> wrote: >>>>> >>>>>>> But the cost is forcing all users of this popular feature to >> modify >>>>>> source code of their DAGs, putting another hurdle for Airflow 3 >>>> adoption. >>>>>> >>>>>> We're all Airflow users before being maintainers and contributors, >>> so I >>>>>> understand how difficult it can be to update code for a library >>>> upgrade. >>>>>> However, breaking changes are a normal part of major releases, and >>>> users >>>>>> should be prepared for them. We’re not asking users to update all >>> their >>>>>> code at once - any option we choose will be backported to 2.11, >>>> allowing >>>>>> users to clean up their code and resolve incompatibilities smoothly >>>>> before >>>>>> upgrading to Airflow 3.0. >>>>>> >>>>>> On Mon, Jan 27, 2025 at 11:36 PM Michał Modras >>>>>> <michalmod...@google.com.invalid> wrote: >>>>>> >>>>>>>> Actually, modifying DAGs code is the easiest part of the >> migration >>>>>>> >>>>>>> I strongly disagree. Modifying source code of DAGs might be easy >>> for >>>>>>> individual devs, but it is what blocks customers who have >>> significant >>>>>> size >>>>>>> deployments from perfoming upgrades and migrations. Even if the >>> code >>>>>> change >>>>>>> is seemingly simple and automated, it becomes a massive >> operational >>>>>>> overhead. >>>>>>> >>>>>>> To be honest, I don't think we're net positive on this cleanup - >>>> sure, >>>>>> the >>>>>>> code get cleaner, there are no potentially overlapping >> parameters, >>> we >>>>>>> separate email which is not core Airflow feature, etc. All great >>>> stuff. >>>>>> But >>>>>>> the cost is forcing all users of this popular feature to modify >>>> source >>>>>> code >>>>>>> of their DAGs, putting another hurdle for Airflow 3 adoption. >>>>>>> >>>>>>> pon., 27 sty 2025, 23:21 użytkownik Elad Kalif < >> elad...@apache.org >>>> >>>>>>> napisał: >>>>>>> >>>>>>>> I vote for option 1. complete removal. >>>>>>>> >>>>>>>> Email is just another integration. We should not favor it. >>>>>>>> >>>>>>>> I agree with Michal that it's massively used but I do not agree >>> on >>>>> the >>>>>>>> risk. >>>>>>>> >>>>>>>> From my point of view, the SmtpNotifier (Which has templates! >>>> thanks >>>>> to >>>>>>>> https://github.com/apache/airflow/pull/36226) has existed for >> a >>>> long >>>>>>> time. >>>>>>>> It provides almost all the needed functionality (except the >>>>>>>> default_email_on_failure) and the migration to it is very >> simple. >>>> We >>>>>> can >>>>>>>> catch all the cases with the upgrade tool we plan and alert >> users >>>>> where >>>>>>>> they need to make changes. >>>>>>>> >>>>>>>> Given that: >>>>>>>> a. We can prevent broken dags by catching all cases in >> migration >>>> tool >>>>>>>> b. Migration is easy. No thinking required. It's just a copy - >>>> paste >>>>>>>> replacement. >>>>>>>> c. There are no edge cases >>>>>>>> >>>>>>>> I think it's considered safe to continue with complete removal. >>>>>>>> Now, the question left is: Is it worth it? What value do we get >>>> from >>>>>>> doing >>>>>>>> that? >>>>>>>> I don't know about other people's experience but I am tired of >>>>>> explaining >>>>>>>> to users about the difference between on_failure_call_back and >>>>>>>> email_on_failure. What happens when someone use both, etc... >>>>>>>> It confuses so many people. We have a chance to get it simple. >>>> Let's >>>>> do >>>>>>>> that. >>>>>>>> >>>>>>>> However, My answer did not cover the default_email_on_failure. >>>>>>> Personally, >>>>>>>> I don't know anyone who actually uses it. >>>>>>>> In most Airflow envs I saw there is more than 1 person / 1 team >>> so >>>> it >>>>>>> makes >>>>>>>> little sense to have a global notification address. >>>>>>>> For those kind of deployments it's no issue because they are >> not >>>>> using >>>>>>>> it, for other, smaller deployments that do use it - I can argue >>>> that >>>>>> the >>>>>>>> change is very low impact as it's very easy to fix, but we can >>> also >>>>>>> modify >>>>>>>> the SMTP connection to hold default_to_email if we want to >>> mitigate >>>>>> that >>>>>>>> risk. >>>>>>>> >>>>>>>> On Tue, Jan 28, 2025 at 12:19 AM Hussein Awala < >> huss...@awala.fr >>>> >>>>>> wrote: >>>>>>>> >>>>>>>>>> avoiding forcing users to modify their DAGs code >>>>>>>>> >>>>>>>>> Actually, modifying DAGs code is the easiest part of the >>>> migration >>>>>>>>> for users, because we will provide migration rules via ruff. >>> For >>>>> more >>>>>>>>> details, please see >>>> https://github.com/apache/airflow/issues/41641 >>>>> . >>>>>>>>> >>>>>>>>> On Mon, Jan 27, 2025 at 11:01 PM Michał Modras >>>>>>>>> <michalmod...@google.com.invalid> wrote: >>>>>>>>> >>>>>>>>>> Email notifications are a *massively* used Airflow >> feature. I >>>>>> suggest >>>>>>>>> that >>>>>>>>>> any change does not break the current syntax of >> BaseOperator, >>>>>>> avoiding >>>>>>>>>> forcing users to modify their DAGs code. Changes on the >>> config >>>>> side >>>>>>> and >>>>>>>>>> provider separation should be fine though (seems separating >>>>>> provider >>>>>>>>> while >>>>>>>>>> keeping the parameters of BaseOperator would require some >>> safe >>>>>>>> defaults). >>>>>>>>>> >>>>>>>>>> On Mon, Jan 27, 2025 at 10:51 PM Hussein Awala < >>>> huss...@awala.fr >>>>>> >>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hello everyone! >>>>>>>>>>> >>>>>>>>>>> We had an implicit consensus about removing the >> email/SMTP >>>>>>>> integration >>>>>>>>>> from >>>>>>>>>>> Airflow Core and replacing it with the SMTP provider and >>>>>>>> SmtpNotifier. >>>>>>>>>> This >>>>>>>>>>> change will simplify Airflow's core, improve >>> maintainability, >>>>> and >>>>>>>>>> provide a >>>>>>>>>>> more consistent experience with other providers. Now is >> the >>>>> best >>>>>>>> moment >>>>>>>>>> to >>>>>>>>>>> do this before Airflow 3.0. >>>>>>>>>>> >>>>>>>>>>> All the Airflow configurations in the *smtp* section are >>>>>> supported >>>>>>> in >>>>>>>>> the >>>>>>>>>>> SMTP connection, so removing the whole section should >> have >>>>>> minimal >>>>>>>>> impact >>>>>>>>>>> on users. The migration effort is expected to be very >> low. >>>>>>>>>>> >>>>>>>>>>> This thread focuses on how best to handle the >>> configurations >>>>>>>> currently >>>>>>>>> in >>>>>>>>>>> the email section, mainly: >>>>>>>>>>> >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> *default_email_on_failure* >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> *default_email_on_retry* >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> *email_backend* >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> *email_conn_id* >>>>>>>>>>> >>>>>>>>>>> The other configurations could be replaced by connection >>>>> extras. >>>>>>>>>>> >>>>>>>>>>> Options for handling email configuration: >>>>>>>>>>> >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> Option 1: Complete Removal >>>>>>>>>>> >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> Remove the entire email section from the Airflow >>>>>>> configuration. >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> Remove the *email_on_failure* and *email_on_retry* >>>>>> arguments >>>>>>>> from >>>>>>>>>>> BaseOperator. >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> Implications: >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> Users will need to explicitly define notifiers >>>> (using >>>>>>>>>>> on_*_callbacks) for any task that requires email >>>>>>>>> notifications. >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> There will be no built-in mechanism for default >>>> email >>>>>>>>>>> notifications. >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> This offers the most significant simplification >> of >>>>>>> Airflow's >>>>>>>>>> core >>>>>>>>>>> but may increase the effort required to >> configure >>>>> email >>>>>>>>> alerts. >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> Option 2: Replacement Configurations >>>>>>>>>>> >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> Introduce new configuration options: >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> *default_notification_on_failure*: (boolean) >>> Enables >>>>>>> default >>>>>>>>>>> notifications on task failure. >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> *default_notification_on_retry*: (boolean) >> Enables >>>>>> default >>>>>>>>>>> notifications on task retry. >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> *default_notifier*: (string) Specifies a method >>> that >>>>>>>>> initializes >>>>>>>>>>> and returns the default notifier. >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> Replace *email_on_failure* and *email_on_retry* in >>>>>>> BaseOperator >>>>>>>>>> with >>>>>>>>>>> *notify_on_failure* and *notify_on_retry*, which >>> would >>>>>>> default >>>>>>>> to >>>>>>>>>> the >>>>>>>>>>> values set in the new configuration options. >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> Replace the old email handling behavior by simply >>>> adding >>>>>> the >>>>>>>>>> notifier >>>>>>>>>>> to the corresponding callback. >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> Implications: >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> Preserve the concept of default notifications >>> while >>>>>> moving >>>>>>>>> away >>>>>>>>>>> from the mail-specific configuration. >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> Support other notification types and send >>>>> notifications >>>>>> by >>>>>>>>>>> default. >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> Requires introducing new configuration options >> and >>>>>>> modifying >>>>>>>>>>> BaseOperator. >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> Option 3: Dedicated Notification Connection >>>>>>>>>>> >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> Create a new connection (of any type) with a >> specific >>>> and >>>>>>>>> reserved >>>>>>>>>>> name for notifications (e.g., >>>>> *notifications_connection*). >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> This connection would have extras to configure the >>>>> notifier >>>>>>> and >>>>>>>>>>> specify whether to send notifications by default. >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> Implications: >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> Provides a centralized and extensible way to >>> manage >>>>> all >>>>>>>>>>> notification settings without Airflow >>> configuration. >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> Allows users to easily switch between different >>>>>>> notification >>>>>>>>>>> methods and send notifications by default. >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> May require more initial setup for users but >>> offers >>>>>>> greater >>>>>>>>>>> flexibility in the long term. >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> Require more dev on our side. >>>>>>>>>>> >>>>>>>>>>> Regarding migration, a fully automated migration might >> not >>> be >>>>>>>> possible >>>>>>>>> in >>>>>>>>>>> all cases because of the need to migrate Airflow >>>>> configurations, >>>>>> we >>>>>>>>> will >>>>>>>>>>> provide comprehensive documentation and examples to >> assist >>>>> users >>>>>> in >>>>>>>>>>> transitioning their email configurations. >>>>>>>>>>> >>>>>>>>>>> I'm interested in hearing your thoughts and any other >>>>> suggestions >>>>>>> you >>>>>>>>> may >>>>>>>>>>> have. >>>>>>>>>>> >>>>>>>>>>> P.S: I have created an almost ready-to-merge PR for >> option >>> 1 >>>> ( >>>>>>>>>>> https://github.com/apache/airflow/pull/46041) >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org For additional commands, e-mail: dev-h...@airflow.apache.org