OK so this is the first case (I think?) of item that we want to deprecate in Airflow 3 rather than remove. I raised https://github.com/apache/airflow/pull/46594 to introduce RemovedInAirflow4Warning so we can use it to indicate the parameters are deprecated but still supported.
On Fri, Feb 7, 2025 at 11:20 AM Ash Berlin-Taylor <a...@apache.org> wrote: > +100 to what Jarek and Michal said. Changing DAG code for this will > seriously impact Airflow 3 adoption. > > We can move the code to a provider without having to change the DAG Author > interface can’t we? For instance we can change the DAG parser to convert > these options in to an SMTP notifier can’t we? > > -ash > > > On 28 Jan 2025, at 16:50, Jarek Potiuk <ja...@potiuk.com> wrote: > > > > I agree with Michał - it's not how simple it is, but that you do it at > all. > > Chance is that if you have 1000s of DAGs in your company and some of them > > use email notification, it means that vast majority of them will. And I > > already know of companies where there are 20+ teams working on those and > > having tens of changes in all of those dags happening a day, and wha we > > basically ask the "deployment managers" of those teams to make a > migration > > where all of those teams will have to - at the day of migration modify > > their DAGs to account for this change. > > > > And in many places it's difficult to keep consistency - some people will > > use one way and some will use other ways. > > > > Great that we see some code examples and what would happen, because it > > opens up my imagination and now I know that what I felt intuitively, is > > pretty much justified. > > I can imagine variations of the code, ALL at the same time implemented > in a > > single Airflow installation: > > > > COMMON CODE: > > common.get_emails_to_send_from(group): > > if group == "group1": > > return ["a@b.c"} > > if group. == "group2" > > return ["a@c.d"] > > > > with DAG( > > default_args={ > > "email": get_emails_to_send_from("my_group"), > > "email_on_failure": True, > > "email_on_retry": True, > > }, > > > > with DAG( > > default_args={ > > "email": [Variable.get("email_for_my_customer_x_dags")], > > "email_on_failure": True, > > "email_on_retry": True, > > }, > > > > team = os.environ.get("TEAM_MANAGING_THE_MACHINE") > > > > with DAG( > > default_args={ > > "email": [f"{team}@mycompany_com"], > > "email_on_failure": True, > > "email_on_retry": True, > > }, > > > > with DAG( > > default_args={ > > "email_on_failure": True, > > "email_on_retry": True > > ): > > op1 = FirstOperator( > > "email" = ["firstopera...@company.com"] > > } > > > > op2 = SecondOperator) > > "email" = ["secondopera...@company.com"] > > } > > > > > > And this is only the beginning, I can imagine 100s of other ways you > could > > set up notification schemes - and there are so many ways possible that I > > hardly imagine automation that could handle such a move. This would be > > pretty much a nightmare for such deployment managers who would have to > > track down and make sure that all such cases and usages will be converted > > "properly" - and DAG authors won't care. They will just want them to > "work" > > after migration. > > > > I don't think we can do more than deprecation and possibly converting it > to > > the common Notifiers automatically under the hood - to be honest. > > > > On Tue, Jan 28, 2025 at 2:23 PM Michał Modras > > <michalmod...@google.com.invalid> wrote: > > > >> I am concerned simply because it is a physical code change, and one that > >> would affect a good percentage of DAGs out there. No matter how complex > the > >> change is, it forces the users to modify their code, which is huge > >> operational overhead in bigger organizations. Imagine - there could be a > >> central platform team responsible for migrating to Airflow 3. But then, > >> before Airflow 3 deployments can be used, each team using particular > >> Airflow deployment would need to modify the code. At the same time, the > >> platform team might not be permitted to touch code of these teams - > >> these are different personas. It can easily become a very complex > migration > >> procedure to expedite across the organization, even if the code change > >> itself is simple. > >> > >> On Tue, Jan 28, 2025 at 3:05 PM Elad Kalif <elad...@apache.org> wrote: > >> > >>> I'd like to provide example: > >>> > >>> with DAG( > >>> default_args={ > >>> "email": ["airf...@airflow.com"], > >>> "email_on_failure": True, > >>> "email_on_retry": True, > >>> }, > >>> ... > >>> ) as dag: > >>> > >>> becomes: > >>> > >>> from airflow.providers.smtp.notifications.smtp import SmtpNotifier > >>> > >>> with DAG( > >>> default_args={ > >>> on_failure_callback=SmtpNotifier(to="airf...@airflow.com"), > >>> on_retry_callback=SmtpNotifier(to="airf...@airflow.com"), > >>> }, > >>> ... > >>> ) as dag: > >>> > >>> > >>> That is it. There are no "moving configurations". There are no judgment > >>> calls. > >>> I do not follow on why this is risky nor why it is expensive. > >>> Is the issue you are worried about that this change can not be > automated > >> by > >>> the migration tool we plan to have? > >>> > >>> Am I missing something? > >>> > >>> > >>> On Tue, Jan 28, 2025 at 3:48 PM Michał Modras > >>> <michalmod...@google.com.invalid> wrote: > >>> > >>>> +1 to what Jarek proposes. > >>>> > >>>> IMO, for Airflow to become not only a leading solution in its market, > >> but > >>>> also a de facto standard, we need to be super conscious about the > >> effort > >>>> required for users to maintain it and keep it up to date. It's a super > >>>> common concern raised by the users, especially those, who don't only > >> run > >>>> Airflow on their laptops, but manage deployments of hundreds or > >> thousands > >>>> Airflow environments. > >>>> > >>>> If we lean on doing things that are right from a purely engineering > >> point > >>>> of view, but put cost on the users, we'll go in the opposite > direction. > >>> It > >>>> might be seen as a sad reality from the perspective of engineering > >>> purity, > >>>> but *user* experience should come first, Airflow maintainer second. > >>>> > >>>> On Tue, Jan 28, 2025 at 12:01 AM Jarek Potiuk <ja...@potiuk.com> > >> wrote: > >>>> > >>>>> I'd say - if it is a very used feature, and likely a lot of people > >> use > >>>> it, > >>>>> the only way we can justify that change is if we can fully automate > >> it > >>>> with > >>>>> near 100% certainty. This is one of the reasons we did not want to do > >>>>> Templating change as breaking in 3.0 - because it was unlikely to be > >>>> fully > >>>>> automatically "patchable" (i.e. behave in the same way). > >>>>> > >>>>> I think we all agreed at the very beginning of Airflow 3 migration > >> that > >>>>> while it's perfectly fine to make a change that requires > >> reconfiguring > >>>>> airflow, any change that REQUIRES a big number of DAGs to be > >> rewritten > >>> is > >>>>> no go. And at the very least we should only do that when there is a > >>>> massive > >>>>> gain for the users and us. I am not sure if we are breaking the DAGs, > >>> the > >>>>> "massive pain" is in this case justified by "massive gain". I simply > >>>> don't > >>>>> see the gain clearly. > >>>>> > >>>>> It's fine if the configuration has to be adapted by the admin person > >>> for > >>>>> everyone, but it's not fine if each user has to go and fix their > >> DAGs. > >>> In > >>>>> many cases this will introduce months of delays with migration to > >>>> Airflow > >>>>> 3 for little reason - because the people who manage Airflow will > >> have > >>> to > >>>>> get everyone who writes DAG (sometimes many people, some of them busy > >>> or > >>>>> unavailable at times) to apply and modify their changes, and > >> migration > >>> in > >>>>> one "go" will be difficult. > >>>>> > >>>>> I'd be for option 2 but where email_on_failure would remain but will > >>> be > >>>>> deprecated and will be really the same as "notify_on_failure" - while > >>> the > >>>>> admin will have to configure default notifier in this case. Yes it > >> will > >>>>> make no sense if someone will set "email_on_failure" and notifier > >> will > >>> be > >>>>> different from email but there will be a deprecation warning (and we > >>> can > >>>>> also raise it every time notification is done). I see no other > >> problem > >>>> with > >>>>> that. > >>>>> > >>>>> J. > >>>>> > >>>>> On Mon, Jan 27, 2025 at 11:56 PM Hussein Awala <huss...@awala.fr> > >>> wrote: > >>>>> > >>>>>>> But the cost is forcing all users of this popular feature to > >> modify > >>>>>> source code of their DAGs, putting another hurdle for Airflow 3 > >>>> adoption. > >>>>>> > >>>>>> We're all Airflow users before being maintainers and contributors, > >>> so I > >>>>>> understand how difficult it can be to update code for a library > >>>> upgrade. > >>>>>> However, breaking changes are a normal part of major releases, and > >>>> users > >>>>>> should be prepared for them. We’re not asking users to update all > >>> their > >>>>>> code at once - any option we choose will be backported to 2.11, > >>>> allowing > >>>>>> users to clean up their code and resolve incompatibilities smoothly > >>>>> before > >>>>>> upgrading to Airflow 3.0. > >>>>>> > >>>>>> On Mon, Jan 27, 2025 at 11:36 PM Michał Modras > >>>>>> <michalmod...@google.com.invalid> wrote: > >>>>>> > >>>>>>>> Actually, modifying DAGs code is the easiest part of the > >> migration > >>>>>>> > >>>>>>> I strongly disagree. Modifying source code of DAGs might be easy > >>> for > >>>>>>> individual devs, but it is what blocks customers who have > >>> significant > >>>>>> size > >>>>>>> deployments from perfoming upgrades and migrations. Even if the > >>> code > >>>>>> change > >>>>>>> is seemingly simple and automated, it becomes a massive > >> operational > >>>>>>> overhead. > >>>>>>> > >>>>>>> To be honest, I don't think we're net positive on this cleanup - > >>>> sure, > >>>>>> the > >>>>>>> code get cleaner, there are no potentially overlapping > >> parameters, > >>> we > >>>>>>> separate email which is not core Airflow feature, etc. All great > >>>> stuff. > >>>>>> But > >>>>>>> the cost is forcing all users of this popular feature to modify > >>>> source > >>>>>> code > >>>>>>> of their DAGs, putting another hurdle for Airflow 3 adoption. > >>>>>>> > >>>>>>> pon., 27 sty 2025, 23:21 użytkownik Elad Kalif < > >> elad...@apache.org > >>>> > >>>>>>> napisał: > >>>>>>> > >>>>>>>> I vote for option 1. complete removal. > >>>>>>>> > >>>>>>>> Email is just another integration. We should not favor it. > >>>>>>>> > >>>>>>>> I agree with Michal that it's massively used but I do not agree > >>> on > >>>>> the > >>>>>>>> risk. > >>>>>>>> > >>>>>>>> From my point of view, the SmtpNotifier (Which has templates! > >>>> thanks > >>>>> to > >>>>>>>> https://github.com/apache/airflow/pull/36226) has existed for > >> a > >>>> long > >>>>>>> time. > >>>>>>>> It provides almost all the needed functionality (except the > >>>>>>>> default_email_on_failure) and the migration to it is very > >> simple. > >>>> We > >>>>>> can > >>>>>>>> catch all the cases with the upgrade tool we plan and alert > >> users > >>>>> where > >>>>>>>> they need to make changes. > >>>>>>>> > >>>>>>>> Given that: > >>>>>>>> a. We can prevent broken dags by catching all cases in > >> migration > >>>> tool > >>>>>>>> b. Migration is easy. No thinking required. It's just a copy - > >>>> paste > >>>>>>>> replacement. > >>>>>>>> c. There are no edge cases > >>>>>>>> > >>>>>>>> I think it's considered safe to continue with complete removal. > >>>>>>>> Now, the question left is: Is it worth it? What value do we get > >>>> from > >>>>>>> doing > >>>>>>>> that? > >>>>>>>> I don't know about other people's experience but I am tired of > >>>>>> explaining > >>>>>>>> to users about the difference between on_failure_call_back and > >>>>>>>> email_on_failure. What happens when someone use both, etc... > >>>>>>>> It confuses so many people. We have a chance to get it simple. > >>>> Let's > >>>>> do > >>>>>>>> that. > >>>>>>>> > >>>>>>>> However, My answer did not cover the default_email_on_failure. > >>>>>>> Personally, > >>>>>>>> I don't know anyone who actually uses it. > >>>>>>>> In most Airflow envs I saw there is more than 1 person / 1 team > >>> so > >>>> it > >>>>>>> makes > >>>>>>>> little sense to have a global notification address. > >>>>>>>> For those kind of deployments it's no issue because they are > >> not > >>>>> using > >>>>>>>> it, for other, smaller deployments that do use it - I can argue > >>>> that > >>>>>> the > >>>>>>>> change is very low impact as it's very easy to fix, but we can > >>> also > >>>>>>> modify > >>>>>>>> the SMTP connection to hold default_to_email if we want to > >>> mitigate > >>>>>> that > >>>>>>>> risk. > >>>>>>>> > >>>>>>>> On Tue, Jan 28, 2025 at 12:19 AM Hussein Awala < > >> huss...@awala.fr > >>>> > >>>>>> wrote: > >>>>>>>> > >>>>>>>>>> avoiding forcing users to modify their DAGs code > >>>>>>>>> > >>>>>>>>> Actually, modifying DAGs code is the easiest part of the > >>>> migration > >>>>>>>>> for users, because we will provide migration rules via ruff. > >>> For > >>>>> more > >>>>>>>>> details, please see > >>>> https://github.com/apache/airflow/issues/41641 > >>>>> . > >>>>>>>>> > >>>>>>>>> On Mon, Jan 27, 2025 at 11:01 PM Michał Modras > >>>>>>>>> <michalmod...@google.com.invalid> wrote: > >>>>>>>>> > >>>>>>>>>> Email notifications are a *massively* used Airflow > >> feature. I > >>>>>> suggest > >>>>>>>>> that > >>>>>>>>>> any change does not break the current syntax of > >> BaseOperator, > >>>>>>> avoiding > >>>>>>>>>> forcing users to modify their DAGs code. Changes on the > >>> config > >>>>> side > >>>>>>> and > >>>>>>>>>> provider separation should be fine though (seems separating > >>>>>> provider > >>>>>>>>> while > >>>>>>>>>> keeping the parameters of BaseOperator would require some > >>> safe > >>>>>>>> defaults). > >>>>>>>>>> > >>>>>>>>>> On Mon, Jan 27, 2025 at 10:51 PM Hussein Awala < > >>>> huss...@awala.fr > >>>>>> > >>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>>> Hello everyone! > >>>>>>>>>>> > >>>>>>>>>>> We had an implicit consensus about removing the > >> email/SMTP > >>>>>>>> integration > >>>>>>>>>> from > >>>>>>>>>>> Airflow Core and replacing it with the SMTP provider and > >>>>>>>> SmtpNotifier. > >>>>>>>>>> This > >>>>>>>>>>> change will simplify Airflow's core, improve > >>> maintainability, > >>>>> and > >>>>>>>>>> provide a > >>>>>>>>>>> more consistent experience with other providers. Now is > >> the > >>>>> best > >>>>>>>> moment > >>>>>>>>>> to > >>>>>>>>>>> do this before Airflow 3.0. > >>>>>>>>>>> > >>>>>>>>>>> All the Airflow configurations in the *smtp* section are > >>>>>> supported > >>>>>>> in > >>>>>>>>> the > >>>>>>>>>>> SMTP connection, so removing the whole section should > >> have > >>>>>> minimal > >>>>>>>>> impact > >>>>>>>>>>> on users. The migration effort is expected to be very > >> low. > >>>>>>>>>>> > >>>>>>>>>>> This thread focuses on how best to handle the > >>> configurations > >>>>>>>> currently > >>>>>>>>> in > >>>>>>>>>>> the email section, mainly: > >>>>>>>>>>> > >>>>>>>>>>> - > >>>>>>>>>>> > >>>>>>>>>>> *default_email_on_failure* > >>>>>>>>>>> - > >>>>>>>>>>> > >>>>>>>>>>> *default_email_on_retry* > >>>>>>>>>>> - > >>>>>>>>>>> > >>>>>>>>>>> *email_backend* > >>>>>>>>>>> - > >>>>>>>>>>> > >>>>>>>>>>> *email_conn_id* > >>>>>>>>>>> > >>>>>>>>>>> The other configurations could be replaced by connection > >>>>> extras. > >>>>>>>>>>> > >>>>>>>>>>> Options for handling email configuration: > >>>>>>>>>>> > >>>>>>>>>>> - > >>>>>>>>>>> > >>>>>>>>>>> Option 1: Complete Removal > >>>>>>>>>>> > >>>>>>>>>>> - > >>>>>>>>>>> > >>>>>>>>>>> Remove the entire email section from the Airflow > >>>>>>> configuration. > >>>>>>>>>>> - > >>>>>>>>>>> > >>>>>>>>>>> Remove the *email_on_failure* and *email_on_retry* > >>>>>> arguments > >>>>>>>> from > >>>>>>>>>>> BaseOperator. > >>>>>>>>>>> - > >>>>>>>>>>> > >>>>>>>>>>> Implications: > >>>>>>>>>>> - > >>>>>>>>>>> > >>>>>>>>>>> Users will need to explicitly define notifiers > >>>> (using > >>>>>>>>>>> on_*_callbacks) for any task that requires email > >>>>>>>>> notifications. > >>>>>>>>>>> - > >>>>>>>>>>> > >>>>>>>>>>> There will be no built-in mechanism for default > >>>> email > >>>>>>>>>>> notifications. > >>>>>>>>>>> - > >>>>>>>>>>> > >>>>>>>>>>> This offers the most significant simplification > >> of > >>>>>>> Airflow's > >>>>>>>>>> core > >>>>>>>>>>> but may increase the effort required to > >> configure > >>>>> email > >>>>>>>>> alerts. > >>>>>>>>>>> - > >>>>>>>>>>> > >>>>>>>>>>> Option 2: Replacement Configurations > >>>>>>>>>>> > >>>>>>>>>>> - > >>>>>>>>>>> > >>>>>>>>>>> Introduce new configuration options: > >>>>>>>>>>> - > >>>>>>>>>>> > >>>>>>>>>>> *default_notification_on_failure*: (boolean) > >>> Enables > >>>>>>> default > >>>>>>>>>>> notifications on task failure. > >>>>>>>>>>> - > >>>>>>>>>>> > >>>>>>>>>>> *default_notification_on_retry*: (boolean) > >> Enables > >>>>>> default > >>>>>>>>>>> notifications on task retry. > >>>>>>>>>>> - > >>>>>>>>>>> > >>>>>>>>>>> *default_notifier*: (string) Specifies a method > >>> that > >>>>>>>>> initializes > >>>>>>>>>>> and returns the default notifier. > >>>>>>>>>>> - > >>>>>>>>>>> > >>>>>>>>>>> Replace *email_on_failure* and *email_on_retry* in > >>>>>>> BaseOperator > >>>>>>>>>> with > >>>>>>>>>>> *notify_on_failure* and *notify_on_retry*, which > >>> would > >>>>>>> default > >>>>>>>> to > >>>>>>>>>> the > >>>>>>>>>>> values set in the new configuration options. > >>>>>>>>>>> - > >>>>>>>>>>> > >>>>>>>>>>> Replace the old email handling behavior by simply > >>>> adding > >>>>>> the > >>>>>>>>>> notifier > >>>>>>>>>>> to the corresponding callback. > >>>>>>>>>>> - > >>>>>>>>>>> > >>>>>>>>>>> Implications: > >>>>>>>>>>> - > >>>>>>>>>>> > >>>>>>>>>>> Preserve the concept of default notifications > >>> while > >>>>>> moving > >>>>>>>>> away > >>>>>>>>>>> from the mail-specific configuration. > >>>>>>>>>>> - > >>>>>>>>>>> > >>>>>>>>>>> Support other notification types and send > >>>>> notifications > >>>>>> by > >>>>>>>>>>> default. > >>>>>>>>>>> - > >>>>>>>>>>> > >>>>>>>>>>> Requires introducing new configuration options > >> and > >>>>>>> modifying > >>>>>>>>>>> BaseOperator. > >>>>>>>>>>> - > >>>>>>>>>>> > >>>>>>>>>>> Option 3: Dedicated Notification Connection > >>>>>>>>>>> > >>>>>>>>>>> - > >>>>>>>>>>> > >>>>>>>>>>> Create a new connection (of any type) with a > >> specific > >>>> and > >>>>>>>>> reserved > >>>>>>>>>>> name for notifications (e.g., > >>>>> *notifications_connection*). > >>>>>>>>>>> - > >>>>>>>>>>> > >>>>>>>>>>> This connection would have extras to configure the > >>>>> notifier > >>>>>>> and > >>>>>>>>>>> specify whether to send notifications by default. > >>>>>>>>>>> - > >>>>>>>>>>> > >>>>>>>>>>> Implications: > >>>>>>>>>>> - > >>>>>>>>>>> > >>>>>>>>>>> Provides a centralized and extensible way to > >>> manage > >>>>> all > >>>>>>>>>>> notification settings without Airflow > >>> configuration. > >>>>>>>>>>> - > >>>>>>>>>>> > >>>>>>>>>>> Allows users to easily switch between different > >>>>>>> notification > >>>>>>>>>>> methods and send notifications by default. > >>>>>>>>>>> - > >>>>>>>>>>> > >>>>>>>>>>> May require more initial setup for users but > >>> offers > >>>>>>> greater > >>>>>>>>>>> flexibility in the long term. > >>>>>>>>>>> - > >>>>>>>>>>> > >>>>>>>>>>> Require more dev on our side. > >>>>>>>>>>> > >>>>>>>>>>> Regarding migration, a fully automated migration might > >> not > >>> be > >>>>>>>> possible > >>>>>>>>> in > >>>>>>>>>>> all cases because of the need to migrate Airflow > >>>>> configurations, > >>>>>> we > >>>>>>>>> will > >>>>>>>>>>> provide comprehensive documentation and examples to > >> assist > >>>>> users > >>>>>> in > >>>>>>>>>>> transitioning their email configurations. > >>>>>>>>>>> > >>>>>>>>>>> I'm interested in hearing your thoughts and any other > >>>>> suggestions > >>>>>>> you > >>>>>>>>> may > >>>>>>>>>>> have. > >>>>>>>>>>> > >>>>>>>>>>> P.S: I have created an almost ready-to-merge PR for > >> option > >>> 1 > >>>> ( > >>>>>>>>>>> https://github.com/apache/airflow/pull/46041) > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>> > >>> > >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org > For additional commands, e-mail: dev-h...@airflow.apache.org > >