PR https://github.com/apache/airflow/pull/47146 to depreciate the parameters in Airflow 3 I believe this closes the discussion on that topic for now
On Sun, Feb 9, 2025 at 8:34 AM Elad Kalif <elad...@apache.org> wrote: > OK so this is the first case (I think?) of item that we want to deprecate > in Airflow 3 rather than remove. > I raised https://github.com/apache/airflow/pull/46594 to > introduce RemovedInAirflow4Warning so we can use it to indicate the > parameters are deprecated but still supported. > > On Fri, Feb 7, 2025 at 11:20 AM Ash Berlin-Taylor <a...@apache.org> wrote: > >> +100 to what Jarek and Michal said. Changing DAG code for this will >> seriously impact Airflow 3 adoption. >> >> We can move the code to a provider without having to change the DAG >> Author interface can’t we? For instance we can change the DAG parser to >> convert these options in to an SMTP notifier can’t we? >> >> -ash >> >> > On 28 Jan 2025, at 16:50, Jarek Potiuk <ja...@potiuk.com> wrote: >> > >> > I agree with Michał - it's not how simple it is, but that you do it at >> all. >> > Chance is that if you have 1000s of DAGs in your company and some of >> them >> > use email notification, it means that vast majority of them will. And I >> > already know of companies where there are 20+ teams working on those and >> > having tens of changes in all of those dags happening a day, and wha we >> > basically ask the "deployment managers" of those teams to make a >> migration >> > where all of those teams will have to - at the day of migration modify >> > their DAGs to account for this change. >> > >> > And in many places it's difficult to keep consistency - some people will >> > use one way and some will use other ways. >> > >> > Great that we see some code examples and what would happen, because it >> > opens up my imagination and now I know that what I felt intuitively, is >> > pretty much justified. >> > I can imagine variations of the code, ALL at the same time implemented >> in a >> > single Airflow installation: >> > >> > COMMON CODE: >> > common.get_emails_to_send_from(group): >> > if group == "group1": >> > return ["a@b.c"} >> > if group. == "group2" >> > return ["a@c.d"] >> > >> > with DAG( >> > default_args={ >> > "email": get_emails_to_send_from("my_group"), >> > "email_on_failure": True, >> > "email_on_retry": True, >> > }, >> > >> > with DAG( >> > default_args={ >> > "email": [Variable.get("email_for_my_customer_x_dags")], >> > "email_on_failure": True, >> > "email_on_retry": True, >> > }, >> > >> > team = os.environ.get("TEAM_MANAGING_THE_MACHINE") >> > >> > with DAG( >> > default_args={ >> > "email": [f"{team}@mycompany_com"], >> > "email_on_failure": True, >> > "email_on_retry": True, >> > }, >> > >> > with DAG( >> > default_args={ >> > "email_on_failure": True, >> > "email_on_retry": True >> > ): >> > op1 = FirstOperator( >> > "email" = ["firstopera...@company.com"] >> > } >> > >> > op2 = SecondOperator) >> > "email" = ["secondopera...@company.com"] >> > } >> > >> > >> > And this is only the beginning, I can imagine 100s of other ways you >> could >> > set up notification schemes - and there are so many ways possible that I >> > hardly imagine automation that could handle such a move. This would be >> > pretty much a nightmare for such deployment managers who would have to >> > track down and make sure that all such cases and usages will be >> converted >> > "properly" - and DAG authors won't care. They will just want them to >> "work" >> > after migration. >> > >> > I don't think we can do more than deprecation and possibly converting >> it to >> > the common Notifiers automatically under the hood - to be honest. >> > >> > On Tue, Jan 28, 2025 at 2:23 PM Michał Modras >> > <michalmod...@google.com.invalid> wrote: >> > >> >> I am concerned simply because it is a physical code change, and one >> that >> >> would affect a good percentage of DAGs out there. No matter how >> complex the >> >> change is, it forces the users to modify their code, which is huge >> >> operational overhead in bigger organizations. Imagine - there could be >> a >> >> central platform team responsible for migrating to Airflow 3. But then, >> >> before Airflow 3 deployments can be used, each team using particular >> >> Airflow deployment would need to modify the code. At the same time, the >> >> platform team might not be permitted to touch code of these teams - >> >> these are different personas. It can easily become a very complex >> migration >> >> procedure to expedite across the organization, even if the code change >> >> itself is simple. >> >> >> >> On Tue, Jan 28, 2025 at 3:05 PM Elad Kalif <elad...@apache.org> wrote: >> >> >> >>> I'd like to provide example: >> >>> >> >>> with DAG( >> >>> default_args={ >> >>> "email": ["airf...@airflow.com"], >> >>> "email_on_failure": True, >> >>> "email_on_retry": True, >> >>> }, >> >>> ... >> >>> ) as dag: >> >>> >> >>> becomes: >> >>> >> >>> from airflow.providers.smtp.notifications.smtp import SmtpNotifier >> >>> >> >>> with DAG( >> >>> default_args={ >> >>> on_failure_callback=SmtpNotifier(to="airf...@airflow.com"), >> >>> on_retry_callback=SmtpNotifier(to="airf...@airflow.com"), >> >>> }, >> >>> ... >> >>> ) as dag: >> >>> >> >>> >> >>> That is it. There are no "moving configurations". There are no >> judgment >> >>> calls. >> >>> I do not follow on why this is risky nor why it is expensive. >> >>> Is the issue you are worried about that this change can not be >> automated >> >> by >> >>> the migration tool we plan to have? >> >>> >> >>> Am I missing something? >> >>> >> >>> >> >>> On Tue, Jan 28, 2025 at 3:48 PM Michał Modras >> >>> <michalmod...@google.com.invalid> wrote: >> >>> >> >>>> +1 to what Jarek proposes. >> >>>> >> >>>> IMO, for Airflow to become not only a leading solution in its market, >> >> but >> >>>> also a de facto standard, we need to be super conscious about the >> >> effort >> >>>> required for users to maintain it and keep it up to date. It's a >> super >> >>>> common concern raised by the users, especially those, who don't only >> >> run >> >>>> Airflow on their laptops, but manage deployments of hundreds or >> >> thousands >> >>>> Airflow environments. >> >>>> >> >>>> If we lean on doing things that are right from a purely engineering >> >> point >> >>>> of view, but put cost on the users, we'll go in the opposite >> direction. >> >>> It >> >>>> might be seen as a sad reality from the perspective of engineering >> >>> purity, >> >>>> but *user* experience should come first, Airflow maintainer second. >> >>>> >> >>>> On Tue, Jan 28, 2025 at 12:01 AM Jarek Potiuk <ja...@potiuk.com> >> >> wrote: >> >>>> >> >>>>> I'd say - if it is a very used feature, and likely a lot of people >> >> use >> >>>> it, >> >>>>> the only way we can justify that change is if we can fully automate >> >> it >> >>>> with >> >>>>> near 100% certainty. This is one of the reasons we did not want to >> do >> >>>>> Templating change as breaking in 3.0 - because it was unlikely to >> be >> >>>> fully >> >>>>> automatically "patchable" (i.e. behave in the same way). >> >>>>> >> >>>>> I think we all agreed at the very beginning of Airflow 3 migration >> >> that >> >>>>> while it's perfectly fine to make a change that requires >> >> reconfiguring >> >>>>> airflow, any change that REQUIRES a big number of DAGs to be >> >> rewritten >> >>> is >> >>>>> no go. And at the very least we should only do that when there is a >> >>>> massive >> >>>>> gain for the users and us. I am not sure if we are breaking the >> DAGs, >> >>> the >> >>>>> "massive pain" is in this case justified by "massive gain". I simply >> >>>> don't >> >>>>> see the gain clearly. >> >>>>> >> >>>>> It's fine if the configuration has to be adapted by the admin person >> >>> for >> >>>>> everyone, but it's not fine if each user has to go and fix their >> >> DAGs. >> >>> In >> >>>>> many cases this will introduce months of delays with migration to >> >>>> Airflow >> >>>>> 3 for little reason - because the people who manage Airflow will >> >> have >> >>> to >> >>>>> get everyone who writes DAG (sometimes many people, some of them >> busy >> >>> or >> >>>>> unavailable at times) to apply and modify their changes, and >> >> migration >> >>> in >> >>>>> one "go" will be difficult. >> >>>>> >> >>>>> I'd be for option 2 but where email_on_failure would remain but >> will >> >>> be >> >>>>> deprecated and will be really the same as "notify_on_failure" - >> while >> >>> the >> >>>>> admin will have to configure default notifier in this case. Yes it >> >> will >> >>>>> make no sense if someone will set "email_on_failure" and notifier >> >> will >> >>> be >> >>>>> different from email but there will be a deprecation warning (and we >> >>> can >> >>>>> also raise it every time notification is done). I see no other >> >> problem >> >>>> with >> >>>>> that. >> >>>>> >> >>>>> J. >> >>>>> >> >>>>> On Mon, Jan 27, 2025 at 11:56 PM Hussein Awala <huss...@awala.fr> >> >>> wrote: >> >>>>> >> >>>>>>> But the cost is forcing all users of this popular feature to >> >> modify >> >>>>>> source code of their DAGs, putting another hurdle for Airflow 3 >> >>>> adoption. >> >>>>>> >> >>>>>> We're all Airflow users before being maintainers and contributors, >> >>> so I >> >>>>>> understand how difficult it can be to update code for a library >> >>>> upgrade. >> >>>>>> However, breaking changes are a normal part of major releases, and >> >>>> users >> >>>>>> should be prepared for them. We’re not asking users to update all >> >>> their >> >>>>>> code at once - any option we choose will be backported to 2.11, >> >>>> allowing >> >>>>>> users to clean up their code and resolve incompatibilities smoothly >> >>>>> before >> >>>>>> upgrading to Airflow 3.0. >> >>>>>> >> >>>>>> On Mon, Jan 27, 2025 at 11:36 PM Michał Modras >> >>>>>> <michalmod...@google.com.invalid> wrote: >> >>>>>> >> >>>>>>>> Actually, modifying DAGs code is the easiest part of the >> >> migration >> >>>>>>> >> >>>>>>> I strongly disagree. Modifying source code of DAGs might be easy >> >>> for >> >>>>>>> individual devs, but it is what blocks customers who have >> >>> significant >> >>>>>> size >> >>>>>>> deployments from perfoming upgrades and migrations. Even if the >> >>> code >> >>>>>> change >> >>>>>>> is seemingly simple and automated, it becomes a massive >> >> operational >> >>>>>>> overhead. >> >>>>>>> >> >>>>>>> To be honest, I don't think we're net positive on this cleanup - >> >>>> sure, >> >>>>>> the >> >>>>>>> code get cleaner, there are no potentially overlapping >> >> parameters, >> >>> we >> >>>>>>> separate email which is not core Airflow feature, etc. All great >> >>>> stuff. >> >>>>>> But >> >>>>>>> the cost is forcing all users of this popular feature to modify >> >>>> source >> >>>>>> code >> >>>>>>> of their DAGs, putting another hurdle for Airflow 3 adoption. >> >>>>>>> >> >>>>>>> pon., 27 sty 2025, 23:21 użytkownik Elad Kalif < >> >> elad...@apache.org >> >>>> >> >>>>>>> napisał: >> >>>>>>> >> >>>>>>>> I vote for option 1. complete removal. >> >>>>>>>> >> >>>>>>>> Email is just another integration. We should not favor it. >> >>>>>>>> >> >>>>>>>> I agree with Michal that it's massively used but I do not agree >> >>> on >> >>>>> the >> >>>>>>>> risk. >> >>>>>>>> >> >>>>>>>> From my point of view, the SmtpNotifier (Which has templates! >> >>>> thanks >> >>>>> to >> >>>>>>>> https://github.com/apache/airflow/pull/36226) has existed for >> >> a >> >>>> long >> >>>>>>> time. >> >>>>>>>> It provides almost all the needed functionality (except the >> >>>>>>>> default_email_on_failure) and the migration to it is very >> >> simple. >> >>>> We >> >>>>>> can >> >>>>>>>> catch all the cases with the upgrade tool we plan and alert >> >> users >> >>>>> where >> >>>>>>>> they need to make changes. >> >>>>>>>> >> >>>>>>>> Given that: >> >>>>>>>> a. We can prevent broken dags by catching all cases in >> >> migration >> >>>> tool >> >>>>>>>> b. Migration is easy. No thinking required. It's just a copy - >> >>>> paste >> >>>>>>>> replacement. >> >>>>>>>> c. There are no edge cases >> >>>>>>>> >> >>>>>>>> I think it's considered safe to continue with complete removal. >> >>>>>>>> Now, the question left is: Is it worth it? What value do we get >> >>>> from >> >>>>>>> doing >> >>>>>>>> that? >> >>>>>>>> I don't know about other people's experience but I am tired of >> >>>>>> explaining >> >>>>>>>> to users about the difference between on_failure_call_back and >> >>>>>>>> email_on_failure. What happens when someone use both, etc... >> >>>>>>>> It confuses so many people. We have a chance to get it simple. >> >>>> Let's >> >>>>> do >> >>>>>>>> that. >> >>>>>>>> >> >>>>>>>> However, My answer did not cover the default_email_on_failure. >> >>>>>>> Personally, >> >>>>>>>> I don't know anyone who actually uses it. >> >>>>>>>> In most Airflow envs I saw there is more than 1 person / 1 team >> >>> so >> >>>> it >> >>>>>>> makes >> >>>>>>>> little sense to have a global notification address. >> >>>>>>>> For those kind of deployments it's no issue because they are >> >> not >> >>>>> using >> >>>>>>>> it, for other, smaller deployments that do use it - I can argue >> >>>> that >> >>>>>> the >> >>>>>>>> change is very low impact as it's very easy to fix, but we can >> >>> also >> >>>>>>> modify >> >>>>>>>> the SMTP connection to hold default_to_email if we want to >> >>> mitigate >> >>>>>> that >> >>>>>>>> risk. >> >>>>>>>> >> >>>>>>>> On Tue, Jan 28, 2025 at 12:19 AM Hussein Awala < >> >> huss...@awala.fr >> >>>> >> >>>>>> wrote: >> >>>>>>>> >> >>>>>>>>>> avoiding forcing users to modify their DAGs code >> >>>>>>>>> >> >>>>>>>>> Actually, modifying DAGs code is the easiest part of the >> >>>> migration >> >>>>>>>>> for users, because we will provide migration rules via ruff. >> >>> For >> >>>>> more >> >>>>>>>>> details, please see >> >>>> https://github.com/apache/airflow/issues/41641 >> >>>>> . >> >>>>>>>>> >> >>>>>>>>> On Mon, Jan 27, 2025 at 11:01 PM Michał Modras >> >>>>>>>>> <michalmod...@google.com.invalid> wrote: >> >>>>>>>>> >> >>>>>>>>>> Email notifications are a *massively* used Airflow >> >> feature. I >> >>>>>> suggest >> >>>>>>>>> that >> >>>>>>>>>> any change does not break the current syntax of >> >> BaseOperator, >> >>>>>>> avoiding >> >>>>>>>>>> forcing users to modify their DAGs code. Changes on the >> >>> config >> >>>>> side >> >>>>>>> and >> >>>>>>>>>> provider separation should be fine though (seems separating >> >>>>>> provider >> >>>>>>>>> while >> >>>>>>>>>> keeping the parameters of BaseOperator would require some >> >>> safe >> >>>>>>>> defaults). >> >>>>>>>>>> >> >>>>>>>>>> On Mon, Jan 27, 2025 at 10:51 PM Hussein Awala < >> >>>> huss...@awala.fr >> >>>>>> >> >>>>>>>> wrote: >> >>>>>>>>>> >> >>>>>>>>>>> Hello everyone! >> >>>>>>>>>>> >> >>>>>>>>>>> We had an implicit consensus about removing the >> >> email/SMTP >> >>>>>>>> integration >> >>>>>>>>>> from >> >>>>>>>>>>> Airflow Core and replacing it with the SMTP provider and >> >>>>>>>> SmtpNotifier. >> >>>>>>>>>> This >> >>>>>>>>>>> change will simplify Airflow's core, improve >> >>> maintainability, >> >>>>> and >> >>>>>>>>>> provide a >> >>>>>>>>>>> more consistent experience with other providers. Now is >> >> the >> >>>>> best >> >>>>>>>> moment >> >>>>>>>>>> to >> >>>>>>>>>>> do this before Airflow 3.0. >> >>>>>>>>>>> >> >>>>>>>>>>> All the Airflow configurations in the *smtp* section are >> >>>>>> supported >> >>>>>>> in >> >>>>>>>>> the >> >>>>>>>>>>> SMTP connection, so removing the whole section should >> >> have >> >>>>>> minimal >> >>>>>>>>> impact >> >>>>>>>>>>> on users. The migration effort is expected to be very >> >> low. >> >>>>>>>>>>> >> >>>>>>>>>>> This thread focuses on how best to handle the >> >>> configurations >> >>>>>>>> currently >> >>>>>>>>> in >> >>>>>>>>>>> the email section, mainly: >> >>>>>>>>>>> >> >>>>>>>>>>> - >> >>>>>>>>>>> >> >>>>>>>>>>> *default_email_on_failure* >> >>>>>>>>>>> - >> >>>>>>>>>>> >> >>>>>>>>>>> *default_email_on_retry* >> >>>>>>>>>>> - >> >>>>>>>>>>> >> >>>>>>>>>>> *email_backend* >> >>>>>>>>>>> - >> >>>>>>>>>>> >> >>>>>>>>>>> *email_conn_id* >> >>>>>>>>>>> >> >>>>>>>>>>> The other configurations could be replaced by connection >> >>>>> extras. >> >>>>>>>>>>> >> >>>>>>>>>>> Options for handling email configuration: >> >>>>>>>>>>> >> >>>>>>>>>>> - >> >>>>>>>>>>> >> >>>>>>>>>>> Option 1: Complete Removal >> >>>>>>>>>>> >> >>>>>>>>>>> - >> >>>>>>>>>>> >> >>>>>>>>>>> Remove the entire email section from the Airflow >> >>>>>>> configuration. >> >>>>>>>>>>> - >> >>>>>>>>>>> >> >>>>>>>>>>> Remove the *email_on_failure* and *email_on_retry* >> >>>>>> arguments >> >>>>>>>> from >> >>>>>>>>>>> BaseOperator. >> >>>>>>>>>>> - >> >>>>>>>>>>> >> >>>>>>>>>>> Implications: >> >>>>>>>>>>> - >> >>>>>>>>>>> >> >>>>>>>>>>> Users will need to explicitly define notifiers >> >>>> (using >> >>>>>>>>>>> on_*_callbacks) for any task that requires email >> >>>>>>>>> notifications. >> >>>>>>>>>>> - >> >>>>>>>>>>> >> >>>>>>>>>>> There will be no built-in mechanism for default >> >>>> email >> >>>>>>>>>>> notifications. >> >>>>>>>>>>> - >> >>>>>>>>>>> >> >>>>>>>>>>> This offers the most significant simplification >> >> of >> >>>>>>> Airflow's >> >>>>>>>>>> core >> >>>>>>>>>>> but may increase the effort required to >> >> configure >> >>>>> email >> >>>>>>>>> alerts. >> >>>>>>>>>>> - >> >>>>>>>>>>> >> >>>>>>>>>>> Option 2: Replacement Configurations >> >>>>>>>>>>> >> >>>>>>>>>>> - >> >>>>>>>>>>> >> >>>>>>>>>>> Introduce new configuration options: >> >>>>>>>>>>> - >> >>>>>>>>>>> >> >>>>>>>>>>> *default_notification_on_failure*: (boolean) >> >>> Enables >> >>>>>>> default >> >>>>>>>>>>> notifications on task failure. >> >>>>>>>>>>> - >> >>>>>>>>>>> >> >>>>>>>>>>> *default_notification_on_retry*: (boolean) >> >> Enables >> >>>>>> default >> >>>>>>>>>>> notifications on task retry. >> >>>>>>>>>>> - >> >>>>>>>>>>> >> >>>>>>>>>>> *default_notifier*: (string) Specifies a method >> >>> that >> >>>>>>>>> initializes >> >>>>>>>>>>> and returns the default notifier. >> >>>>>>>>>>> - >> >>>>>>>>>>> >> >>>>>>>>>>> Replace *email_on_failure* and *email_on_retry* in >> >>>>>>> BaseOperator >> >>>>>>>>>> with >> >>>>>>>>>>> *notify_on_failure* and *notify_on_retry*, which >> >>> would >> >>>>>>> default >> >>>>>>>> to >> >>>>>>>>>> the >> >>>>>>>>>>> values set in the new configuration options. >> >>>>>>>>>>> - >> >>>>>>>>>>> >> >>>>>>>>>>> Replace the old email handling behavior by simply >> >>>> adding >> >>>>>> the >> >>>>>>>>>> notifier >> >>>>>>>>>>> to the corresponding callback. >> >>>>>>>>>>> - >> >>>>>>>>>>> >> >>>>>>>>>>> Implications: >> >>>>>>>>>>> - >> >>>>>>>>>>> >> >>>>>>>>>>> Preserve the concept of default notifications >> >>> while >> >>>>>> moving >> >>>>>>>>> away >> >>>>>>>>>>> from the mail-specific configuration. >> >>>>>>>>>>> - >> >>>>>>>>>>> >> >>>>>>>>>>> Support other notification types and send >> >>>>> notifications >> >>>>>> by >> >>>>>>>>>>> default. >> >>>>>>>>>>> - >> >>>>>>>>>>> >> >>>>>>>>>>> Requires introducing new configuration options >> >> and >> >>>>>>> modifying >> >>>>>>>>>>> BaseOperator. >> >>>>>>>>>>> - >> >>>>>>>>>>> >> >>>>>>>>>>> Option 3: Dedicated Notification Connection >> >>>>>>>>>>> >> >>>>>>>>>>> - >> >>>>>>>>>>> >> >>>>>>>>>>> Create a new connection (of any type) with a >> >> specific >> >>>> and >> >>>>>>>>> reserved >> >>>>>>>>>>> name for notifications (e.g., >> >>>>> *notifications_connection*). >> >>>>>>>>>>> - >> >>>>>>>>>>> >> >>>>>>>>>>> This connection would have extras to configure the >> >>>>> notifier >> >>>>>>> and >> >>>>>>>>>>> specify whether to send notifications by default. >> >>>>>>>>>>> - >> >>>>>>>>>>> >> >>>>>>>>>>> Implications: >> >>>>>>>>>>> - >> >>>>>>>>>>> >> >>>>>>>>>>> Provides a centralized and extensible way to >> >>> manage >> >>>>> all >> >>>>>>>>>>> notification settings without Airflow >> >>> configuration. >> >>>>>>>>>>> - >> >>>>>>>>>>> >> >>>>>>>>>>> Allows users to easily switch between different >> >>>>>>> notification >> >>>>>>>>>>> methods and send notifications by default. >> >>>>>>>>>>> - >> >>>>>>>>>>> >> >>>>>>>>>>> May require more initial setup for users but >> >>> offers >> >>>>>>> greater >> >>>>>>>>>>> flexibility in the long term. >> >>>>>>>>>>> - >> >>>>>>>>>>> >> >>>>>>>>>>> Require more dev on our side. >> >>>>>>>>>>> >> >>>>>>>>>>> Regarding migration, a fully automated migration might >> >> not >> >>> be >> >>>>>>>> possible >> >>>>>>>>> in >> >>>>>>>>>>> all cases because of the need to migrate Airflow >> >>>>> configurations, >> >>>>>> we >> >>>>>>>>> will >> >>>>>>>>>>> provide comprehensive documentation and examples to >> >> assist >> >>>>> users >> >>>>>> in >> >>>>>>>>>>> transitioning their email configurations. >> >>>>>>>>>>> >> >>>>>>>>>>> I'm interested in hearing your thoughts and any other >> >>>>> suggestions >> >>>>>>> you >> >>>>>>>>> may >> >>>>>>>>>>> have. >> >>>>>>>>>>> >> >>>>>>>>>>> P.S: I have created an almost ready-to-merge PR for >> >> option >> >>> 1 >> >>>> ( >> >>>>>>>>>>> https://github.com/apache/airflow/pull/46041) >> >>>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>> >> >>>>>>>> >> >>>>>>> >> >>>>>> >> >>>>> >> >>>> >> >>> >> >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org >> For additional commands, e-mail: dev-h...@airflow.apache.org >> >>