I agree with Jarek and Niko regarding the importance of SemVer for Airflow and how it aids in maintaining user trust.
However, I am not a fan of the strict application of SemVer, especially in how we consider a small change in default values as a breaking change. IMHO, an alternative solution for making deprecated features pluggable would be to isolate them within the Airflow core and introduce a new configuration for enabling/disabling these features, with them being disabled by default. The commitment we make to our users should be "all features will remain supported across minor versions," instead of "you won't need to change any configuration to upgrade Airflow to the next major version." Naturally, we should provide documentation after each minor version release. Additionally, we could consider implementing a new CLI command similar to `airflow db upgrade` that analyzes the current configuration, alerts the user about changed default values, and suggests some changes to avoid breaking changes. I have also observed that only unstable features highly likely to be removed are added as experimental. In my opinion, Airflow 2 introduced several potential candidates, such as deferrable tasks, data-aware scheduling, dynamic mapped tasks, and setup/teardown tasks, ... Making big new features as experimental for two minor versions would give us the time and flexibility to make significant (and potentially breaking) changes. This would help correct any wrong decisions we might have made during discussing/designing the new feature, particularly after receiving user feedback. On Wed, Aug 30, 2023 at 4:35 PM Jarek Potiuk <ja...@potiuk.com> wrote: > Just to add to my points about responding to our user. > > This is - of course - anecdotal - but this is a transcript from today's > Slack conversation I got with one of the users, and this is not the first > conversation I had of this kind: > > > It's only because of Strict Semver policies I can very plainly say what I > said. > And IMHO users do not expect "I want 2.3 to be supported for X". They > really, really want predictability of what to expect. And they are ... > happy when we tell them "we expect you to upgrade" asap. > I think Airflow 1.10 made a lot of harm in the perception of what Airflow > upgrades are supposed to be. And we are YET to see all the positive effects > of the SemVer change we've done once users will grasp what it means to > them. > > (see https://apache-airflow.slack.com/archives/CCQ7EGB1P/p1693404058623619 > ) > > > User: Hello, when does the community or apache support for 2.2.x or 2.3.x > expires. Can we get some insights for us to plan upgrades? > > Jarek: > > Airflow comes without any guarantees of any sort > This is the licence > We do have promise about airflow SemVer: > https://airflow.apache.org/docs/apache-airflow/stable/release-process.html > Means that we aim for all 2.* releases to be backwards compatible > And we only release new code and security fixes in latest 2.* minor release > ONLY > > https://airflow.apache.org/docs/apache-airflow/stable/security/releasing_security_patches.html > So as long as you keep upgrading to latest 2.* (currently 2.7.0) you are > fine > There had never been a single bugfix or security patch released for > previous minor branch > Which means that the only way to get fixes (and security patches) is to > upgrade to latest airflow 2 (edited) > Which currently is Airflow 2.7.0 > And you should follow this pattern > People who create airflow are mostly volunteers and do their best effort to > help people - but if you come with some errors in 2.2 or 2.3 you will > likely get answer "upgrade to latest airflow" > There are however companies that might provide you with paid support for > 2.2 or 2.3 > Service Providers (MWAA, Cloud Composer) have their own schedule for > supporting their version and you can get paid support there if you wish > > User: > > Great! thanks for all the insights @Jarek Potiuk > Last time when we did an upgrade was due to some vulnerabilities found with > 1.10.x version. > Reason for checking this is we don\'t want to get into such situation > again. Since we maintain a lot. > And agree that upgrade is the best way to have us on par. But project scope > and efforts determines that. > We do have astro and plans to move there gradually. Until then wanted to > make sure current versions that we have doesnt fall under any priority > scanners. :slightly_smiling_face: > > Jarek: > Yes. 1.10 did not have those promises > It was not even following SemVer > > J. > > > On Wed, Aug 30, 2023 at 8:58 AM Pierre Jeambrun <pierrejb...@gmail.com> > wrote: > > > Same, I was very tempted by this at first but Jarek and Niko changed my > > mind. I think sticking to semver will be more beneficial in the long run. > > > > On Wed 30 Aug 2023 at 04:09, Mehta, Shubham <shu...@amazon.com.invalid> > > wrote: > > > > > I couldn’t agree more with Jarek and Niko's perspective on the > importance > > > of maintaining SemVer for Apache Airflow. > > > > > > I've had conversations with dozens of customers, and it was a lot > easier > > > to convince them to upgrade for a more stable and secure Airflow > > > environment. The key selling point was that Airflow strictly follows > > > SemVer, so users don't have to worry about upgrades breaking their > > > environment. Security is the most important aspect of this. And with > the > > > recent inflow of CVEs being addressed with every version release, I > can't > > > imagine how difficult it would have been for customers without SemVer > > > promise to ensure that their Airflow deployments are secure. > > > > > > > Well, if we had a different policy that allowed for introducing > > behavior > > > changes on a regular basis, then we would not have to save them all up > > for > > > the major release, and unleash them on the users all at once. So then > you > > > would not have that big painful major release upgrade to deal with -- > > you'd > > > have done it a little bit at a time. So the "carrots" become less > > important > > > perhaps. Perhaps the fact that behavior changes would come out in dribs > > and > > > drabs over time would make it more likely for users to upgrade sooner, > > > because staying current would be less painful than getting too far > > > > > > Regarding introducing behavior changes on a regular basis, I recently > > > analyzed improvements and new features in Airflow. I noticed that > Airflow > > > did not strictly follow SemVer for the 1.10.x releases. As a result, > > there > > > were many users stuck on versions like "1.10.12," and these users are > > > hesitant even to upgrade to later 1.10 versions. Now, I see users > happily > > > migrating to newer versions of Airflow and trying out new features. > > > Granted, it's not perfect due to potential breaking changes in the > > provider > > > packages, but it's far better than what Airflow experienced with the > > 1.10.x > > > series. > > > > > > To be honest, Airflow already faces challenges in improving the > adoption > > > of new features, in my personal opinion. For example, it took about a > > year > > > for deferrable operators to gain awareness and interest. I also haven't > > > seen much excitement around data-driven scheduling among Airflow users > > > (which I was secretly hoping), similar to Airflow contributors. Moving > > away > > > from SemVer would likely make this situation worse. > > > > > > > I'm sure many good ideas have emerged, but been ruled out solely > based > > > on backcompat. > > > > > > Until we have a list of data points / ideas that were discarded, it is > > > hard to justify a major release for this reason. Maybe we should > maintain > > > an active list in GitHub discussions? > > > > > > In conclusion, SemVer is easy to understand for regular Airflow users > who > > > might not read every line in the release notes or follow every mailing > > list > > > discussion. Personally, I don't think it has made adding new features > > > difficult as I see a lot of new features coming in lately. In fact, I > > > strongly believe that SemVer helps keep Airflow contributors focused on > > > customer needs and encourages them to think creatively about ensuring > > > backward compatibility. > > > > > > Shubham > > > > > > > > > On 2023-08-27, 9:05 AM, "Daniel Standish" > > > <daniel.stand...@astronomer.io.inva <mailto: > > > daniel.stand...@astronomer.io.inva>LID> wrote: > > > > > > > > > CAUTION: This email originated from outside of the organization. Do not > > > click links or open attachments unless you can confirm the sender and > > know > > > the content is safe. > > > > > > > > > > > > > > > > > > > > > > > > > > Since I was called :) > > > > > > > > > > > > > > > As though you needed to be called to chime in ;) > > > > > > > > > Yeah and the other thing that your comments made me think of was... how > > it > > > could make provider management more challenging. Because though > currently > > > we have min_airflow_version set in providers and we can use that to > > control > > > behavior (and assumptions about what's in core), presently it's just > > about > > > future compat and just addition of new features. But with a change like > > > this, it would expand that burden to some extent, by > > > requiring consideration of what's changed and what's removed, in a way > > that > > > is not a practical issue presently. > > > > > > > > > I see no particular reason for removing features if they do not slow us > > > down > > > > > > > > > > > > > > > Yeah so wholesale removal of features is one thing, like with the > subdags > > > you mentioned. But the prospect of the infinitely distant 3.0 also has > a > > > more diffuse impact on development. I'm sure many good ideas have > emerged > > > but been ruled out solely based on backcompat. Sometimes probably on a > > > narrow backcompat concern where it's maybe like... is anybody really > > > relying on this aspect of behavior? > > > > > > > > > Maybe that's simply what we must deal with. But the thought occurred to > > > me, maybe there's some other way. > > > > > > > > > And yeah i shouldn't say it's "not working for us"... that's just me > > > writing an email 2 minutes before bedtime when an idea popped in my > > > head.... obviously it's working ok for us, and doing a lot of work > *for* > > > us. > > > > > > > > > > > > > > > > > > > > > > > > > > > On Sun, Aug 27, 2023 at 1:33 AM Jarek Potiuk <ja...@potiuk.com > <mailto: > > > ja...@potiuk.com>> wrote: > > > > > > > > > > Since I was called :). > > > > > > > > Yes. I would be very, very careful here. You might think that we use > > > > "SemVer" as a "cult". Finally it's just a versioning scheme we > adopted, > > > > right? But for me - this is way more. It's all about communication > with > > > > our users, making promises to our users and design decisions that > > impact > > > > our security policies. > > > > > > > > I think Semver has this nice property that we can promise our users > "if > > > you > > > > are using the public interface of Airflow, you can upgrade without > too > > > much > > > > of a fear that things will break - if they will be broken, this will > be > > > > accidental and will get fixed". BTW we already have, very nicely > > defined > > > > > > > > > > > > > > https://airflow.apache.org/docs/apache-airflow/stable/public-airflow-interface.html > > > < > > > > > > https://airflow.apache.org/docs/apache-airflow/stable/public-airflow-interface.html > > > > > > > > so it's pretty clear what we promise to our users. And it also has > > > certain > > > > "security" properties - but I will get to that. > > > > > > > > I would love to hear what other think, but I have 3 important aspects > > > that > > > > should be considered in this discussion > > > > > > > > 1. Promises we make to our users and what it means for us unswering > to > > > > their issues. > > > > > > > > Surely we could make other promises. CalVer promises (We release > > > regularly) > > > > - but it does not give the user any indication that whatever worked > > > before > > > > will work in the foreseeable future and will get maintained. It makes > > > > maintainer life easier, yes. It however makes the user's life harder, > > > > because they cannot rely on something being available and their > > > > upgrades might and will be more difficult. And yes - for Snowflake it > > > > matters a lot, because they actually get paid for supporting old > > versions > > > > and they have no choice but to respond to users who claim the "old > > > > supported version does not work". They cannot (as we can, and often > do > > > > currently) tell the users "upgrade to the latest version - it should > be > > > > easy because of SemVer promise - if you follow our "use public > > interface > > > > only of course". We (community/maintainer) can very easily say that > and > > > > since we give no support, no guarantees, no-one pays for solving > > > problems, > > > > this "upgrade to latest version" is actually a very good answer - > many, > > > > many times. > > > > > > > > For maintainers that rarely respond to user questions, yes Semver is > > > harder > > > > to add new things. But for maintainers who actually respond a lot to > > > users' > > > > questions, life is suddenly way harder - because they cannot answer > > > > "upgrade to latest version" - because immediately the user will > answer > > > "but > > > > I cannot - because I am using this and that feature. tell me how to > > solve > > > > the problem without upgrading". And they will be in full right to say > > > that. > > > > I recommend any maintainers to spend sometimes few week looking and > > > > actually responding to user's questions. That is quite a good lesson > > and > > > > this aspect should also be considered in this discussion. > > > > > > > > 2. Why do we want to introduce breaking changes? > > > > > > > > I believe the only reason for removing features (i don't really like > > > > softening "breaking changes" with "behaviour changes' BTW. > > > > This attempts to hide the fact that those changes are - in fact - > > > breaking > > > > changes - is that when they are slowing us down (us - people who > > > > develop airflow). So I propose to keep the name in further discussion > > as > > > it > > > > tells much more about the nature of "behaviour changes". I see no > > > > particular reason for removing features if they do not slow us down. > > > > > > > > Let me ask this way - would Semver disturb you if we had a way of > > > removing > > > > features from core airflow (i.e. making them not slowing down > > > development) > > > > if we have a way of doing it without breaking Semver? Seems > > > contradictory - > > > > but it is not. We've already done that and continue doing it. Move > out > > > > Kubernetes and Celery out of core -> we did it. It's not any more > part > > of > > > > Semver. They never were, actually (but a number of people could > assume > > > they > > > > were). Now, they are completely out of the way. I remember how much > > time > > > > Daniel, you spent on back-compatibility of K8S code - but... Not any > > > more. > > > > People will be able to keep current K8S and Celery provider > > > implementation > > > > basically forever now. No matter how many Airflow versions we have. > By > > > > introducing a very clear Executor API and making sure we decouple it > > from > > > > the Core - we actually made the impossible happen: > > > > > > > > * Airflow core still keeps the SemVer Promises > > > > * Users can stick to the "old" behaviour of K8S and Celery executors > as > > > > long as they want > > > > * We can introduce breaking changes in K8S and Celery providers - > > without > > > > absolutely looking back > > > > > > > > Seems like magic - but just by clear specification of what is/is not > a > > > > public API, decoupling things and having mechanisms of providers > where > > > you > > > > downgrade and upgrade them independently and where the old versions > can > > > be > > > > used for as long as you want - even after you upgrade Airflow, we > > > > seemingly made impossible - possible. And .. my assumption is - we > can > > do > > > > the same with other features. Surely some are more difficult than > > others > > > > (SubDAG - yes I am talking about you). But maybe - instead of > breaking > > > > SemVer we can figure out how to remove subdag to s separately > > installable > > > > provider? Why not introduce some stable API and decoupling SubDAG > > > > functionality similar to Celery/K8S Executors? It will be a lot > harder, > > > and > > > > likely performance will suffer - but hey, performance is not > something > > > > promised by SemVer. We already - apparently - in 2.5 increased > > > > Airflow's resource requirements by ~ 30% (looks like from an > anecdotal > > > > user's report). And while users complain, performance / resource > usage > > is > > > > not promised by SemVer (and by us). And while users might complain, > > > > increasing resources nowadays is just a matter of cost, it's > generally > > > easy > > > > to increase memory for your Airflow installation. Yes you will pay > > more, > > > > but usually Airflow's cost is rather low and there are other factors > > that > > > > might decrease the cost (such as deferrables) so this is not a big > > > problem > > > > (and it does not matter in this discussion). > > > > > > > > So my question is - do we have a really good reason to break up with > > > SemVer > > > > and remove some features ? Or maybe there are ways we can separate > them > > > out > > > > of the way of core maintainers without breaking SemVer? I believe > more > > > and > > > > more decoupling is way better approach to achieve faster development > > than > > > > breaking SemVer. > > > > > > > > 3. Security patches > > > > > > > > This is, I think, one of the things that will only get more important > > > over > > > > the next few years. And we need to be ready for what's coming. I am > not > > > > sure about others but I am not only following, but also I actively > > > > participate in discussion of the Apache Software Foundation. For > those > > > who > > > > don't - I recommend reading this blog post at the ASF Blog > > > > > > > > > > > > > > https://news.apache.org/foundation/entry/save-open-source-the-impending-tragedy-of-the-cyber-resilience-act > > > < > > > > > > https://news.apache.org/foundation/entry/save-open-source-the-impending-tragedy-of-the-cyber-resilience-act > > > > > > > > . We are facing - in the next few years increased pressure from > > > governments > > > > (EU and US mainly in our case) to force everyone to follow security > > > > practices they deem essential. We are at a pivotal moment where the > > > > Software Development industry is starting to be regulated. It > happened > > in > > > > the past for many industries. And it is happening now - we are facing > > > > regulations that we've never experienced in software development > > history. > > > > Those laws are about to be finalized (some of them in a few months). > > The > > > > actual scope of it is yet to be finalized but among them there is a > > > STRICT > > > > requirement of releasing security patches for all major versions of > the > > > > software for 5 years (!) after it's been released. This will be a > > strict > > > > requirement in the EU and companies and organisations not following > it > > > will > > > > be forbidden to do business in the EU (similar in the US). How it > will > > > > impact ASF - we do not know yet, our processes are sound. But there > is > > a > > > > path that both - our users and stakeholders will expect that there > are > > > > security patches that are released for all the software that is out > > there > > > > and used for years. > > > > > > > > If we use SemVer - this is the very nice side of it - by definition > we > > > only > > > > need to release patches for all the MAJOR versions we have. This is > > what > > > we > > > > do effectively today. We only release security patches for the latest > > > MINOR > > > > release of the ONLY major release (Airflow 2). If we start > deliberately > > > > releasing breaking changes - then such a breaking release becomes > > > > automatically equivalent to a MAJOR release - because our users will > > not > > > be > > > > able to upgrade and apply security fixes without - sometimes - > majorly > > > > breaking their installation. This is 100% against the spirit and idea > > of > > > > the regulations. The regulations aim to force those who produce > > software > > > to > > > > make it easy and possible for the users to upgrade immediately after > > > > security fixes are released. > > > > > > > > In a way - using SemVer and being able to tell the users "We only > > release > > > > security patches in the latest release because you can always easily > > > > upgrade to this version due to SemVer". > > > > > > > > If we are looking to speed up our development and not get into the > way > > of > > > > maintainers - CalVer in this respect is way worse IMHO. The > regulations > > > > might make us actually slower if we follow it. > > > > > > > > J. > > > > > > > > > > > > > > > > > > > > > > > > On Sun, Aug 27, 2023 at 8:46 AM Daniel Standish > > > > <daniel.stand...@astronomer.io.inva <mailto: > > > daniel.stand...@astronomer.io.inva>lid> wrote: > > > > > > > > > And to clarify, when I talk about putting pressure on major > releases, > > > > what > > > > > I mean is that there's this notion that a major release has to have > > > some > > > > > very special or big feature change. One reason offered is > marketing. > > > > > Major release is an opportunity to market airflow, so take > advantage > > of > > > > > that. Another offered is, "well folks won't upgrade if there's not > > some > > > > > special carrots in it", especially given that major releases are > > where > > > we > > > > > introduce a bunch of breaking changes all at once. > > > > > > > > > > Well, if we had a different policy that allowed for introducing > > > behavior > > > > > changes on a regular basis, then we would not have to save them all > > up > > > > for > > > > > the major release, and unleash them on the users all at once. So > then > > > > you > > > > > would not have that big painful major release upgrade to deal with > -- > > > > you'd > > > > > have done it a little bit at a time. So the "carrots" become less > > > > > important perhaps. Perhaps the fact that behavior changes would > come > > > out > > > > > in dribs and drabs over time would make it more likely for users to > > > > upgrade > > > > > sooner, because staying current would be less painful than getting > > too > > > > far > > > > > behind -- though that's just a thought. > > > > > > > > > > But anyway, the way it is now, the major release seems to be too > many > > > > > things: big marketing push, tons of new features, *and* the only > > > > > opportunity to make breaking changes. A policy like snowflake's > seems > > > so > > > > > much healthier, methodical, and relaxed, allowing us to be > selective > > > > about > > > > > when and how to release behavior changes, without it having to be > > > > anything > > > > > more than that. > > > > > > > > > > CalVer <https://calver.org/> <https://calver.org/>> may be a > good > > > option. > > > > > > > > > > > > > > > On Sat, Aug 26, 2023 at 11:22 PM Daniel Standish < > > > > > daniel.stand...@astronomer.io <mailto: > daniel.stand...@astronomer.io > > >> > > > wrote: > > > > > > > > > > > For whatever reason, I was reminded recently of snowflake's > > "behavior > > > > > > change" policy > > > > > > > > > > > > See here: > > > > > > > https://docs.snowflake.com/en/release-notes/behavior-change-policy > > < > > > https://docs.snowflake.com/en/release-notes/behavior-change-policy> > > > > > > > > > > > > I think semver is problematic for core because basically you > cannot > > > > > > implement behavior changes until the "mythical" major release. > The > > > > next > > > > > > major always seems very far off. Indeed some have suggested that > > 3.0 > > > > > might > > > > > > never happen (looking at you @potiuk :) ). > > > > > > > > > > > > Given the fact that it is so incredibly uncertain when exactly > 3.0 > > > will > > > > > > happen (and after that, subsequent major releases), it really > makes > > > the > > > > > > developer's job a lot harder. Because you feel like you may never > > (or > > > > > > practically never) be able to make a change, or fix something, > etc. > > > > > > > > > > > > What snowflake does is they release "behavior changes" (a.k.a. > > breaks > > > > > with > > > > > > backward compatibility) in phases. First is testing phase (users > > can > > > > > > enable optionally). Next is opt-out period (enabled by default > but > > > you > > > > > can > > > > > > disable). Final phase is general availability, when it's enabled > > and > > > > you > > > > > > can't disable it. > > > > > > > > > > > > Moving to something like this would give us a little more > > flexibility > > > > to > > > > > > evolve airflow incrementally over time, without putting so much > > > > pressure > > > > > on > > > > > > those mythical major releases. And it would not put so much > > pressure > > > > on > > > > > > individual feature development, because you would know that > > there's a > > > > > > reasonable path to changing things down the road. > > > > > > > > > > > > Because of the infrequency of our major releases, I really think > > for > > > > > core, > > > > > > semver is just not working for us. That's perhaps a bold claim, > but > > > > > that's > > > > > > how I feel. > > > > > > > > > > > > Discuss! > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >