Ah right okay. By "policy for policy's sake" this is exactly what I mean - we have so much work down already it feels nigh on impossible to know everything.
Will min Airflow version aside: we agreed we're going to follow SemVer, so I say we stick to that. X+1.0.0 (when ever we choose to release it) is when we remove the deprecated code. My vote is for let's keep it simple: SemVer, and we delete the deprecated code whenever we remember. I don't think we need a policy more than "follow SemVer" as a major version bump says a breaking change was made and users should read the changelog. -a On 21 May 2022 18:57:51 BST, Jarek Potiuk <[email protected]> wrote: >> I'd also say we should keep the min Airflow version as low as possible for >> providers, and only update it when we actually want to require a new feature >> in a specific provider. Are you saying we should periodically just up the >> min core requirement even if it's not required? > >This is a policy we already discussed and agreed to before actually. I >think if you would like to discuss proposal of changing the policy, >you should open another discussion because it is scheduled to happen >already at the end of May. to bump all providers to 2.2+ following the >policy. > >It is documented here: >https://github.com/potiuk/airflow/#support-for-providers > >We discussed it and agreed to it here: >* Discussion here: >https://lists.apache.org/thread/csczm7xmnntdz9wtjbod8pqgt13zoggo - >where the apparent consensus (not my proposal) was that 12 months for >2.1 is good >* Lazy consensus on proposed PR (including TL;DR): >https://lists.apache.org/thread/43v1dnww37t0o924mzo365ot2p58k864 > >> Putting any kind of policy in place here seems like policy for policy's sake. > >Not really. There were few arguments (you can see them in the >discussion) why it makes sense to bump all of them. But I think if we >want to re-discuss it, a new discussion should be opened. > >J. > > >On Sat, May 21, 2022 at 7:04 PM Ash Berlin-Taylor <[email protected]> wrote: >> >> Oh okay. >> >> Will then I disagree with your proposal quite strongly: providers should be >> run like separate projects with next to nothing tied to core. >> >> So long as X.*.* issues a warning then X+1.0.0 can delete that code, even if >> it's released the next week - it's not like we do this very often anyway >> >> I'd also say we should keep the min Airflow version as low as possible for >> providers, and only update it when we actually want to require a new feature >> in a specific provider. Are you saying we should periodically just up the >> min core requirement even if it's not required? >> >> Putting any kind of policy in place here seems like policy for policy's sake. >> >> This is yet another reason to split the providers - if Corp A wants to have >> longer deprecation then they can, so it's not even something we need to >> worry about in the core project. >> >> -a >> >> On 21 May 2022 17:50:06 BST, Jarek Potiuk <[email protected]> wrote: >>> >>> Yes. It is in providers. And it's about Google not Salesforce (that >>> one has not yet been released). >>> >>> Both have nothing to do with Providers. But my proposal is to couple >>> deprecation removal with the "min-version" of Airflow. The >>> "min-version" is the moment where people will have to upgrade Airflow >>> so having deprecation policy tied with that might make sense from the >>> "operational" point of view for our users - even if it has nothing to >>> do with core vs. providers. >>> >>> J. >>> >>> On Sat, May 21, 2022 at 6:46 PM Ash Berlin-Taylor <[email protected]> wrote: >>>> >>>> >>>> I think your are talking about a different case to Elad. >>>> >>>> The impetus for this discussion was deprecated operator for Tableau in >>>> the Salesforce provider. >>>> >>>> So nothing there is to do with Airflow core version, just purely on the >>>> providers side. >>>> >>>> -ash >>>> >>>> On 21 May 2022 17:01:01 BST, Jarek Potiuk <[email protected]> wrote: >>>>> >>>>> >>>>> Tl;DR; My proposal for policy is very simple: let's remove >>>>> deprecations, if they were introduced in the version of Airflow that >>>>> we do not support any more in providers. The next round of providers >>>>> will have "Airflow >= 2.2". So if we added a deprecation in 2.1, it >>>>> means that it is safe to remove that deprecation - because the user >>>>> will have to make a significant upgrade anyway to use the providers >>>>> anyway. Also I think we should split Google Provider. But this is a >>>>> bit on the side and we should have another discussion as a spin-off >>>>> from that one. >>>>> >>>>> More context below. >>>>> >>>>> Beware, it's long and it summarizes about 2 weeks of discussing those >>>>> with multiple people and 2 weeks thinking about the best way we can >>>>> approach it. >>>>> So brace yourself if you want to know what led to this proposal. >>>>> >>>>> I think I see two sides of it and I am a bit torn. The Google example >>>>> was really interesting and "complex" - and I think we should learn >>>>> from it and implement something that will be good for everyone (which >>>>> I think is possible) :). >>>>> >>>>> 1) On one side indeed providers can be easily upgraded or downloaded >>>>> separately. >>>>> 2) On the other hand the fixes in some providers (google/aws) are >>>>> "bundled" together - this was the case that Rafał mentioned - this >>>>> means that yeah - it's easy to upgrade/downgrade, but also such >>>>> upgrade/downgrade might either require you to update a lot of code, or >>>>> you might miss important fix/new feature. Really what we have is that >>>>> bug fixes/features/deprecations are all coupled together. >>>>> >>>>> What the Composer team did was really decoupling deprecations and >>>>> bugfixes. The Google Ads API v10 was a bit special case - because it >>>>> was all-in-once : bug fix (old API stopped working), new feature (the >>>>> new API has new features), and breaking change (as it was breaking). >>>>> And it was coupled together with a number of deprecation removals. So >>>>> even if is "easy" to upgrade/downgrade google provider, you were >>>>> really left with two choices: >>>>> * either you both - have ADS working AND you have to remove deprecation >>>>> * or you can still keep deprecations, BUT your ads will not work >>>>> >>>>> We can of course discuss various questions in this particular case: >>>>> * Should the ads team disable the old API when they could check >>>>> important "Airlfow" client is not ready? >>>>> * Should we know earlier that the old Google Ads API will be disabled? >>>>> * Should Google Provider support v10 earlier? >>>>> * What should the notification chain look like ? >>>>> * Should we have some monitoring in place to warn us before? >>>>> * Should we avoid removing deprecations when we also added the Ads v10 >>>>> API support? >>>>> >>>>> I think answering those questions might help us to "avoid" similar >>>>> problems, but I think this is not a unique "Ads" problem and it's much >>>>> harder to change for all the operators we have :). >>>>> >>>>> So let's focus on what we can do. I think we have two problems to solve: >>>>> >>>>> * we want to avoid keeping deprecation for a long period >>>>> * but also we want to avoid our stakeholders having to "fork" the >>>>> community code. I don't like that we will have "different" providers >>>>> in Composer - this is temporary of course, but it sets a precedent >>>>> that will make life of both - community and Composer team difficult >>>>> (when people will report errors for one, they might refer to a version >>>>> of provider that is only available for Composer team (or even if >>>>> published, it will be extra "variable" that we want to avoid when >>>>> investigating issues of our users). >>>>> >>>>> Also, I think Google Provider is a bit of a "special" case. We've been >>>>> discussing that for quite a while before that the Google Provider >>>>> **should** be split. >>>>> >>>>> I believe we have two things to do: >>>>> >>>>> * I think we should split Google Provider finally. We can split it to >>>>> Ads/Workspace/Cloud or even further - separate providers for each >>>>> service. There is some complexity with shared GoogleHook and similar, >>>>> but I think we can work out a good solution for that. We can discuss >>>>> separately how exactly the split should be. If we had it in place, >>>>> this problem would be non-existing - Google Ads would be upgraded >>>>> separately and people could keep deprecations from other services >>>>> easily. >>>>> >>>>> * I think It is important to have a "predictable" deprecation time of >>>>> life as discussed above. I.e. the users (and other stakeholders) >>>>> should have at least a chance to know when the deprecations will be >>>>> removed and have enough lead time. We have not done it before, so this >>>>> could be a surprise that the deprecations were removed. Now - the >>>>> question is - what the rule should be. >>>>> >>>>> I am kinda ok with fixed 3/6 months. Both are acceptable. However, I >>>>> personally think they are a) artificial without any business >>>>> connection b) much too fast. >>>>> >>>>> First of all the question is WHY we want to remove deprecations at all >>>>> - I think we want to remove them, because they introduce non-zero >>>>> maintenance cost and prevent/make difficult some future changes. I >>>>> think this is the only reason for deprecation removal. Simply >>>>> "removing deprecation" without having a reason for that is not "good >>>>> enough" if it potentially causes trouble to our users, and make them >>>>> open issues that we will have to handle. It's just the matter of >>>>> weighing what is more costly for us - keeping the deprecations, or >>>>> handling the cost of our users dealing with broken dags of theirs (if >>>>> they don't proactively remove deprecations). We can "believe" there >>>>> will not be many such users, but the reality is very different. Our >>>>> users generally remove deprecations when a) there is a significant >>>>> migration b) when they have no other choice but then they will raise >>>>> issues and we will also be hit back by this. >>>>> >>>>> I also think most of our deprecations are really easy to maintain for >>>>> a while. And 3/6 months is the minimum time for an enterprise to even >>>>> "plan" any upgrade. We have to remember our users are not ONLY >>>>> maintaining Airflow. Airflow is just one of the many systems they have >>>>> to keep up and running. We might **think** that Airflow is important >>>>> enough to force our users to remove deprecation immediately when they >>>>> appear, but this is a bit megalomaniac thinking. Most of our users >>>>> have a number of systems to keep running and Airflow is just one of >>>>> the things they keep running. >>>>> >>>>> Summarizing - I think that having a short (3-6 months is short IMHO) >>>>> deprecation policy brings very little benefits to us as maintainers >>>>> (slightly easier code to maintain), while it might make big pressure >>>>> on our users to remove deprecation much more often than they have >>>>> capacity for. >>>>> >>>>> But actually I think we could be much smarter in actually engineering >>>>> the most "smooth" upgrade and deprecation removal approach for our >>>>> users. I think we all have a common goal to make the users upgrade >>>>> AIRFLOW minor version relatively quickly. We already have the policy >>>>> that providers will have Airflow 2.N min version for 12 months. And >>>>> one of the reasons was to make our users upgrade relatively fast, but >>>>> slow enough to be able to plan and execute it without haste, but by >>>>> adding the policy, one of the reasons was to add "incentive" for them >>>>> to migrate during those 12 months. Simply - if they don't upgrade, >>>>> they will stop receiving fixes to the providers they use. Also the >>>>> reason was to give our users some kind of predictability and enough >>>>> time to adapt to newer airflow releases. >>>>> >>>>> I think my proposal would be to "couple" the two policies together - >>>>> i.e. connect the deprecation removal with "min-airflow" policy for >>>>> providers. Let me repeat what I wrote in TL;DR; above: >>>>> >>>>> My proposal for policy is very simple: let's remove deprecations, if >>>>> they were introduced in the version of Airflow that we do not support >>>>> any more in providers. The next round of providers will have "Airflow >>>>>> >>>>>> >>>>>> = 2.2". So if we added a deprecation in 2.1, it means that it is safe >>>>> >>>>> >>>>> to remove that deprecation - because the user will have to make a >>>>> significant upgrade anyway to use the providers anyway. >>>>> >>>>> J. >>>>> >>>>> On Sat, May 21, 2022 at 4:52 PM Rafal Biegacz >>>>> <[email protected]> wrote: >>>>>> >>>>>> >>>>>> >>>>>> Elad - thank you for bringing this topic! >>>>>> >>>>>> In general, it seems that we have two big groups of users: >>>>>> a) those who are on top of all the changes in Airflow and Airflow >>>>>> Providers - these users are eager to change their DAGs to adjust to >>>>>> newer versions quickly. >>>>>> b) users who want to implement some DAGs and they want to run them for >>>>>> as long as it is possible (it's especially true in case of Enterprise >>>>>> users) >>>>>> >>>>>> One consequence of very short deprecation notice periods will be that >>>>>> users will be less eager to upgrade to newer versions of providers. >>>>>> On the other hand, users are recommended to use up-to-date/maintained >>>>>> providers - which is good for them. >>>>>> >>>>>> I'm inclining towards: >>>>>> >>>>>> 3 or 6-month deprecation notice periods (I'm evenly split between >>>>>> these two options); definitely, it should not be shorter than 3 months. >>>>>> a deprecation could be announced in providers' version X, should not >>>>>> be removed in X+1 version and deprecation could be removed in X+2 >>>>>> version assuming "deprecation notice period" passes. >>>>>> >>>>>> Regards, Rafal. >>>>>> >>>>>> PS. Deprecations also introduce challenges with delivering fixes and >>>>>> maintaining of already released versions. >>>>>> In the case of Google providers 6.8.0 we had a challenge with >>>>>> two items: problem with CloudSQL Proxy Runner and lack of support for >>>>>> Google Ads API v10. >>>>>> >>>>>> Both items were fixed in providers 7.0.0 but the fixes were coming >>>>>> with a price - several deprecated Google operators and operator >>>>>> parameters were removed so to get the fixes users would need to modify >>>>>> lots of their other DAGs >>>>>> In case of Composer, we ended up with producing custom version of >>>>>> Google providers - built based on 6.8 with required fixes. >>>>>> IMHO, we should produce a community version of Google providers 6.8.1 >>>>>> or 6.9 to fix it. >>>>>> >>>>>> >>>>>> >>>>>> On Sat, May 21, 2022 at 1:33 AM Kaxil Naik <[email protected]> wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> 6 months and even 3 months sounds too long for releasing a major >>>>>>> version of provider to be honest. Providers follow strict sem-ver, >>>>>>> effort to downgrade to previous version is very less compared to core >>>>>>> Airflow. Similarly effort to upgrade is less too. >>>>>>> >>>>>>> So I would vote for a guideline for deprecation for 2 releases with >>>>>>> an exception where it is not possible to provide a deprecation before >>>>>>> breaking change path. >>>>>>> >>>>>>> Regards, >>>>>>> Kaxil >>>>>>> >>>>>>> On Fri, 20 May 2022 at 23:03, Ash Berlin-Taylor <[email protected]> >>>>>>> wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> My vote is for remove it as soon after the major ver of the provider >>>>>>>> is released, or as soon as anyone remembers anyway :) >>>>>>>> >>>>>>>> On Fri, May 20 2022 at 22:58:54 +0300, Elad Kalif >>>>>>>> <[email protected]> wrote: >>>>>>>> >>>>>>>> Providers follow semver just like airflow core. >>>>>>>> >>>>>>>> If you upgrade to a major release it means that there are breaking >>>>>>>> changes and you should read the release notes to know what they are. >>>>>>>> Breaking changes can happen regardless of removing deprecated >>>>>>>> features. >>>>>>>> >>>>>>>> Google provider for example had several breaking changes releases >>>>>>>> (2.0.0, 3.0.0 etc..) only in 7.0.0 we removed deprecated features. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> בתאריך יום ו׳, 20 במאי 2022, 22:50, מאת Mateusz Henc >>>>>>>> <[email protected]>: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Hello, >>>>>>>>> Right, nobody forces you to upgrade, but sometimes you wait for an >>>>>>>>> important bug fix/new feature that is coming in the new version and >>>>>>>>> you are surprised by the breaking change there. >>>>>>>>> >>>>>>>>> Isn't the problem with deprecations more about their visibility? >>>>>>>>> How can users learn today that they use a deprecated feature? I think >>>>>>>>> it's only from logs. >>>>>>>>> But if dags are running fine, there is no need to check logs. >>>>>>>>> >>>>>>>>> Shouldn't information about new deprecations be included in release >>>>>>>>> notes for the package? >>>>>>>>> >>>>>>>>> Best regards, >>>>>>>>> Mateusz Henc >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, May 20, 2022 at 5:39 PM Daniel Standish >>>>>>>>> <[email protected]> wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> It means, in a 3 months period, a developer needs to [do lots of >>>>>>>>>>> things...] >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> When removal is released (say after a min of 3 months since >>>>>>>>>> deprecation), as a user nothing forces you to upgrade to the latest >>>>>>>>>> major immediately. >>>>>>>>>>
