>
> Since I was called :)

As though you needed to be called to chime in ;)

Yeah and the other thing that your comments made me think of was... how it
could make provider management more challenging.  Because though currently
we have min_airflow_version set in providers and we can use that to control
behavior (and assumptions about what's in core), presently it's just about
future compat and just addition of new features.  But with a change like
this, it would expand that burden to some extent, by
requiring consideration of what's changed and what's removed, in a way that
is not a practical issue presently.

I see no particular reason for removing features if they do not slow us down


Yeah so wholesale removal of features is one thing, like with the subdags
you mentioned.  But the prospect of the infinitely distant 3.0 also has a
more diffuse impact on development. I'm sure many good ideas have emerged
but been ruled out solely based on backcompat.  Sometimes probably on a
narrow backcompat concern where it's maybe like... is anybody really
relying on this aspect of behavior?

Maybe that's simply what we must deal with.  But the thought occurred to
me, maybe there's some other way.

And yeah i shouldn't say it's "not working for us"... that's just me
writing an email 2 minutes before bedtime when an idea popped in my
head.... obviously it's working ok for us, and doing a lot of work *for* us.




On Sun, Aug 27, 2023 at 1:33 AM Jarek Potiuk <ja...@potiuk.com> wrote:

> Since I was called :).
>
> Yes. I would be very, very careful here. You might think that we use
> "SemVer" as a "cult". Finally it's  just a versioning scheme we adopted,
> right?  But for me -  this is way more. It's all about communication with
> our users, making promises to our users and design decisions that impact
> our security policies.
>
> I think Semver has this nice property that we can promise our users "if you
> are using the public interface of Airflow, you can upgrade without too much
> of a fear that things will break - if they will be broken, this will be
> accidental and will get fixed".  BTW we already have, very nicely defined
>
> https://airflow.apache.org/docs/apache-airflow/stable/public-airflow-interface.html
> so it's pretty clear what we promise to our users. And it also has certain
> "security" properties - but I will get to that.
>
> I would love to hear what other think,  but I have 3 important aspects that
> should be considered in this discussion
>
> 1. Promises we make to our users and what it means for us unswering to
> their issues.
>
> Surely we could make other promises. CalVer promises (We release regularly)
> - but it does not give the user any indication that whatever worked before
> will work in the foreseeable future and will get maintained. It makes
> maintainer life easier, yes. It however makes the user's life harder,
> because they cannot rely on something being available and their
> upgrades might and will be more difficult. And yes - for Snowflake it
> matters a lot, because they actually get paid for supporting old versions
> and they have no choice but to respond to users who claim the "old
> supported version does not work". They cannot (as we can, and often do
> currently) tell the users "upgrade to the latest  version - it should be
> easy because of SemVer promise - if you follow our "use public interface
> only of course".  We (community/maintainer) can very easily say that and
> since we give no support, no guarantees, no-one pays for solving problems,
> this "upgrade to latest version" is actually a very good answer - many,
> many times.
>
> For maintainers that rarely respond to user questions, yes Semver is harder
> to add new things. But for maintainers who actually respond a lot to users'
> questions, life is suddenly way harder - because they cannot answer
> "upgrade to latest version" - because immediately the user will answer "but
> I cannot - because I am using this and that feature. tell me how to solve
> the problem without upgrading". And they will be in full right to say that.
> I recommend any maintainers to spend sometimes few week looking and
> actually responding to user's questions. That is quite a good lesson and
> this aspect should also be considered in this discussion.
>
> 2. Why do we want to introduce breaking changes?
>
> I believe the only reason for removing features (i don't really like
> softening "breaking changes" with "behaviour changes' BTW.
> This attempts to hide the fact that those changes are - in fact - breaking
> changes - is that when they are slowing us down (us - people who
> develop airflow). So I propose to keep the name in further discussion as it
> tells much more about the nature of "behaviour changes".  I see no
> particular reason for removing features if they do not slow us down.
>
> Let me ask this way - would Semver disturb you if we had a way of removing
> features from core airflow (i.e. making them not slowing down development)
> if we have a way of doing it without breaking Semver? Seems contradictory -
> but it is not. We've already done that and continue doing it. Move out
> Kubernetes and Celery out of core -> we did it. It's not any more part of
> Semver. They never were, actually (but a number of people could assume they
> were). Now, they are completely out of the way. I remember how much time
> Daniel, you spent on back-compatibility of K8S code - but... Not any more.
> People will be able to keep current K8S and Celery provider implementation
> basically forever now. No matter how many Airflow versions we have. By
> introducing a very clear Executor API and making sure we decouple it from
> the Core - we actually made the impossible happen:
>
> * Airflow core still keeps the SemVer Promises
> * Users can stick to the "old" behaviour of K8S and Celery executors as
> long as they want
> * We can introduce breaking changes in K8S and Celery providers - without
> absolutely looking back
>
> Seems like magic - but just by clear specification of what is/is not a
> public API, decoupling things and having mechanisms of providers where you
> downgrade and upgrade them independently and where the old versions can be
> used for as long as you want - even after you upgrade Airflow,  we
> seemingly made impossible - possible. And .. my assumption is - we can do
> the same with other features. Surely some are more difficult than others
> (SubDAG - yes I am talking about you). But maybe - instead of breaking
> SemVer we can figure out how to remove subdag to s separately installable
> provider? Why not introduce some stable API and decoupling SubDAG
> functionality similar to Celery/K8S Executors? It will be a lot harder, and
> likely performance will suffer - but hey, performance is not something
> promised by SemVer. We already - apparently - in 2.5 increased
> Airflow's resource requirements by ~ 30% (looks like from an anecdotal
> user's report). And while users complain, performance / resource usage is
> not promised by SemVer (and by us). And while users might complain,
> increasing resources nowadays is just a matter of cost, it's generally easy
> to increase memory for your Airflow installation. Yes you will pay more,
> but usually Airflow's cost is rather low and there are other factors that
> might decrease the cost (such as deferrables) so this is not a big problem
> (and it does not matter in this discussion).
>
> So my question is - do we have a really good reason to break up with SemVer
> and remove some features ? Or maybe there are ways we can separate them out
> of the way of core maintainers without breaking SemVer? I believe more and
> more decoupling is way better approach to achieve faster development than
> breaking SemVer.
>
> 3. Security patches
>
> This is, I think, one of the things that will only get more important over
> the next few years. And we need to be ready for what's coming. I am not
> sure about others but I am not only following, but also I actively
> participate in discussion of the Apache Software Foundation. For those who
> don't - I recommend reading this blog post at the ASF Blog
>
> https://news.apache.org/foundation/entry/save-open-source-the-impending-tragedy-of-the-cyber-resilience-act
> . We are facing - in the next few years increased pressure from governments
> (EU and US mainly in our case) to force everyone to follow security
> practices they deem essential. We are at a pivotal moment where the
> Software Development industry is starting to be regulated. It happened in
> the past for many industries. And it is happening now - we are facing
> regulations that we've never experienced in software development history.
> Those laws are about to be finalized (some of them in a few months).  The
> actual scope of it is yet to be finalized but among them there is a STRICT
> requirement of releasing security patches for all major versions of the
> software for 5 years (!) after it's been released. This will be a strict
> requirement in the EU and companies and organisations not following it will
> be forbidden to do business in the EU (similar in the US). How it will
> impact ASF - we do not know yet, our processes are sound. But there is a
> path that both - our users and stakeholders will expect that there are
> security patches that are released for all the software that is out there
> and used for years.
>
> If we use SemVer - this is the very nice side of it - by definition we only
> need to release patches for all the MAJOR versions we have. This is what we
> do effectively today. We only release security patches for the latest MINOR
> release of the ONLY major release (Airflow 2). If we start deliberately
> releasing breaking changes - then such a breaking release becomes
> automatically equivalent to a MAJOR release - because our users will not be
> able to upgrade and apply security fixes without - sometimes - majorly
> breaking their installation. This is 100% against the spirit and idea of
> the regulations. The regulations aim to force those who produce software to
> make it easy and possible for the users to upgrade immediately after
> security fixes are released.
>
> In a way - using SemVer and being able to tell the users "We only release
> security patches in the latest release because you can always easily
> upgrade to this version due to SemVer".
>
> If we are looking to speed up our development and not get into the way of
> maintainers - CalVer in this respect is way worse IMHO. The regulations
> might make us actually slower if we follow it.
>
> J.
>
>
>
>
>
> On Sun, Aug 27, 2023 at 8:46 AM Daniel Standish
> <daniel.stand...@astronomer.io.invalid> wrote:
>
> > And to clarify, when I talk about putting pressure on major releases,
> what
> > I mean is that there's this notion that a major release has to have some
> > very special or big feature change.  One reason offered is marketing.
> > Major release is an opportunity to market airflow, so take advantage of
> > that.  Another offered is, "well folks won't upgrade if there's not some
> > special carrots in it", especially given that major releases are where we
> > introduce a bunch of breaking changes all at once.
> >
> > Well, if we had a different policy that allowed for introducing behavior
> > changes on a regular basis, then we would not have to save them all up
> for
> > the major release, and unleash them on the users all at once.  So then
> you
> > would not have that big painful major release upgrade to deal with --
> you'd
> > have done it a little bit at a time.  So the "carrots" become less
> > important perhaps.  Perhaps the fact that behavior changes would come out
> > in dribs and drabs over time would make it more likely for users to
> upgrade
> > sooner, because staying current would be less painful than getting too
> far
> > behind -- though that's just a thought.
> >
> > But anyway, the way it is now, the major release seems to be too many
> > things: big marketing push, tons of new features, *and* the only
> > opportunity to make breaking changes.  A policy like snowflake's seems so
> > much healthier, methodical, and relaxed, allowing us to be selective
> about
> > when and how to release behavior changes, without it having to be
> anything
> > more than that.
> >
> > CalVer <https://calver.org/> may be a good option.
> >
> >
> > On Sat, Aug 26, 2023 at 11:22 PM Daniel Standish <
> > daniel.stand...@astronomer.io> wrote:
> >
> > > For whatever reason, I was reminded recently of snowflake's "behavior
> > > change" policy
> > >
> > > See here:
> > > https://docs.snowflake.com/en/release-notes/behavior-change-policy
> > >
> > > I think semver is problematic for core because basically you cannot
> > > implement behavior changes until the "mythical" major release.  The
> next
> > > major always seems very far off.  Indeed some have suggested that 3.0
> > might
> > > never happen (looking at you @potiuk :) ).
> > >
> > > Given the fact that it is so incredibly uncertain when exactly 3.0 will
> > > happen (and after that, subsequent major releases), it really makes the
> > > developer's job a lot harder.  Because you feel like you may never (or
> > > practically never) be able to make a change, or fix something, etc.
> > >
> > > What snowflake does is they release "behavior changes" (a.k.a. breaks
> > with
> > > backward compatibility) in phases.  First is testing phase (users can
> > > enable optionally). Next is opt-out period (enabled by default but you
> > can
> > > disable).  Final phase is general availability, when it's enabled and
> you
> > > can't disable it.
> > >
> > > Moving to something like this would give us a little more flexibility
> to
> > > evolve airflow incrementally over time, without putting so much
> pressure
> > on
> > > those mythical major releases.  And it would not put so much pressure
> on
> > > individual feature development, because you would know that there's a
> > > reasonable path to changing things down the road.
> > >
> > > Because of the infrequency of our major releases, I really think for
> > core,
> > > semver is just not working for us.  That's perhaps a bold claim, but
> > that's
> > > how I feel.
> > >
> > > Discuss!
> > >
> > >
> > >
> > >
> > >
> >
>

Reply via email to