On 2/16/26 11:58, Matt Coster wrote:
> On 16/02/2026 10:11, Thorsten Leemhuis wrote:
> 
> We're currently trying to force this issue to reproduce on hardware we
> have on hand; we'd like to see it fixed properly as much as anyone.

Yeah, no worries, I never doubted that. But getting things properly fixed
can mean "revert, fix, reapply" when it comes to regressions in Linux --
which is something that should not be seen as something bad, as Linus said
himself (see below)!

> From our side at least, I don't believe this is a regression at all.
In the end what matters is: some change afaics caused systems to not work
anymore that used to be working -- that makes it a regression my the Linux
kernels standards. And those by the same standards must be fixed, ideally
quickly. Find a few quotes on that from Linus below that explains this
better. 

Ciao, Thorsten
---


On how quickly regressions should be fixed
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

* From `2026-01-22 
<https://lore.kernel.org/all/CAHk-=wheqniw_wthgo7bkkt7uib-p+ai2jp9m+z+fycz6ca...@mail.gmail.com/>`_::

    But a user complaining should basically result in an immediate fix -
    possibly a "revert and rethink".

  With a later clarification on `2026-01-28 
<https://lore.kernel.org/all/cahk-%3dwi86aosxs66-yi54%2bmpqjpu0upxb8zafg%[email protected]/>`_::

    It's also worth noting that "immediate" obviously doesn't mean "right
    this *second* when the problem has been reported".

    But if it's a regression with a known commit that caused it, I think
    the rule of thumb should generally be "within a week", preferably
    before the next rc.

* From `2023-04-21 
<https://lore.kernel.org/all/CAHk-=wgD98pmSK3ZyHk_d9kZ2bhgN6DuNZMAJaV0WTtbkf=r...@mail.gmail.com/>`_::

    Known-broken commits either
     (a) get a timely fix that doesn't have other questions
    or
     (b) get reverted

* From `2021-09-20(2) 
<https://lore.kernel.org/all/CAHk-=wgovmtrw1tnbmc1rn5yqytkyn0hz+sc4k0dgnn++u9...@mail.gmail.com/>`_::

    [...] review shouldn't hold up reported regressions of existing code. That's
    just basic _testing_ - either the fix should be applied, or - if the fix is
    too invasive or too ugly - the problematic source of the regression should
    be reverted.

    Review should be about new code, it shouldn't be holding up "there's a
    bug report, here's the obvious fix".

* From `2023-05-08 
<https://lore.kernel.org/all/CAHk-=wgzU8_dGn0Yg+DyX7ammTkDUCyEJ4C=nvnhrhxkwc7...@mail.gmail.com/>`_::

    If something doesn't even build, it should damn well be fixed ASAP.


On how fixing regressions with reverts can help prevent maintainer burnout
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

* From `2026-01-28 
<https://lore.kernel.org/all/cahk-%3dwi86aosxs66-yi54%2bmpqjpu0upxb8zafg%[email protected]/>`_::

    > So how can I/we make "immediate fixes" happen more often without
    > contributing to maintainer burnout?

    [...] the "revert and rethink" model [...] often a good idea in general
    unless there's just an obvious fix for an obvious bug [...]

    Exactly so that maintainers don't get stressed out over having a pending
    problem report that people keep pestering them about.

    I think people are sometimes a bit too bought into whatever changes
    they made, and reverting is seen as "too drastic", but I think it's
    often the quick and easy solution for when there isn't some obvious
    response to a regression report.


On why the "no regressions" rule exists
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

* From `2026-01-22 
<https://lore.kernel.org/all/CAHk-=wheqniw_wthgo7bkkt7uib-p+ai2jp9m+z+fycz6ca...@mail.gmail.com/>`_::

    But the basic rule is: be so good about backwards compatibility that
    users never have to worry about upgrading. They should absolutely feel
    confident that any kernel-reported problem will either be solved, or
    have an easy solution that is appropriate for *them* (ie a
    non-technical user shouldn't be expected to be able to do a lot).

    Because the last thing we want is people holding back from trying new
    kernels.

* From `2024-05-28 
<https://lore.kernel.org/all/CAHk-=wgtb7y-beh7tpdvdwru7zkq8-kmjz53tsk37zsppdw...@mail.gmail.com/>`_::

    I introduced that "no regressions" rule something like two decades
    ago, because people need to be able to update their kernel without
    fear of something they relied on suddenly stopping to work.

* From `2018-08-03 
<https://lore.kernel.org/all/CA+55aFwWZX=cxmwdtkdgb36kf12xmtehmqjbimpcqcrg2hi...@mail.gmail.com/>`_::

    The whole point of "we do not regress" is so that people can upgrade
    the kernel and never have to worry about it.

    [...]

    Because the only thing that matters IS THE USER.

* From `2017-10-26(1) 
<https://lore.kernel.org/lkml/ca+55afxw7nmamvyhkvz1upbutujewrt6yb51qax5rtrwowj...@mail.gmail.com/>`_::

    If the kernel used to work for you, the rule is that it continues to work
    for you.

    [...]

    People should basically always feel like they can update their kernel
    and simply not have to worry about it.

    I refuse to introduce "you can only update the kernel if you also
    update that other program" kind of limitations. If the kernel used to
    work for you, the rule is that it continues to work for you.


On exceptions to the "no regressions" rule
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

* From `2026-01-22 
<https://lore.kernel.org/all/CAHk-=wheqniw_wthgo7bkkt7uib-p+ai2jp9m+z+fycz6ca...@mail.gmail.com/>`_::

    There are _very_ few exceptions to that rule, the main one being "the
    problem was a fundamental huge and gaping security issue and we *had* to
    make that change, and we couldn't even make your limited use-case just
    continue to work".

    The other exception is "the problem was reported years after it was
    introduced, and now most people rely on the new behavior".

    [...]

    Now, if it's one or two users and you can just get them to recompile,
    that's one thing. Niche hardware and odd use-cases can sometimes be
    solved that way, and regressions can sometimes be fixed by handholding
    every single reporter if the reporter is willing and able to change
    his or her workflow.

* From `2023-04-20 
<https://lore.kernel.org/all/CAHk-=wis_qqy4odnynnki5b7qhosmxtoj1jxo5wmb6sruwq...@mail.gmail.com/>`_::

    And yes, I do consider "regression in an earlier release" to be a
    regression that needs fixing.

    There's obviously a time limit: if that "regression in an earlier
    release" was a year or more ago, and just took forever for people to
    notice, and it had semantic changes that now mean that fixing the
    regression could cause a _new_ regression, then that can cause me to
    go "Oh, now the new semantics are what we have to live with".

* From `2021-09-20(3) 
<https://lore.kernel.org/all/CAHk-=wi7db2sj-wngvvsj7ak2cm556q8437soxo4ejt2bwp...@mail.gmail.com/>`_::

    Yes, we have situations where even regressions don't matter - like
    major security issues that simply cannot be fixed other ways, because
    the regression _was_ the security hole.

* From `2017-10-26(2) 
<https://lore.kernel.org/lkml/ca+55afxw7nmamvyhkvz1upbutujewrt6yb51qax5rtrwowj...@mail.gmail.com/>`_::

    There have been exceptions, but they are few and far between, and they
    generally have some major and fundamental reasons for having happened,
    that were basically entirely unavoidable, and people _tried_hard_ to
    avoid them. Maybe we can't practically support the hardware any more
    after it is decades old and nobody uses it with modern kernels any
    more. Maybe there's a serious security issue with how we did things,
    and people actually depended on that fundamentally broken model. Maybe
    there was some fundamental other breakage that just _had_ to have a
    flag day for very core and fundamental reasons.


On accepting when a regression occurred
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

* From `2026-01-22 
<https://lore.kernel.org/all/CAHk-=wheqniw_wthgo7bkkt7uib-p+ai2jp9m+z+fycz6ca...@mail.gmail.com/>`_::

    But starting to argue about users reporting breaking changes is
    basically the final line for me. I have a couple of people that I have
    in my spam block-list and refuse to have anything to do with, and they
    have generally been about exactly that.

    Note how it's not about making mistakes and _causing_ the regression.
    That's normal. That's development. But then arguing about it is a
    no-no.

* From `2024-06-23 
<https://lore.kernel.org/all/CAHk-=wi_KMO_rJ6OCr8mAWBRg-irziM=t9wxgc+j1vvoqb3...@mail.gmail.com/>`_::

    We don't introduce regressions and then blame others.

    There's a very clear rule in kernel development: things that break
    other things ARE NOT FIXES.

    EVER.

    They get reverted, or the thing they broke gets fixed.

* From `2021-06-05 
<https://lore.kernel.org/all/CAHk-=wiuvqhn76yuwhkjzzwtdjmmjf_zn4+u7vejjmegh3r...@mail.gmail.com/>`_::

    THERE ARE NO VALID ARGUMENTS FOR REGRESSIONS.

    Honestly, security people need to understand that "not working" is not
    a success case of security. It's a failure case.

    Yes, "not working" may be secure. But security in that case is *pointless*.

* From `2017-10-26(5) 
<https://lore.kernel.org/lkml/CA+55aFwiiQYJ+YoLKCXjN_beDVfu38mg=ggg5lfocqhe8qi...@mail.gmail.com/>`_::

    [...] when regressions *do* occur, we admit to them and fix them, instead of
    blaming user space.

    The fact that you have apparently been denying the regression now for
    three weeks means that I will revert, and I will stop pulling apparmor
    requests until the people involved understand how kernel development
    is done.


On back-and-forth
~~~~~~~~~~~~~~~~~

* From `2024-05-28 
<https://lore.kernel.org/all/CAHk-=wgtb7y-beh7tpdvdwru7zkq8-kmjz53tsk37zsppdw...@mail.gmail.com/>`_::

    The "no regressions" rule is that we do not introduce NEW bugs.

    It *literally* came about because we had an endless dance of "fix two
    bugs, introduce one new one", and that then resulted in a system that
    you cannot TRUST.

* From `2021-09-20(1) 
<https://lore.kernel.org/all/CAHk-=wi7db2sj-wngvvsj7ak2cm556q8437soxo4ejt2bwp...@mail.gmail.com/>`_::


    And the thing that makes regressions special is that back when I
    wasn't so strict about these things, we'd end up in endless "seesaw
    situations" where somebody would fix something, it would break
    something else, then that something else would break, and it would
    never actually converge on anything reliable at all.

* From `2015-08-13 
<https://lore.kernel.org/all/ca+55afxk8-bsikwr_s-c+4g6wihkpqvmle34h9wozpeua6w...@mail.gmail.com/>`_::

    The strict policy of no regressions actually originally started mainly wrt
    suspend/resume issues, where the "fix one machine, break another" kind of
    back-and-forth caused endless problems, and meant that we didn't actually
    necessarily make any forward progress, just moving a problem around.


On regressions caused by bugfixes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

* From `2018-08-03 
<https://lore.kernel.org/all/CA+55aFwWZX=cxmwdtkdgb36kf12xmtehmqjbimpcqcrg2hi...@mail.gmail.com/>`_::

    > Kernel had a bug which has been fixed

    That is *ENTIRELY* immaterial.

    Guys, whether something was buggy or not DOES NOT MATTER.

    [...]

    It's basically saying "I took something that worked, and I broke it,
    but now it's better". Do you not see how f*cking insane that statement
    is?

Reply via email to