Bug#883938: RFT: Candidate fix for boot failure of Debian 8.10 on various x86 systems
On Mon, 2017-12-18 at 18:40 +, Thomas Patrick Downes wrote: > Is there going to be any kind of post-mortem analysis of how this > happened? I think it would be good to do this, but it doesn't seem to be something that Debian does as a matter of course, and I'm unlikely to be the right person to lead such an analysis. It might be worth proposing this to the release team (debian-release mailing list). [...] > It would also be helpful if you clearly stated whether this (a) > affects all multi-socket systems and (b) whether it affected any > single-socket systems. Between the sample bias of bug reports > themselves and the “fog of war” neither conclusion is clear. I'm afraid I still don't have a deep enough understanding of the bug to say for sure. Given that 'numa=off' appears to be a workaround, I suspect that it is triggered by multiple NUMA nodes. That would imply that older multi-socket systems with a shared memory controller would not be affected, while some single-socket systems with multiple memory controllers would be affected. Ben. -- Ben Hutchings I say we take off; nuke the site from orbit. It's the only way to be sure. signature.asc Description: This is a digitally signed message part
Bug#883938: RFT: Candidate fix for boot failure of Debian 8.10 on various x86 systems
Is there going to be any kind of post-mortem analysis of how this happened? The changelogs indicate that this entered oldstable-proposed-updates around 4 December. I’m not quite sure when it entered oldstable-updates or if it ever formally entered oldstable-updates prior to being incorporated into the 8.10 release on 9 December. I would not call 5 days an “extended testing period” as promised by the proposed-updates mechanism. https://www.debian.org/releases/proposed-updates.html It would also be helpful if you clearly stated whether this (a) affects all multi-socket systems and (b) whether it affected any single-socket systems. Between the sample bias of bug reports themselves and the “fog of war” neither conclusion is clear. Yours, -- Tom Downes Senior Scientist Center for Gravitation, Cosmology and Astrophysics 414.229.2678
Bug#883938: RFT: Candidate fix for boot failure of Debian 8.10 on various x86 systems
Thank you. The update has just been applied and my systems are now up and running again with no UUID error.
Bug#883938: RFT: Candidate fix for boot failure of Debian 8.10 on various x86 systems
Dear Ben, Thank you, the fix works for me, both Sun Fires (X2200M2 and X4200M2) boot with 3.16.51-3~a.test (2017-12-11). Best wishes: Elemér
Bug#883938: RFT: Candidate fix for boot failure of Debian 8.10 on various x86 systems
Hi, * Ben Hutchings[171214 11:37]: > Apologies for this regression. Salvatore Bonaccorso has tracked down > which change in 3.16-stable triggers the crash, and I identified some > related upstream changes which appear to fix it. An updated package is > available at: > > https://people.debian.org/~benh/packages/jessie-pu/linux-image-3.16.0-4-amd64_3.16.51-3~a.test_amd64.deb We just ran into this same issue inside Proxmox VE 5.1-38 on a KVM guest with 2 Sockets with NUMA enabled. I can confirm that the test kernel makes the guest boot again. Many thanks, Chris signature.asc Description: PGP signature
Bug#883938: RFT: Candidate fix for boot failure of Debian 8.10 on various x86 systems
On Tue, 12 Dec 2017 01:57:48 + Ben Hutchings wrote: > [This message is bcc'd to all bug reporters.] > > Apologies for this regression. Salvatore Bonaccorso has tracked down > which change in 3.16-stable triggers the crash, and I identified some > related upstream changes which appear to fix it. An updated package is > available at: > > https://people.debian.org/~benh/packages/jessie-pu/linux-image-3.16.0-4-amd64_3.16.51-3~a.test_amd64.deb > > There is a signed .changes file in the same directory that you can use > to authenticate it. > > Please report back (to the bug address) whether this fixes the > regression for you. > > If you need i386 packages, let me know and I will upload them too. > > Ben. > > -- > Ben Hutchings > Unix is many things to many people, > but it's never been everything to anybody. The fix worked for me, thanks !
Bug#883938: RFT: Candidate fix for boot failure of Debian 8.10 on various x86 systems
On 12/12/17 02:57, Ben Hutchings wrote: > https://people.debian.org/~benh/packages/jessie-pu/linux-image-3.16.0-4-amd64_3.16.51-3~a.test_amd64.deb > > Please report back (to the bug address) whether this fixes the > regression for you. > Fixes the problem on our servers. Thanks! Mike.
Bug#883938: RFT: Candidate fix for boot failure of Debian 8.10 on various x86 systems
Am 12.12.2017 um 02:57 schrieb Ben Hutchings: Hi Ben, > Apologies for this regression. Salvatore Bonaccorso has tracked down > which change in 3.16-stable triggers the crash, and I identified some > related upstream changes which appear to fix it. An updated package is > available at: > > https://people.debian.org/~benh/packages/jessie-pu/linux-image-3.16.0-4-amd64_3.16.51-3~a.test_amd64.deb > > There is a signed .changes file in the same directory that you can use > to authenticate it. > > Please report back (to the bug address) whether this fixes the > regression for you. Fixes the regression on a HP DL380 Gen9. Thanks for following up. Bernhard
Bug#883938: RFT: Candidate fix for boot failure of Debian 8.10 on various x86 systems
Hi Ben, Ben Hutchings wrote: An updated package is available at: https://people.debian.org/~benh/packages/jessie-pu/linux-image-3.16.0-4-amd64_3.16.51-3~a.test_amd64.deb I can also confirm that this build works fine on my problematic machines. Thanks for the fix! Karsten signature.asc Description: PGP signature
Bug#883938: RFT: Candidate fix for boot failure of Debian 8.10 on various x86 systems
3.16.51-3~a.test also works on my previously problematic box.
Bug#883938: RFT: Candidate fix for boot failure of Debian 8.10 on various x86 systems
On Tue, 12 Dec 2017 01:57:48 + Ben Hutchingswrote: > [This message is bcc'd to all bug reporters.] > > Apologies for this regression. Salvatore Bonaccorso has tracked down > which change in 3.16-stable triggers the crash, and I identified some > related upstream changes which appear to fix it. An updated package is > available at: > > https://people.debian.org/~benh/packages/jessie-pu/linux-image-3.16.0-4-amd64_3.16.51-3~a.test_amd64.deb > > There is a signed .changes file in the same directory that you can use > to authenticate it. > > Please report back (to the bug address) whether this fixes the > regression for you. > > If you need i386 packages, let me know and I will upload them too. > > Ben. > > -- > Ben Hutchings > Unix is many things to many people, > but it's never been everything to anybody. It worked for me (on Dell PowerEdge R630); I'm now able to boot with using maxcpus=1, nosmp or numa=off. Thanks for everyone's work by the way! Thomas
Bug#883938: RFT: Candidate fix for boot failure of Debian 8.10 on various x86 systems
[This message is bcc'd to all bug reporters.] Apologies for this regression. Salvatore Bonaccorso has tracked down which change in 3.16-stable triggers the crash, and I identified some related upstream changes which appear to fix it. An updated package is available at: https://people.debian.org/~benh/packages/jessie-pu/linux-image-3.16.0-4-amd64_3.16.51-3~a.test_amd64.deb There is a signed .changes file in the same directory that you can use to authenticate it. Please report back (to the bug address) whether this fixes the regression for you. If you need i386 packages, let me know and I will upload them too. Ben. -- Ben Hutchings Unix is many things to many people, but it's never been everything to anybody. signature.asc Description: This is a digitally signed message part