Bug#884069: RFT: Candidate fix for boot failure of Debian 8.10 on various x86 systems
Tested on most important system (IBM BladeCenter HS22). It boots OK, but infiniband don't work (interface is up, but rdma - no): [ 147.690952] ko2iblnd: disagrees about version of symbol rdma_resolve_addr [ 147.690956] ko2iblnd: Unknown symbol rdma_resolve_addr (err -22) [ 147.691009] ko2iblnd: disagrees about version of symbol rdma_reject [ 147.691010] ko2iblnd: Unknown symbol rdma_reject (err -22) [ 147.691021] ko2iblnd: disagrees about version of symbol rdma_disconnect [ 147.691022] ko2iblnd: Unknown symbol rdma_disconnect (err -22) [ 147.691061] ko2iblnd: disagrees about version of symbol rdma_resolve_route [ 147.691062] ko2iblnd: Unknown symbol rdma_resolve_route (err -22) [ 147.691071] ko2iblnd: disagrees about version of symbol rdma_bind_addr [ 147.691072] ko2iblnd: Unknown symbol rdma_bind_addr (err -22) [ 147.691079] ko2iblnd: disagrees about version of symbol rdma_create_qp [ 147.691080] ko2iblnd: Unknown symbol rdma_create_qp (err -22) [ 147.691098] ko2iblnd: disagrees about version of symbol rdma_create_id [ 147.691099] ko2iblnd: Unknown symbol rdma_create_id (err -22) [ 147.691113] ko2iblnd: disagrees about version of symbol rdma_notify [ 147.691114] ko2iblnd: Unknown symbol rdma_notify (err -22) [ 147.691125] ko2iblnd: disagrees about version of symbol rdma_listen [ 147.691126] ko2iblnd: Unknown symbol rdma_listen (err -22) [ 147.691132] ko2iblnd: disagrees about version of symbol rdma_destroy_qp [ 147.691133] ko2iblnd: Unknown symbol rdma_destroy_qp (err -22) [ 147.691181] ko2iblnd: disagrees about version of symbol rdma_set_reuseaddr [ 147.691182] ko2iblnd: Unknown symbol rdma_set_reuseaddr (err -22) [ 147.691189] ko2iblnd: disagrees about version of symbol rdma_connect [ 147.691190] ko2iblnd: Unknown symbol rdma_connect (err -22) [ 147.691225] ko2iblnd: disagrees about version of symbol rdma_destroy_id [ 147.691226] ko2iblnd: Unknown symbol rdma_destroy_id (err -22) [ 147.691244] ko2iblnd: disagrees about version of symbol rdma_accept [ 147.691245] ko2iblnd: Unknown symbol rdma_accept (err -22) Regards Rolandas Naujikas On 2017.12.12 03:57, Ben Hutchings wrote: > [This message is bcc'd to all bug reporters.] > > Apologies for this regression. Salvatore Bonaccorso has tracked down > which change in 3.16-stable triggers the crash, and I identified some > related upstream changes which appear to fix it. An updated package is > available at: > > https://people.debian.org/~benh/packages/jessie-pu/linux-image-3.16.0-4-amd64_3.16.51-3~a.test_amd64.deb > > There is a signed .changes file in the same directory that you can use > to authenticate it. > > Please report back (to the bug address) whether this fixes the > regression for you. > > If you need i386 packages, let me know and I will upload them too. > > Ben. > > -- > Ben Hutchings > Unix is many things to many people, > but it's never been everything to anybody. > signature.asc Description: OpenPGP digital signature
Bug#884069: workaround is to use nosmp
You can use nosmp parameter to start server temporary for downgrade.
Bug#884069: the same problem on different hardware
Loading Linux 3.16.0-4-amd64 ... Loading initial ramdisk ... [0.349165] general protection fault: [#1] SMP [0.352000] Modules linked in: [0.352000] CPU: 0 PID: 1 Comm: swapper/0 Tainted: GW 3.16.0-4-amd64 #1 Debian 3.16.51-2 [0.352000] Hardware name: Sun Microsystems Sun Fire X2200 M2 /S39 , BIOS S39_3B27 02/04/2009 [0.352000] task: 88021b4dd2d0 ti: 88021b4f task.ti: 88021b4f [0.352000] RIP: 0010:[] [] build_sched_domains+0x72d/0xcf0 [0.352000] RSP: :88021b4f3df8 EFLAGS: 00010216 [0.352000] RAX: RBX: RCX: 0002 [0.352000] RDX: 00016cc8 RSI: RDI: 0200 [0.352000] RBP: 88021b5d1d98 R08: 88021b5d0760 R09: 00f4 [0.352000] R10: R11: 88021b4f3b06 R12: 88021b5d0740 [0.352000] R13: 0200 R14: 88021b58cac0 R15: 0200 [0.352000] FS: () GS:880223c0() knlGS: [0.352000] CS: 0010 DS: ES: CR0: 8005003b [0.352000] CR2: 880323fff000 CR3: 01813000 CR4: 07f0 [0.352000] Stack: [0.352000] 8802 88021b5d0758 88021b5d1d00 88021b58cac0 [0.352000] 880323418e00 [0.352000] f1c8 [0.352000] Call Trace: [0.352000] [] ? sched_init_smp+0x398/0x452 [0.352000] [] ? mutex_lock+0xe/0x2a [0.352000] [] ? put_online_cpus+0x23/0x80 [0.352000] [] ? stop_machine+0x2c/0x40 [0.352000] [] ? kernel_init_freeable+0xdd/0x1e1 [0.352000] [] ? rest_init+0x80/0x80 [0.352000] [] ? kernel_init+0xa/0xf0 [0.352000] [] ? ret_from_fork+0x58/0x90 [0.352000] [] ? rest_init+0x80/0x80 [0.352000] Code: c0 0f 85 46 05 00 00 48 8b 74 24 08 48 c7 c2 00 dd a6 81 bf ff ff ff ff e8 91 78 21 00 48 98 49 8b 56 10 48 8b 04 c5 a0 1e 8e 81 <48> 8b 14 10 b8 01 00 00 00 49 89 54 24 10 f0 0f c1 02 85 c0 75 [0.352000] RIP [] build_sched_domains+0x72d/0xcf0 [0.352000] RSP [0.352006] ---[ end trace 68d23c2290c77ca9 ]--- [0.356017] Kernel panic - not syncing: Attempted to kill init! exitcode=0x000b [0.356017] [0.36] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x000b [0.36]
Bug#884069: system cannot boot after upgrade to Debian 8.10
Severity: critical
Bug#884069: Kernel crash on boot on IBM BladeCenter HS22
Package: linux-image-3.16.0-4-amd64 Version: 3.16.51-2 Loading Linux 3.16.0-4-amd64 ... Loading initial ramdisk ... [0.604128] general protection fault: [#1] SMP [0.609303] Modules linked in: [0.612493] CPU: 0 PID: 1 Comm: swapper/0 Tainted: GW 3.16.0-4-amd64 #1 Debian 3.16.51-2 [0.621685] Hardware name: IBM BladeCenter HS22 -[7870SGX]-/68Y8077, BIOS -[P9E163AUS-1.24]- 09/17/2014 [0.631141] task: 8803731392d0 ti: 88037313c000 task.ti: 88037313c000 [0.638684] RIP: 0010:[] [] build_sched_domains+0x72d/0xcf0 [0.647609] RSP: :88037313fdf8 EFLAGS: 00010216 [0.652982] RAX: RBX: RCX: 0012 [0.660175] RDX: 00016f48 RSI: RDI: 0200 [0.667372] RBP: 880372f5d698 R08: 880372f5e0e0 R09: 0139 [0.674566] R10: R11: 88037313fb06 R12: 880372f5e0c0 [0.681761] R13: 0200 R14: 880672e640c0 R15: 0200 [0.688958] FS: () GS:88037fc0() knlGS: [0.697110] CS: 0010 DS: ES: CR0: 8005003b [0.702914] CR2: 88067000 CR3: 01813000 CR4: 07f0 [0.710109] Stack: [0.712183] 8806 880372f5e0d8 880372f5d600 880672e640c0 [0.719940] 880372f53e00 [0.727715] f1c8 [0.735501] Call Trace: [0.738022] [] ? sched_init_smp+0x398/0x452 [0.743930] [] ? mutex_lock+0xe/0x2a [0.749229] [] ? put_online_cpus+0x23/0x80 [0.755050] [] ? stop_machine+0x2c/0x40 [0.760618] [] ? kernel_init_freeable+0xdd/0x1e1 [0.766963] [] ? rest_init+0x80/0x80 [0.772264] [] ? kernel_init+0xa/0xf0 [0.777652] [] ? ret_from_fork+0x58/0x90 [0.783298] [] ? rest_init+0x80/0x80 [0.788595] Code: c0 0f 85 46 05 00 00 48 8b 74 24 08 48 c7 c2 00 dd a6 81 bf ff ff ff ff e8 91 78 21 00 48 98 49 8b 56 10 48 8b 04 c5 a0 1e 8e 81 <48> 8b 14 10 b8 01 00 00 00 49 89 54 24 10 f0 0f c1 02 85 c0 75 [0.812501] RIP [] build_sched_domains+0x72d/0xcf0 [0.819101] RSP [0.822668] ---[ end trace b6ea7a8f78a6ba93 ]--- [0.827375] Kernel panic - not syncing: Attempted to kill init! exitcode=0x000b [0.827375] [0.836621] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x000b [0.836621]