Bug#884069: RFT: Candidate fix for boot failure of Debian 8.10 on various x86 systems

2017-12-11 Thread Rolandas Naujikas
Tested on most important system (IBM BladeCenter HS22).
It boots OK, but infiniband don't work (interface is up, but rdma - no):

[  147.690952] ko2iblnd: disagrees about version of symbol rdma_resolve_addr
[  147.690956] ko2iblnd: Unknown symbol rdma_resolve_addr (err -22)
[  147.691009] ko2iblnd: disagrees about version of symbol rdma_reject
[  147.691010] ko2iblnd: Unknown symbol rdma_reject (err -22)
[  147.691021] ko2iblnd: disagrees about version of symbol rdma_disconnect
[  147.691022] ko2iblnd: Unknown symbol rdma_disconnect (err -22)
[  147.691061] ko2iblnd: disagrees about version of symbol
rdma_resolve_route
[  147.691062] ko2iblnd: Unknown symbol rdma_resolve_route (err -22)
[  147.691071] ko2iblnd: disagrees about version of symbol rdma_bind_addr
[  147.691072] ko2iblnd: Unknown symbol rdma_bind_addr (err -22)
[  147.691079] ko2iblnd: disagrees about version of symbol rdma_create_qp
[  147.691080] ko2iblnd: Unknown symbol rdma_create_qp (err -22)
[  147.691098] ko2iblnd: disagrees about version of symbol rdma_create_id
[  147.691099] ko2iblnd: Unknown symbol rdma_create_id (err -22)
[  147.691113] ko2iblnd: disagrees about version of symbol rdma_notify
[  147.691114] ko2iblnd: Unknown symbol rdma_notify (err -22)
[  147.691125] ko2iblnd: disagrees about version of symbol rdma_listen
[  147.691126] ko2iblnd: Unknown symbol rdma_listen (err -22)
[  147.691132] ko2iblnd: disagrees about version of symbol rdma_destroy_qp
[  147.691133] ko2iblnd: Unknown symbol rdma_destroy_qp (err -22)
[  147.691181] ko2iblnd: disagrees about version of symbol
rdma_set_reuseaddr
[  147.691182] ko2iblnd: Unknown symbol rdma_set_reuseaddr (err -22)
[  147.691189] ko2iblnd: disagrees about version of symbol rdma_connect
[  147.691190] ko2iblnd: Unknown symbol rdma_connect (err -22)
[  147.691225] ko2iblnd: disagrees about version of symbol rdma_destroy_id
[  147.691226] ko2iblnd: Unknown symbol rdma_destroy_id (err -22)
[  147.691244] ko2iblnd: disagrees about version of symbol rdma_accept
[  147.691245] ko2iblnd: Unknown symbol rdma_accept (err -22)

Regards
Rolandas Naujikas

On 2017.12.12 03:57, Ben Hutchings wrote:
> [This message is bcc'd to all bug reporters.]
> 
> Apologies for this regression.  Salvatore Bonaccorso has tracked down
> which change in 3.16-stable triggers the crash, and I identified some
> related upstream changes which appear to fix it.  An updated package is
> available at:
> 
> https://people.debian.org/~benh/packages/jessie-pu/linux-image-3.16.0-4-amd64_3.16.51-3~a.test_amd64.deb
> 
> There is a signed .changes file in the same directory that you can use
> to authenticate it.
> 
> Please report back (to the bug address) whether this fixes the
> regression for you.
> 
> If you need i386 packages, let me know and I will upload them too.
> 
> Ben.
> 
> -- 
> Ben Hutchings
> Unix is many things to many people,
> but it's never been everything to anybody.
> 



signature.asc
Description: OpenPGP digital signature


Bug#884069: workaround is to use nosmp

2017-12-11 Thread Rolandas Naujikas
You can use nosmp parameter to start server temporary for downgrade.



Bug#884069: the same problem on different hardware

2017-12-10 Thread Rolandas Naujikas
Loading Linux 3.16.0-4-amd64 ...
Loading initial ramdisk ...
[0.349165] general protection fault:  [#1] SMP
[0.352000] Modules linked in:
[0.352000] CPU: 0 PID: 1 Comm: swapper/0 Tainted: GW
3.16.0-4-amd64 #1 Debian 3.16.51-2
[0.352000] Hardware name: Sun Microsystems Sun Fire X2200 M2
/S39  , BIOS S39_3B27 02/04/2009


[0.352000] task: 88021b4dd2d0 ti: 88021b4f task.ti:
88021b4f
[0.352000] RIP: 0010:[]  []
build_sched_domains+0x72d/0xcf0
[0.352000] RSP: :88021b4f3df8  EFLAGS: 00010216

[0.352000] RAX:  RBX:  RCX:
0002
[0.352000] RDX: 00016cc8 RSI:  RDI:
0200
[0.352000] RBP: 88021b5d1d98 R08: 88021b5d0760 R09:
00f4
[0.352000] R10:  R11: 88021b4f3b06 R12:
88021b5d0740
[0.352000] R13: 0200 R14: 88021b58cac0 R15:
0200
[0.352000] FS:  () GS:880223c0()
knlGS:
[0.352000] CS:  0010 DS:  ES:  CR0: 8005003b

[0.352000] CR2: 880323fff000 CR3: 01813000 CR4:
07f0
[0.352000] Stack:
[0.352000]  8802 88021b5d0758 88021b5d1d00
88021b58cac0
[0.352000]    
880323418e00
[0.352000]   f1c8 

[0.352000] Call Trace:
[0.352000]  [] ? sched_init_smp+0x398/0x452

[0.352000]  [] ? mutex_lock+0xe/0x2a
[0.352000]  [] ? put_online_cpus+0x23/0x80

[0.352000]  [] ? stop_machine+0x2c/0x40

[0.352000]  [] ? kernel_init_freeable+0xdd/0x1e1

[0.352000]  [] ? rest_init+0x80/0x80
[0.352000]  [] ? kernel_init+0xa/0xf0

[0.352000]  [] ? ret_from_fork+0x58/0x90

[0.352000]  [] ? rest_init+0x80/0x80
[0.352000] Code: c0 0f 85 46 05 00 00 48 8b 74 24 08 48 c7 c2 00 dd
a6 81 bf ff ff ff ff e8 91 78 21 00 48 98 49 8b 56 10 48 8b 04 c5 a0 1e
8e 81 <48> 8b 14 10 b8 01 00 00 00 49 89 54 24 10 f0 0f c1 02 85 c0 75

[0.352000] RIP  [] build_sched_domains+0x72d/0xcf0

[0.352000]  RSP 
[0.352006] ---[ end trace 68d23c2290c77ca9 ]---
[0.356017] Kernel panic - not syncing: Attempted to kill init!
exitcode=0x000b
[0.356017]
[0.36] ---[ end Kernel panic - not syncing: Attempted to kill
init! exitcode=0x000b
[0.36]



Bug#884069: system cannot boot after upgrade to Debian 8.10

2017-12-10 Thread Rolandas Naujikas
Severity: critical



Bug#884069: Kernel crash on boot on IBM BladeCenter HS22

2017-12-10 Thread Rolandas Naujikas
Package: linux-image-3.16.0-4-amd64
Version: 3.16.51-2

Loading Linux 3.16.0-4-amd64 ...
Loading initial ramdisk ...
[0.604128] general protection fault:  [#1] SMP
[0.609303] Modules linked in:
[0.612493] CPU: 0 PID: 1 Comm: swapper/0 Tainted: GW
3.16.0-4-amd64 #1 Debian 3.16.51-2
[0.621685] Hardware name: IBM BladeCenter HS22 -[7870SGX]-/68Y8077,
BIOS -[P9E163AUS-1.24]- 09/17/2014
[0.631141] task: 8803731392d0 ti: 88037313c000 task.ti:
88037313c000
[0.638684] RIP: 0010:[]  []
build_sched_domains+0x72d/0xcf0
[0.647609] RSP: :88037313fdf8  EFLAGS: 00010216
[0.652982] RAX:  RBX:  RCX:
0012
[0.660175] RDX: 00016f48 RSI:  RDI:
0200
[0.667372] RBP: 880372f5d698 R08: 880372f5e0e0 R09:
0139
[0.674566] R10:  R11: 88037313fb06 R12:
880372f5e0c0
[0.681761] R13: 0200 R14: 880672e640c0 R15:
0200
[0.688958] FS:  () GS:88037fc0()
knlGS:
[0.697110] CS:  0010 DS:  ES:  CR0: 8005003b

[0.702914] CR2: 88067000 CR3: 01813000 CR4:
07f0
[0.710109] Stack:
[0.712183]  8806 880372f5e0d8 880372f5d600
880672e640c0
[0.719940]    
880372f53e00
[0.727715]   f1c8 

[0.735501] Call Trace:
[0.738022]  [] ? sched_init_smp+0x398/0x452

[0.743930]  [] ? mutex_lock+0xe/0x2a
[0.749229]  [] ? put_online_cpus+0x23/0x80

[0.755050]  [] ? stop_machine+0x2c/0x40

[0.760618]  [] ? kernel_init_freeable+0xdd/0x1e1

[0.766963]  [] ? rest_init+0x80/0x80
[0.772264]  [] ? kernel_init+0xa/0xf0

[0.777652]  [] ? ret_from_fork+0x58/0x90

[0.783298]  [] ? rest_init+0x80/0x80
[0.788595] Code: c0 0f 85 46 05 00 00 48 8b 74 24 08 48 c7 c2 00 dd
a6 81 bf ff ff ff ff e8 91 78 21 00 48 98 49 8b 56 10 48 8b 04 c5 a0 1e
8e 81 <48> 8b 14 10 b8 01 00 00 00 49 89 54 24 10 f0 0f c1 02 85 c0 75

[0.812501] RIP  [] build_sched_domains+0x72d/0xcf0

[0.819101]  RSP 
[0.822668] ---[ end trace b6ea7a8f78a6ba93 ]---
[0.827375] Kernel panic - not syncing: Attempted to kill init!
exitcode=0x000b
[0.827375]
[0.836621] ---[ end Kernel panic - not syncing: Attempted to kill
init! exitcode=0x000b
[0.836621]