** Description changed: + [Impact] + + At boot-time, the kernel will panic somewhere in 'blk_mq_register_disk', + a snippet of the track is below, and full panic dump is attached. The + panic dump was collected via serial console, as the kernel panics so + early that we cannot kdump it. + + [ 2.650512] [<ffffffff813ac8a6>] blk_mq_register_disk+0xa6/0x160 + [ 2.656675] [<ffffffff813a1b44>] blk_register_queue+0xb4/0x160 + [ 2.662661] [<ffffffff813af53e>] add_disk+0x1ce/0x490 + [ 2.667869] [<ffffffff815477e0>] loop_add+0x1f0/0x270 + + [Test Case] + + At boot-time, the kernel will panic somewhere in 'blk_mq_register_disk', + a snippet of the track is below, and full panic dump is attached. + + [Regression Potential] + + * Fix implemented upstream starting with v4.6-rc1 + + * The fix is fairly straightfoward given the stack trace and the + upstream fix. + + * The fix is hard to verify, but user "Proton" was able to confirmed + the test kernel including the fix solve this particular problem. + + + [Other Info] + + * https://lkml.org/lkml/2016/3/16/40 + * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=e0e827b9 + * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=897bb0c7 + + [Original Description] We discovered a pretty serious regression introduced in 4.4.0-18. At boot-time, the kernel will panic somewhere in 'blk_mq_register_disk', a snippet of the track is below, and full panic dump is attached. The panic dump was collected via serial console, as the kernel panics so early that we cannot kdump it. [ 2.650512] [<ffffffff813ac8a6>] blk_mq_register_disk+0xa6/0x160 [ 2.656675] [<ffffffff813a1b44>] blk_register_queue+0xb4/0x160 [ 2.662661] [<ffffffff813af53e>] add_disk+0x1ce/0x490 [ 2.667869] [<ffffffff815477e0>] loop_add+0x1f0/0x270 This seems somewhat similar to https://lkml.org/lkml/2016/3/16/40, but the trace is not identical. We discovered this issue when we were experimenting with linux-generic- lts-xenial from trusty-updates on a 14.04 installation. When we installed it, 4.4.0-15 was the current package, and it worked fine and provided a large amount of improvements for us. Background security updates installed 4.4.0-18, and this updated and grub and became the default kernel. On a reboot, the node panics about 2 seconds in, resulting in a machine in a dead state. We were able to boot a rescue image and roll bac kto 4.4.0-15, which works nicely. We currently have pinning on 4.4.0-15 to prevent this problem from coming back, but would prefer to see the problem fixed. I'll attach lspci, lshw, and dmidecode for our hardware as well, but this is happening on pretty vanilla supermicro nodes. We are able to consistently reproduce it on our hardware. It is not reproducible in EC2, only on metal.
** Description changed: [Impact] At boot-time, the kernel will panic somewhere in 'blk_mq_register_disk', a snippet of the track is below, and full panic dump is attached. The panic dump was collected via serial console, as the kernel panics so early that we cannot kdump it. [ 2.650512] [<ffffffff813ac8a6>] blk_mq_register_disk+0xa6/0x160 [ 2.656675] [<ffffffff813a1b44>] blk_register_queue+0xb4/0x160 [ 2.662661] [<ffffffff813af53e>] add_disk+0x1ce/0x490 [ 2.667869] [<ffffffff815477e0>] loop_add+0x1f0/0x270 [Test Case] At boot-time, the kernel will panic somewhere in 'blk_mq_register_disk', a snippet of the track is below, and full panic dump is attached. [Regression Potential] - * Fix implemented upstream starting with v4.6-rc1 + * Fix implemented upstream starting with v4.6-rc1 - * The fix is fairly straightfoward given the stack trace and the + * The fix is fairly straightfoward given the stack trace and the upstream fix. - * The fix is hard to verify, but user "Proton" was able to confirmed - the test kernel including the fix solve this particular problem. - + * The fix is hard to verify, but user "Proton" was able to confirmed the test kernel including the fix solve this particular problem: + https://bugs.launchpad.net/ubuntu/+source/linux-lts-xenial/+bug/1572630/comments/23 [Other Info] - * https://lkml.org/lkml/2016/3/16/40 - * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=e0e827b9 - * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=897bb0c7 + * https://lkml.org/lkml/2016/3/16/40 + * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=e0e827b9 + * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=897bb0c7 [Original Description] We discovered a pretty serious regression introduced in 4.4.0-18. At boot-time, the kernel will panic somewhere in 'blk_mq_register_disk', a snippet of the track is below, and full panic dump is attached. The panic dump was collected via serial console, as the kernel panics so early that we cannot kdump it. [ 2.650512] [<ffffffff813ac8a6>] blk_mq_register_disk+0xa6/0x160 [ 2.656675] [<ffffffff813a1b44>] blk_register_queue+0xb4/0x160 [ 2.662661] [<ffffffff813af53e>] add_disk+0x1ce/0x490 [ 2.667869] [<ffffffff815477e0>] loop_add+0x1f0/0x270 This seems somewhat similar to https://lkml.org/lkml/2016/3/16/40, but the trace is not identical. We discovered this issue when we were experimenting with linux-generic- lts-xenial from trusty-updates on a 14.04 installation. When we installed it, 4.4.0-15 was the current package, and it worked fine and provided a large amount of improvements for us. Background security updates installed 4.4.0-18, and this updated and grub and became the default kernel. On a reboot, the node panics about 2 seconds in, resulting in a machine in a dead state. We were able to boot a rescue image and roll bac kto 4.4.0-15, which works nicely. We currently have pinning on 4.4.0-15 to prevent this problem from coming back, but would prefer to see the problem fixed. I'll attach lspci, lshw, and dmidecode for our hardware as well, but this is happening on pretty vanilla supermicro nodes. We are able to consistently reproduce it on our hardware. It is not reproducible in EC2, only on metal. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1572630 Title: boot-time kernel panic introduced in 4.4.0-18, not present in 4.4.0-15 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-lts-xenial/+bug/1572630/+subscriptions -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
