Public bug reported: Hello: Every few days I get a kernel panic on my Ubuntu Server 20.10 box, which was recently upgraded to a Ryzen 3700X. I have 7 WD Red Pro HDDs in a RAID 6 array with Linux MD, and they're all attached to a LSI 9211-8ik PCIe card. Motherboard is currently a Gigabyte B550M Aorus Pro. My Ubuntu install is running the latest 5.8.0-53 kernel.
This is the 2nd hardware configuration with the exact same kernel panic text. Previously I had these HDDs directly connected to the SATA controller of a ASRock X570 Pro4 ATX mobo with the same 3700X. I was also previously using Ubuntu Server 20.04 LTS -- I had upgraded to 20.10 in hopes that the newer kernel would fix it, which it did not. I had posted a whole story on StackOverflow about this journey if you're interested: https://superuser.com/questions/1615400/md-raid-6-periodic-kernel-panic-possible-kernel-bug However, I am now convinced this is a Linux kernel bug in the MD driver. Example 1 kernel panic: [406005.583315] BUG: stack guard page was hit at 000000007cbff150 (stack is 000000003b7072a2..00000000dac5ed08) [406005.583315] kernel stack overflow (double-fault): 0000 [#1] SMP NOPTI [406005.583315] CPU: 15 PID: 514 Comm: md0_raid6 Tainted: P OE 5.8.0-36-generic #40-Ubuntu [406005.583316] Hardware name: Gigabyte Technology Co., Ltd. B550M AORUS PRO/B550M AORUS PRO, BIOS F1 05/19/2020 [406005.583316] RIP: 0010:slab_free_freelist_hook+0x35/0x120 [406005.583316] Code: 89 d5 41 54 49 89 f4 53 48 89 fb 48 83 ec 08 48 8b 02 4c 8b 36 48 c7 06 00 00 00 00 48 c7 02 00 00 00 00 48 85 c0 49 0f 44 c6 <48> 89 45 d0 eb 06 4c 3b 7d d0 74 5d 8b 53 20 4d 89 f7 49 8d 34 16 [406005.583316] RSP: 0018:ffffa620c06e3ff8 EFLAGS: 00010246 [406005.583317] RAX: ffff9aaf36f54720 RBX: ffff9ab34b407800 RCX: 0000000000000001 [406005.583317] RDX: ffffa620c06e4040 RSI: ffffa620c06e4038 RDI: ffff9ab34b407800 [406005.583317] RBP: ffffa620c06e4028 R08: 0000000000000001 R09: ffffffffb9c54500 [406005.583318] R10: ffff9aaf36f54fe0 R11: 0000000000000001 R12: ffffa620c06e4038 [406005.583318] R13: ffffa620c06e4040 R14: ffff9aaf36f54720 R15: ffff9ab2925cbd10 [406005.583318] FS: 0000000000000000(0000) GS:ffff9ab34edc0000(0000) knlGS:0000000000000000 [406005.583318] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [406005.583318] CR2: ffffa620c06e3fe8 CR3: 00000005d52ac000 CR4: 0000000000340ee0 [406005.583319] Call Trace: [406005.583319] ? mempool_kfree+0xe/0x10 [406005.583319] ? kfree+0xb8/0x220 [406005.583319] ? mempool_kfree+0xe/0x10 [406005.583319] ? mempool_free+0x2f/0x80 [406005.583319] ? md_end_io+0x4b/0x70 [406005.583319] ? bio_endio+0xe6/0x150 Example 2 kernel panic with old mobo: [161342.301305] BUG: stack guard page was hit at 00000000fc60f228 (stack is 00000000875efe77..000000003f38a379) [161342.301306] kernel stack overflow (double-fault): 0000 [#1] SMP NOPTI [161342.301306] CPU: 10 PID: 465 Comm: md0_raid6 Tainted: P OE 5.8.0-33-generic #36-Ubuntu [161342.301307] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X570 Pro4, BIOS P3.60 12/01/2020 [161342.301307] RIP: 0010:slab_free_freelist_hook+0x35/0x120 [161342.301308] Code: 89 d5 41 54 49 89 f4 53 48 89 fb 48 83 ec 08 48 8b 02 4c 8b 36 48 c7 06 00 00 00 00 48 c7 02 00 00 00 00 48 85 c0 49 0f 44 c6 <48> 89 45 d0 eb 06 4c 3b 7d d0 74 5d 8b 53 20 4d 89 f7 49 8d 34 16 [161342.301308] RSP: 0018:ffffa86b00c6fff8 EFLAGS: 00010246 [161342.301309] RAX: ffff98edc21cac40 RBX: ffff98ef0b407800 RCX: 0000000000000001 [161342.301310] RDX: ffffa86b00c70040 RSI: ffffa86b00c70038 RDI: ffff98ef0b407800 [161342.301310] RBP: ffffa86b00c70028 R08: 0000000000000001 R09: ffffffff85854500 [161342.301311] R10: ffff98edc21ca100 R11: 0000000000000001 R12: ffffa86b00c70038 [161342.301311] R13: ffffa86b00c70040 R14: ffff98edc21cac40 R15: ffff98e9b53d74d8 [161342.301311] FS: 0000000000000000(0000) GS:ffff98ef0ec80000(0000) knlGS:0000000000000000 [161342.301312] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [161342.301312] CR2: ffffa86b00c6ffe8 CR3: 00000007fa766000 CR4: 0000000000340ee0 [161342.301312] Call Trace: [161342.301313] ? mempool_kfree+0xe/0x10 [161342.301313] ? kfree+0xb8/0x220 [161342.301313] ? mempool_kfree+0xe/0x10 [161342.301313] ? mempool_free+0x2f/0x80 [161342.301314] ? md_end_io+0x4b/0x70 [161342.301314] ? bio_endio+0xe6/0x150 [161342.301314] ? bio_chain_endio+0x2d/0x40 [161342.301315] ? md_end_io+0x5d/0x70 [161342.301315] ? bio_endio+0xe6/0x150 [161342.301315] ? bio_chain_endio+0x2d/0x40 [161342.301315] ? md_end_io+0x5d/0x70 [161342.301316] ? bio_endio+0xe6/0x150 [161342.301316] ? bio_chain_endio+0x2d/0x40 [161342.301316] ? md_end_io+0x5d/0x70 [161342.301316] ? bio_endio+0xe6/0x150 [161342.301317] ? bio_chain_endio+0x2d/0x40 [161342.301317] ? md_end_io+0x5d/0x70 [161342.301317] ? bio_endio+0xe6/0x150 [161342.301317] ? bio_chain_endio+0x2d/0x40 ... [161342.301379] ? md_end_io+0x5d/0x70 [161342.301379] ? bio_endio+0xe6/0x150 [161342.301380] ? bio_chain_endio+0x2d/0x40 [161342.301380] ? md_end_io+0x5d/0x70 [161342.301380] ? bio_endio+0xe6/0x150 [161342.301380] ? bio_ch [161342.301381] Lost 296 message(s)! [ 0.000000] Linux version 5.8.0-33-generic (buildd@lgw01-amd64-036) (gcc (Ubuntu 10.2.0-13ubuntu1) 10.2.0, GNU ld (GNU Binutils for Ubuntu) 2.35.1) #36-Ubuntu SMP Wed Dec 9 09:14:40 UTC 2020 (Ubuntu 5.8.0-33.36-generic 5.8.17) I can provide newer kernel panics or other info if needed. Thanks! ProblemType: Bug DistroRelease: Ubuntu 20.10 Package: mdadm 4.1-5ubuntu5 ProcVersionSignature: Ubuntu 5.8.0-53.60-generic 5.8.18 Uname: Linux 5.8.0-53-generic x86_64 NonfreeKernelModules: nvidia_modeset nvidia ApportVersion: 2.20.11-0ubuntu50.5 Architecture: amd64 CasperMD5CheckResult: pass Date: Tue May 25 12:11:44 2021 InstallationDate: Installed on 2020-11-23 (182 days ago) InstallationMedia: Ubuntu-Server 20.10 "Groovy Gorilla" - Release amd64 (20201022) MachineType: Gigabyte Technology Co., Ltd. B550M AORUS PRO ProcEnviron: TERM=screen-256color PATH=(custom, no user) LANG=en_US.UTF-8 SHELL=/bin/bash ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.8.0-53-generic root=/dev/mapper/ubuntu--vg-ubuntu--lv ro console=tty1 console=ttyS0,115200 processor.max_cstate=5 rcu_nocbs=0-15 SourcePackage: mdadm UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 05/19/2020 dmi.bios.release: 5.17 dmi.bios.vendor: American Megatrends Inc. dmi.bios.version: F1 dmi.board.asset.tag: Default string dmi.board.name: B550M AORUS PRO dmi.board.vendor: Gigabyte Technology Co., Ltd. dmi.board.version: x.x dmi.chassis.asset.tag: Default string dmi.chassis.type: 3 dmi.chassis.vendor: Default string dmi.chassis.version: Default string dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvrF1:bd05/19/2020:br5.17:svnGigabyteTechnologyCo.,Ltd.:pnB550MAORUSPRO:pvrDefaultstring:rvnGigabyteTechnologyCo.,Ltd.:rnB550MAORUSPRO:rvrx.x:cvnDefaultstring:ct3:cvrDefaultstring: dmi.product.family: Default string dmi.product.name: B550M AORUS PRO dmi.product.sku: Default string dmi.product.version: Default string dmi.sys.vendor: Gigabyte Technology Co., Ltd. etc.blkid.tab: Error: [Errno 2] No such file or directory: '/etc/blkid.tab' mtime.conffile..etc.apport.crashdb.conf: 2020-11-24T13:52:10.563946 ** Affects: mdadm (Ubuntu) Importance: Undecided Status: New ** Tags: amd64 apport-bug groovy uec-images -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1929591 Title: MD RAID 6 Periodic Kernel Panic Stack Overflow Double-Fault To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/1929591/+subscriptions -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
