Public bug reported: Found during boot testing of Noble linux-lowlatency-hwe-6.11 (6.11.0-1012.13~24.04.1) on TF amd-server.
Sample kernel warning message: WARNING: CPU: 0 PID: 1 at kernel/time/timer_migration.c:543 tmigr_requires_handle_remote+0x123/0x130 Modules linked in: CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0-1012-lowlatency #13~24.04.1-Ubuntu Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 06/07/2018 RIP: 0010:tmigr_requires_handle_remote+0x123/0x130 Code: 65 48 2b 14 25 28 00 00 00 75 23 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d 31 d2 31 c9 31 f6 31 ff e9 c1 84 07 01 0f 0b eb ba <0f> 0b eb a9 e8 44 5d 06 01 0f 1f 40 00 90 90 90 90 90 90 90 90 90 RSP: 0018:ffffa6f9c0003f30 EFLAGS: 00010046 RAX: 0000000000000000 RBX: ffff8c899f026200 RCX: 7fffffffffffffff RDX: ffff8c8240100e00 RSI: 0000000000000002 RDI: 0000000000000000 RBP: ffffa6f9c0003f68 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffff8c899f000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff8cc1bfdff000 CR3: 0000002dcb83e000 CR4: 00000000003506f0 Call Trace: ? show_regs+0x6c/0x80 ? __warn+0x88/0x140 ? tmigr_requires_handle_remote+0x123/0x130 ? report_bug+0x182/0x1b0 ? handle_bug+0x6e/0xb0 ? exc_invalid_op+0x18/0x80 ? asm_exc_invalid_op+0x1b/0x20 ? tmigr_requires_handle_remote+0x123/0x130 update_process_times+0x63/0xb0 tick_periodic+0x2d/0x90 tick_handle_periodic+0x25/0x80 __sysvec_apic_timer_interrupt+0x59/0x130 sysvec_apic_timer_interrupt+0x9b/0xc0 asm_sysvec_apic_timer_interrupt+0x1b/0x20 RIP: 0010:delay_halt_mwaitx+0x3c/0x50 Code: 05 91 3f 60 64 48 05 00 60 00 00 0f 01 fa b8 ff ff ff ff b9 02 00 00 00 48 39 c6 48 0f 46 c6 48 89 c3 b8 f0 00 00 00 0f 01 fb <48> 8b 5d f8 c9 31 c0 31 d2 31 c9 31 f6 e9 22 53 09 00 66 90 90 90 RSP: 0018:ffffa6f9c007bbf8 EFLAGS: 00000293 RAX: 00000000000000f0 RBX: 0000000000005d93 RCX: 0000000000000002 RDX: 0000000000000000 RSI: 0000000000005d93 RDI: 00000035e3527498 RBP: ffffa6f9c007bc00 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000005d93 R13: 0000000000000005 R14: 0000000000000001 R15: 0000000000000020 ? srso_return_thunk+0x5/0x5f delay_halt.part.0+0x3e/0x70 delay_halt+0x13/0x30 __const_udelay+0x3d/0x50 wakeup_secondary_cpu_via_init+0xed/0x2e0 do_boot_cpu+0x1d1/0x200 native_kick_ap+0x111/0x1d0 arch_cpuhp_kick_ap_alive+0x15/0x20 cpuhp_kick_ap_alive+0x55/0x90 ? __pfx_cpuhp_kick_ap_alive+0x10/0x10 cpuhp_invoke_callback+0x340/0x520 __cpuhp_invoke_callback_range+0x80/0x100 _cpu_up+0x10b/0x280 cpu_up+0xe3/0x120 cpuhp_bringup_mask+0x71/0xd0 cpuhp_bringup_cpus_parallel+0x116/0x150 ? __pfx_kernel_init+0x10/0x10 bringup_nonboot_cpus+0x22/0x50 smp_init+0x2a/0x90 kernel_init_freeable+0x10b/0x210 kernel_init+0x1b/0x200 ret_from_fork+0x47/0x70 ? __pfx_kernel_init+0x10/0x10 ret_from_fork_asm+0x1a/0x30 ---[ end trace 0000000000000000 ]--- This issue can be reproduced with oracular/linux, at least with the same tmigr_group hierarchy, so it is likely to be observed on any Oracular derivatives or backports. The kernel logs related to the topology of TF amd-server (and eventual group hierarchy), where the issue was observed, are as follows: CPU topo: Max. logical packages: 2 CPU topo: Max. logical dies: 2 CPU topo: Max. dies per package: 1 CPU topo: Max. threads per core: 2 CPU topo: Num. cores per package: 16 CPU topo: Num. threads per package: 32 CPU topo: Allowing 64 present CPUs plus 0 hotplug CPUs smpboot: x86: Booting SMP configuration: .... node #0, CPUs: #1 #2 #3 .... node #1, CPUs: #4 #5 #6 #7 .... node #2, CPUs: #8 #9 #10 #11 .... node #3, CPUs: #12 #13 #14 #15 .... node #4, CPUs: #16 #17 #18 #19 .... node #5, CPUs: #20 #21 #22 #23 .... node #6, CPUs: #24 #25 #26 #27 .... node #7, CPUs: #28 #29 #30 #31 .... node #0, CPUs: #32 #33 #34 #35 .... node #1, CPUs: #36 #37 #38 #39 .... node #2, CPUs: #40 #41 #42 #43 .... node #3, CPUs: #44 #45 #46 #47 .... node #4, CPUs: #48 #49 #50 #51 .... node #5, CPUs: #52 #53 #54 #55 .... node #6, CPUs: #56 #57 #58 #59 .... node #7, CPUs: #60 #61 #62 #63 Timer migration: 2 hierarchy levels; 8 children per group; 1 crossnode level The 2025.03.17 Oracular kernels (including derivatives and backports) include commit b729cc1ec21a ("timers/migration: Fix another race between hotplug and idle entry/exit") via the upstream stable patchset LP: #2100328, while commit 868c9037df62 ("timers/migration: Fix off-by-one root mis-connection") is not included. I've verified locally that with the fix-the-fix commit 868c9037df62, the issue disappears. ** Affects: ubuntu-kernel-tests Importance: Undecided Status: New ** Tags: oracular sru-20250317 ubuntu-boot ** Description changed: Found during boot testing of Noble linux-lowlatency-hwe-6.11 (6.11.0-1012.13~24.04.1) on TF amd-server. Sample kernel warning message: - WARNING: CPU: 0 PID: 1 at kernel/time/timer_migration.c:543 tmigr_requires_handle_remote+0x123/0x130 - Modules linked in: - CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0-1012-lowlatency #13~24.04.1-Ubuntu - Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 06/07/2018 - RIP: 0010:tmigr_requires_handle_remote+0x123/0x130 - Code: 65 48 2b 14 25 28 00 00 00 75 23 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d 31 d2 31 c9 31 f6 31 ff e9 c1 84 07 01 0f 0b eb ba <0f> 0b eb a9 e8 44 5d 06 01 0f 1f 40 00 90 90 90 90 90 90 90 90 90 - RSP: 0018:ffffa6f9c0003f30 EFLAGS: 00010046 - RAX: 0000000000000000 RBX: ffff8c899f026200 RCX: 7fffffffffffffff - RDX: ffff8c8240100e00 RSI: 0000000000000002 RDI: 0000000000000000 - RBP: ffffa6f9c0003f68 R08: 0000000000000000 R09: 0000000000000000 - R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 - R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000001 - FS: 0000000000000000(0000) GS:ffff8c899f000000(0000) knlGS:0000000000000000 - CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 - CR2: ffff8cc1bfdff000 CR3: 0000002dcb83e000 CR4: 00000000003506f0 - Call Trace: + WARNING: CPU: 0 PID: 1 at kernel/time/timer_migration.c:543 tmigr_requires_handle_remote+0x123/0x130 + Modules linked in: + CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0-1012-lowlatency #13~24.04.1-Ubuntu + Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 06/07/2018 + RIP: 0010:tmigr_requires_handle_remote+0x123/0x130 + Code: 65 48 2b 14 25 28 00 00 00 75 23 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 + 5f 5d 31 d2 31 c9 31 f6 31 ff e9 c1 84 07 01 0f 0b eb ba <0f> 0b eb a9 e8 44 + 5d 06 01 0f 1f 40 00 90 90 90 90 90 90 90 90 90 + RSP: 0018:ffffa6f9c0003f30 EFLAGS: 00010046 + RAX: 0000000000000000 RBX: ffff8c899f026200 RCX: 7fffffffffffffff + RDX: ffff8c8240100e00 RSI: 0000000000000002 RDI: 0000000000000000 + RBP: ffffa6f9c0003f68 R08: 0000000000000000 R09: 0000000000000000 + R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 + R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000001 + FS: 0000000000000000(0000) GS:ffff8c899f000000(0000) knlGS:0000000000000000 + CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 + CR2: ffff8cc1bfdff000 CR3: 0000002dcb83e000 CR4: 00000000003506f0 + Call Trace: - ? show_regs+0x6c/0x80 - ? __warn+0x88/0x140 - ? tmigr_requires_handle_remote+0x123/0x130 - ? report_bug+0x182/0x1b0 - ? handle_bug+0x6e/0xb0 - ? exc_invalid_op+0x18/0x80 - ? asm_exc_invalid_op+0x1b/0x20 - ? tmigr_requires_handle_remote+0x123/0x130 - update_process_times+0x63/0xb0 - tick_periodic+0x2d/0x90 - tick_handle_periodic+0x25/0x80 - __sysvec_apic_timer_interrupt+0x59/0x130 - sysvec_apic_timer_interrupt+0x9b/0xc0 - asm_sysvec_apic_timer_interrupt+0x1b/0x20 - RIP: 0010:delay_halt_mwaitx+0x3c/0x50 - Code: 05 91 3f 60 64 48 05 00 60 00 00 0f 01 fa b8 ff ff ff ff b9 02 00 00 00 48 39 c6 48 0f 46 c6 48 89 c3 b8 f0 00 00 00 0f 01 fb <48> 8b 5d f8 c9 31 c0 31 d2 31 c9 31 f6 e9 22 53 09 00 66 90 90 90 - RSP: 0018:ffffa6f9c007bbf8 EFLAGS: 00000293 - RAX: 00000000000000f0 RBX: 0000000000005d93 RCX: 0000000000000002 - RDX: 0000000000000000 RSI: 0000000000005d93 RDI: 00000035e3527498 - RBP: ffffa6f9c007bc00 R08: 0000000000000000 R09: 0000000000000000 - R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000005d93 - R13: 0000000000000005 R14: 0000000000000001 R15: 0000000000000020 - ? srso_return_thunk+0x5/0x5f - delay_halt.part.0+0x3e/0x70 - delay_halt+0x13/0x30 - __const_udelay+0x3d/0x50 - wakeup_secondary_cpu_via_init+0xed/0x2e0 - do_boot_cpu+0x1d1/0x200 - native_kick_ap+0x111/0x1d0 - arch_cpuhp_kick_ap_alive+0x15/0x20 - cpuhp_kick_ap_alive+0x55/0x90 - ? __pfx_cpuhp_kick_ap_alive+0x10/0x10 - cpuhp_invoke_callback+0x340/0x520 - __cpuhp_invoke_callback_range+0x80/0x100 - _cpu_up+0x10b/0x280 - cpu_up+0xe3/0x120 - cpuhp_bringup_mask+0x71/0xd0 - cpuhp_bringup_cpus_parallel+0x116/0x150 - ? __pfx_kernel_init+0x10/0x10 - bringup_nonboot_cpus+0x22/0x50 - smp_init+0x2a/0x90 - kernel_init_freeable+0x10b/0x210 - kernel_init+0x1b/0x200 - ret_from_fork+0x47/0x70 - ? __pfx_kernel_init+0x10/0x10 - ret_from_fork_asm+0x1a/0x30 + ? show_regs+0x6c/0x80 + ? __warn+0x88/0x140 + ? tmigr_requires_handle_remote+0x123/0x130 + ? report_bug+0x182/0x1b0 + ? handle_bug+0x6e/0xb0 + ? exc_invalid_op+0x18/0x80 + ? asm_exc_invalid_op+0x1b/0x20 + ? tmigr_requires_handle_remote+0x123/0x130 + update_process_times+0x63/0xb0 + tick_periodic+0x2d/0x90 + tick_handle_periodic+0x25/0x80 + __sysvec_apic_timer_interrupt+0x59/0x130 + sysvec_apic_timer_interrupt+0x9b/0xc0 + asm_sysvec_apic_timer_interrupt+0x1b/0x20 + RIP: 0010:delay_halt_mwaitx+0x3c/0x50 + Code: 05 91 3f 60 64 48 05 00 60 00 00 0f 01 fa b8 ff ff ff ff b9 02 00 00 + 00 48 39 c6 48 0f 46 c6 48 89 c3 b8 f0 00 00 00 0f 01 fb <48> 8b 5d f8 c9 31 + c0 31 d2 31 c9 31 f6 e9 22 53 09 00 66 90 90 90 + RSP: 0018:ffffa6f9c007bbf8 EFLAGS: 00000293 + RAX: 00000000000000f0 RBX: 0000000000005d93 RCX: 0000000000000002 + RDX: 0000000000000000 RSI: 0000000000005d93 RDI: 00000035e3527498 + RBP: ffffa6f9c007bc00 R08: 0000000000000000 R09: 0000000000000000 + R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000005d93 + R13: 0000000000000005 R14: 0000000000000001 R15: 0000000000000020 + ? srso_return_thunk+0x5/0x5f + delay_halt.part.0+0x3e/0x70 + delay_halt+0x13/0x30 + __const_udelay+0x3d/0x50 + wakeup_secondary_cpu_via_init+0xed/0x2e0 + do_boot_cpu+0x1d1/0x200 + native_kick_ap+0x111/0x1d0 + arch_cpuhp_kick_ap_alive+0x15/0x20 + cpuhp_kick_ap_alive+0x55/0x90 + ? __pfx_cpuhp_kick_ap_alive+0x10/0x10 + cpuhp_invoke_callback+0x340/0x520 + __cpuhp_invoke_callback_range+0x80/0x100 + _cpu_up+0x10b/0x280 + cpu_up+0xe3/0x120 + cpuhp_bringup_mask+0x71/0xd0 + cpuhp_bringup_cpus_parallel+0x116/0x150 + ? __pfx_kernel_init+0x10/0x10 + bringup_nonboot_cpus+0x22/0x50 + smp_init+0x2a/0x90 + kernel_init_freeable+0x10b/0x210 + kernel_init+0x1b/0x200 + ret_from_fork+0x47/0x70 + ? __pfx_kernel_init+0x10/0x10 + ret_from_fork_asm+0x1a/0x30 - ---[ end trace 0000000000000000 ]--- - + ---[ end trace 0000000000000000 ]--- This issue can be reproduced with oracular/linux, at least with the same tmigr_group hierarchy, so it is likely to be observed on any Oracular derivatives or backports. The kernel logs related to the topology of TF amd-server (and eventual group hierarchy), where the issue was observed, are as follows: - CPU topo: Max. logical packages: 2 - CPU topo: Max. logical dies: 2 - CPU topo: Max. dies per package: 1 - CPU topo: Max. threads per core: 2 - CPU topo: Num. cores per package: 16 - CPU topo: Num. threads per package: 32 - CPU topo: Allowing 64 present CPUs plus 0 hotplug CPUs + CPU topo: Max. logical packages: 2 + CPU topo: Max. logical dies: 2 + CPU topo: Max. dies per package: 1 + CPU topo: Max. threads per core: 2 + CPU topo: Num. cores per package: 16 + CPU topo: Num. threads per package: 32 + CPU topo: Allowing 64 present CPUs plus 0 hotplug CPUs - smpboot: x86: Booting SMP configuration: - .... node #0, CPUs: #1 #2 #3 - .... node #1, CPUs: #4 #5 #6 #7 - .... node #2, CPUs: #8 #9 #10 #11 - .... node #3, CPUs: #12 #13 #14 #15 - .... node #4, CPUs: #16 #17 #18 #19 - .... node #5, CPUs: #20 #21 #22 #23 - .... node #6, CPUs: #24 #25 #26 #27 - .... node #7, CPUs: #28 #29 #30 #31 - .... node #0, CPUs: #32 #33 #34 #35 - .... node #1, CPUs: #36 #37 #38 #39 - .... node #2, CPUs: #40 #41 #42 #43 - .... node #3, CPUs: #44 #45 #46 #47 - .... node #4, CPUs: #48 #49 #50 #51 - .... node #5, CPUs: #52 #53 #54 #55 - .... node #6, CPUs: #56 #57 #58 #59 - .... node #7, CPUs: #60 #61 #62 #63 + smpboot: x86: Booting SMP configuration: + .... node #0, CPUs: #1 #2 #3 + .... node #1, CPUs: #4 #5 #6 #7 + .... node #2, CPUs: #8 #9 #10 #11 + .... node #3, CPUs: #12 #13 #14 #15 + .... node #4, CPUs: #16 #17 #18 #19 + .... node #5, CPUs: #20 #21 #22 #23 + .... node #6, CPUs: #24 #25 #26 #27 + .... node #7, CPUs: #28 #29 #30 #31 + .... node #0, CPUs: #32 #33 #34 #35 + .... node #1, CPUs: #36 #37 #38 #39 + .... node #2, CPUs: #40 #41 #42 #43 + .... node #3, CPUs: #44 #45 #46 #47 + .... node #4, CPUs: #48 #49 #50 #51 + .... node #5, CPUs: #52 #53 #54 #55 + .... node #6, CPUs: #56 #57 #58 #59 + .... node #7, CPUs: #60 #61 #62 #63 - Timer migration: 2 hierarchy levels; 8 children per group; 1 + Timer migration: 2 hierarchy levels; 8 children per group; 1 crossnode level - - The 2025.03.17 Oracular kernels (including derivatives and backports) include commit b729cc1ec21a ('timers/migration: Fix another race between hotplug and idle entry/exit') via the upstream stable patchset LP: #2100328, while commit 868c9037df62 ('timers/migration: Fix off-by-one root mis-connection') is not included. I've verified locally that with the fix-the-fix commit 868c9037df62, the issue disappears. + The 2025.03.17 Oracular kernels (including derivatives and backports) + include commit b729cc1ec21a ("timers/migration: Fix another race between + hotplug and idle entry/exit") via the upstream stable patchset LP: + #2100328, while commit 868c9037df62 ("timers/migration: Fix off-by-one + root mis-connection") is not included. I've verified locally that with + the fix-the-fix commit 868c9037df62, the issue disappears. ** Tags added: sru-20250317 -- You received this bug notification because you are a member of Canonical Platform QA Team, which is subscribed to ubuntu-kernel-tests. https://bugs.launchpad.net/bugs/2106022 Title: log_check/kernel_tainted failed with kernel warnings at kernel/time/timer_migration.c:543 on Oracular Status in ubuntu-kernel-tests: New Bug description: Found during boot testing of Noble linux-lowlatency-hwe-6.11 (6.11.0-1012.13~24.04.1) on TF amd-server. Sample kernel warning message: WARNING: CPU: 0 PID: 1 at kernel/time/timer_migration.c:543 tmigr_requires_handle_remote+0x123/0x130 Modules linked in: CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0-1012-lowlatency #13~24.04.1-Ubuntu Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 06/07/2018 RIP: 0010:tmigr_requires_handle_remote+0x123/0x130 Code: 65 48 2b 14 25 28 00 00 00 75 23 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d 31 d2 31 c9 31 f6 31 ff e9 c1 84 07 01 0f 0b eb ba <0f> 0b eb a9 e8 44 5d 06 01 0f 1f 40 00 90 90 90 90 90 90 90 90 90 RSP: 0018:ffffa6f9c0003f30 EFLAGS: 00010046 RAX: 0000000000000000 RBX: ffff8c899f026200 RCX: 7fffffffffffffff RDX: ffff8c8240100e00 RSI: 0000000000000002 RDI: 0000000000000000 RBP: ffffa6f9c0003f68 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffff8c899f000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff8cc1bfdff000 CR3: 0000002dcb83e000 CR4: 00000000003506f0 Call Trace: ? show_regs+0x6c/0x80 ? __warn+0x88/0x140 ? tmigr_requires_handle_remote+0x123/0x130 ? report_bug+0x182/0x1b0 ? handle_bug+0x6e/0xb0 ? exc_invalid_op+0x18/0x80 ? asm_exc_invalid_op+0x1b/0x20 ? tmigr_requires_handle_remote+0x123/0x130 update_process_times+0x63/0xb0 tick_periodic+0x2d/0x90 tick_handle_periodic+0x25/0x80 __sysvec_apic_timer_interrupt+0x59/0x130 sysvec_apic_timer_interrupt+0x9b/0xc0 asm_sysvec_apic_timer_interrupt+0x1b/0x20 RIP: 0010:delay_halt_mwaitx+0x3c/0x50 Code: 05 91 3f 60 64 48 05 00 60 00 00 0f 01 fa b8 ff ff ff ff b9 02 00 00 00 48 39 c6 48 0f 46 c6 48 89 c3 b8 f0 00 00 00 0f 01 fb <48> 8b 5d f8 c9 31 c0 31 d2 31 c9 31 f6 e9 22 53 09 00 66 90 90 90 RSP: 0018:ffffa6f9c007bbf8 EFLAGS: 00000293 RAX: 00000000000000f0 RBX: 0000000000005d93 RCX: 0000000000000002 RDX: 0000000000000000 RSI: 0000000000005d93 RDI: 00000035e3527498 RBP: ffffa6f9c007bc00 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000005d93 R13: 0000000000000005 R14: 0000000000000001 R15: 0000000000000020 ? srso_return_thunk+0x5/0x5f delay_halt.part.0+0x3e/0x70 delay_halt+0x13/0x30 __const_udelay+0x3d/0x50 wakeup_secondary_cpu_via_init+0xed/0x2e0 do_boot_cpu+0x1d1/0x200 native_kick_ap+0x111/0x1d0 arch_cpuhp_kick_ap_alive+0x15/0x20 cpuhp_kick_ap_alive+0x55/0x90 ? __pfx_cpuhp_kick_ap_alive+0x10/0x10 cpuhp_invoke_callback+0x340/0x520 __cpuhp_invoke_callback_range+0x80/0x100 _cpu_up+0x10b/0x280 cpu_up+0xe3/0x120 cpuhp_bringup_mask+0x71/0xd0 cpuhp_bringup_cpus_parallel+0x116/0x150 ? __pfx_kernel_init+0x10/0x10 bringup_nonboot_cpus+0x22/0x50 smp_init+0x2a/0x90 kernel_init_freeable+0x10b/0x210 kernel_init+0x1b/0x200 ret_from_fork+0x47/0x70 ? __pfx_kernel_init+0x10/0x10 ret_from_fork_asm+0x1a/0x30 ---[ end trace 0000000000000000 ]--- This issue can be reproduced with oracular/linux, at least with the same tmigr_group hierarchy, so it is likely to be observed on any Oracular derivatives or backports. The kernel logs related to the topology of TF amd-server (and eventual group hierarchy), where the issue was observed, are as follows: CPU topo: Max. logical packages: 2 CPU topo: Max. logical dies: 2 CPU topo: Max. dies per package: 1 CPU topo: Max. threads per core: 2 CPU topo: Num. cores per package: 16 CPU topo: Num. threads per package: 32 CPU topo: Allowing 64 present CPUs plus 0 hotplug CPUs smpboot: x86: Booting SMP configuration: .... node #0, CPUs: #1 #2 #3 .... node #1, CPUs: #4 #5 #6 #7 .... node #2, CPUs: #8 #9 #10 #11 .... node #3, CPUs: #12 #13 #14 #15 .... node #4, CPUs: #16 #17 #18 #19 .... node #5, CPUs: #20 #21 #22 #23 .... node #6, CPUs: #24 #25 #26 #27 .... node #7, CPUs: #28 #29 #30 #31 .... node #0, CPUs: #32 #33 #34 #35 .... node #1, CPUs: #36 #37 #38 #39 .... node #2, CPUs: #40 #41 #42 #43 .... node #3, CPUs: #44 #45 #46 #47 .... node #4, CPUs: #48 #49 #50 #51 .... node #5, CPUs: #52 #53 #54 #55 .... node #6, CPUs: #56 #57 #58 #59 .... node #7, CPUs: #60 #61 #62 #63 Timer migration: 2 hierarchy levels; 8 children per group; 1 crossnode level The 2025.03.17 Oracular kernels (including derivatives and backports) include commit b729cc1ec21a ("timers/migration: Fix another race between hotplug and idle entry/exit") via the upstream stable patchset LP: #2100328, while commit 868c9037df62 ("timers/migration: Fix off- by-one root mis-connection") is not included. I've verified locally that with the fix-the-fix commit 868c9037df62, the issue disappears. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/2106022/+subscriptions -- Mailing list: https://launchpad.net/~canonical-ubuntu-qa Post to : canonical-ubuntu-qa@lists.launchpad.net Unsubscribe : https://launchpad.net/~canonical-ubuntu-qa More help : https://help.launchpad.net/ListHelp