On Thu, Feb 16, 2023 at 2:22 PM HAGIO KAZUHITO(萩尾 一仁) <k-hagio...@nec.com> wrote:
> On 2023/02/16 13:05, lijiang wrote: > >>> On Thu, Feb 16, 2023 at 8:48 AM HAGIO KAZUHITO(萩尾 一仁) < > k-hagio...@nec.com <mailto:k-hagio...@nec.com>> wrote: > >>> > >>> On 2023/02/15 17:24, Lianbo Jiang wrote: > >>> > Kernel commit 7d65f4a65532 ("irq: Consolidate do_softirq() arch > overriden > >>> > implementations") renamed the call_softirq to > do_softirq_own_stack. > >>> > Crash may incorrectly output the eframe stack with a warning as > below: > >>> > > >>> > crash> foreach bt > >>> > ... > >>> > PID: 0 TASK: ffff914f820a8000 CPU: 25 COMMAND: > "swapper/25" > >>> > #0 [fffffe0000504e48] crash_nmi_callback at ffffffffa665d763 > >>> > #1 [fffffe0000504e50] nmi_handle at ffffffffa662a423 > >>> > #2 [fffffe0000504ea8] default_do_nmi at ffffffffa6fe7dc9 > >>> > #3 [fffffe0000504ec8] do_nmi at ffffffffa662a97f > >>> > #4 [fffffe0000504ef0] end_repeat_nmi at ffffffffa70015e8 > >>> > [exception RIP: clone_endio+172] > >>> > RIP: ffffffffc005c1ec RSP: ffffa1d403d08e98 RFLAGS: > 00000246 > >>> > RAX: 0000000000000000 RBX: ffff915326fba230 RCX: > 0000000000000018 > >>> > RDX: ffffffffc0075400 RSI: 0000000000000000 RDI: > ffff915326fba230 > >>> > RBP: ffff915326fba1c0 R8: 0000000000001000 R9: > ffff915308d6d2a0 > >>> > R10: 000000a97dfe5e10 R11: ffffa1d40038fe98 R12: > ffff915302babc40 > >>> > R13: ffff914f94360000 R14: 0000000000000000 R15: > 0000000000000000 > >>> > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > >>> > --- <NMI exception stack> --- > >>> > #5 [ffffa1d403d08e98] clone_endio at ffffffffc005c1ec > [dm_mod] > >>> > #6 [ffffa1d403d08ed0] blk_update_request at ffffffffa6a96954 > >>> > #7 [ffffa1d403d08f10] scsi_end_request at ffffffffa6c9b968 > >>> > #8 [ffffa1d403d08f48] scsi_io_completion at ffffffffa6c9bb3e > >>> > #9 [ffffa1d403d08f90] blk_complete_reqs at ffffffffa6aa0e95 > >>> > #10 [ffffa1d403d08fa0] __softirqentry_text_start at > ffffffffa72000dc > >>> > #11 [ffffa1d403d08ff0] do_softirq_own_stack at > ffffffffa7000f9a > >>> > --- <IRQ stack> --- > >>> > #12 [ffffa1d40038fe70] do_softirq_own_stack at > ffffffffa7000f9a > >>> > [exception RIP: unknown or invalid address] > >>> > RIP: 0000000000000000 RSP: 0000000000000000 RFLAGS: > 00000000 > >>> > RAX: ffffffffa672eae5 RBX: ffffffffa83b34e0 RCX: > ffffffffa672eb12 > >>> > RDX: 0000000000000010 RSI: 8b7d6c8869010c00 RDI: > 0000000000000085 > >>> > RBP: 0000000000000286 R8: ffff914f820a8000 R9: > ffffffffa67a94e0 > >>> > R10: 0000000000000286 R11: ffffffffa66fb4c5 R12: > ffffffffa67a898b > >>> > R13: 0000000000000000 R14: fffffffffffffff8 R15: > ffffffffa67a1e68 > >>> > ORIG_RAX: 0000000000000000 CS: 0000 SS: ffffffffa672edff > >>> > bt: WARNING: possibly bogus exception frame > >>> > #13 [ffffa1d40038ff30] start_secondary at ffffffffa665fa2c > >>> > #14 [ffffa1d40038ff50] secondary_startup_64_no_verify at > ffffffffa6600116 > >>> > ... > >>> > >>> Thank you for the patch. Could I have "bt -c 25" output with the > patch > >>> and "bt -r -c 25 | tail -n 40" output for checking? > >>> > >>> Sure. > > Thank you, looks better. Applied with Georges' suggestion and a few > commit log changes. > > https://github.com/crash-utility/crash/commit/e0e6e4a7ee03b3d00b50a9e4db2f2ea6f7da0da3 > > Looks good. > The irq_eframe_link calculated on RHEL8 also may be wrong, but > Those kernel commits are not introduced: [1] v5.8: 931b94145981 ("x86/entry: Provide helpers for executing on the irqstack") [2] v5.8: fa5e5c409213 ("x86/entry: Use idtentry for interrupts") [3] v5.12: 52d743f3b712 ("x86/softirq: Remove indirection in do_softirq_own_stack()") I tested it(based on 452) and didn't get any errors: crash> bt PID: 0 TASK: ffff9ed24128a800 CPU: 3 COMMAND: "swapper/3" #0 [ffffb7c300db8b80] machine_kexec at ffffffff8f46b063 #1 [ffffb7c300db8bd8] __crash_kexec at ffffffff8f5b357a #2 [ffffb7c300db8c98] panic at ffffffff8f4f50ff #3 [ffffb7c300db8d18] sysrq_handle_crash at ffffffff8f9f75e1 #4 [ffffb7c300db8d20] __handle_sysrq.cold.13 at ffffffff8f9f7f04 #5 [ffffb7c300db8d50] sysrq_filter at ffffffff8f9f7cbd #6 [ffffb7c300db8d88] input_to_handler at ffffffff8fb3780f #7 [ffffb7c300db8dc0] input_pass_values at ffffffff8fb388ab #8 [ffffb7c300db8de8] input_handle_event at ffffffff8fb3a9e8 #9 [ffffb7c300db8e18] input_event at ffffffff8fb3ae9f #10 [ffffb7c300db8e50] atkbd_interrupt at ffffffff8fb413a3 #11 [ffffb7c300db8ea8] serio_interrupt at ffffffff8fb33a86 #12 [ffffb7c300db8ed8] i8042_interrupt at ffffffff8fb35134 #13 [ffffb7c300db8f18] __handle_irq_event_percpu at ffffffff8f569d70 #14 [ffffb7c300db8f58] handle_irq_event_percpu at ffffffff8f569ef0 #15 [ffffb7c300db8f80] handle_irq_event at ffffffff8f569f76 #16 [ffffb7c300db8fa0] handle_edge_irq at ffffffff8f56e3d2 #17 [ffffb7c300db8fb8] handle_irq at ffffffff8f42924c #18 [ffffb7c300db8fc0] do_IRQ at ffffffff8fe01e89 --- <IRQ stack> --- #19 [ffffb7c300cf3de8] ret_from_intr at ffffffff8fe00a8f [exception RIP: native_safe_halt+14] RIP: ffffffff8fdf7eae RSP: ffffb7c300cf3e98 RFLAGS: 00000206 RAX: ffffffff8fdf7d40 RBX: 0000000000000003 RCX: 0000000000000001 RDX: 0000000000929b82 RSI: 0000000000000083 RDI: 0000000000000003 RBP: 0000000000000003 R8: 0000b7ac3793def2 R9: 000000000004f515 R10: 0000000000000000 R11: 0000000000000019 R12: 0000000000000000 R13: 0000000000000000 R14: ffffffffffffffff R15: ffff9ed24128a800 ORIG_RAX: ffffffffffffffdb CS: 0010 SS: 0018 #20 [ffffb7c300cf3e98] default_idle at ffffffff8fdf7d4a #21 [ffffb7c300cf3ea0] default_idle_call at ffffffff8fdf8034 #22 [ffffb7c300cf3ec0] do_idle at ffffffff8f52eb63 #23 [ffffb7c300cf3f10] cpu_startup_entry at ffffffff8f52edff #24 [ffffb7c300cf3f30] start_secondary at ffffffff8f45fa2c #25 [ffffb7c300cf3f50] secondary_startup_64_no_verify at ffffffff8f400116 crash> log ... [161537.153436] Call Trace: [161537.161011] <IRQ> [161537.162223] dump_stack+0x41/0x60 [161537.166883] panic+0xe7/0x2ac [161537.168950] ? printk+0x58/0x73 [161537.170718] sysrq_handle_crash+0x11/0x20 [161537.177654] __handle_sysrq.cold.13+0x48/0xff [161537.180096] sysrq_filter+0x32d/0x3f0 [161537.182079] input_to_handler+0x4f/0xf0 [161537.185589] input_pass_values.part.7+0x11b/0x140 [161537.188160] input_handle_event+0x138/0x5a0 [161537.190729] input_event+0x4f/0x80 [161537.192653] atkbd_interrupt+0x623/0x6b0 [161537.195040] serio_interrupt+0x46/0x90 [161537.201579] i8042_interrupt+0x154/0x240 [161537.203750] __handle_irq_event_percpu+0x40/0x190 [161537.208421] handle_irq_event_percpu+0x30/0x80 [161537.210869] handle_irq_event+0x36/0x57 [161537.212946] handle_edge_irq+0x82/0x190 [161537.215345] handle_irq+0x1c/0x30 [161537.217128] do_IRQ+0x49/0xd0 [161537.219876] common_interrupt+0xf/0xf [161537.221828] </IRQ> [161537.222944] RIP: 0010:native_safe_halt+0xe/0x20 [161537.225239] Code: 00 f0 80 48 02 20 48 8b 00 a8 08 75 c0 e9 79 ff ff ff 90 9 0 90 90 90 90 90 90 90 90 e9 07 00 00 00 0f 00 2d 96 22 41 00 fb f4 <e9> 8d 95 2 0 00 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 e9 07 00 00 [161537.234417] RSP: 0018:ffffb7c300cf3e98 EFLAGS: 00000206 ORIG_RAX: ffffffffffffffdb [161537.238201] RAX: ffffffff8fdf7d40 RBX: 0000000000000003 RCX: 0000000000000001 [161537.241756] RDX: 0000000000929b82 RSI: 0000000000000083 RDI: 0000000000000003 [161537.245308] RBP: 0000000000000003 R08: 0000b7ac3793def2 R09: 000000000004f515 [161537.248868] R10: 0000000000000000 R11: 0000000000000019 R12: 0000000000000000 [161537.252420] R13: 0000000000000000 R14: ffffffffffffffff R15: ffff9ed24128a800 [161537.255983] ? tdx_safe_halt+0x80/0x80 [161537.259858] default_idle+0xa/0x20 [161537.261677] default_idle_call+0x44/0xf0 [161537.263719] do_idle+0x213/0x2d0 [161537.265694] cpu_startup_entry+0x6f/0x80 [161537.267750] start_secondary+0x18c/0x1d0 [161537.270058] secondary_startup_64_no_verify+0xd1/0xdb ... crash> help -m |grep link irq_eframe_link: 40 I have a plan of changing that part and will revisit with it. > > Sounds good. Thanks. Lianbo > Thanks, > Kazu > > >>> > >>> crash> bt -c 25 > >>> PID: 0 TASK: ffff914f820a8000 CPU: 25 COMMAND: "swapper/25" > >>> #0 [fffffe0000504e48] crash_nmi_callback at ffffffffa665d763 > >>> #1 [fffffe0000504e50] nmi_handle at ffffffffa662a423 > >>> #2 [fffffe0000504ea8] default_do_nmi at ffffffffa6fe7dc9 > >>> #3 [fffffe0000504ec8] do_nmi at ffffffffa662a97f > >>> #4 [fffffe0000504ef0] end_repeat_nmi at ffffffffa70015e8 > >>> [exception RIP: clone_endio+172] > >>> RIP: ffffffffc005c1ec RSP: ffffa1d403d08e98 RFLAGS: 00000246 > >>> RAX: 0000000000000000 RBX: ffff915326fba230 RCX: 0000000000000018 > >>> RDX: ffffffffc0075400 RSI: 0000000000000000 RDI: ffff915326fba230 > >>> RBP: ffff915326fba1c0 R8: 0000000000001000 R9: ffff915308d6d2a0 > >>> R10: 000000a97dfe5e10 R11: ffffa1d40038fe98 R12: ffff915302babc40 > >>> R13: ffff914f94360000 R14: 0000000000000000 R15: 0000000000000000 > >>> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > >>> --- <NMI exception stack> --- > >>> #5 [ffffa1d403d08e98] clone_endio at ffffffffc005c1ec [dm_mod] > >>> #6 [ffffa1d403d08ed0] blk_update_request at ffffffffa6a96954 > >>> #7 [ffffa1d403d08f10] scsi_end_request at ffffffffa6c9b968 > >>> #8 [ffffa1d403d08f48] scsi_io_completion at ffffffffa6c9bb3e > >>> #9 [ffffa1d403d08f90] blk_complete_reqs at ffffffffa6aa0e95 > >>> #10 [ffffa1d403d08fa0] __softirqentry_text_start at ffffffffa72000dc > >>> #11 [ffffa1d403d08ff0] do_softirq_own_stack at ffffffffa7000f9a > >>> --- <IRQ stack> --- > >>> #12 [ffffa1d40038fe70] update_ts_time_stats at ffffffffa67a1e68 > >>> #13 [ffffa1d40038fea0] do_softirq at ffffffffa66fb4c5 > >>> #14 [ffffa1d40038feb0] flush_smp_call_function_queue at > ffffffffa67a94e0 > >>> #15 [ffffa1d40038fec0] do_idle at ffffffffa672eae5 > >>> #16 [ffffa1d40038ff10] cpu_startup_entry at ffffffffa672edff > >>> #17 [ffffa1d40038ff30] start_secondary at ffffffffa665fa2c > >>> #18 [ffffa1d40038ff50] secondary_startup_64_no_verify at > ffffffffa6600116 > >>> crash> > >>> crash> bt -r -c 25 | tail -n 40 > >>> ffffa1d40038fd80: ffff9156ff26afc0 0000002027c3e7a1 > >>> ffffa1d40038fd90: __next_timer_interrupt+166 ffff9156ff25aa68 > >>> ffffa1d40038fda0: 00000000fffd85da 0000000000000082 > >>> ffffa1d40038fdb0: 0000000000000082 hrtimer_get_next_event+78 > >>> ffffa1d40038fdc0: ffff9156ff25aa40 000000200cd15980 > >>> ffffa1d40038fdd0: __hrtimer_next_event_base+192 ffff9156ff25cf40 > >>> ffffa1d40038fde0: 7fffffffffffffff ffff9156ff25d580 > >>> ffffa1d40038fdf0: 0000000000000082 ffffffffffffffff > >>> ffffa1d40038fe00: 0000000000000082 hrtimer_next_event_without+96 > >>> ffffa1d40038fe10: ffff9156ff25d580 000000200cff2040 > >>> ffffa1d40038fe20: 000000200cdb406e ffff9153052f8c00 > >>> ffffa1d40038fe30: sched_clock+5 sched_clock_cpu+12 > >>> ffffa1d40038fe40: acpi_idle_driver+136 ffff9153052f8c00 > >>> ffffa1d40038fe50: cpuidle_enter_state+181 000000203501afdb > >>> ffffa1d40038fe60: 00ffffffa82c44c0 read_tsc > >>> ffffa1d40038fe70: update_ts_time_stats+88 fffffffffffffff8 > >>> ffffa1d40038fe80: 0000000000000000 __flush_smp_call_function_queue+219 > >>> ffffa1d40038fe90: 0000000000000286 __cpu_online_mask > >>> ffffa1d40038fea0: do_softirq+69 0000000000000286 > >>> ffffa1d40038feb0: flush_smp_call_function_queue+96 ffff914f820a8000 > >>> ffffa1d40038fec0: do_idle+405 do_idle+450 > >>> ffffa1d40038fed0: 0000000000000010 8b7d6c8869010c00 > >>> ffffa1d40038fee0: 0000000000000085 0000000000000000 > >>> ffffa1d40038fef0: 0000000000000000 0000000000000000 > >>> ffffa1d40038ff00: 0000000000000000 0000000000000000 > >>> ffffa1d40038ff10: cpu_startup_entry+111 8b7d6c8869010c00 > >>> ffffa1d40038ff20: 534e004fc6e00600 ffffa1d40038ff38 > >>> ffffa1d40038ff30: start_secondary+396 534e004fc6e00600 > >>> ffffa1d40038ff40: 0000000000000000 0000000040000000 > >>> ffffa1d40038ff50: secondary_startup_64_no_verify+209 0000000000000000 > >>> ffffa1d40038ff60: 0000000000000000 0000000000000000 > >>> ffffa1d40038ff70: 0000000000000000 0000000000000000 > >>> ffffa1d40038ff80: 0000000000000000 0000000000000000 > >>> ffffa1d40038ff90: 0000000000000000 0000000000000000 > >>> ffffa1d40038ffa0: 0000000000000000 0000000000000000 > >>> ffffa1d40038ffb0: 0000000000000000 0000000000000000 > >>> ffffa1d40038ffc0: 0000000000000000 0000000000000000 > >>> ffffa1d40038ffd0: 0000000000000000 0000000000000000 > >>> ffffa1d40038ffe0: 0000000000000000 0000000000000000 > >>> ffffa1d40038fff0: 0000000000000000 0000000000000000 > >>> crash> > >>> Thanks. > >>> Lianbo > >>> > >>> Thanks, > >>> Kazu > >>> > >>> > > >>> > Also no exception frame when coming from do_softirq_own_stack. > >>> > > >>> > Reported-by: Marco Patalano <mpata...@redhat.com <mailto: > mpata...@redhat.com>> > >>> > Signed-off-by: Lianbo Jiang <liji...@redhat.com <mailto: > liji...@redhat.com>> > >>> > --- > >>> > x86_64.c | 5 +++-- > >>> > 1 file changed, 3 insertions(+), 2 deletions(-) > >>> > > >>> > diff --git a/x86_64.c b/x86_64.c > >>> > index 5b671bd97775..3428bed417df 100644 > >>> > --- a/x86_64.c > >>> > +++ b/x86_64.c > >>> > @@ -3825,10 +3825,11 @@ in_exception_stack: > >>> > up -= 1; > >>> > bt->instptr = *up; > >>> > /* > >>> > - * No exception frame when coming from > call_softirq. > >>> > + * No exception frame when coming from > call_softirq > >>> > + * or do_softirq_own_stack. > >>> > */ > >>> > if ((sp = value_search(bt->instptr, &offset)) && > >>> > - STREQ(sp->name, "call_softirq")) > >>> > + (STREQ(sp->name, "call_softirq") || > STREQ(sp->name, "do_softirq_own_stack"))) > >>> > irq_eframe = 0; > >>> > bt->frameptr = 0; > >>> > done = FALSE; > >>>
-- Crash-utility mailing list Crash-utility@redhat.com https://listman.redhat.com/mailman/listinfo/crash-utility Contribution Guidelines: https://github.com/crash-utility/crash/wiki