[Crash-utility] Re: [PATCH] x86_64: rhel 9.3: "crash" doesn't handle all IRQ exception properly

Tao Liu Mon, 15 Apr 2024 18:21:03 -0700

Hi Aureau,

On Mon, Apr 15, 2024 at 11:37 PM Aureau, Georges (Kernel Tools ERT)
<georges.aur...@hpe.com> wrote:
>
> Hello Tao Liu, and Kazu,
>
> Thanks Kazu for your ack.
>
> Indeed, Tao Liu, you are right, we could factor the case 0 (most likely case) 
> into the try[] loop but this would
> make the code not as readable. I would rather keep my original patch, where 
> we leave case 0 as an explicit call
> (most likely case) outside of the try[] loop.


OK, thanks for the explanation, I agree with your thoughts. One more
question which I still not very clear:

-               machdep->machspec->irq_eframe_link -= 8;

Why don't we update irq_eframe_link after the code change?

Thanks,
Tao Liu


>
> Thanks,
> Georges
>
> -----Original Message-----
> From: Tao Liu <l...@redhat.com>
> Sent: Monday, April 15, 2024 2:33 PM
> To: Aureau, Georges (Kernel Tools ERT) <georges.aur...@hpe.com>
> Cc: Lianbo Jiang <liji...@redhat.com>; HAGIO KAZUHITO(萩尾 一仁) 
> <k-hagio...@nec.com>; devel@lists.crash-utility.osci.io
> Subject: Re: [Crash-utility] Re: [PATCH] x86_64: rhel 9.3: "crash" doesn't 
> handle all IRQ exception properly
>
> Hi Aureau,
>
> On Mon, Apr 15, 2024 at 4:48 PM HAGIO KAZUHITO(萩尾　一仁)
> <k-hagio...@nec.com> wrote:
> >
> > On 2024/04/10 22:47, Aureau, Georges (Kernel Tools ERT) wrote:
> > > Hello Lianbo,
> > >
> > > In my case, the fix you suggested works properly as I've got a single CPU 
> > > with an IRQ exception in a SYSVEC handler.
> > > But it we have multiple CPUs with different IRQ exceptions, then
> > > your suggestion does not work as
> > > x86_64_irq_eframe_link() is changing irq_eframe_link, for example:
> > >
> > > x86_64_irq_eframe_link_init():
> > >      set default irq_eframe_link=-24 since we have
> > > sysvec_apic_timer_interrupt
> > >
> > > Then if we do "bt -c cpu" on several cpu X, Y, X, being of their 
> > > irq_stack:
> > >
> > > x86_64_irq_eframe_link()/cpu X/rip:sysvec_apic_timer_interrupt:
> > >      default irq_eframe_link=-24 is matching 
> > > -framesize(rip:sysvec_apic_timer_interrupt)
> > >      so we are ok
> > >
> > > x86_64_irq_eframe_link()/cpu Y/rip:common_interrupt:
> > >      -24 does not work as framesize(common_interrupt)=32
> > >      we try -24-8 = -32, it works, and we set irq_eframe_link=-32
> > >
> > > x86_64_irq_eframe_link()/cpu Z/rip:sysvec_apic_timer_interrupt:
> > >      -32 does not work as framesize(rip:sysvec_apic_timer_interrupt)=24
> > >      we try -32-8 = -40, it does not work, and we are back to square one !
> > >      here we should try  -32+8 = -24
> > >
> > > We would need to have "bt -c cpu" doing runtime per cpu selection between 
> > > -32 and -24 depending on framesize(*(pcpu_hot.hardirq_stack_ptr-8)).
> > > So we can't really see "irq_eframe_link" as being a value valid for all 
> > > cpu's / IRQ.
> > > The "irq_eframe_link" should be per cpu, depending of what each cpu is 
> > > executing (more below).
> > >
> > > Currently x86_64_irq_eframe_link() tries "stkref", then it tries
> > > "stkref-irq_eframe_link" (with irq_eframe_link set in 
> > > irq_eframe_link_init()), and then it tries "stkref-irq_eframe_link+8":
> > >
> > > static ulong
> > > x86_64_irq_eframe_link(ulong stkref, struct bt_info *bt, FILE *ofp)
> > > {
> > >          ulong irq_eframe;
> > >
> > >          if (x86_64_exception_frame(EFRAME_VERIFY, stkref, 0, bt, ofp))
> > >                  return stkref;
> > >
> > >          irq_eframe = stkref - machdep->machspec->irq_eframe_link;
> > >
> > >          if (x86_64_exception_frame(EFRAME_VERIFY, irq_eframe, 0, bt, 
> > > ofp))
> > >                  return irq_eframe;
> > >
> > >          if (x86_64_exception_frame(EFRAME_VERIFY, irq_eframe+8, 0, bt, 
> > > ofp)) {
> > >                  machdep->machspec->irq_eframe_link -= 8;
> > >                  return (irq_eframe + 8);
> > >          }
> > >
> > >          return irq_eframe;
> > > }
> > >
> > >
> > > I don't have the history as why we try "stkref-irq_eframe_link+8" ? Any 
> > > idea ?
> > >
> > > But this current design with defaulting irq_eframe_link to -32 (or -24 as 
> > > per your suggestion) does not always work.
> > >
> > > We would need to handle both irq_stack rip's from IRQ functions  (eg. 
> > > common_interrupt):
> > >
> > > #define DEFINE_IDTENTRY_IRQ(func)                                       \
> > > static void __##func(struct pt_regs *regs, u32 vector);                 \
> > >                                                                          \
> > > __visible noinstr void func(struct pt_regs *regs,                       \
> > >                              unsigned long error_code)                   \
> > > {                                                                       \
> > >          irqentry_state_t state = irqentry_enter(regs);                  \
> > >          u32 vector = (u32)(u8)error_code;                               \
> > >                                                                          \
> > >          instrumentation_begin();                                        \
> > >          kvm_set_cpu_l1tf_flush_l1d();                                   \
> > >          run_irq_on_irqstack_cond(__##func, regs, vector);               \
> > >          instrumentation_end();                                          \
> > >          irqentry_exit(regs, state);                                     \
> > > }                                                                       \
> > >
> > > \ static noinline void __##func(struct pt_regs *regs, u32 vector)
> > >
> > >
> > > And also to handle irq_stack rip's from SYSVEC functions  (eg. 
> > > sysvec_apic_timer_interrupt):
> > >
> > > #define DEFINE_IDTENTRY_SYSVEC(func)                                    \
> > > static void __##func(struct pt_regs *regs);                             \
> > >                                                                          \
> > > __visible noinstr void func(struct pt_regs *regs)                       \
> > > {                                                                       \
> > >          irqentry_state_t state = irqentry_enter(regs);                  \
> > >                                                                          \
> > >          instrumentation_begin();                                        \
> > >          kvm_set_cpu_l1tf_flush_l1d();                                   \
> > >          run_sysvec_on_irqstack_cond(__##func, regs);                    \
> > >          instrumentation_end();                                          \
> > >          irqentry_exit(regs, state);                                     \
> > > }                                                                       \
> > >
> > > \ static noinline void __##func(struct pt_regs *regs) Other changes are 
> > > required.
> > >
> > >
> > > Clearly we should be adjusting "irq_eframe_link" based on framesize for 
> > > either DEFINE_IDTENTRY_IRQ() functions or DEFINE_IDTENTRY_SYSVEC() 
> > > functions.
> > > Maybe we should have  "irq_eframe_link" for DEFINE_IDTENTRY_IRQ() 
> > > functions, and another "sysvec_eframe_link" for DEFINE_IDTENTRY_SYSVEC() 
> > > functions, then try both links.
> > > Anyhow, the current x86_64_irq_eframe_link_init() code is making 
> > > assumption on a single common framesize , and actually there is a couple 
> > > of assumptions:
> > > - assuming framesizes with hardcoded values
> > > - assuming a single common framesize
> > >
> > > To keep backward compatibility with the current code, which is
> > > trying +8 (not sure why), the patch I proposed was keeping the existing 
> > > +8, but it is also trying -8, and also +16 and -16, as to try having more 
> > > fallbacks in case we didn't guess the proper framesizes.
> >
> > Thank you for the fix, I also have a vmcore printing "bogus exception
> > frame" but there has been no good idea to fix.  The patch fixed this
> > so
> >
> > Acked-by: Kazuhito Hagio <k-hagio...@nec.com>
> >
> > $ diff -u before.log after.log
> >
> > @@ -577,14 +577,21 @@
> >    #9 [ffff954a4683cfe0] __irq_exit_rcu at ffffffff86b0948f
> >   #10 [ffff954a4683cff0] sysvec_apic_timer_interrupt at ffffffff875fe8f2
> >   --- <IRQ stack> ---
> > -    RIP: 0000000000000010  RSP: 0000000000000018  RFLAGS: ffff954a4437fe80
> > -    RAX: 000000000000001f  RBX: 000000000000a3e7  RCX: 000000000000000e
> > -    RDX: 000000003a2e8f1a  RSI: 0000000000000000  RDI: ffffffffffffffff
> > -    RBP: ffffb53a3f580000   R8: ffff8a51bf5b0a80   R9: 0000001a95a9b7b8
> > -    R10: 0000000000000000  R11: 00000000000022ef  R12: 0000000000000004
> > -    R13: ffffffff884acd60  R14: 0000001a95a9b7b8  R15: 0000000000000004
> > -    ORIG_RAX: ffffffff872fe528  CS: 0202  SS: 1a92a013c6
> > -WARNING: possibly bogus exception frame
> > +#11 [ffff954a4437fdd8] asm_sysvec_apic_timer_interrupt at ffffffff87800d86
> > +    [exception RIP: cpuidle_enter_state+216]
> > +    RIP: ffffffff872fe528  RSP: ffff954a4437fe80  RFLAGS: 00000202
> > +    RAX: ffff8a51bf5b0a80  RBX: ffffb53a3f580000  RCX: 000000000000001f
> > +    RDX: 000000000000000e  RSI: 000000003a2e8f1a  RDI: 0000000000000000
> > +    RBP: 0000000000000004   R8: 0000001a95a9b7b8   R9: 0000000000000000
> > +    R10: 00000000000022ef  R11: 000000000000a3e7  R12: ffffffff884acd60
> > +    R13: 0000001a95a9b7b8  R14: 0000000000000004  R15: 0000000000000000
> > +    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
> > +#12 [ffff954a4437feb8] cpuidle_enter at ffffffff872fe8b9
> > +#13 [ffff954a4437fed8] cpuidle_idle_call at ffffffff86b624dc
> > +#14 [ffff954a4437ff10] do_idle at ffffffff86b625fb
> > +#15 [ffff954a4437ff28] cpu_startup_entry at ffffffff86b62849
> > +#16 [ffff954a4437ff38] start_secondary at ffffffff86a62a0f
> > +#17 [ffff954a4437ff50] secondary_startup_64_no_verify at
> > +ffffffff86a0015a
> >
> > Thanks,
> > Kazu
> >
> >
> > >
> > > Regards,
> > > Georges
> > >
> > >
> > >
> > > -----Original Message-----
> > > From: Lianbo Jiang <liji...@redhat.com>
> > > Sent: Wednesday, April 10, 2024 6:11 AM
> > > To: devel@lists.crash-utility.osci.io
> > > Cc: Aureau, Georges (Kernel Tools ERT) <georges.aur...@hpe.com>
> > > Subject: Re: [PATCH] x86_64: rhel 9.3: "crash" doesn't handle all
> > > IRQ exception properly
> > >
> > > Hi, Aureau
> > >
> > > Thank you for the fix.
> > >
> > > On 4/5/24 08:38, devel-requ...@lists.crash-utility.osci.io wrote:
> > >> Date: Thu, 4 Apr 2024 14:39:06 +0000
> > >> From: "Aureau, Georges (Kernel Tools ERT)"<georges.aur...@hpe.com>
> > >> Subject: [Crash-utility] [PATCH] x86_64: rhel 9.3: "crash" doesn't
> > >>      handle all IRQ exception properly.
> > >> To:"devel@lists.crash-utility.osci.io"
> > >>      <devel@lists.crash-utility.osci.io>
> > >> Message-ID: <sj0pr84mb14821a83d9631672b4f0f9449f...@sj0pr84mb1482.namp
> > >>      RD84.PROD.OUTLOOK.COM>
> > >> Content-Type: text/plain; charset="us-ascii"
> > >>
> > >>
> > >> Hello,
> > >>
> > >> On x86_64, with rhel 9.3, there are cases where "crash" can't handle IRQ 
> > >> exception frames properly.
> > >>
> > >> For example, "crash" fails with "WARNING: possibly bogus exception 
> > >> frame":
> > >>
> > >> crash> bt -c 30
> > >> PID: 2898241  TASK: ff4cb0ce0da0c680  CPU: 30   COMMAND: "star-ccm+"
> > >>    #0 [fffffe4658d88e58] crash_nmi_callback at ffffffffa00675e8
> > >>    #1 [fffffe4658d88e68] nmi_handle at ffffffffa002ebab
> > >>    #2 [fffffe4658d88eb0] default_do_nmi at ffffffffa0c56cd0
> > >>    #3 [fffffe4658d88ed0] exc_nmi at ffffffffa0c56ee1
> > >>    #4 [fffffe4658d88ef0] end_repeat_nmi at ffffffffa0e01639
> > >>       [exception RIP: native_queued_spin_lock_slowpath+636]
> > >>       RIP: ffffffffa0c6b80c  RSP: ff5eba269937ce10  RFLAGS: 00000046
> > >>       RAX: 0000000000000000  RBX: ff4cb12bcfbb2940  RCX: 00000000007c0000
> > >>       RDX: 0000000000000057  RSI: 0000000001600000  RDI: ff4cb12d67205608
> > >>       RBP: ff4cb12d67205608   R8: ff90b9c682475940   R9: 0000000000000000
> > >>       R10: ff4cb0d70ee4e100  R11: 0000000000000000  R12: 0000000000000000
> > >>       R13: 000000000000001e  R14: ff90b9c682474918  R15: ff90b9c682477120
> > >>       ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
> > >> --- <NMI exception stack> ---
> > >>    #5 [ff5eba269937ce10] native_queued_spin_lock_slowpath at 
> > >> ffffffffa0c6b80c
> > >>    #6 [ff5eba269937ce30] _raw_spin_lock_irqsave at ffffffffa0c6ad90
> > >>    #7 [ff5eba269937ce40] free_iova at ffffffffa07df716
> > >>    #8 [ff5eba269937ce68] fq_ring_free at ffffffffa07dba15
> > >>    #9 [ff5eba269937ce98] fq_flush_timeout at ffffffffa07dc8fe
> > >> #10 [ff5eba269937ced0] call_timer_fn at ffffffffa01c00a4
> > >> #11 [ff5eba269937cef0] __run_timers at ffffffffa01c03ae
> > >> #12 [ff5eba269937cf88] run_timer_softirq at ffffffffa01c0476
> > >> #13 [ff5eba269937cf90] __do_softirq at ffffffffa0c6c007
> > >> #14 [ff5eba269937cfe0] __irq_exit_rcu at ffffffffa010ef61
> > >> #15 [ff5eba269937cff0] sysvec_apic_timer_interrupt at
> > >> ffffffffa0c58ca2
> > >> --- <IRQ stack> ---
> > >>       RIP: 0000000000000010  RSP: 0000000000000018  RFLAGS: 
> > >> ff5eba26ddc9f7e8
> > >>       RAX: 0000000000000a20  RBX: ff5eba26ddc9f940  RCX: 0000000000001000
> > >>       RDX: ffffffb559980000  RSI: ff4cb12d67207400  RDI: ffffffffffffffff
> > >>       RBP: 0000000000001000   R8: ff5eba26ddc9f940   R9: ff5eba26ddc9f8af
> > >>       R10: 0000000000000003  R11: 0000000000000a20  R12: ff5eba26ddc9f8b0
> > >>       R13: 000000283c07f000  R14: ff4cb0f5a29a1c00  R15: 0000000000000001
> > >>       ORIG_RAX: ffffffffa07c4e60  CS: 0206  SS: 7000001cf0380001
> > >> bt: WARNING: possibly bogus exception frame
> > >>
> > >>
> > >> Running "crash" with "--machdep irq_eframe_link=0xffffffffffffffe8" (ie. 
> > >> thus irq_eframe_link = -24) works properly:
> > >>
> > >> PID: 2898241  TASK: ff4cb0ce0da0c680  CPU: 30   COMMAND: "star-ccm+"
> > >>    #0 [fffffe4658d88e58] crash_nmi_callback at ffffffffa00675e8
> > >>    #1 [fffffe4658d88e68] nmi_handle at ffffffffa002ebab
> > >>    #2 [fffffe4658d88eb0] default_do_nmi at ffffffffa0c56cd0
> > >>    #3 [fffffe4658d88ed0] exc_nmi at ffffffffa0c56ee1
> > >>    #4 [fffffe4658d88ef0] end_repeat_nmi at ffffffffa0e01639
> > >>       [exception RIP: native_queued_spin_lock_slowpath+636]
> > >>       RIP: ffffffffa0c6b80c  RSP: ff5eba269937ce10  RFLAGS: 00000046
> > >>       RAX: 0000000000000000  RBX: ff4cb12bcfbb2940  RCX: 00000000007c0000
> > >>       RDX: 0000000000000057  RSI: 0000000001600000  RDI: ff4cb12d67205608
> > >>       RBP: ff4cb12d67205608   R8: ff90b9c682475940   R9: 0000000000000000
> > >>       R10: ff4cb0d70ee4e100  R11: 0000000000000000  R12: 0000000000000000
> > >>       R13: 000000000000001e  R14: ff90b9c682474918  R15: ff90b9c682477120
> > >>       ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
> > >> --- <NMI exception stack> ---
> > >>    #5 [ff5eba269937ce10] native_queued_spin_lock_slowpath at 
> > >> ffffffffa0c6b80c
> > >>    #6 [ff5eba269937ce30] _raw_spin_lock_irqsave at ffffffffa0c6ad90
> > >>    #7 [ff5eba269937ce40] free_iova at ffffffffa07df716
> > >>    #8 [ff5eba269937ce68] fq_ring_free at ffffffffa07dba15
> > >>    #9 [ff5eba269937ce98] fq_flush_timeout at ffffffffa07dc8fe
> > >> #10 [ff5eba269937ced0] call_timer_fn at ffffffffa01c00a4
> > >> #11 [ff5eba269937cef0] __run_timers at ffffffffa01c03ae
> > >> #12 [ff5eba269937cf88] run_timer_softirq at ffffffffa01c0476
> > >> #13 [ff5eba269937cf90] __do_softirq at ffffffffa0c6c007
> > >> #14 [ff5eba269937cfe0] __irq_exit_rcu at ffffffffa010ef61
> > >> #15 [ff5eba269937cff0] sysvec_apic_timer_interrupt at
> > >> ffffffffa0c58ca2
> > >> --- <IRQ stack> ---
> > >> #16 [ff5eba26ddc9f738] asm_sysvec_apic_timer_interrupt at 
> > >> ffffffffa0e00e06
> > >>       [exception RIP: alloc_pte.constprop.0+32]
> > >>       RIP: ffffffffa07c4e60  RSP: ff5eba26ddc9f7e8  RFLAGS: 00000206
> > >>       RAX: ff5eba26ddc9f940  RBX: 0000000000001000  RCX: 0000000000000a20
> > >>       RDX: 0000000000001000  RSI: ffffffb559980000  RDI: ff4cb12d67207400
> > >>       RBP: ff5eba26ddc9f8b0   R8: ff5eba26ddc9f8af   R9: 0000000000000003
> > >>       R10: 0000000000000a20  R11: ff5eba26ddc9f940  R12: 000000283c07f000
> > >>       R13: ff4cb0f5a29a1c00  R14: 0000000000000001  R15: ff4cb0f5a29a1bf8
> > >>       ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
> > >> #17 [ff5eba26ddc9f830] iommu_v1_map_pages at ffffffffa07c5648
> > >> #18 [ff5eba26ddc9f8f8] __iommu_map at ffffffffa07d7803
> > >> #19 [ff5eba26ddc9f990] iommu_map_sg at ffffffffa07d7b71
> > >> #20 [ff5eba26ddc9f9f0] iommu_dma_map_sg at ffffffffa07ddcc9
> > >> #21 [ff5eba26ddc9fa90] __dma_map_sg_attrs at ffffffffa01b5205
> > >> #22 [ff5eba26ddc9fa98] dma_map_sgtable at ffffffffa01b526d
> > >> #23 [ff5eba26ddc9faa8] ib_umem_get at ffffffffc112f308 [ib_uverbs]
> > >> #24 [ff5eba26ddc9fb00] mlx5_ib_reg_user_mr at ffffffffc11eb991
> > >> [mlx5_ib]
> > >> #25 [ff5eba26ddc9fb40] ib_uverbs_reg_mr at ffffffffc1123bc4
> > >> [ib_uverbs]
> > >> #26 [ff5eba26ddc9fbb0] ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE
> > >> at
> > >> ffffffffc112d064 [ib_uverbs]
> > >> #27 [ff5eba26ddc9fbe0] ib_uverbs_run_method at ffffffffc112a121
> > >> [ib_uverbs]
> > >> #28 [ff5eba26ddc9fc30] ib_uverbs_cmd_verbs at ffffffffc112a332
> > >> [ib_uverbs]
> > >> #29 [ff5eba26ddc9fe68] ib_uverbs_ioctl at ffffffffc112a494
> > >> [ib_uverbs]
> > >> #30 [ff5eba26ddc9fea8] __x64_sys_ioctl at ffffffffa0438f67
> > >> #31 [ff5eba26ddc9fed8] do_syscall_64 at ffffffffa0c55189
> > >> #32 [ff5eba26ddc9ff50] entry_SYSCALL_64_after_hwframe at ffffffffa0e000ea
> > >>       RIP: 0000146ef3a3ec6b  RSP: 00007ffe3b7dc5b8  RFLAGS: 00000246
> > >>       RAX: ffffffffffffffda  RBX: 00007ffe3b7dc6a8  RCX: 0000146ef3a3ec6b
> > >>       RDX: 00007ffe3b7dc690  RSI: 00000000c0181b01  RDI: 0000000000000011
> > >>       RBP: 00007ffe3b7dc670   R8: 0000146ed9549010   R9: 00007ffe3b7dc7c4
> > >>       R10: 0000000000000009  R11: 0000000000000246  R12: 00007ffe3b7dc7c4
> > >>       R13: 000000000000000c  R14: 000000000000000c  R15: 0000146ed9549150
> > >>       ORIG_RAX: 0000000000000010  CS: 0033  SS: 002b
> > >>
> > >>
> > >> Some background:
> > >>
> > >> asm_common_interrupt:
> > >>         callq  error_entry
> > >>         movq   %rax,%rsp
> > >>         movq   %rsp,%rdi
> > >>         movq   0x78(%rsp),%rsi
> > >>         movq   $-0x1,0x78(%rsp)
> > >>         call common_interrupt    # rsp pointing to regs
> > >>
> > >> common_interrupt
> > >>         pushq  %r12
> > >>         pushq  %rbp
> > >>         pushq  %rbx
> > >>         [...]
> > >>         movq   hardirq_stack_ptr,%r11
> > >>         movq   %rsp,(%r11)
> > >>         movq   %r11,%rsp
> > >>         [...]
> > >>         call __common_interrupt  # rip:common_interrupt
> > >>
> > >> So frame_size(rip:common_interrupt) = 32 (3 push + ret).
> > >>
> > >> Hence "machdep->machspec->irq_eframe_link = -32;"  (see 
> > >> x86_64_irq_eframe_link_init()).
> > >>
> > >> Now:
> > >>
> > >> asm_sysvec_apic_timer_interrupt:
> > >>         pushq  $-0x1
> > >>         callq  error_entry
> > >>         movq   %rax,%rsp
> > >>         movq   %rsp,%rdi
> > >>         callq  sysvec_apic_timer_interrupt
> > >>
> > >> sysvec_apic_timer_interrupt
> > >>         pushq  %r12
> > >>         pushq  %rbp
> > >>         [...]
> > >>         movq   hardirq_stack_ptr,%r11
> > >>         movq   %rsp,(%r11)
> > >>         movq   %r11,%rsp
> > >>         [...]
> > >>         call __sysvec_apic_timer_interrupt  #
> > >> rip:sysvec_apic_timer_interrupt
> > >>
> > >> Here frame_size(rip:sysvec_apic_timer_interrupt) = 24 (2 push +
> > >> ret)
> > >>
> > >> We should also notice that:
> > >>
> > >> rip  = *(hardirq_stack_ptr - 8)
> > >> rsp  = *(hardirq_stack_ptr)
> > >> regs = rsp - frame_size(rip)
> > >>
> > >> But x86_64_get_framesize() does not work with IRQ handlers (returns 0).
> > >> So not many options other than hardcoding the most likely value and 
> > >> looking around it.
> > >> Actually x86_64_irq_eframe_link() was trying -32 (default), and then 
> > >> -40,  but not -24 !
> > >>
> > >> Signed-off-by: Georges Aureau<georges.aur...@hpe.com>
> > >> --
> > >> diff --git a/x86_64.c b/x86_64.c
> > >> index aec82b0..90c2a91 100644
> > >> --- a/x86_64.c
> > >> +++ b/x86_64.c
> > >> @@ -6623,13 +6623,14 @@ x86_64_irq_eframe_link_init(void)
> > >>
> > >>    /*
> > >>     *  Calculate and verify the IRQ exception frame location from
> > >> the
> > >> - *  stack reference at the top of the IRQ stack, possibly
> > >> adjusting
> > >> - *  the ms->irq_eframe_link value.
> > >> + *  stack reference at the top of the IRQ stack, keep
> > >> + ms->irq_eframe_link
> > >> + *  as the most likely value, and try a few sizes around it.
> > >>     */
> > >>    static ulong
> > >>    x86_64_irq_eframe_link(ulong stkref, struct bt_info *bt, FILE *ofp)
> > >>    {
> > >>           ulong irq_eframe;
> > >> +       int i, try[] = { 8, -8, 16, -16 };
> > >>
> > >>           if (x86_64_exception_frame(EFRAME_VERIFY, stkref, 0, bt, ofp))
> > >>                   return stkref;
> > >> @@ -6639,9 +6640,10 @@ x86_64_irq_eframe_link(ulong stkref, struct 
> > >> bt_info *bt, FILE *ofp)
> > >>           if (x86_64_exception_frame(EFRAME_VERIFY, irq_eframe, 0, bt, 
> > >> ofp))
> > >>                   return irq_eframe;
> > >>
> > >> -       if (x86_64_exception_frame(EFRAME_VERIFY, irq_eframe+8, 0, bt, 
> > >> ofp)) {
> > >> -               machdep->machspec->irq_eframe_link -= 8;
>
> Why don't we update "machdep->machspec->irq_eframe_link" in the following 
> added code?
>
> > >> -               return (irq_eframe + 8);
> > >> +       for (i=0; i<sizeof(try)/sizeof(int); i++) {
> > >> +               if (x86_64_exception_frame(EFRAME_VERIFY, 
> > >> irq_eframe+try[i], 0, bt, ofp)) {
> > >> +                       return (irq_eframe + try[i]);
> > >> +               }
> > >>           }
> > >>
> > >>           return irq_eframe;
>
> I guess if we don't need to update
> "machdep->machspec->irq_eframe_link", maybe we can merge the code as follows?
>
>  x86_64_irq_eframe_link(ulong stkref, struct bt_info *bt, FILE *ofp)  {
>         ulong irq_eframe;
> +       int i, try[] = { 0, 8, -8, 16, -16 };
>
>         if (x86_64_exception_frame(EFRAME_VERIFY, stkref, 0, bt, ofp))
>                 return stkref;
>
>         irq_eframe = stkref - machdep->machspec->irq_eframe_link;
>
> -       if (x86_64_exception_frame(EFRAME_VERIFY, irq_eframe, 0, bt, ofp))
> -               return irq_eframe;
> -
> -       if (x86_64_exception_frame(EFRAME_VERIFY, irq_eframe+8, 0, bt, ofp)) {
> -               machdep->machspec->irq_eframe_link -= 8;
> -               return (irq_eframe + 8);
> +       for (i=0; i<sizeof(try)/sizeof(int); i++) {
> +               if (x86_64_exception_frame(EFRAME_VERIFY,
> irq_eframe+try[i], 0, bt, ofp)) {
> +                       return (irq_eframe + try[i]);
> +               }
>         }
>
>         return irq_eframe;
>
> Thanks,
> Tao Liu
>
> > >
> > >
> > > Can you help to try the following changes? It could be good to check the 
> > > relevant symbol, but I'm not sure if this can work well for you and cover 
> > > some corner cases.
> > >
> > > diff --git a/x86_64.c b/x86_64.c
> > > index aec82b03b62e..251e224c013b 100644
> > > --- a/x86_64.c
> > > +++ b/x86_64.c
> > > @@ -6574,6 +6574,8 @@ x86_64_irq_eframe_link_init(void)
> > >        if (symbol_exists("asm_common_interrupt")) {
> > >            if (symbol_exists("asm_call_on_stack"))
> > >                machdep->machspec->irq_eframe_link = -64;
> > > +        else if (symbol_exists("sysvec_apic_timer_interrupt"))
> > > +            machdep->machspec->irq_eframe_link = -24;
> > >            else
> > >                machdep->machspec->irq_eframe_link = -32;
> > >            return;
> > >
> > >
> > > Thanks
> > >
> > > Lianbo
> > >
> > > --
> > > Crash-utility mailing list -- devel@lists.crash-utility.osci.io To
> > > unsubscribe send an email to devel-le...@lists.crash-utility.osci.io
> > > https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io
> > > / Contribution Guidelines:
> > > https://github.com/crash-utility/crash/wiki
> > --
> > Crash-utility mailing list -- devel@lists.crash-utility.osci.io To
> > unsubscribe send an email to devel-le...@lists.crash-utility.osci.io
> > https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/
> > Contribution Guidelines: https://github.com/crash-utility/crash/wiki
>
--
Crash-utility mailing list -- devel@lists.crash-utility.osci.io
To unsubscribe send an email to devel-le...@lists.crash-utility.osci.io
https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/
Contribution Guidelines: https://github.com/crash-utility/crash/wiki

[Crash-utility] Re: [PATCH] x86_64: rhel 9.3: "crash" doesn't handle all IRQ exception properly

Reply via email to