Hi, On Mon, Feb 4, 2019 at 5:12 AM Mark Rutland <mark.rutl...@arm.com> wrote: > > On Fri, Feb 01, 2019 at 01:38:05PM -0800, Doug Anderson wrote: > > Hi, > > Hi Doug, > > > I was wondering if anyone out there has given any thought to > > annotating the ARM64 IRQ handling in such a way that we could stack > > crawl past el1_irq() when in gdb. > > > > I spent a bit of time on this a few months ago and documented all my > > findings in: > > > > https://bugs.chromium.org/p/chromium/issues/detail?id=908721 > > There, the error from GDB is: > > Backtrace stopped: previous frame identical to this frame (corrupt > stack?) > > ... is that misleading? > > ... or do we have some duplicate stack frame that we somewhow skip in > the kernel unwinder?
If I had to guess I'd say that when gdb doesn't see a frame it recognizes then it just returns the previous one, which causes it to stop. I don't think gdb falls back to just looking at the link register because it needs more. > > I can copy and paste all the discussion from that bug here, but since > > it's public hopefully folks can read the discussion / investigation > > there. To put it briefly, though: I can stack crawl past "el1_irq" > > with the normal linux stack crawl (which is what kdb uses) but I can't > > crawl past "el1_irq" in gdb(). After talking to some of our tools > > guys here I'm fairly certain that we could solve this with the right > > CFI directives, but when I poked at it I wasn't able to figure out the > > magic. > > AFAICT, we don't know why GDB is terminating early. Could we please > figure that out first? e.g. by looking for the above message in the GDB > sources. > > If we do need CFI annotations, I'd rather move that entry code to C > first, to minimize how painful that is. I have an ongoing project [1] to > do just that... > > Thanks, > Mark. > > [1] > https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/log/?h=arm64/entry-deasm OK, I tried this. It _changes_ the behavior but doesn't magically get me get a full crawl. If something like this is likely to merge to mainline before too long then it makes sense to spend the time debugging it instead of the old code... --- Vanilla v5.0-rc6 on kevin: #13 0xffffff801013e08c in generic_handle_irq_desc (desc=0x1) at .../include/linux/irqdesc.h:154 #14 generic_handle_irq (irq=<optimized out>) at .../kernel/irq/irqdesc.c:628 #15 0xffffff801013e110 in __handle_domain_irq (domain=0xffffffc000211880, hwirq=<optimized out>, lookup=<optimized out>, regs=0xffffff8011003ce0) at .../kernel/irq/irqdesc.c:665 #16 0xffffff8010081124 in handle_domain_irq (domain=0x1, hwirq=<optimized out>, regs=<optimized out>) at .../include/linux/irqdesc.h:172 #17 gic_handle_irq (regs=0xffffff8011003ce0) at .../drivers/irqchip/irq-gic-v3.c:367 #18 0xffffff8010082bf4 in el1_irq () at .../arch/arm64/kernel/entry.S:609 Backtrace stopped: previous frame identical to this frame (corrupt stack?) --- Vanilla v5.0-rc6 + your patches on kevin: #13 0xffffff801013e3cc in generic_handle_irq_desc (desc=0x1) at .../include/linux/irqdesc.h:154 #14 generic_handle_irq (irq=<optimized out>) at .../kernel/irq/irqdesc.c:628 #15 0xffffff801013e450 in __handle_domain_irq (domain=0xffffffc000211880, hwirq=<optimized out>, lookup=<optimized out>, regs=0xffffff8011003ce0) at .../kernel/irq/irqdesc.c:665 #16 0xffffff80100810c4 in handle_domain_irq (domain=0x1, hwirq=<optimized out>, regs=<optimized out>) at .../include/linux/irqdesc.h:172 #17 gic_handle_irq (regs=0xffffff8011003ce0) at .../drivers/irqchip/irq-gic-v3.c:367 #18 0xffffff8010084fd0 in call_on_stack () at .../arch/arm64/kernel/entry.S:718 Backtrace stopped: Cannot access memory at address 0xffffff8010004008 -Doug _______________________________________________ Kgdb-bugreport mailing list Kgdb-bugreport@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kgdb-bugreport