On Tue, Jun 02, 2026 at 01:26:48PM +0530, Shrikanth Hegde wrote:
>
>
> On 6/1/26 3:26 PM, Peter Zijlstra wrote:
> > On Mon, Jun 01, 2026 at 02:46:24PM +0530, Shrikanth Hegde wrote:
> >
> > > Ritesh, Mukesh, Is below possible scenario?
> > >
> > > do_page_fault seems to enable irq's in the interrupt handler?
> > > is that expected? if so, one might see
> > >
> > > -- do_page_fault (enter kernel mode)
> > > -- enables interrupts
> > > -- gets interrupt - Sets need_resched.
> > > -- irqentry_exit - Sees it is kernel mode. Just checks preempt
> > > count
> > > and calls preempt_schedule_irq, which catches both
> > > preempt_count and !irqs_disabled. Hence the panic?
> > >
> > > Should do_page_fault do preempt_disable when it enables the interrupts?
> >
> > No, it is expected for page-fault to be able to schedule. Specifically,
> > it must be able to sleep to support loading pages from disk.
>
> Oh yes. Ok. Thanks for taking a look.
>
> >
> > Please check the value of preempt_count() (does it perchance have
> > HARDIRQ_OFFSET?). Also, if the fault handler does enable IRQs, it must
> > also disable them again once done.
>
> Will check it.
>
> >
> > Notably, I see ___do_page_fault() do interrupt_cond_loadl_irq_enable(),
> > but I'm not seeing a local_irq_disable() to match!
>
> Yes, that's likely the culprit. It is possible that ___do_page_fault runs for
> longer
> and it may set need_resched. If it was in kernel mode, then it may not
> disable the
> interrupt and then subsequent irqentry_exit panics.
>
> BTW I was able to consistently repro this on P9 with hackbench as below.
>
> for i in {0..10}; do ./hackbench 10 process 10000 loops; done;
> for i in {0..10}; do ./hackbench 20 process 10000 loops; done;
> for i in {0..10}; do ./hackbench 30 process 10000 loops; done;
> for i in {0..10}; do ./hackbench 40 process 10000 loops; done; << usually
> panics here.
> for i in {0..10}; do ./hackbench 10 thread 10000 loops; done;
> for i in {0..10}; do ./hackbench 20 thread 10000 loops; done;
> for i in {0..10}; do ./hackbench -pipe 10 process 10000 loops; done;
> for i in {0..10}; do ./hackbench -pipe 20 process 10000 loops; done;
> for i in {0..10}; do ./hackbench -pipe 30 process 10000 loops; done;
> for i in {0..10}; do ./hackbench -pipe 40 process 10000 loops; done;
> for i in {0..10}; do ./hackbench -pipe 10 thread 10000 loops; done;
> for i in {0..10}; do ./hackbench -pipe 20 thread 10000 loops; done;
>
> Note, if i run ./hackbench 40 process 10000 loops alone, it doesn't panic.
> Likely some continous stressing needed to get into this case.
>
> Below diff helps to fix it. With it see test passes. Hackbench numbers aren't
> super happy
> about it. It is regressing a bit compared to baseline. But no panic atleast.
> AND i have changed the BUG_ON to WARN_ON as irq_disabled right after. We
> could still fix the
> call sites if the warning is seen.
>
> diff --git a/arch/powerpc/include/asm/entry-common.h
> b/arch/powerpc/include/asm/entry-common.h
> index de5601282755..7da373a56813 100644
> --- a/arch/powerpc/include/asm/entry-common.h
> +++ b/arch/powerpc/include/asm/entry-common.h
> @@ -253,16 +253,17 @@ static inline void arch_interrupt_enter_prepare(struct
> pt_regs *regs)
> static inline void arch_interrupt_exit_prepare(struct pt_regs *regs)
> {
> if (user_mode(regs)) {
> - BUG_ON(regs_is_unrecoverable(regs));
> - BUG_ON(regs_irqs_disabled(regs));
> + WARN_ON(regs_is_unrecoverable(regs));
> + WARN_ON(regs_irqs_disabled(regs));
> /*
> * We don't need to restore AMR on the way back to userspace
> for KUAP.
> * AMR can only have been unlocked if we interrupted the
> kernel.
> */
> kuap_assert_locked();
> -
> - local_irq_disable();
> }
> +
> + /* irqentry_exit expects to be called with interrupts disabled */
> + local_irq_disable();
> }
> static inline void arch_interrupt_async_enter_prepare(struct pt_regs *regs)
>
I would suggest trying something a little more focussed like so:
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 806c74e0d5ab..b002c179415c 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -589,6 +589,7 @@ static __always_inline void __do_page_fault(struct pt_regs
*regs)
err = ___do_page_fault(regs, regs->dar, regs->dsisr);
if (unlikely(err))
bad_page_fault(regs, err);
+ local_irq_disable();
}
DEFINE_INTERRUPT_HANDLER(do_page_fault)
Since only ___do_page_fault() will enable interrupts, you only need to
disable them again on its return path.