On Wed, Dec 02, 2020 at 10:21:16AM +0100, Peter Zijlstra wrote:
> On Tue, Dec 01, 2020 at 08:18:56PM +0100, Heiko Carstens wrote:
> OK, so with a little help from s390/PoO and Sven, the code removed skips
> the TRACE_IRQS_OFF when IRQs were enabled in the old PSW (the previous
> context).
> 
> That sounds entirely the right thing. Irrespective of what the previous
> IRQ state was, the current state is off.
> 
> > diff --git a/arch/s390/kernel/idle.c b/arch/s390/kernel/idle.c
> > index 2b85096964f8..5bd8c1044d09 100644
> > --- a/arch/s390/kernel/idle.c
> > +++ b/arch/s390/kernel/idle.c
> > @@ -123,7 +123,6 @@ void arch_cpu_idle_enter(void)
> >  void arch_cpu_idle(void)
> >  {
> >     enabled_wait();
> > -   raw_local_irq_enable();
> >  }
> 
> Currently arch_cpu_idle() is defined as to return with IRQs enabled,
> however, the very first thing we do when we return is
> raw_local_irq_disable(), so this change is harmless.
> 
> It is also the direction I've been arguing for elsewhere in this thread.
> So I'm certainly not complaining.

So I left that raw_local_irq_enable() in to be consistent with other
architectures. enabled_wait() now returns with irqs disabled, but with
a lockdep state that tells irqs are on...  See patch below.
Works and hopefully makes sense ;)

In addition (but not for rc7) I want to get rid of our complex udelay
implementation. I think we don't need that anymore.. so there would be
only the idle code left where we have to play tricks.

>From 7bd86fb3eb039a4163281472ca79b9158e726526 Mon Sep 17 00:00:00 2001
From: Heiko Carstens <h...@linux.ibm.com>
Date: Wed, 2 Dec 2020 11:46:01 +0100
Subject: [PATCH] s390: fix irq state tracing

With commit 58c644ba512c ("sched/idle: Fix arch_cpu_idle() vs
tracing") common code calls arch_cpu_idle() with a lockdep state that
tells irqs are on.

This doesn't work very well for s390: psw_idle() will enable interrupts
to wait for an interrupt. As soon as an interrupt occurs the interrupt
handler will verify if the old context was psw_idle(). If that is the
case the interrupt enablement bits in the old program status word will
be cleared.

A subsequent test in both the external as well as the io interrupt
handler checks if in the old context interrupts were enabled. Due to
the above patching of the old program status word it is assumed the
old context had interrupts disabled, and therefore a call to
TRACE_IRQS_OFF (aka trace_hardirqs_off_caller) is skipped. Which in
turn makes lockdep incorrectly "think" that interrupts are enabled
within the interrupt handler.

Fix this by unconditionally calling TRACE_IRQS_OFF when entering
interrupt handlers. Also call unconditionally TRACE_IRQS_ON when
leaving interrupts handlers.

This leaves the special psw_idle() case, which now returns with
interrupts disabled, but has an "irqs on" lockdep state. So callers of
psw_idle() must adjust the state on their own, if required. This is
currently only __udelay_disabled().

Fixes: 58c644ba512c ("sched/idle: Fix arch_cpu_idle() vs tracing")
Signed-off-by: Heiko Carstens <h...@linux.ibm.com>
---
 arch/s390/kernel/entry.S | 15 ---------------
 arch/s390/lib/delay.c    |  5 ++---
 2 files changed, 2 insertions(+), 18 deletions(-)

diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S
index 26bb0603c5a1..92beb1444644 100644
--- a/arch/s390/kernel/entry.S
+++ b/arch/s390/kernel/entry.S
@@ -763,12 +763,7 @@ ENTRY(io_int_handler)
        xc      __PT_FLAGS(8,%r11),__PT_FLAGS(%r11)
        TSTMSK  __LC_CPU_FLAGS,_CIF_IGNORE_IRQ
        jo      .Lio_restore
-#if IS_ENABLED(CONFIG_TRACE_IRQFLAGS)
-       tmhh    %r8,0x300
-       jz      1f
        TRACE_IRQS_OFF
-1:
-#endif
        xc      __SF_BACKCHAIN(8,%r15),__SF_BACKCHAIN(%r15)
 .Lio_loop:
        lgr     %r2,%r11                # pass pointer to pt_regs
@@ -791,12 +786,7 @@ ENTRY(io_int_handler)
        TSTMSK  __LC_CPU_FLAGS,_CIF_WORK
        jnz     .Lio_work
 .Lio_restore:
-#if IS_ENABLED(CONFIG_TRACE_IRQFLAGS)
-       tm      __PT_PSW(%r11),3
-       jno     0f
        TRACE_IRQS_ON
-0:
-#endif
        mvc     __LC_RETURN_PSW(16),__PT_PSW(%r11)
        tm      __PT_PSW+1(%r11),0x01   # returning to user ?
        jno     .Lio_exit_kernel
@@ -976,12 +966,7 @@ ENTRY(ext_int_handler)
        xc      __PT_FLAGS(8,%r11),__PT_FLAGS(%r11)
        TSTMSK  __LC_CPU_FLAGS,_CIF_IGNORE_IRQ
        jo      .Lio_restore
-#if IS_ENABLED(CONFIG_TRACE_IRQFLAGS)
-       tmhh    %r8,0x300
-       jz      1f
        TRACE_IRQS_OFF
-1:
-#endif
        xc      __SF_BACKCHAIN(8,%r15),__SF_BACKCHAIN(%r15)
        lgr     %r2,%r11                # pass pointer to pt_regs
        lghi    %r3,EXT_INTERRUPT
diff --git a/arch/s390/lib/delay.c b/arch/s390/lib/delay.c
index daca7bad66de..8c0c68e7770e 100644
--- a/arch/s390/lib/delay.c
+++ b/arch/s390/lib/delay.c
@@ -33,7 +33,7 @@ EXPORT_SYMBOL(__delay);
 
 static void __udelay_disabled(unsigned long long usecs)
 {
-       unsigned long cr0, cr0_new, psw_mask, flags;
+       unsigned long cr0, cr0_new, psw_mask;
        struct s390_idle_data idle;
        u64 end;
 
@@ -45,9 +45,8 @@ static void __udelay_disabled(unsigned long long usecs)
        psw_mask = __extract_psw() | PSW_MASK_EXT | PSW_MASK_WAIT;
        set_clock_comparator(end);
        set_cpu_flag(CIF_IGNORE_IRQ);
-       local_irq_save(flags);
        psw_idle(&idle, psw_mask);
-       local_irq_restore(flags);
+       trace_hardirqs_off();
        clear_cpu_flag(CIF_IGNORE_IRQ);
        set_clock_comparator(S390_lowcore.clock_comparator);
        __ctl_load(cr0, 0, 0);
-- 
2.17.1

Reply via email to