On Wed, Dec 02, 2020 at 10:21:16AM +0100, Peter Zijlstra wrote: > On Tue, Dec 01, 2020 at 08:18:56PM +0100, Heiko Carstens wrote: > OK, so with a little help from s390/PoO and Sven, the code removed skips > the TRACE_IRQS_OFF when IRQs were enabled in the old PSW (the previous > context). > > That sounds entirely the right thing. Irrespective of what the previous > IRQ state was, the current state is off. > > > diff --git a/arch/s390/kernel/idle.c b/arch/s390/kernel/idle.c > > index 2b85096964f8..5bd8c1044d09 100644 > > --- a/arch/s390/kernel/idle.c > > +++ b/arch/s390/kernel/idle.c > > @@ -123,7 +123,6 @@ void arch_cpu_idle_enter(void) > > void arch_cpu_idle(void) > > { > > enabled_wait(); > > - raw_local_irq_enable(); > > } > > Currently arch_cpu_idle() is defined as to return with IRQs enabled, > however, the very first thing we do when we return is > raw_local_irq_disable(), so this change is harmless. > > It is also the direction I've been arguing for elsewhere in this thread. > So I'm certainly not complaining.
So I left that raw_local_irq_enable() in to be consistent with other architectures. enabled_wait() now returns with irqs disabled, but with a lockdep state that tells irqs are on... See patch below. Works and hopefully makes sense ;) In addition (but not for rc7) I want to get rid of our complex udelay implementation. I think we don't need that anymore.. so there would be only the idle code left where we have to play tricks. >From 7bd86fb3eb039a4163281472ca79b9158e726526 Mon Sep 17 00:00:00 2001 From: Heiko Carstens <h...@linux.ibm.com> Date: Wed, 2 Dec 2020 11:46:01 +0100 Subject: [PATCH] s390: fix irq state tracing With commit 58c644ba512c ("sched/idle: Fix arch_cpu_idle() vs tracing") common code calls arch_cpu_idle() with a lockdep state that tells irqs are on. This doesn't work very well for s390: psw_idle() will enable interrupts to wait for an interrupt. As soon as an interrupt occurs the interrupt handler will verify if the old context was psw_idle(). If that is the case the interrupt enablement bits in the old program status word will be cleared. A subsequent test in both the external as well as the io interrupt handler checks if in the old context interrupts were enabled. Due to the above patching of the old program status word it is assumed the old context had interrupts disabled, and therefore a call to TRACE_IRQS_OFF (aka trace_hardirqs_off_caller) is skipped. Which in turn makes lockdep incorrectly "think" that interrupts are enabled within the interrupt handler. Fix this by unconditionally calling TRACE_IRQS_OFF when entering interrupt handlers. Also call unconditionally TRACE_IRQS_ON when leaving interrupts handlers. This leaves the special psw_idle() case, which now returns with interrupts disabled, but has an "irqs on" lockdep state. So callers of psw_idle() must adjust the state on their own, if required. This is currently only __udelay_disabled(). Fixes: 58c644ba512c ("sched/idle: Fix arch_cpu_idle() vs tracing") Signed-off-by: Heiko Carstens <h...@linux.ibm.com> --- arch/s390/kernel/entry.S | 15 --------------- arch/s390/lib/delay.c | 5 ++--- 2 files changed, 2 insertions(+), 18 deletions(-) diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S index 26bb0603c5a1..92beb1444644 100644 --- a/arch/s390/kernel/entry.S +++ b/arch/s390/kernel/entry.S @@ -763,12 +763,7 @@ ENTRY(io_int_handler) xc __PT_FLAGS(8,%r11),__PT_FLAGS(%r11) TSTMSK __LC_CPU_FLAGS,_CIF_IGNORE_IRQ jo .Lio_restore -#if IS_ENABLED(CONFIG_TRACE_IRQFLAGS) - tmhh %r8,0x300 - jz 1f TRACE_IRQS_OFF -1: -#endif xc __SF_BACKCHAIN(8,%r15),__SF_BACKCHAIN(%r15) .Lio_loop: lgr %r2,%r11 # pass pointer to pt_regs @@ -791,12 +786,7 @@ ENTRY(io_int_handler) TSTMSK __LC_CPU_FLAGS,_CIF_WORK jnz .Lio_work .Lio_restore: -#if IS_ENABLED(CONFIG_TRACE_IRQFLAGS) - tm __PT_PSW(%r11),3 - jno 0f TRACE_IRQS_ON -0: -#endif mvc __LC_RETURN_PSW(16),__PT_PSW(%r11) tm __PT_PSW+1(%r11),0x01 # returning to user ? jno .Lio_exit_kernel @@ -976,12 +966,7 @@ ENTRY(ext_int_handler) xc __PT_FLAGS(8,%r11),__PT_FLAGS(%r11) TSTMSK __LC_CPU_FLAGS,_CIF_IGNORE_IRQ jo .Lio_restore -#if IS_ENABLED(CONFIG_TRACE_IRQFLAGS) - tmhh %r8,0x300 - jz 1f TRACE_IRQS_OFF -1: -#endif xc __SF_BACKCHAIN(8,%r15),__SF_BACKCHAIN(%r15) lgr %r2,%r11 # pass pointer to pt_regs lghi %r3,EXT_INTERRUPT diff --git a/arch/s390/lib/delay.c b/arch/s390/lib/delay.c index daca7bad66de..8c0c68e7770e 100644 --- a/arch/s390/lib/delay.c +++ b/arch/s390/lib/delay.c @@ -33,7 +33,7 @@ EXPORT_SYMBOL(__delay); static void __udelay_disabled(unsigned long long usecs) { - unsigned long cr0, cr0_new, psw_mask, flags; + unsigned long cr0, cr0_new, psw_mask; struct s390_idle_data idle; u64 end; @@ -45,9 +45,8 @@ static void __udelay_disabled(unsigned long long usecs) psw_mask = __extract_psw() | PSW_MASK_EXT | PSW_MASK_WAIT; set_clock_comparator(end); set_cpu_flag(CIF_IGNORE_IRQ); - local_irq_save(flags); psw_idle(&idle, psw_mask); - local_irq_restore(flags); + trace_hardirqs_off(); clear_cpu_flag(CIF_IGNORE_IRQ); set_clock_comparator(S390_lowcore.clock_comparator); __ctl_load(cr0, 0, 0); -- 2.17.1