Re: [PATCH v2 05/12] x86: rework arch_local_irq_restore() to not use popf

2020-11-22 Thread Jürgen Groß via Virtualization

On 22.11.20 22:44, Andy Lutomirski wrote:

On Sat, Nov 21, 2020 at 10:55 PM Jürgen Groß  wrote:


On 20.11.20 12:59, Peter Zijlstra wrote:

On Fri, Nov 20, 2020 at 12:46:23PM +0100, Juergen Gross wrote:

+static __always_inline void arch_local_irq_restore(unsigned long flags)
+{
+if (!arch_irqs_disabled_flags(flags))
+arch_local_irq_enable();
+}


If someone were to write horrible code like:

   local_irq_disable();
   local_irq_save(flags);
   local_irq_enable();
   local_irq_restore(flags);

we'd be up some creek without a paddle... now I don't _think_ we have
genius code like that, but I'd feel saver if we can haz an assertion in
there somewhere...

Maybe something like:

#ifdef CONFIG_DEBUG_ENTRY // for lack of something saner
   WARN_ON_ONCE((arch_local_save_flags() ^ flags) & X86_EFLAGS_IF);
#endif

At the end?


I'd like to, but using WARN_ON_ONCE() in include/asm/irqflags.h sounds
like a perfect receipt for include dependency hell.

We could use a plain asm("ud2") instead.


How about out-of-lining it:

#ifdef CONFIG_DEBUG_ENTRY
extern void warn_bogus_irqrestore();
#endif

static __always_inline void arch_local_irq_restore(unsigned long flags)
{
if (!arch_irqs_disabled_flags(flags)) {
arch_local_irq_enable();
} else {
#ifdef CONFIG_DEBUG_ENTRY
if (unlikely(arch_local_irq_save() & X86_EFLAGS_IF))
 warn_bogus_irqrestore();
#endif
}



This couldn't be a WARN_ON_ONCE() then (or it would be a catch all).
Another approach might be to open-code the WARN_ON_ONCE(), like:

#ifdef CONFIG_DEBUG_ENTRY
extern void warn_bogus_irqrestore(bool *once);
#endif

static __always_inline void arch_local_irq_restore(unsigned long flags)
{
if (!arch_irqs_disabled_flags(flags))
arch_local_irq_enable();
#ifdef CONFIG_DEBUG_ENTRY
{
static bool once;

if (unlikely(arch_local_irq_save() & X86_EFLAGS_IF))
warn_bogus_irqrestore();
}
#endif
}


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: application/pgp-keys


OpenPGP_signature
Description: OpenPGP digital signature
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 000/141] Fix fall-through warnings for Clang

2020-11-22 Thread Sam Ravnborg
Hi James.

> > > If none of the 140 patches here fix a real bug, and there is no
> > > change to machine code then it sounds to me like a W=2 kind of a
> > > warning.
> > 
> > FWIW, this series has found at least one bug so far:
> > https://lore.kernel.org/lkml/CAFCwf11izHF=g1mGry1fE5kvFFFrxzhPSM6qKAO8gxSp=kr...@mail.gmail.com/
> 
> 
> Well, it's a problem in an error leg, sure, but it's not a really
> compelling reason for a 141 patch series, is it?  All that fixing this
> error will do is get the driver to print "oh dear there's a problem"
> under four more conditions than it previously did.

You are asking the wrong question here.

Yuo should ask  how many hours could have been saved by all the bugs
people have been fighting with and then fixed *before* the code
hit the kernel at all.

My personal experience is that I, more than once, have had errors
related to a missing break in my code. So this warnings is IMO a win.

And if we are only ~100 patches to have it globally enabled then it is a
no-brainer in my book.

Sam
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 05/12] x86: rework arch_local_irq_restore() to not use popf

2020-11-22 Thread Andy Lutomirski
On Sat, Nov 21, 2020 at 10:55 PM Jürgen Groß  wrote:
>
> On 20.11.20 12:59, Peter Zijlstra wrote:
> > On Fri, Nov 20, 2020 at 12:46:23PM +0100, Juergen Gross wrote:
> >> +static __always_inline void arch_local_irq_restore(unsigned long flags)
> >> +{
> >> +if (!arch_irqs_disabled_flags(flags))
> >> +arch_local_irq_enable();
> >> +}
> >
> > If someone were to write horrible code like:
> >
> >   local_irq_disable();
> >   local_irq_save(flags);
> >   local_irq_enable();
> >   local_irq_restore(flags);
> >
> > we'd be up some creek without a paddle... now I don't _think_ we have
> > genius code like that, but I'd feel saver if we can haz an assertion in
> > there somewhere...
> >
> > Maybe something like:
> >
> > #ifdef CONFIG_DEBUG_ENTRY // for lack of something saner
> >   WARN_ON_ONCE((arch_local_save_flags() ^ flags) & X86_EFLAGS_IF);
> > #endif
> >
> > At the end?
>
> I'd like to, but using WARN_ON_ONCE() in include/asm/irqflags.h sounds
> like a perfect receipt for include dependency hell.
>
> We could use a plain asm("ud2") instead.

How about out-of-lining it:

#ifdef CONFIG_DEBUG_ENTRY
extern void warn_bogus_irqrestore();
#endif

static __always_inline void arch_local_irq_restore(unsigned long flags)
{
   if (!arch_irqs_disabled_flags(flags)) {
   arch_local_irq_enable();
   } else {
#ifdef CONFIG_DEBUG_ENTRY
   if (unlikely(arch_local_irq_save() & X86_EFLAGS_IF))
warn_bogus_irqrestore();
#endif
}
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 000/141] Fix fall-through warnings for Clang

2020-11-22 Thread Joe Perches
On Sun, 2020-11-22 at 10:21 -0800, James Bottomley wrote:
> Please tell me
> our reward for all this effort isn't a single missing error print.

There were quite literally dozens of logical defects found
by the fallthrough additions.  Very few were logging only.



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 000/141] Fix fall-through warnings for Clang

2020-11-22 Thread Miguel Ojeda
On Sun, Nov 22, 2020 at 7:22 PM James Bottomley
 wrote:
>
> Well, it's a problem in an error leg, sure, but it's not a really
> compelling reason for a 141 patch series, is it?  All that fixing this
> error will do is get the driver to print "oh dear there's a problem"
> under four more conditions than it previously did.
>
> We've been at this for three years now with nearly a thousand patches,
> firstly marking all the fall throughs with /* fall through */ and later
> changing it to fallthrough.  At some point we do have to ask if the
> effort is commensurate with the protection afforded.  Please tell me
> our reward for all this effort isn't a single missing error print.

It isn't that much effort, isn't it? Plus we need to take into account
the future mistakes that it might prevent, too. So even if there were
zero problems found so far, it is still a positive change.

I would agree if these changes were high risk, though; but they are
almost trivial.

Cheers,
Miguel
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 000/141] Fix fall-through warnings for Clang

2020-11-22 Thread Joe Perches
On Sun, 2020-11-22 at 11:12 -0800, James Bottomley wrote:
> On Sun, 2020-11-22 at 10:25 -0800, Joe Perches wrote:
> > On Sun, 2020-11-22 at 10:21 -0800, James Bottomley wrote:
> > > Please tell me our reward for all this effort isn't a single
> > > missing error print.
> > 
> > There were quite literally dozens of logical defects found
> > by the fallthrough additions.  Very few were logging only.
> 
> So can you give us the best examples (or indeed all of them if someone
> is keeping score)?  hopefully this isn't a US election situation ...

Gustavo?  Are you running for congress now?

https://lwn.net/Articles/794944/


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 000/141] Fix fall-through warnings for Clang

2020-11-22 Thread Kees Cook
On Fri, Nov 20, 2020 at 11:51:42AM -0800, Jakub Kicinski wrote:
> On Fri, 20 Nov 2020 11:30:40 -0800 Kees Cook wrote:
> > On Fri, Nov 20, 2020 at 10:53:44AM -0800, Jakub Kicinski wrote:
> > > On Fri, 20 Nov 2020 12:21:39 -0600 Gustavo A. R. Silva wrote:  
> > > > This series aims to fix almost all remaining fall-through warnings in
> > > > order to enable -Wimplicit-fallthrough for Clang.
> > > > 
> > > > In preparation to enable -Wimplicit-fallthrough for Clang, explicitly
> > > > add multiple break/goto/return/fallthrough statements instead of just
> > > > letting the code fall through to the next case.
> > > > 
> > > > Notice that in order to enable -Wimplicit-fallthrough for Clang, this
> > > > change[1] is meant to be reverted at some point. So, this patch helps
> > > > to move in that direction.
> > > > 
> > > > Something important to mention is that there is currently a discrepancy
> > > > between GCC and Clang when dealing with switch fall-through to empty 
> > > > case
> > > > statements or to cases that only contain a break/continue/return
> > > > statement[2][3][4].  
> > > 
> > > Are we sure we want to make this change? Was it discussed before?
> > > 
> > > Are there any bugs Clangs puritanical definition of fallthrough helped
> > > find?
> > > 
> > > IMVHO compiler warnings are supposed to warn about issues that could
> > > be bugs. Falling through to default: break; can hardly be a bug?!  
> > 
> > It's certainly a place where the intent is not always clear. I think
> > this makes all the cases unambiguous, and doesn't impact the machine
> > code, since the compiler will happily optimize away any behavioral
> > redundancy.
> 
> If none of the 140 patches here fix a real bug, and there is no change
> to machine code then it sounds to me like a W=2 kind of a warning.

FWIW, this series has found at least one bug so far:
https://lore.kernel.org/lkml/CAFCwf11izHF=g1mGry1fE5kvFFFrxzhPSM6qKAO8gxSp=kr...@mail.gmail.com/

-- 
Kees Cook
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: netconsole deadlock with virtnet

2020-11-22 Thread Leon Romanovsky
On Thu, Nov 19, 2020 at 01:55:53PM +0100, Petr Mladek wrote:
> On Tue 2020-11-17 09:33:25, Steven Rostedt wrote:
> > On Tue, 17 Nov 2020 12:23:41 +0200
> > Leon Romanovsky  wrote:
> >
> > > Hi,
> > >
> > > Approximately two weeks ago, our regression team started to experience 
> > > those
> > > netconsole splats. The tested code is Linus's master (-rc4) + netdev 
> > > net-next
> > > + netdev net-rc.
> > >
> > > Such splats are random and we can't bisect because there is no stable 
> > > reproducer.
> > >
> > > Any idea, what is the root cause?
> > >
> > > [   21.149739]   __do_sys_finit_module+0xbc/0x12c
> > > [   21.149740]   __arm64_sys_finit_module+0x28/0x34
> > > [   21.149741]   el0_svc_common.constprop.0+0x84/0x200
> > > [   21.149742]   do_el0_svc+0x2c/0x90
> > > [   21.149743]   el0_svc+0x18/0x50
> > > [   21.149744]   el0_sync_handler+0xe0/0x350
> > > [   21.149745]   el0_sync+0x158/0x180
> > > [   21.149746]  }
> > > [   21.149747]  ... key  at: [] 
> > > target_list_lock+0x18/0xf000 [netconsole]
> > > [   21.149748]  ..
> > > [   21.149750] Lost 190 message(s)!
> >
> > It really sucks that we lose 190 messages that would help to decipher this
> > more. :-p
>
> The message commes from the printk_safe code. The size can be
> increased by CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT.
>
> > Because I'm not sure where the xmit_lock is taken while holding the
> > target_list_lock. But the above does show that printk() calls write_msg()
> > while holding the console_lock, and write_msg() takes the target_list_lock.
> >
> > Thus, the fix would ether require disabling interrupts every time the
> > xmit_lock is taken, or to get it from being taken while holding the
> > target_list_lock.
>
> It seems that the missing messages might help to find the root of
> the problem.

Sorry for not being very responsive, I was in internet-free zone :).

I'll increase CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT from 13 to be 26, let's
see what night run will give us.

Thanks

>
> Best Regards,
> Petr
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization