Re: [Xen-devel] [PATCH v2 1/2] x86/msr: Carry on after a non-"safe" MSR access fails without !panic_on_oops

2016-03-12 Thread Ingo Molnar

* Andy Lutomirski  wrote:

> On Thu, Oct 1, 2015 at 12:15 AM, Ingo Molnar  wrote:
> >
> > * Andy Lutomirski  wrote:
> >
> >> > These could still be open coded in an inlined fashion, like the 
> >> > scheduler usage.
> >>
> >> We could have a raw_rdmsr for those.
> >>
> >> OTOH, I'm still not 100% convinced that this warn-but-don't-die behavior is
> >> worth the effort.  This isn't a frequent source of bugs to my knowledge, 
> >> and we
> >> don't try to recover from incorrect cr writes, out-of-bounds MMIO, etc, so 
> >> do we
> >> really gain much by rigging a recovery mechanism for rdmsr and wrmsr 
> >> failures
> >> for code that doesn't use the _safe variants?
> >
> > It's just the general principle really: don't crash the kernel on bootup. 
> > There's
> > few things more user hostile than that.
> >
> > Also, this would maintain the status quo: since we now (accidentally) don't 
> > crash
> > the kernel on distro kernels (but silently and unsafely ignore the faulting
> > instruction), we should not regress that behavior (by adding the chance to 
> > crash
> > again), but improve upon it.
> 
> Just a heads up: the extable improvements in tip:ras/core make it
> straightforward to get the best of all worlds: explicit failure
> handling (written in C!), no fast path overhead whatsoever, and no new
> garbage in the exception handlers.

I _knew_ I should have merged them into tip:x86/mm, not tip:ras/core ;-)

I had a quick look at your new MSR series and I'm very happy with that 
direction!

Thanks,

Ingo

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] x86/msr: Carry on after a non-"safe" MSR access fails without !panic_on_oops

2016-03-11 Thread Andy Lutomirski
On Thu, Oct 1, 2015 at 12:15 AM, Ingo Molnar  wrote:
>
> * Andy Lutomirski  wrote:
>
>> > These could still be open coded in an inlined fashion, like the scheduler 
>> > usage.
>>
>> We could have a raw_rdmsr for those.
>>
>> OTOH, I'm still not 100% convinced that this warn-but-don't-die behavior is
>> worth the effort.  This isn't a frequent source of bugs to my knowledge, and 
>> we
>> don't try to recover from incorrect cr writes, out-of-bounds MMIO, etc, so 
>> do we
>> really gain much by rigging a recovery mechanism for rdmsr and wrmsr failures
>> for code that doesn't use the _safe variants?
>
> It's just the general principle really: don't crash the kernel on bootup. 
> There's
> few things more user hostile than that.
>
> Also, this would maintain the status quo: since we now (accidentally) don't 
> crash
> the kernel on distro kernels (but silently and unsafely ignore the faulting
> instruction), we should not regress that behavior (by adding the chance to 
> crash
> again), but improve upon it.

Just a heads up: the extable improvements in tip:ras/core make it
straightforward to get the best of all worlds: explicit failure
handling (written in C!), no fast path overhead whatsoever, and no new
garbage in the exception handlers.

Patches coming once I test them.

>
> Thanks,
>
> Ingo



-- 
Andy Lutomirski
AMA Capital Management, LLC

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] x86/msr: Carry on after a non-"safe" MSR access fails without !panic_on_oops

2015-10-01 Thread Ingo Molnar

* Andy Lutomirski  wrote:

> > These could still be open coded in an inlined fashion, like the scheduler 
> > usage.
> 
> We could have a raw_rdmsr for those.
> 
> OTOH, I'm still not 100% convinced that this warn-but-don't-die behavior is 
> worth the effort.  This isn't a frequent source of bugs to my knowledge, and 
> we 
> don't try to recover from incorrect cr writes, out-of-bounds MMIO, etc, so do 
> we 
> really gain much by rigging a recovery mechanism for rdmsr and wrmsr failures 
> for code that doesn't use the _safe variants?

It's just the general principle really: don't crash the kernel on bootup. 
There's 
few things more user hostile than that.

Also, this would maintain the status quo: since we now (accidentally) don't 
crash 
the kernel on distro kernels (but silently and unsafely ignore the faulting 
instruction), we should not regress that behavior (by adding the chance to 
crash 
again), but improve upon it.

Thanks,

Ingo

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] x86/msr: Carry on after a non-"safe" MSR access fails without !panic_on_oops

2015-09-30 Thread Andy Lutomirski
On Wed, Sep 30, 2015 at 7:01 AM, Ingo Molnar  wrote:
>
> * Peter Zijlstra  wrote:
>
>> On Mon, Sep 21, 2015 at 09:36:15AM -0700, Linus Torvalds wrote:
>> > On Mon, Sep 21, 2015 at 1:46 AM, Ingo Molnar  wrote:
>> > >
>> > > Linus, what's your preference?
>> >
>> > So quite frankly, is there any reason we don't just implement
>> > native_read_msr() as just
>> >
>> >unsigned long long native_read_msr(unsigned int msr)
>> >{
>> >   int err;
>> >   unsigned long long val;
>> >
>> >   val = native_read_msr_safe(msr, );
>> >   WARN_ON_ONCE(err);
>> >   return val;
>> >}
>> >
>> > Note: no inline, no nothing. Just put it in arch/x86/lib/msr.c, and be
>> > done with it. I don't see the downside.
>> >
>> > How many msr reads are so critical that the function call
>> > overhead would matter? Get rid of the inline version of the _safe()
>> > thing too, and put that thing there too.
>>
>> There are a few in the perf code, and esp. on cores without a stack engine 
>> the
>> call overhead is noticeable. Also note that the perf MSRs are generally
>> optimized MSRs and less slow (we cannot say fast, they're still MSRs) than
>> regular MSRs.
>
> These could still be open coded in an inlined fashion, like the scheduler 
> usage.
>

We could have a raw_rdmsr for those.

OTOH, I'm still not 100% convinced that this warn-but-don't-die
behavior is worth the effort.  This isn't a frequent source of bugs to
my knowledge, and we don't try to recover from incorrect cr writes,
out-of-bounds MMIO, etc, so do we really gain much by rigging a
recovery mechanism for rdmsr and wrmsr failures for code that doesn't
use the _safe variants?

--Andy

> Thanks,
>
> Ingo



-- 
Andy Lutomirski
AMA Capital Management, LLC

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] x86/msr: Carry on after a non-"safe" MSR access fails without !panic_on_oops

2015-09-30 Thread H. Peter Anvin
On 09/21/2015 09:36 AM, Linus Torvalds wrote:
> 
> How many msr reads are so critical that the function call
> overhead would matter? Get rid of the inline version of the _safe()
> thing too, and put that thing there too.
> 

Probably only the ones that may go in the context switch path.

-hpa



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] x86/msr: Carry on after a non-"safe" MSR access fails without !panic_on_oops

2015-09-30 Thread Peter Zijlstra
On Mon, Sep 21, 2015 at 09:36:15AM -0700, Linus Torvalds wrote:
> On Mon, Sep 21, 2015 at 1:46 AM, Ingo Molnar  wrote:
> >
> > Linus, what's your preference?
> 
> So quite frankly, is there any reason we don't just implement
> native_read_msr() as just
> 
>unsigned long long native_read_msr(unsigned int msr)
>{
>   int err;
>   unsigned long long val;
> 
>   val = native_read_msr_safe(msr, );
>   WARN_ON_ONCE(err);
>   return val;
>}
> 
> Note: no inline, no nothing. Just put it in arch/x86/lib/msr.c, and be
> done with it. I don't see the downside.
> 
> How many msr reads are so critical that the function call
> overhead would matter? Get rid of the inline version of the _safe()
> thing too, and put that thing there too.

There are a few in the perf code, and esp. on cores without a stack
engine the call overhead is noticeable. Also note that the perf MSRs are
generally optimized MSRs and less slow (we cannot say fast, they're
still MSRs) than regular MSRs.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] x86/msr: Carry on after a non-"safe" MSR access fails without !panic_on_oops

2015-09-30 Thread Ingo Molnar

* Peter Zijlstra  wrote:

> On Mon, Sep 21, 2015 at 09:36:15AM -0700, Linus Torvalds wrote:
> > On Mon, Sep 21, 2015 at 1:46 AM, Ingo Molnar  wrote:
> > >
> > > Linus, what's your preference?
> > 
> > So quite frankly, is there any reason we don't just implement
> > native_read_msr() as just
> > 
> >unsigned long long native_read_msr(unsigned int msr)
> >{
> >   int err;
> >   unsigned long long val;
> > 
> >   val = native_read_msr_safe(msr, );
> >   WARN_ON_ONCE(err);
> >   return val;
> >}
> > 
> > Note: no inline, no nothing. Just put it in arch/x86/lib/msr.c, and be
> > done with it. I don't see the downside.
> > 
> > How many msr reads are so critical that the function call
> > overhead would matter? Get rid of the inline version of the _safe()
> > thing too, and put that thing there too.
> 
> There are a few in the perf code, and esp. on cores without a stack engine 
> the 
> call overhead is noticeable. Also note that the perf MSRs are generally 
> optimized MSRs and less slow (we cannot say fast, they're still MSRs) than 
> regular MSRs.

These could still be open coded in an inlined fashion, like the scheduler usage.

Thanks,

Ingo

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] x86/msr: Carry on after a non-"safe" MSR access fails without !panic_on_oops

2015-09-22 Thread Ingo Molnar

* Linus Torvalds  wrote:

> On Mon, Sep 21, 2015 at 1:46 AM, Ingo Molnar  wrote:
> >
> > Linus, what's your preference?
> 
> So quite frankly, is there any reason we don't just implement
> native_read_msr() as just
> 
>unsigned long long native_read_msr(unsigned int msr)
>{
>   int err;
>   unsigned long long val;
> 
>   val = native_read_msr_safe(msr, );
>   WARN_ON_ONCE(err);
>   return val;
>}
> 
> Note: no inline, no nothing. Just put it in arch/x86/lib/msr.c, and be
> done with it. I don't see the downside.

Absolutely!

> How many msr reads are so critical that the function call overhead 
> would 
> matter? Get rid of the inline version of the _safe() thing too, and put that 
> thing there too.

Only a very low number of them is performance critical (because even 
hw-accelerated MSR accesses are generally slow so we try to avoid MSR accesses 
in 
fast paths as much as possible, via shadowing, etc.) - and in the few cases 
where 
we have to access an MSR in a fast path we can do those separately.

I'm only worried about the 'default' APIs, i.e. rdmsr() that is used throughout 
arch/x86/ over a hundred times, not about performance critical code paths that 
get 
enough testing and enough attention in general.

Thanks,

Ingo

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] x86/msr: Carry on after a non-"safe" MSR access fails without !panic_on_oops

2015-09-22 Thread Paolo Bonzini


On 21/09/2015 19:43, Andy Lutomirski wrote:
> And maybe the KVM user return notifier.

No, not really.  If anything, the place in KVM where it makes a
difference is vmx_save_host_state, which is also only using
always-present MSRs.  But don't care about KVM.

First clean it up, then we can add back inline versions like __rdmsr or
rdmsr_fault or rdmsr_unsafe or whatever.

Paolo

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] x86/msr: Carry on after a non-"safe" MSR access fails without !panic_on_oops

2015-09-21 Thread Paolo Bonzini


On 21/09/2015 10:46, Ingo Molnar wrote:
> Or we could extend exception table entry encoding to include a 'warning bit', 
> to 
> not bloat the kernel. If the exception handler code encounters such an 
> exception 
> it would generate a one-time warning for that entry, but otherwise not crash 
> the 
> kernel and continue execution with an all-zeroes result for the MSR read.

The 'warning bit' already exists, it is the opcode that caused the fault. :)

The concern about bloat is a good one.  However, why is it necessary to
keep native_*_msr* inline?  If they are moved out-of-line, using the
exception table becomes the obvious solution and doesn't cause bloat
anymore.

Paolo

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] x86/msr: Carry on after a non-"safe" MSR access fails without !panic_on_oops

2015-09-21 Thread Linus Torvalds
On Mon, Sep 21, 2015 at 1:46 AM, Ingo Molnar  wrote:
>
> Linus, what's your preference?

So quite frankly, is there any reason we don't just implement
native_read_msr() as just

   unsigned long long native_read_msr(unsigned int msr)
   {
  int err;
  unsigned long long val;

  val = native_read_msr_safe(msr, );
  WARN_ON_ONCE(err);
  return val;
   }

Note: no inline, no nothing. Just put it in arch/x86/lib/msr.c, and be
done with it. I don't see the downside.

How many msr reads are so critical that the function call
overhead would matter? Get rid of the inline version of the _safe()
thing too, and put that thing there too.

  Linus

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] x86/msr: Carry on after a non-"safe" MSR access fails without !panic_on_oops

2015-09-21 Thread Linus Torvalds
On Mon, Sep 21, 2015 at 9:49 AM, Arjan van de Ven  wrote:
>>
>> How many msr reads are so critical that the function call
>> overhead would matter?
>
> if anything qualifies it'd be switch_to() and friends.

Is there anything else than the FS/GS_BASE thing (possibly hidden
behind inlines etc that I didn't get from a quick grep)? And why is
that sometimes using the "safe" version (in do_arch_prctl()), and
sometimes not (switch_to())?

I'm not convinced that mess is a good argument for the status quo ;)

> note that I'm not entirely happy about the notion of "safe" MSRs.
> They're safe as in "won't fault".

I wouldn't object to renaming them.

Linus

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] x86/msr: Carry on after a non-"safe" MSR access fails without !panic_on_oops

2015-09-21 Thread Andy Lutomirski
On Mon, Sep 21, 2015 at 9:36 AM, Linus Torvalds
 wrote:
> On Mon, Sep 21, 2015 at 1:46 AM, Ingo Molnar  wrote:
>>
>> Linus, what's your preference?
>
> So quite frankly, is there any reason we don't just implement
> native_read_msr() as just
>
>unsigned long long native_read_msr(unsigned int msr)
>{
>   int err;
>   unsigned long long val;
>
>   val = native_read_msr_safe(msr, );
>   WARN_ON_ONCE(err);
>   return val;
>}
>
> Note: no inline, no nothing. Just put it in arch/x86/lib/msr.c, and be
> done with it. I don't see the downside.

In the interest of sanity, I want to drop the "native_", too, since
there appear to be few or no good use cases for native_read_msr as
such.  I'm tempted to add new functions read_msr and write_msr that
forward to rdmsrl_safe and wrmsrl_safe.

It looks like the msr helpers are every bit as bad as the TSC helpers
used to be :(

--Andy

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] x86/msr: Carry on after a non-"safe" MSR access fails without !panic_on_oops

2015-09-21 Thread Andy Lutomirski
On Mon, Sep 21, 2015 at 9:49 AM, Arjan van de Ven  wrote:
> On 9/21/2015 9:36 AM, Linus Torvalds wrote:
>>
>> On Mon, Sep 21, 2015 at 1:46 AM, Ingo Molnar  wrote:
>>>
>>>
>>> Linus, what's your preference?
>>
>>
>> So quite frankly, is there any reason we don't just implement
>> native_read_msr() as just
>>
>> unsigned long long native_read_msr(unsigned int msr)
>> {
>>int err;
>>unsigned long long val;
>>
>>val = native_read_msr_safe(msr, );
>>WARN_ON_ONCE(err);
>>return val;
>> }
>>
>> Note: no inline, no nothing. Just put it in arch/x86/lib/msr.c, and be
>> done with it. I don't see the downside.
>>
>> How many msr reads are so critical that the function call
>> overhead would matter?
>
>
> if anything qualifies it'd be switch_to() and friends.

And maybe the KVM user return notifier.  Unfortunately, switch_to
might gain another two MSR accesses at some point if we decide to fix
the bugs in there.  Sigh.

>
> note that I'm not entirely happy about the notion of "safe" MSRs.
> They're safe as in "won't fault".
> Reading random MSRs isn't a generic safe operation though, but the name sort
> of gives people
> the impression that it is. Even with _safe variants, you still need to KNOW
> the MSR exists (by means
> of CPUID or similar) unfortunately.
>

I tend to agree.

Anyway, the fully out-of-line approach isn't obviously a bad idea, and
it simplifies the whole mess (we can drop most of the paravirt
patches, too).  I'll give it a try and see what happens.

--Andy

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] x86/msr: Carry on after a non-"safe" MSR access fails without !panic_on_oops

2015-09-21 Thread Arjan van de Ven

On 9/21/2015 9:36 AM, Linus Torvalds wrote:

On Mon, Sep 21, 2015 at 1:46 AM, Ingo Molnar  wrote:


Linus, what's your preference?


So quite frankly, is there any reason we don't just implement
native_read_msr() as just

unsigned long long native_read_msr(unsigned int msr)
{
   int err;
   unsigned long long val;

   val = native_read_msr_safe(msr, );
   WARN_ON_ONCE(err);
   return val;
}

Note: no inline, no nothing. Just put it in arch/x86/lib/msr.c, and be
done with it. I don't see the downside.

How many msr reads are so critical that the function call
overhead would matter?


if anything qualifies it'd be switch_to() and friends.

note that I'm not entirely happy about the notion of "safe" MSRs.
They're safe as in "won't fault".
Reading random MSRs isn't a generic safe operation though, but the name sort of 
gives people
the impression that it is. Even with _safe variants, you still need to KNOW the 
MSR exists (by means
of CPUID or similar) unfortunately.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] x86/msr: Carry on after a non-"safe" MSR access fails without !panic_on_oops

2015-09-21 Thread Linus Torvalds
On Mon, Sep 21, 2015 at 11:16 AM, Andy Lutomirski  wrote:
>
> In the interest of sanity, I want to drop the "native_", too

Yes. I think the only reason it exists is to have that wrapper layer
for PV. And that argument just goes away if you just make the
non-inline helper function do all the PV logic directly.

I really suspect we should do this for a *lot* of the PV ops. Yeah,
some are so performance-critical that we probably do have a good
reason for the inline indirections etc (historical example: native
spin-unlock, which traditionally could be done as a single store
instruction), but I suspect a lot of the PV indirection is for this
kind of "historical wrapper model" reason, and it often makes it
really hard to see what is going on because you have to go through
several layers of indirection, often in different files.

  Linus

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] x86/msr: Carry on after a non-"safe" MSR access fails without !panic_on_oops

2015-09-21 Thread Borislav Petkov
On Mon, Sep 21, 2015 at 11:16:30AM -0700, Andy Lutomirski wrote:
> In the interest of sanity, I want to drop the "native_", too, since
> there appear to be few or no good use cases for native_read_msr as
> such.  I'm tempted to add new functions read_msr and write_msr that
> forward to rdmsrl_safe and wrmsrl_safe.

Just change the msr_read/msr_write() ones in arch/x86/lib/msr.c to take
a u64 and you're there.

> It looks like the msr helpers are every bit as bad as the TSC helpers
> used to be :(

Yap.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] x86/msr: Carry on after a non-"safe" MSR access fails without !panic_on_oops

2015-09-21 Thread Ingo Molnar

* Andy Lutomirski  wrote:

> On Sep 20, 2015 5:15 PM, "Linus Torvalds"  
> wrote:
> >
> > On Sun, Sep 20, 2015 at 5:02 PM, Andy Lutomirski  wrote:
> > > This demotes an OOPS and likely panic due to a failed non-"safe" MSR
> > > access to a WARN_ON_ONCE and a return of zero (in the RDMSR case).
> > > We still write a pr_info entry unconditionally for debugging.
> >
> > No, this is wrong.
> >
> > If you really want to do something like this, then just make all MSR reads 
> > safe. So the only difference between "safe" and "unsafe" is that the unsafe 
> > version just doesn't check the return value, and silently just returns zero 
> > for reads (or writes nothing).
> >
> > To quote Obi-Wan: "Use the exception table, Luke".
> >
> > Because decoding instructions is just too ugly. We'll do it for CPU errata 
> > where we might have to do it for user space code too (ie the AMD prefetch 
> > mess), but for code that _we_ control? Hell no.
> >
> > So NAK on this.
> 
> My personal preference is to just not do this at all.  A couple people 
> disagree.  
> If we make the unsafe variants not oops, then I think we want to have the 
> nice 
> loud warning, since these issues are bugs if they happen.
> 
> We could certainly use the exception table for this, but it'll result in 
> bigger 
> core, since each MSR access will need an exception table entry and an 
> associated 
> fixup to call some helper that warns and sets the result to zero.
> 
> I'd be happy to implement that, but only if it'll be applied. Otherwise I'd 
> rather just drop this patch and keep the rest of the series.

Linus, what's your preference?

Due to the bug mentioned earlier in this thread all MSR reads are currently 
'safe' 
on all the major Linux distros (which all have CONFIG_PARAVIRT=y), i.e. by 
'fixing' them we'd reintroduce random crashes into various fragile pieces of 
code...

To add insult to injury, the current 'silently safe by accident' MSR code isn't 
so 
safe: because it leaves the result of the read uninitialized...

To fix this all I'd really like to have:

 - safe MSR reads by default (i.e. never boot crash the kernel on some rare 
   condition - which to most users is either a silent boot hang or an instant 
   restart). Historicaly we had a stream of 'silly boot crashes' due to MSR 
reads 
   that generate a #GPF. They make Linux less usable around the edges, 
especially 
   in the x86 non-server (desktop) space where most hardware vendors are either 
   openly Linux hostile, or, at best, Linux oblivious.

 - proper result-zeroing behavior on exceptions

 - and we should also generate _some_ sort of warning when MSR exceptions happen
   in an 'unintended' fashion.

Maybe the warning could be put under a (default-enabled) config option for the 
size conscious.

Or we could extend exception table entry encoding to include a 'warning bit', 
to 
not bloat the kernel. If the exception handler code encounters such an 
exception 
it would generate a one-time warning for that entry, but otherwise not crash 
the 
kernel and continue execution with an all-zeroes result for the MSR read.

Thanks,

Ingo

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 1/2] x86/msr: Carry on after a non-"safe" MSR access fails without !panic_on_oops

2015-09-20 Thread Andy Lutomirski
This demotes an OOPS and likely panic due to a failed non-"safe" MSR
access to a WARN_ON_ONCE and a return of zero (in the RDMSR case).
We still write a pr_info entry unconditionally for debugging.

To be clear, this type of failure should *not* happen.  This patch
exists to minimize the chance of nasty undebuggable failures due on
systems that used to work due to a now-fixed CONFIG_PARAVIRT=y bug.

Signed-off-by: Andy Lutomirski 
---
 arch/x86/kernel/traps.c | 55 +
 1 file changed, 55 insertions(+)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 346eec73f7db..f82987643e32 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -437,6 +437,58 @@ exit_trap:
do_trap(X86_TRAP_BR, SIGSEGV, "bounds", regs, error_code, NULL);
 }
 
+static bool paper_over_kernel_gpf(struct pt_regs *regs)
+{
+   /*
+* Try to decode the opcode that failed.  So far, we only care
+* about boring two-byte unprefixed opcodes, so we don't need
+* the full instruction decoder machinery.
+*/
+   u16 opcode;
+
+   if (probe_kernel_read(, (const void *)regs->ip, sizeof(opcode)))
+   return false;
+
+   if (opcode == 0x320f) {
+   /* RDMSR */
+   pr_info("bad kernel RDMSR from non-existent MSR 0x%x",
+   (unsigned int)regs->cx);
+   if (!panic_on_oops) {
+   WARN_ON_ONCE(true);
+
+   /*
+* Pretend that RDMSR worked and returned zero.  We
+* chose zero because zero seems less likely to
+* cause further malfunctions than any other value.
+*/
+   regs->ax = 0;
+   regs->dx = 0;
+   regs->ip += 2;
+   return true;
+   } else {
+   /* Don't fix it up. */
+   return false;
+   }
+   } else if (opcode == 0x300f) {
+   /* WRMSR */
+   pr_info("bad kernel WRMSR writing 0x%08x%08x to MSR 0x%x",
+   (unsigned int)regs->dx, (unsigned int)regs->ax,
+   (unsigned int)regs->cx);
+   if (!panic_on_oops) {
+   WARN_ON_ONCE(true);
+
+   /* Pretend it worked and carry on. */
+   regs->ip += 2;
+   return true;
+   } else {
+   /* Don't fix it up. */
+   return false;
+   }
+   }
+
+   return false;
+}
+
 dotraplinkage void
 do_general_protection(struct pt_regs *regs, long error_code)
 {
@@ -456,6 +508,9 @@ do_general_protection(struct pt_regs *regs, long error_code)
if (fixup_exception(regs))
return;
 
+   if (paper_over_kernel_gpf(regs))
+   return;
+
tsk->thread.error_code = error_code;
tsk->thread.trap_nr = X86_TRAP_GP;
if (notify_die(DIE_GPF, "general protection fault", regs, 
error_code,
-- 
2.4.3


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] x86/msr: Carry on after a non-"safe" MSR access fails without !panic_on_oops

2015-09-20 Thread Andy Lutomirski
On Sep 20, 2015 5:15 PM, "Linus Torvalds"  wrote:
>
> On Sun, Sep 20, 2015 at 5:02 PM, Andy Lutomirski  wrote:
> > This demotes an OOPS and likely panic due to a failed non-"safe" MSR
> > access to a WARN_ON_ONCE and a return of zero (in the RDMSR case).
> > We still write a pr_info entry unconditionally for debugging.
>
> No, this is wrong.
>
> If you really want to do something like this, then just make all MSR
> reads safe. So the only difference between "safe" and "unsafe" is that
> the unsafe version just doesn't check the return value, and silently
> just returns zero for reads (or writes nothing).
>
> To quote Obi-Wan: "Use the exception table, Luke".
>
> Because decoding instructions is just too ugly. We'll do it for CPU
> errata where we might have to do it for user space code too (ie the
> AMD prefetch mess), but for code that _we_ control? Hell no.
>
> So NAK on this.

My personal preference is to just not do this at all.  A couple people
disagree.  If we make the unsafe variants not oops, then I think we
want to have the nice loud warning, since these issues are bugs if
they happen.

We could certainly use the exception table for this, but it'll result
in bigger core, since each MSR access will need an exception table
entry and an associated fixup to call some helper that warns and sets
the result to zero.

I'd be happy to implement that, but only if it'll be applied.
Otherwise I'd rather just drop this patch and keep the rest of the
series.

--Andy

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] x86/msr: Carry on after a non-"safe" MSR access fails without !panic_on_oops

2015-09-20 Thread Linus Torvalds
On Sun, Sep 20, 2015 at 5:02 PM, Andy Lutomirski  wrote:
> This demotes an OOPS and likely panic due to a failed non-"safe" MSR
> access to a WARN_ON_ONCE and a return of zero (in the RDMSR case).
> We still write a pr_info entry unconditionally for debugging.

No, this is wrong.

If you really want to do something like this, then just make all MSR
reads safe. So the only difference between "safe" and "unsafe" is that
the unsafe version just doesn't check the return value, and silently
just returns zero for reads (or writes nothing).

To quote Obi-Wan: "Use the exception table, Luke".

Because decoding instructions is just too ugly. We'll do it for CPU
errata where we might have to do it for user space code too (ie the
AMD prefetch mess), but for code that _we_ control? Hell no.

So NAK on this.

   Linus

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel