subject:"Re\: \[RFC PATCH\] x86, entry\: Switch stacks on a paranoid entry from userspace"

RE: [RFC PATCH] x86, entry: Switch stacks on a paranoid entry from userspace

2014-11-18 Thread Luck, Tony

> Your test case is presumably doing something that involves setting
> undocumented registers* to program the CPU or memory controller to
> generate a machine check on access to some address.  Presumably this
> is done by broadcasting an SMI and programming the registers in SMM.

Good theory - but not quite how it works.  The ACPI/EINJ table does trigger
an SMI so the BIOS can do the injection.  What BIOS actually does is to play
with the memory controller so that the next write to the target address will
flip some ECC bits in an unnatural way (to either plant a correctable error
with just one bit flipped, or a UC error with two bits flipped).  Then the SMI
returns.

Then my application reads the target address, and we see CMCI or MCE
when the ECC check fails.

Hopefully this keeps the SMI path decoupled from the MCE ... I even sleep
a little after injection and before consumption just in case there are any
stragglers late returning from the (broadcast) SMI that planted the error.

-Tony
N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

Re: [RFC PATCH] x86, entry: Switch stacks on a paranoid entry from userspace

2014-11-18 Thread Andy Lutomirski

On Tue, Nov 18, 2014 at 10:30 AM, Luck, Tony  wrote:
>>> The lost cpu is *really* lost.  Warm reset doesn't fix the machine, I 
>>> usually
>>> have to do a full power cycle.
>
>> How is it even possible that I did that with a few lines of asm?
>
> Probably not your directly your fault - some cascade of errors may have 
> occurred.

I went and read the manual.  Here's a hypothesis:

Your test case is presumably doing something that involves setting
undocumented registers* to program the CPU or memory controller to
generate a machine check on access to some address.  Presumably this
is done by broadcasting an SMI and programming the registers in SMM.

Now SMM is rather strange.  The docs list a large set of interrupt
sources that are disabled on SMM entry, and this list does not include
#MC.  So presumably #MC is actually left enabled on entry to SMM.
That means that, unless SMRAM has an interrupt table that has a
working machine check handler (which seems highly unlikely), then
there is at least some window in which a #MC delivered in SMM will
cause some kind of failure.  This could really happen: a broadcast #MC
could easily race a broadcast SMI and do this.

If you crash your SMM code, then I wouldn't be at all surprised if the
CPU wedges hard enough that even your remote management thing can't
reset it.

* These are probably the registers that are supposed to be documented
in volume 2 section 4.4.9 of the Xeon E5 1600/2600 datasheet,
reference 326509-003, but the docs are extremely incomplete.

--Andy

>
>> Could this be a hardware bug?  Is there some condition that causes #MC
>> delivery to wedge hard enough that even INIT/RESET stops working?  Or
>> possibly some CPU got stuck in SMM -- I have no idea what warm reset
>> does these days.
>
> I'm not even sure what kind of reset the remote management i/f I used
> actually applied.
>
>> Here's the patch to improve the timeout messages, but given the degree
>> of wedgedness, I can guess what it'll say:
>>
>> https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/paranoid=e5cbd9d141bde651ecb20f0b65ad13bcef2468d0
>
> Heh - I'd already put in some hacky printk()s to do similar. Mine aren't 
> upstream quality, but do print the value of mce_callin/mce_executing
> as appropriate.  But I got some confusing results - reporter complained that 
> only 142 of 144 had shown up. So two threads missing,
> maybe means one core went into h/w shutdown.  Need to dig further to see if 
> the missing duo really are from the same core.
>
> -Tony

-- 
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [RFC PATCH] x86, entry: Switch stacks on a paranoid entry from userspace

2014-11-18 Thread Luck, Tony

>> The lost cpu is *really* lost.  Warm reset doesn't fix the machine, I usually
>> have to do a full power cycle.

> How is it even possible that I did that with a few lines of asm?

Probably not your directly your fault - some cascade of errors may have 
occurred.

> Could this be a hardware bug?  Is there some condition that causes #MC
> delivery to wedge hard enough that even INIT/RESET stops working?  Or
> possibly some CPU got stuck in SMM -- I have no idea what warm reset
> does these days.

I'm not even sure what kind of reset the remote management i/f I used
actually applied.

> Here's the patch to improve the timeout messages, but given the degree
> of wedgedness, I can guess what it'll say:
>
> https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/paranoid=e5cbd9d141bde651ecb20f0b65ad13bcef2468d0

Heh - I'd already put in some hacky printk()s to do similar. Mine aren't 
upstream quality, but do print the value of mce_callin/mce_executing
as appropriate.  But I got some confusing results - reporter complained that 
only 142 of 144 had shown up. So two threads missing,
maybe means one core went into h/w shutdown.  Need to dig further to see if the 
missing duo really are from the same core.

-Tony

Re: [RFC PATCH] x86, entry: Switch stacks on a paranoid entry from userspace

2014-11-18 Thread Borislav Petkov

On Mon, Nov 17, 2014 at 12:05:59PM -0800, Andy Lutomirski wrote:
> https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/log/?h=x86/paranoid
> 
> I'm not quite ready to send v3.  I want to do two things first:
> 
> 1. Consider disabling the stack switch for double_fault.

Sounds conservatively nice :)

> 2. Clean up the macros.  I'll validate this by ensuring that the
> generated code is identical to the current version.
> 
> IOW, I don't expect the asm for machine_check to change.

Right, so I don't see anything wrong with the patch but entry_64.S is
nasty so don't take my word for it.

I guess handlers should need to do "if (user_mode_vm(regs))" after your
change now, in order to know what to do.

I'll try to do some error injection here too, on my boxes to see how
this behaves.

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH] x86, entry: Switch stacks on a paranoid entry from userspace

2014-11-18 Thread Borislav Petkov

On Mon, Nov 17, 2014 at 12:05:59PM -0800, Andy Lutomirski wrote:
 https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/log/?h=x86/paranoid
 
 I'm not quite ready to send v3.  I want to do two things first:
 
 1. Consider disabling the stack switch for double_fault.

Sounds conservatively nice :)

 2. Clean up the macros.  I'll validate this by ensuring that the
 generated code is identical to the current version.
 
 IOW, I don't expect the asm for machine_check to change.

Right, so I don't see anything wrong with the patch but entry_64.S is
nasty so don't take my word for it.

I guess handlers should need to do if (user_mode_vm(regs)) after your
change now, in order to know what to do.

I'll try to do some error injection here too, on my boxes to see how
this behaves.

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [RFC PATCH] x86, entry: Switch stacks on a paranoid entry from userspace

2014-11-18 Thread Luck, Tony

 The lost cpu is *really* lost.  Warm reset doesn't fix the machine, I usually
 have to do a full power cycle.

 How is it even possible that I did that with a few lines of asm?

Probably not your directly your fault - some cascade of errors may have 
occurred.

 Could this be a hardware bug?  Is there some condition that causes #MC
 delivery to wedge hard enough that even INIT/RESET stops working?  Or
 possibly some CPU got stuck in SMM -- I have no idea what warm reset
 does these days.

I'm not even sure what kind of reset the remote management i/f I used
actually applied.

 Here's the patch to improve the timeout messages, but given the degree
 of wedgedness, I can guess what it'll say:

 https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/paranoidid=e5cbd9d141bde651ecb20f0b65ad13bcef2468d0

Heh - I'd already put in some hacky printk()s to do similar. Mine aren't 
upstream quality, but do print the value of mce_callin/mce_executing
as appropriate.  But I got some confusing results - reporter complained that 
only 142 of 144 had shown up. So two threads missing,
maybe means one core went into h/w shutdown.  Need to dig further to see if the 
missing duo really are from the same core.

-Tony

Re: [RFC PATCH] x86, entry: Switch stacks on a paranoid entry from userspace

2014-11-18 Thread Andy Lutomirski

On Tue, Nov 18, 2014 at 10:30 AM, Luck, Tony tony.l...@intel.com wrote:
The lost cpu is *really* lost. Warm reset doesn't fix the machine, I
usually
have to do a full power cycle.

How is it even possible that I did that with a few lines of asm?

Probably not your directly your fault - some cascade of errors may have
occurred.

I went and read the manual. Here's a hypothesis:

Your test case is presumably doing something that involves setting
undocumented registers* to program the CPU or memory controller to
generate a machine check on access to some address. Presumably this
is done by broadcasting an SMI and programming the registers in SMM.

Now SMM is rather strange. The docs list a large set of interrupt
sources that are disabled on SMM entry, and this list does not include
#MC. So presumably #MC is actually left enabled on entry to SMM.
That means that, unless SMRAM has an interrupt table that has a
working machine check handler (which seems highly unlikely), then
there is at least some window in which a #MC delivered in SMM will
cause some kind of failure. This could really happen: a broadcast #MC
could easily race a broadcast SMI and do this.

If you crash your SMM code, then I wouldn't be at all surprised if the
CPU wedges hard enough that even your remote management thing can't
reset it.

* These are probably the registers that are supposed to be documented
in volume 2 section 4.4.9 of the Xeon E5 1600/2600 datasheet,
reference 326509-003, but the docs are extremely incomplete.

--Andy

Could this be a hardware bug? Is there some condition that causes #MC
delivery to wedge hard enough that even INIT/RESET stops working? Or
possibly some CPU got stuck in SMM -- I have no idea what warm reset
does these days.

I'm not even sure what kind of reset the remote management i/f I used
actually applied.

Here's the patch to improve the timeout messages, but given the degree
of wedgedness, I can guess what it'll say:

https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/paranoidid=e5cbd9d141bde651ecb20f0b65ad13bcef2468d0

Heh - I'd already put in some hacky printk()s to do similar. Mine aren't
upstream quality, but do print the value of mce_callin/mce_executing
as appropriate. But I got some confusing results - reporter complained that
only 142 of 144 had shown up. So two threads missing,
maybe means one core went into h/w shutdown. Need to dig further to see if
the missing duo really are from the same core.

-Tony

--
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

1 2 >

1 - 100 of 124 matches

Mail list logo