Re: [Xen-devel] [PATCH for-4.5] x86/HVM: Partial revert of 28b4baacd5

2014-11-25 Thread Jan Beulich
 On 25.11.14 at 11:08, andrew.coop...@citrix.com wrote:
 A failed vmentry is overwhelmingly likely to be caused by corrupt VMCS state.
 As a result, injecting a fault and retrying the the vmentry is likely to 
 fail
 in the same way.

That's not all that unlikely - remember that the change was prompted
by the XSA-110 fix. There CS pieces being in a bad state would get
corrected by the exception injection.

 One other alternative, which I would pursue if we were not already in -rc2
 would be to add some extra logic to detect repeated vmentry failure and 
 allow
 one attempt to shoot userspace before giving up and crashing the domain.

That's not even needed afaict (and if it really is, it can't be all that
difficult/intrusive): Did you observe what you attempt to fix here in
practice, or is this just from theoretical considerations? I ask because
I don't think it can actually happen, as the second time we get here
the guest ought to be in kernel mode (due to the exception injection)
and hence would get crashed anyway.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH for-4.5] x86/HVM: Partial revert of 28b4baacd5

2014-11-25 Thread Tim Deegan
At 10:08 + on 25 Nov (1416906538), Andrew Cooper wrote:
 A failed vmentry is overwhelmingly likely to be caused by corrupt VMCS state.
 As a result, injecting a fault and retrying the the vmentry is likely to fail
 in the same way.

In particular, the guest's privilege level won't change until _after_
the next vm entry has succeeded.

 While crashing a guest because userspace tickled a hypervisor bug to get up
 invalid VMCS state is bad (and usually warrants an XSA), it is better than the
 infinite loop caused by this change, and the non-ratelimited console output it
 would cause.
 
 Signed-off-by: Andrew Cooper andrew.coop...@citrix.com

Reviewed-by: Tim Deegan t...@xen.org

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH for-4.5] x86/HVM: Partial revert of 28b4baacd5

2014-11-25 Thread Andrew Cooper
On 25/11/14 10:42, Jan Beulich wrote:
 On 25.11.14 at 11:08, andrew.coop...@citrix.com wrote:
 A failed vmentry is overwhelmingly likely to be caused by corrupt VMCS state.
 As a result, injecting a fault and retrying the the vmentry is likely to 
 fail
 in the same way.
 That's not all that unlikely - remember that the change was prompted
 by the XSA-110 fix. There CS pieces being in a bad state would get
 corrected by the exception injection.

 One other alternative, which I would pursue if we were not already in -rc2
 would be to add some extra logic to detect repeated vmentry failure and 
 allow
 one attempt to shoot userspace before giving up and crashing the domain.
 That's not even needed afaict (and if it really is, it can't be all that
 difficult/intrusive): Did you observe what you attempt to fix here in
 practice, or is this just from theoretical considerations? I ask because
 I don't think it can actually happen, as the second time we get here
 the guest ought to be in kernel mode (due to the exception injection)
 and hence would get crashed anyway.

Only from theoretical considerations.  A bad CS (and possibly SS) would
be fixed by this, but there are many others which wouldn't

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH for-4.5] x86/HVM: Partial revert of 28b4baacd5

2014-11-25 Thread Jan Beulich
 On 25.11.14 at 11:46, andrew.coop...@citrix.com wrote:
 On 25/11/14 10:42, Jan Beulich wrote:
 On 25.11.14 at 11:08, andrew.coop...@citrix.com wrote:
 A failed vmentry is overwhelmingly likely to be caused by corrupt VMCS 
 state.
 As a result, injecting a fault and retrying the the vmentry is likely to 
 fail
 in the same way.
 That's not all that unlikely - remember that the change was prompted
 by the XSA-110 fix. There CS pieces being in a bad state would get
 corrected by the exception injection.

 One other alternative, which I would pursue if we were not already in -rc2
 would be to add some extra logic to detect repeated vmentry failure and 
 allow
 one attempt to shoot userspace before giving up and crashing the domain.
 That's not even needed afaict (and if it really is, it can't be all that
 difficult/intrusive): Did you observe what you attempt to fix here in
 practice, or is this just from theoretical considerations? I ask because
 I don't think it can actually happen, as the second time we get here
 the guest ought to be in kernel mode (due to the exception injection)
 and hence would get crashed anyway.
 
 Only from theoretical considerations.  A bad CS (and possibly SS) would
 be fixed by this, but there are many others which wouldn't

But that doesn't eliminate the fact that in the second pass we'd find
the guest in kernel mode, and hence crash it. Yet your reply sounds
as if you still think your patch is needed.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH for-4.5] x86/HVM: Partial revert of 28b4baacd5

2014-11-25 Thread Andrew Cooper
On 25/11/14 10:46, Andrew Cooper wrote:
 On 25/11/14 10:42, Jan Beulich wrote:
 On 25.11.14 at 11:08, andrew.coop...@citrix.com wrote:
 A failed vmentry is overwhelmingly likely to be caused by corrupt VMCS 
 state.
 As a result, injecting a fault and retrying the the vmentry is likely to 
 fail
 in the same way.
 That's not all that unlikely - remember that the change was prompted
 by the XSA-110 fix. There CS pieces being in a bad state would get
 corrected by the exception injection.

 One other alternative, which I would pursue if we were not already in -rc2
 would be to add some extra logic to detect repeated vmentry failure and 
 allow
 one attempt to shoot userspace before giving up and crashing the domain.
 That's not even needed afaict (and if it really is, it can't be all that
 difficult/intrusive): Did you observe what you attempt to fix here in
 practice, or is this just from theoretical considerations? I ask because
 I don't think it can actually happen, as the second time we get here
 the guest ought to be in kernel mode (due to the exception injection)
 and hence would get crashed anyway.
 Only from theoretical considerations.  A bad CS (and possibly SS) would
 be fixed by this, but there are many others which wouldn't

Actually, as Tim correctly points out, a bad CS/SS won't be fixed by
this without emulating the event injection.  Per the XSA-106 followup,
we only ever emulate enough of event injection to cover the dpl checks
on software events for older generation SVM.  We never actually emulate
the context switch itself.

~Andrew


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH for-4.5] x86/HVM: Partial revert of 28b4baacd5

2014-11-25 Thread Jan Beulich
 On 25.11.14 at 11:58, andrew.coop...@citrix.com wrote:
 On 25/11/14 10:46, Andrew Cooper wrote:
 On 25/11/14 10:42, Jan Beulich wrote:
 On 25.11.14 at 11:08, andrew.coop...@citrix.com wrote:
 A failed vmentry is overwhelmingly likely to be caused by corrupt VMCS 
 state.
 As a result, injecting a fault and retrying the the vmentry is likely to 
 fail
 in the same way.
 That's not all that unlikely - remember that the change was prompted
 by the XSA-110 fix. There CS pieces being in a bad state would get
 corrected by the exception injection.

 One other alternative, which I would pursue if we were not already in -rc2
 would be to add some extra logic to detect repeated vmentry failure and 
 allow
 one attempt to shoot userspace before giving up and crashing the domain.
 That's not even needed afaict (and if it really is, it can't be all that
 difficult/intrusive): Did you observe what you attempt to fix here in
 practice, or is this just from theoretical considerations? I ask because
 I don't think it can actually happen, as the second time we get here
 the guest ought to be in kernel mode (due to the exception injection)
 and hence would get crashed anyway.
 Only from theoretical considerations.  A bad CS (and possibly SS) would
 be fixed by this, but there are many others which wouldn't
 
 Actually, as Tim correctly points out, a bad CS/SS won't be fixed by
 this without emulating the event injection.  Per the XSA-106 followup,
 we only ever emulate enough of event injection to cover the dpl checks
 on software events for older generation SVM.  We never actually emulate
 the context switch itself.

Which suggests that rather than doing the partial revert as you
propose we might better extend the check to become kernel mode
or event injection pending.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH for-4.5] x86/HVM: Partial revert of 28b4baacd5

2014-11-25 Thread Andrew Cooper
On 25/11/14 11:31, Jan Beulich wrote:
 On 25.11.14 at 11:58, andrew.coop...@citrix.com wrote:
 On 25/11/14 10:46, Andrew Cooper wrote:
 On 25/11/14 10:42, Jan Beulich wrote:
 On 25.11.14 at 11:08, andrew.coop...@citrix.com wrote:
 A failed vmentry is overwhelmingly likely to be caused by corrupt VMCS 
 state.
 As a result, injecting a fault and retrying the the vmentry is likely to 
 fail
 in the same way.
 That's not all that unlikely - remember that the change was prompted
 by the XSA-110 fix. There CS pieces being in a bad state would get
 corrected by the exception injection.

 One other alternative, which I would pursue if we were not already in -rc2
 would be to add some extra logic to detect repeated vmentry failure and 
 allow
 one attempt to shoot userspace before giving up and crashing the domain.
 That's not even needed afaict (and if it really is, it can't be all that
 difficult/intrusive): Did you observe what you attempt to fix here in
 practice, or is this just from theoretical considerations? I ask because
 I don't think it can actually happen, as the second time we get here
 the guest ought to be in kernel mode (due to the exception injection)
 and hence would get crashed anyway.
 Only from theoretical considerations.  A bad CS (and possibly SS) would
 be fixed by this, but there are many others which wouldn't
 Actually, as Tim correctly points out, a bad CS/SS won't be fixed by
 this without emulating the event injection.  Per the XSA-106 followup,
 we only ever emulate enough of event injection to cover the dpl checks
 on software events for older generation SVM.  We never actually emulate
 the context switch itself.
 Which suggests that rather than doing the partial revert as you
 propose we might better extend the check to become kernel mode
 or event injection pending.

At that point, it is safer just to unconditionally crash on a repeated
vmentry failure, rather than gain a list of conditions which we hope
wont leave us spinning in a loop.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH for-4.5] x86/HVM: Partial revert of 28b4baacd5

2014-11-25 Thread Jan Beulich
 On 25.11.14 at 13:10, andrew.coop...@citrix.com wrote:
 On 25/11/14 11:31, Jan Beulich wrote:
 On 25.11.14 at 11:58, andrew.coop...@citrix.com wrote:
 On 25/11/14 10:46, Andrew Cooper wrote:
 On 25/11/14 10:42, Jan Beulich wrote:
 On 25.11.14 at 11:08, andrew.coop...@citrix.com wrote:
 A failed vmentry is overwhelmingly likely to be caused by corrupt VMCS 
 state.
 As a result, injecting a fault and retrying the the vmentry is likely to 
 fail
 in the same way.
 That's not all that unlikely - remember that the change was prompted
 by the XSA-110 fix. There CS pieces being in a bad state would get
 corrected by the exception injection.

 One other alternative, which I would pursue if we were not already in 
 -rc2
 would be to add some extra logic to detect repeated vmentry failure and 
 allow
 one attempt to shoot userspace before giving up and crashing the domain.
 That's not even needed afaict (and if it really is, it can't be all that
 difficult/intrusive): Did you observe what you attempt to fix here in
 practice, or is this just from theoretical considerations? I ask because
 I don't think it can actually happen, as the second time we get here
 the guest ought to be in kernel mode (due to the exception injection)
 and hence would get crashed anyway.
 Only from theoretical considerations.  A bad CS (and possibly SS) would
 be fixed by this, but there are many others which wouldn't
 Actually, as Tim correctly points out, a bad CS/SS won't be fixed by
 this without emulating the event injection.  Per the XSA-106 followup,
 we only ever emulate enough of event injection to cover the dpl checks
 on software events for older generation SVM.  We never actually emulate
 the context switch itself.
 Which suggests that rather than doing the partial revert as you
 propose we might better extend the check to become kernel mode
 or event injection pending.
 
 At that point, it is safer just to unconditionally crash on a repeated
 vmentry failure, rather than gain a list of conditions which we hope
 wont leave us spinning in a loop.

Crashing on _repeated_ VM entry failure is certainly fine with me.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel