Re: [Xen-devel] Ping: [PATCH v3 2/4] x86: suppress SMEP and SMAP while running 32-bit PV guest code

2016-05-13 Thread Wei Liu
On Fri, May 13, 2016 at 09:30:23AM -0600, Jan Beulich wrote:
> >>> On 13.05.16 at 17:21,  wrote:
> > On Tue, May 03, 2016 at 07:58:58AM -0600, Jan Beulich wrote:
> >> >>> On 17.03.16 at 09:03,  wrote:
> >> > Since such guests' kernel code runs in ring 1, their memory accesses,
> >> > at the paging layer, are supervisor mode ones, and hence subject to
> >> > SMAP/SMEP checks. Such guests cannot be expected to be aware of those
> >> > two features though (and so far we also don't expose the respective
> >> > feature flags), and hence may suffer page faults they cannot deal with.
> >> > 
> >> > While the placement of the re-enabling slightly weakens the intended
> >> > protection, it was selected such that 64-bit paths would remain
> >> > unaffected where possible. At the expense of a further performance hit
> >> > the re-enabling could be put right next to the CLACs.
> >> > 
> >> > Note that this introduces a number of extra TLB flushes - CR4.SMEP
> >> > transitioning from 0 to 1 always causes a flush, and it transitioning
> >> > from 1 to 0 may also do.
> >> > 
> >> > Signed-off-by: Jan Beulich 
> >> 
> >> So I think we need to take some decision here, and I'm afraid
> >> Andrew and I won't be able to settle this between us. He's
> >> validly concerned about the performance impact this got proven
> >> to have (for 32-bit PV guests), yet I continue to think that correct
> >> behavior is more relevant than performance. Hence I think we
> >> should bite the bullet and take the change. For those who value
> >> performance more than security, they can always disable the use
> >> of SMEP and SMAP via command line option.
> >> 
> >> Of course I'm also concerned that Intel, who did introduce the
> >> functional regression in the first place, so far didn't participate at
> >> all in finding an acceptable solution to the problem at hand...
> >> 
> > 
> > So this thread has not produced a conclusion. What do we need to do
> > about this issue?
> 
> Andrew privately agreed to no longer object, but of course
> subject to him doing (another) proper review.
> 
> > Do we have a set of patches that make things behave correctly
> > (regardless of its performance impact)?
> 
> Yes, this one.
> 

OK. I will wait for him to review this series.

Wei.

> Jan
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Ping: [PATCH v3 2/4] x86: suppress SMEP and SMAP while running 32-bit PV guest code

2016-05-13 Thread Jan Beulich
>>> On 13.05.16 at 17:21,  wrote:
> On Tue, May 03, 2016 at 07:58:58AM -0600, Jan Beulich wrote:
>> >>> On 17.03.16 at 09:03,  wrote:
>> > Since such guests' kernel code runs in ring 1, their memory accesses,
>> > at the paging layer, are supervisor mode ones, and hence subject to
>> > SMAP/SMEP checks. Such guests cannot be expected to be aware of those
>> > two features though (and so far we also don't expose the respective
>> > feature flags), and hence may suffer page faults they cannot deal with.
>> > 
>> > While the placement of the re-enabling slightly weakens the intended
>> > protection, it was selected such that 64-bit paths would remain
>> > unaffected where possible. At the expense of a further performance hit
>> > the re-enabling could be put right next to the CLACs.
>> > 
>> > Note that this introduces a number of extra TLB flushes - CR4.SMEP
>> > transitioning from 0 to 1 always causes a flush, and it transitioning
>> > from 1 to 0 may also do.
>> > 
>> > Signed-off-by: Jan Beulich 
>> 
>> So I think we need to take some decision here, and I'm afraid
>> Andrew and I won't be able to settle this between us. He's
>> validly concerned about the performance impact this got proven
>> to have (for 32-bit PV guests), yet I continue to think that correct
>> behavior is more relevant than performance. Hence I think we
>> should bite the bullet and take the change. For those who value
>> performance more than security, they can always disable the use
>> of SMEP and SMAP via command line option.
>> 
>> Of course I'm also concerned that Intel, who did introduce the
>> functional regression in the first place, so far didn't participate at
>> all in finding an acceptable solution to the problem at hand...
>> 
> 
> So this thread has not produced a conclusion. What do we need to do
> about this issue?

Andrew privately agreed to no longer object, but of course
subject to him doing (another) proper review.

> Do we have a set of patches that make things behave correctly
> (regardless of its performance impact)?

Yes, this one.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Ping: [PATCH v3 2/4] x86: suppress SMEP and SMAP while running 32-bit PV guest code

2016-05-13 Thread Wei Liu
On Tue, May 03, 2016 at 07:58:58AM -0600, Jan Beulich wrote:
> >>> On 17.03.16 at 09:03,  wrote:
> > Since such guests' kernel code runs in ring 1, their memory accesses,
> > at the paging layer, are supervisor mode ones, and hence subject to
> > SMAP/SMEP checks. Such guests cannot be expected to be aware of those
> > two features though (and so far we also don't expose the respective
> > feature flags), and hence may suffer page faults they cannot deal with.
> > 
> > While the placement of the re-enabling slightly weakens the intended
> > protection, it was selected such that 64-bit paths would remain
> > unaffected where possible. At the expense of a further performance hit
> > the re-enabling could be put right next to the CLACs.
> > 
> > Note that this introduces a number of extra TLB flushes - CR4.SMEP
> > transitioning from 0 to 1 always causes a flush, and it transitioning
> > from 1 to 0 may also do.
> > 
> > Signed-off-by: Jan Beulich 
> 
> So I think we need to take some decision here, and I'm afraid
> Andrew and I won't be able to settle this between us. He's
> validly concerned about the performance impact this got proven
> to have (for 32-bit PV guests), yet I continue to think that correct
> behavior is more relevant than performance. Hence I think we
> should bite the bullet and take the change. For those who value
> performance more than security, they can always disable the use
> of SMEP and SMAP via command line option.
> 
> Of course I'm also concerned that Intel, who did introduce the
> functional regression in the first place, so far didn't participate at
> all in finding an acceptable solution to the problem at hand...
> 

So this thread has not produced a conclusion. What do we need to do
about this issue?

Do we have a set of patches that make things behave correctly
(regardless of its performance impact)?

Wei.

> Jan
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Ping: [PATCH v3 2/4] x86: suppress SMEP and SMAP while running 32-bit PV guest code

2016-05-04 Thread Jan Beulich
>>> On 04.05.16 at 12:03,  wrote:
> v3 has logic intended to omit cr4 changes while in the content of 32bit
> PV guest userspace.  This in turn was intended to increase performance
> by reducing the quantity of TLB flushes.
> 
> Even if hardware has logic to speed up CR4 writes when following reads,
> the complete omission of the writes in the first place should
> necessarily be faster still.
> 
> v3 performs ~20% worse compared to v2, which indicates that one of the
> logical steps is wrong.

Yet no-one can spot it?

May I suggest to re-measure v2 and v3 on an otherwise identical
baseline (which I'm relatively sure was not the case for the previous
measurements)? Furthermore the delta between v2 and v3 was
larger than the exit-to-guest-user-mode adjustment. Hence another
thing to consider is kind of bisecting between the two versions, at
least at the granularity of the three distinct changes done.

And then, looking at the interdiff, I note that it is quite important to
know whether you measured a debug or a release build. If it was a
debug one, the change to the debugging portion of cr4_pv32_restore
makes it so that the CR4 read is not only not avoided there, but that
read now happens in close succession to a prior write, which afaik
may incur further stalls.

>> And finally, I'm not tied to this patch set. I'd be fine with some
>> other fix to restore correct behavior. What I'm not fine with is the
>> limbo state this is being left in, with me all the time having to revive
>> the discussion.
> 
> I agree that context switching cr4 is the only way of allowing 32bit PV
> guests to function without impacting other functionality.

See below - what makes you assume this would come at a lower
price?

>  After all,
> the PV guest is doing precisely what SMAP is intended to catch.

I don't understand this part, especially when - from the previous
sentence - the context is "32-bit PV guest".

> I also accept that this will come with a performance overhead, but
> frankly, the performance regression on these specific patches is
> massive.

Yet it has to be assumed that any other approach (e.g. the
context switching one you mention) wouldn't be any better.

>  I am very concerned that the first thing I will have to do in
> XenServer is revert them.

That shouldn't be a primary reason to not make forward progress
in the community project. Or else I could equally well say it has to
go in because otherwise we will have to carry the patch locally in
our product trees.

> I don't like this situation, but blindly ploughing ahead given these
> issues is also a problem.

Re-emphasizing again that there is no known or apparent functional
bug with this patch, figuring out the performance aspects (beyond
what I've suggested above) is where I think we would need Intel's
help with. Yet you've seen Feng's reply, which was a result of us
pushing for their input. And once again - as long as there's no
functional bug, I think we either need to make progress in identifying
the reason (which I don't see much I can do for), or - also taking into
account for how long the thing has been broken in the tree - simply
commit what is there and deal with the performance fallout later.
Leaving functionally broken code in the tree for such an extended
(and otherwise unbounded) period of time is a no-go imo.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Ping: [PATCH v3 2/4] x86: suppress SMEP and SMAP while running 32-bit PV guest code

2016-05-04 Thread Andrew Cooper
On 03/05/16 15:25, Jan Beulich wrote:
 On 03.05.16 at 16:10,  wrote:
>> On 03/05/16 14:58, Jan Beulich wrote:
>> On 17.03.16 at 09:03,  wrote:
 Since such guests' kernel code runs in ring 1, their memory accesses,
 at the paging layer, are supervisor mode ones, and hence subject to
 SMAP/SMEP checks. Such guests cannot be expected to be aware of those
 two features though (and so far we also don't expose the respective
 feature flags), and hence may suffer page faults they cannot deal with.

 While the placement of the re-enabling slightly weakens the intended
 protection, it was selected such that 64-bit paths would remain
 unaffected where possible. At the expense of a further performance hit
 the re-enabling could be put right next to the CLACs.

 Note that this introduces a number of extra TLB flushes - CR4.SMEP
 transitioning from 0 to 1 always causes a flush, and it transitioning
 from 1 to 0 may also do.

 Signed-off-by: Jan Beulich 
>>> So I think we need to take some decision here, and I'm afraid
>>> Andrew and I won't be able to settle this between us. He's
>>> validly concerned about the performance impact this got proven
>>> to have (for 32-bit PV guests), yet I continue to think that correct
>>> behavior is more relevant than performance. Hence I think we
>>> should bite the bullet and take the change. For those who value
>>> performance more than security, they can always disable the use
>>> of SMEP and SMAP via command line option.
>>>
>>> Of course I'm also concerned that Intel, who did introduce the
>>> functional regression in the first place, so far didn't participate at
>>> all in finding an acceptable solution to the problem at hand...
>> What I am even more concerned with is that the supposedly-optimised
>> version which tried to omit cr4 writes whenever possible caused a higher
>> performance overhead than the version which rewrite cr4 all the time.
>>
>> As is stands, v3 of the series is even worse for performance than v2. 
>> That alone means that v3 is not in a suitable state to be accepted, as
>> there is some hidden bug in it.
> For one, we could take v2 (albeit that would feel odd).
>
> And then, from v3 showing worse performance (which only you
> have seen the numbers for, and [hopefully] know what deviations
> in them really mean)

The highlights for both v2 and v3 are on xen-devel.

>  it does not follow that v3 is buggy.

It very much does follow.

>  That's one
> possibility, sure (albeit hard to explain, as functionally everything
> is working fine for both you and me, afaik), but not the only one.
> Another may be that hardware processes CR4 writes faster when
> they happen in close succession to CR4 reads. And I'm sure more
> could be come up with.

v3 has logic intended to omit cr4 changes while in the content of 32bit
PV guest userspace.  This in turn was intended to increase performance
by reducing the quantity of TLB flushes.

Even if hardware has logic to speed up CR4 writes when following reads,
the complete omission of the writes in the first place should
necessarily be faster still.

v3 performs ~20% worse compared to v2, which indicates that one of the
logical steps is wrong.

>
> And finally, I'm not tied to this patch set. I'd be fine with some
> other fix to restore correct behavior. What I'm not fine with is the
> limbo state this is being left in, with me all the time having to revive
> the discussion.

I agree that context switching cr4 is the only way of allowing 32bit PV
guests to function without impacting other functionality.  After all,
the PV guest is doing precisely what SMAP is intended to catch.

I also accept that this will come with a performance overhead, but
frankly, the performance regression on these specific patches is
massive.  I am very concerned that the first thing I will have to do in
XenServer is revert them.

I don't like this situation, but blindly ploughing ahead given these
issues is also a problem.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Ping: [PATCH v3 2/4] x86: suppress SMEP and SMAP while running 32-bit PV guest code

2016-05-03 Thread Wu, Feng


> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: Tuesday, May 3, 2016 9:59 PM
> Cc: Andrew Cooper ; Wei Liu
> ; George Dunlap ; Ian
> Jackson ; Wu, Feng ; Stefano
> Stabellini ; xen-devel 
> ;
> Konrad Rzeszutek Wilk ; Tim Deegan 
> Subject: Ping: [PATCH v3 2/4] x86: suppress SMEP and SMAP while running 32-bit
> PV guest code
> 
> >>> On 17.03.16 at 09:03,  wrote:
> > Since such guests' kernel code runs in ring 1, their memory accesses,
> > at the paging layer, are supervisor mode ones, and hence subject to
> > SMAP/SMEP checks. Such guests cannot be expected to be aware of those
> > two features though (and so far we also don't expose the respective
> > feature flags), and hence may suffer page faults they cannot deal with.
> >
> > While the placement of the re-enabling slightly weakens the intended
> > protection, it was selected such that 64-bit paths would remain
> > unaffected where possible. At the expense of a further performance hit
> > the re-enabling could be put right next to the CLACs.
> >
> > Note that this introduces a number of extra TLB flushes - CR4.SMEP
> > transitioning from 0 to 1 always causes a flush, and it transitioning
> > from 1 to 0 may also do.
> >
> > Signed-off-by: Jan Beulich 
> 
> So I think we need to take some decision here, and I'm afraid
> Andrew and I won't be able to settle this between us. He's
> validly concerned about the performance impact this got proven
> to have (for 32-bit PV guests), yet I continue to think that correct
> behavior is more relevant than performance. Hence I think we
> should bite the bullet and take the change. For those who value
> performance more than security, they can always disable the use
> of SMEP and SMAP via command line option.

I think this is a proper solution for those who cares more about
performance. And BTW, Andrew, is there any detailed data which can
show the exact performance overhead introduced by this patch?

Thanks,
Feng

> 
> Of course I'm also concerned that Intel, who did introduce the
> functional regression in the first place, so far didn't participate at
> all in finding an acceptable solution to the problem at hand...
> 
> Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Ping: [PATCH v3 2/4] x86: suppress SMEP and SMAP while running 32-bit PV guest code

2016-05-03 Thread Jan Beulich
>>> On 03.05.16 at 16:10,  wrote:
> On 03/05/16 14:58, Jan Beulich wrote:
> On 17.03.16 at 09:03,  wrote:
>>> Since such guests' kernel code runs in ring 1, their memory accesses,
>>> at the paging layer, are supervisor mode ones, and hence subject to
>>> SMAP/SMEP checks. Such guests cannot be expected to be aware of those
>>> two features though (and so far we also don't expose the respective
>>> feature flags), and hence may suffer page faults they cannot deal with.
>>>
>>> While the placement of the re-enabling slightly weakens the intended
>>> protection, it was selected such that 64-bit paths would remain
>>> unaffected where possible. At the expense of a further performance hit
>>> the re-enabling could be put right next to the CLACs.
>>>
>>> Note that this introduces a number of extra TLB flushes - CR4.SMEP
>>> transitioning from 0 to 1 always causes a flush, and it transitioning
>>> from 1 to 0 may also do.
>>>
>>> Signed-off-by: Jan Beulich 
>> So I think we need to take some decision here, and I'm afraid
>> Andrew and I won't be able to settle this between us. He's
>> validly concerned about the performance impact this got proven
>> to have (for 32-bit PV guests), yet I continue to think that correct
>> behavior is more relevant than performance. Hence I think we
>> should bite the bullet and take the change. For those who value
>> performance more than security, they can always disable the use
>> of SMEP and SMAP via command line option.
>>
>> Of course I'm also concerned that Intel, who did introduce the
>> functional regression in the first place, so far didn't participate at
>> all in finding an acceptable solution to the problem at hand...
> 
> What I am even more concerned with is that the supposedly-optimised
> version which tried to omit cr4 writes whenever possible caused a higher
> performance overhead than the version which rewrite cr4 all the time.
> 
> As is stands, v3 of the series is even worse for performance than v2. 
> That alone means that v3 is not in a suitable state to be accepted, as
> there is some hidden bug in it.

For one, we could take v2 (albeit that would feel odd).

And then, from v3 showing worse performance (which only you
have seen the numbers for, and [hopefully] know what deviations
in them really mean) it does not follow that v3 is buggy. That's one
possibility, sure (albeit hard to explain, as functionally everything
is working fine for both you and me, afaik), but not the only one.
Another may be that hardware processes CR4 writes faster when
they happen in close succession to CR4 reads. And I'm sure more
could be come up with.

And finally, I'm not tied to this patch set. I'd be fine with some
other fix to restore correct behavior. What I'm not fine with is the
limbo state this is being left in, with me all the time having to revive
the discussion.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Ping: [PATCH v3 2/4] x86: suppress SMEP and SMAP while running 32-bit PV guest code

2016-05-03 Thread Andrew Cooper
On 03/05/16 14:58, Jan Beulich wrote:
 On 17.03.16 at 09:03,  wrote:
>> Since such guests' kernel code runs in ring 1, their memory accesses,
>> at the paging layer, are supervisor mode ones, and hence subject to
>> SMAP/SMEP checks. Such guests cannot be expected to be aware of those
>> two features though (and so far we also don't expose the respective
>> feature flags), and hence may suffer page faults they cannot deal with.
>>
>> While the placement of the re-enabling slightly weakens the intended
>> protection, it was selected such that 64-bit paths would remain
>> unaffected where possible. At the expense of a further performance hit
>> the re-enabling could be put right next to the CLACs.
>>
>> Note that this introduces a number of extra TLB flushes - CR4.SMEP
>> transitioning from 0 to 1 always causes a flush, and it transitioning
>> from 1 to 0 may also do.
>>
>> Signed-off-by: Jan Beulich 
> So I think we need to take some decision here, and I'm afraid
> Andrew and I won't be able to settle this between us. He's
> validly concerned about the performance impact this got proven
> to have (for 32-bit PV guests), yet I continue to think that correct
> behavior is more relevant than performance. Hence I think we
> should bite the bullet and take the change. For those who value
> performance more than security, they can always disable the use
> of SMEP and SMAP via command line option.
>
> Of course I'm also concerned that Intel, who did introduce the
> functional regression in the first place, so far didn't participate at
> all in finding an acceptable solution to the problem at hand...

What I am even more concerned with is that the supposedly-optimised
version which tried to omit cr4 writes whenever possible caused a higher
performance overhead than the version which rewrite cr4 all the time.

As is stands, v3 of the series is even worse for performance than v2. 
That alone means that v3 is not in a suitable state to be accepted, as
there is some hidden bug in it.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] Ping: [PATCH v3 2/4] x86: suppress SMEP and SMAP while running 32-bit PV guest code

2016-05-03 Thread Jan Beulich
>>> On 17.03.16 at 09:03,  wrote:
> Since such guests' kernel code runs in ring 1, their memory accesses,
> at the paging layer, are supervisor mode ones, and hence subject to
> SMAP/SMEP checks. Such guests cannot be expected to be aware of those
> two features though (and so far we also don't expose the respective
> feature flags), and hence may suffer page faults they cannot deal with.
> 
> While the placement of the re-enabling slightly weakens the intended
> protection, it was selected such that 64-bit paths would remain
> unaffected where possible. At the expense of a further performance hit
> the re-enabling could be put right next to the CLACs.
> 
> Note that this introduces a number of extra TLB flushes - CR4.SMEP
> transitioning from 0 to 1 always causes a flush, and it transitioning
> from 1 to 0 may also do.
> 
> Signed-off-by: Jan Beulich 

So I think we need to take some decision here, and I'm afraid
Andrew and I won't be able to settle this between us. He's
validly concerned about the performance impact this got proven
to have (for 32-bit PV guests), yet I continue to think that correct
behavior is more relevant than performance. Hence I think we
should bite the bullet and take the change. For those who value
performance more than security, they can always disable the use
of SMEP and SMAP via command line option.

Of course I'm also concerned that Intel, who did introduce the
functional regression in the first place, so far didn't participate at
all in finding an acceptable solution to the problem at hand...

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel