Re: [Xen-devel] [tip:x86/platform] x86/hyper-v: Use hypercall for remote TLB flush

2017-08-15 Thread Steven Rostedt
On Fri, 11 Aug 2017 14:07:14 +0200
Peter Zijlstra  wrote:

> It goes like:
> 
>   CPU0CPU1
> 
>   unhook page
>   cli
>   traverse page tables
>   TLB invalidate ---> 
>   sti
>   
>TLB invalidate
>   <--  complete

I guess the important part here is the above "complete". CPU0 doesn't
proceed until its receives it. Thus it does act like
cli~rcu_read_lock(), sti~rcu_read_unlock(), and "TLB invalidate" is
equivalent to synchronize_rcu().

[ this response is for clarification for the casual observer of this
  thread ;-) ]

-- Steve

>   
>   free page
> 
> So the CPU1 page-table walker gets an existence guarantee of the
> page-tables by clearing IF.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [tip:x86/platform] x86/hyper-v: Use hypercall for remote TLB flush

2017-08-11 Thread Peter Zijlstra
On Fri, Aug 11, 2017 at 03:07:29PM +0200, Juergen Gross wrote:
> On 11/08/17 14:54, Peter Zijlstra wrote:
> > On Fri, Aug 11, 2017 at 02:46:41PM +0200, Juergen Gross wrote:
> >> Aah, okay. Now I understand the problem. The TLB isn't the issue but the
> >> IPI is serving two purposes here: TLB flushing (which is allowed to
> >> happen at any time) and serialization regarding access to critical pages
> >> (which seems to be broken in the Xen case as you suggest).
> > 
> > Indeed, and now hyper-v as well.
> 
> Is it possible to distinguish between non-critical calls of
> flush_tlb_others() (which should be the majority IMHO) and critical ones
> regarding above problem? I guess the only problem is the case when a
> page table can be freed because its last valid entry is gone, right?
> 
> We might want to add a serialization flag to indicate flushing _and_
> serialization via IPI should be performed.

Possible, but not trivial. Esp things like transparent huge pages, which
swizzles PMDs around makes things tricky.

The by far easiest solution is to switch over to HAVE_RCU_TABLE_FREE
when either Xen or Hyper-V is doing this. Ideally it would not have a
significant performance hit (needs testing) and we can simply always do
this when PARAVIRT, or otherwise we need to get creative with
static_keys or something.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [tip:x86/platform] x86/hyper-v: Use hypercall for remote TLB flush

2017-08-11 Thread Juergen Gross
On 11/08/17 14:54, Peter Zijlstra wrote:
> On Fri, Aug 11, 2017 at 02:46:41PM +0200, Juergen Gross wrote:
>> Aah, okay. Now I understand the problem. The TLB isn't the issue but the
>> IPI is serving two purposes here: TLB flushing (which is allowed to
>> happen at any time) and serialization regarding access to critical pages
>> (which seems to be broken in the Xen case as you suggest).
> 
> Indeed, and now hyper-v as well.

Is it possible to distinguish between non-critical calls of
flush_tlb_others() (which should be the majority IMHO) and critical ones
regarding above problem? I guess the only problem is the case when a
page table can be freed because its last valid entry is gone, right?

We might want to add a serialization flag to indicate flushing _and_
serialization via IPI should be performed.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [tip:x86/platform] x86/hyper-v: Use hypercall for remote TLB flush

2017-08-11 Thread Peter Zijlstra
On Fri, Aug 11, 2017 at 02:46:41PM +0200, Juergen Gross wrote:
> Aah, okay. Now I understand the problem. The TLB isn't the issue but the
> IPI is serving two purposes here: TLB flushing (which is allowed to
> happen at any time) and serialization regarding access to critical pages
> (which seems to be broken in the Xen case as you suggest).

Indeed, and now hyper-v as well.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [tip:x86/platform] x86/hyper-v: Use hypercall for remote TLB flush

2017-08-11 Thread Juergen Gross
On 11/08/17 14:35, Peter Zijlstra wrote:
> On Fri, Aug 11, 2017 at 02:22:25PM +0200, Juergen Gross wrote:
>> Wait - the TLB can be cleared at any time, as Andrew was pointing out.
>> No cpu can rely on an address being accessible just because IF is being
>> cleared. All that matters is the existing and valid page table entry.
>>
>> So clearing IF on a cpu isn't meant to secure the TLB from being
>> cleared, but just to avoid interrupts (as the name of the flag is
>> suggesting).
> 
> Yes, but by holding off the TLB invalidate IPI, we hold off the freeing
> of the concurrently unhooked page-table.
> 
>> In the Xen case the hypervisor does the following:
>>
>> - it checks whether any of the vcpus specified in the cpumask of the
>>   flush request is running on any physical cpu
>> - if any running vcpu is found an IPI will be sent to the physical cpu
>>   and the hypervisor will do the TLB flush there
> 
> And this will preempt a vcpu which could have IF cleared, right?
> 
>> - any vcpu addressed by the flush and not running will be flagged to
>>   flush its TLB when being scheduled the next time
>>
>> This ensures no TLB entry to be flushed can be used after return of
>> xen_flush_tlb_others().
> 
> But that is not a sufficient guarantee. We need the IF to hold off the
> TLB invalidate and thereby hold off the freeing of our page-table pages.

Aah, okay. Now I understand the problem. The TLB isn't the issue but the
IPI is serving two purposes here: TLB flushing (which is allowed to
happen at any time) and serialization regarding access to critical pages
(which seems to be broken in the Xen case as you suggest).

Juergen

> 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [tip:x86/platform] x86/hyper-v: Use hypercall for remote TLB flush

2017-08-11 Thread Peter Zijlstra
On Fri, Aug 11, 2017 at 02:22:25PM +0200, Juergen Gross wrote:
> Wait - the TLB can be cleared at any time, as Andrew was pointing out.
> No cpu can rely on an address being accessible just because IF is being
> cleared. All that matters is the existing and valid page table entry.
> 
> So clearing IF on a cpu isn't meant to secure the TLB from being
> cleared, but just to avoid interrupts (as the name of the flag is
> suggesting).

Yes, but by holding off the TLB invalidate IPI, we hold off the freeing
of the concurrently unhooked page-table.

> In the Xen case the hypervisor does the following:
> 
> - it checks whether any of the vcpus specified in the cpumask of the
>   flush request is running on any physical cpu
> - if any running vcpu is found an IPI will be sent to the physical cpu
>   and the hypervisor will do the TLB flush there

And this will preempt a vcpu which could have IF cleared, right?

> - any vcpu addressed by the flush and not running will be flagged to
>   flush its TLB when being scheduled the next time
> 
> This ensures no TLB entry to be flushed can be used after return of
> xen_flush_tlb_others().

But that is not a sufficient guarantee. We need the IF to hold off the
TLB invalidate and thereby hold off the freeing of our page-table pages.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [tip:x86/platform] x86/hyper-v: Use hypercall for remote TLB flush

2017-08-11 Thread Juergen Gross
On 11/08/17 12:56, Peter Zijlstra wrote:
> On Fri, Aug 11, 2017 at 11:23:10AM +0200, Vitaly Kuznetsov wrote:
>> Peter Zijlstra  writes:
>>
>>> On Thu, Aug 10, 2017 at 07:08:22PM +, Jork Loeser wrote:
>>>
>> Subject: Re: [tip:x86/platform] x86/hyper-v: Use hypercall for remote 
>> TLB flush

>> Hold on.. if we don't IPI for TLB invalidation. What serializes our
>> software page table walkers like fast_gup() ?
>
> Hypervisor may implement this functionality via an IPI.
>
> K. Y

 HvFlushVirtualAddressList() states:
 This call guarantees that by the time control returns back to the
 caller, the observable effects of all flushes on the specified virtual
 processors have occurred.

 HvFlushVirtualAddressListEx() refers to HvFlushVirtualAddressList() as 
 adding sparse target VP lists.

 Is this enough of a guarantee, or do you see other races?
>>>
>>> That's nowhere near enough. We need the remote CPU to have completed any
>>> guest IF section that was in progress at the time of the call.
>>>
>>> So if a host IPI can interrupt a guest while the guest has IF cleared,
>>> and we then process the host IPI -- clear the TLBs -- before resuming the
>>> guest, which still has IF cleared, we've got a problem.
>>>
>>> Because at that point, our software page-table walker, that relies on IF
>>> being clear to guarantee the page-tables exist, because it holds off the
>>> TLB invalidate and thereby the freeing of the pages, gets its pages
>>> ripped out from under it.
>>
>> Oh, I see your concern. Hyper-V, however, is not the first x86
>> hypervisor trying to avoid IPIs on remote TLB flush, Xen does this
>> too. Briefly looking at xen_flush_tlb_others() I don't see anything
>> special, do we know how serialization is achieved there?
> 
> No idea on how Xen works, I always just hope it goes away :-) But lets
> ask some Xen folks.

Wait - the TLB can be cleared at any time, as Andrew was pointing out.
No cpu can rely on an address being accessible just because IF is being
cleared. All that matters is the existing and valid page table entry.

So clearing IF on a cpu isn't meant to secure the TLB from being
cleared, but just to avoid interrupts (as the name of the flag is
suggesting).

In the Xen case the hypervisor does the following:

- it checks whether any of the vcpus specified in the cpumask of the
  flush request is running on any physical cpu
- if any running vcpu is found an IPI will be sent to the physical cpu
  and the hypervisor will do the TLB flush there
- any vcpu addressed by the flush and not running will be flagged to
  flush its TLB when being scheduled the next time

This ensures no TLB entry to be flushed can be used after return of
xen_flush_tlb_others().


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [tip:x86/platform] x86/hyper-v: Use hypercall for remote TLB flush

2017-08-11 Thread Peter Zijlstra
On Fri, Aug 11, 2017 at 12:05:45PM +0100, Andrew Cooper wrote:
> >> Oh, I see your concern. Hyper-V, however, is not the first x86
> >> hypervisor trying to avoid IPIs on remote TLB flush, Xen does this
> >> too. Briefly looking at xen_flush_tlb_others() I don't see anything
> >> special, do we know how serialization is achieved there?
> > No idea on how Xen works, I always just hope it goes away :-) But lets
> > ask some Xen folks.
> 
> How is the software pagewalker relying on IF being clear safe at all (on
> native, let alone under virtualisation)?  Hardware has no architectural
> requirement to keep entries in the TLB.

No, but it _can_, therefore when we unhook pages we _must_ invalidate.

It goes like:

CPU0CPU1

unhook page
cli
traverse page tables
TLB invalidate ---> 
sti

 TLB invalidate
<--  complete

free page

So the CPU1 page-table walker gets an existence guarantee of the
page-tables by clearing IF.

> In the virtualisation case, at any point the vcpu can be scheduled on a
> different pcpu even during a critical region like that, so the TLB
> really can empty itself under your feet.

Not the point.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [tip:x86/platform] x86/hyper-v: Use hypercall for remote TLB flush

2017-08-11 Thread Andrew Cooper
On 11/08/17 11:56, Peter Zijlstra wrote:
> On Fri, Aug 11, 2017 at 11:23:10AM +0200, Vitaly Kuznetsov wrote:
>> Peter Zijlstra  writes:
>>
>>> On Thu, Aug 10, 2017 at 07:08:22PM +, Jork Loeser wrote:
>>>
>> Subject: Re: [tip:x86/platform] x86/hyper-v: Use hypercall for remote 
>> TLB flush
>> Hold on.. if we don't IPI for TLB invalidation. What serializes our
>> software page table walkers like fast_gup() ?
> Hypervisor may implement this functionality via an IPI.
>
> K. Y
 HvFlushVirtualAddressList() states:
 This call guarantees that by the time control returns back to the
 caller, the observable effects of all flushes on the specified virtual
 processors have occurred.

 HvFlushVirtualAddressListEx() refers to HvFlushVirtualAddressList() as 
 adding sparse target VP lists.

 Is this enough of a guarantee, or do you see other races?
>>> That's nowhere near enough. We need the remote CPU to have completed any
>>> guest IF section that was in progress at the time of the call.
>>>
>>> So if a host IPI can interrupt a guest while the guest has IF cleared,
>>> and we then process the host IPI -- clear the TLBs -- before resuming the
>>> guest, which still has IF cleared, we've got a problem.
>>>
>>> Because at that point, our software page-table walker, that relies on IF
>>> being clear to guarantee the page-tables exist, because it holds off the
>>> TLB invalidate and thereby the freeing of the pages, gets its pages
>>> ripped out from under it.
>> Oh, I see your concern. Hyper-V, however, is not the first x86
>> hypervisor trying to avoid IPIs on remote TLB flush, Xen does this
>> too. Briefly looking at xen_flush_tlb_others() I don't see anything
>> special, do we know how serialization is achieved there?
> No idea on how Xen works, I always just hope it goes away :-) But lets
> ask some Xen folks.

How is the software pagewalker relying on IF being clear safe at all (on
native, let alone under virtualisation)?  Hardware has no architectural
requirement to keep entries in the TLB.

In the virtualisation case, at any point the vcpu can be scheduled on a
different pcpu even during a critical region like that, so the TLB
really can empty itself under your feet.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [tip:x86/platform] x86/hyper-v: Use hypercall for remote TLB flush

2017-08-11 Thread Peter Zijlstra
On Fri, Aug 11, 2017 at 11:23:10AM +0200, Vitaly Kuznetsov wrote:
> Peter Zijlstra  writes:
> 
> > On Thu, Aug 10, 2017 at 07:08:22PM +, Jork Loeser wrote:
> >
> >> > > Subject: Re: [tip:x86/platform] x86/hyper-v: Use hypercall for remote 
> >> > > TLB flush
> >> 
> >> > > Hold on.. if we don't IPI for TLB invalidation. What serializes our
> >> > > software page table walkers like fast_gup() ?
> >> > 
> >> > Hypervisor may implement this functionality via an IPI.
> >> > 
> >> > K. Y
> >> 
> >> HvFlushVirtualAddressList() states:
> >> This call guarantees that by the time control returns back to the
> >> caller, the observable effects of all flushes on the specified virtual
> >> processors have occurred.
> >> 
> >> HvFlushVirtualAddressListEx() refers to HvFlushVirtualAddressList() as 
> >> adding sparse target VP lists.
> >> 
> >> Is this enough of a guarantee, or do you see other races?
> >
> > That's nowhere near enough. We need the remote CPU to have completed any
> > guest IF section that was in progress at the time of the call.
> >
> > So if a host IPI can interrupt a guest while the guest has IF cleared,
> > and we then process the host IPI -- clear the TLBs -- before resuming the
> > guest, which still has IF cleared, we've got a problem.
> >
> > Because at that point, our software page-table walker, that relies on IF
> > being clear to guarantee the page-tables exist, because it holds off the
> > TLB invalidate and thereby the freeing of the pages, gets its pages
> > ripped out from under it.
> 
> Oh, I see your concern. Hyper-V, however, is not the first x86
> hypervisor trying to avoid IPIs on remote TLB flush, Xen does this
> too. Briefly looking at xen_flush_tlb_others() I don't see anything
> special, do we know how serialization is achieved there?

No idea on how Xen works, I always just hope it goes away :-) But lets
ask some Xen folks.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel