Re: [Xen-devel] [PATCH v2 4/6] xen/x86: disable global pages for domains with XPTI active

2018-03-09 Thread Juergen Gross
On 08/03/18 16:06, Jan Beulich wrote:
 On 02.03.18 at 09:14,  wrote:
>> @@ -123,22 +142,14 @@ unsigned int flush_area_local(const void *va, unsigned 
>> int flags)
>>  u32 t = pre_flush();
>>  
>>  if ( !cpu_has_invpcid )
>> -{
>> -unsigned long cr4 = read_cr4();
>> -
>> -write_cr4(cr4 & ~X86_CR4_PGE);
>> -barrier();
>> -write_cr4(cr4);
>> -}
>> +do_flush_tlb(0);
>>  else
>> -{
>>  /*
>>   * Using invpcid to flush all mappings works
>>   * regardless of whether PCID is enabled or not.
>>   * It is faster than read-modify-write CR4.
>>   */
>>  invpcid_flush_all();
>> -}
> 
> Btw, this is correct for FLUSH_TLB_GLOBAL, but goes too far for
> FLUSH_TLB.

You are aware that my patches didn't change anything in this regard?


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 4/6] xen/x86: disable global pages for domains with XPTI active

2018-03-09 Thread Juergen Gross
On 09/03/18 09:34, Jan Beulich wrote:
 On 09.03.18 at 06:23,  wrote:
>>>  From: Jan Beulich [mailto:jbeul...@suse.com]
>>> Sent: Thursday, March 8, 2018 9:39 PM
>>>
 --- a/xen/include/asm-x86/domain.h
 +++ b/xen/include/asm-x86/domain.h
 @@ -622,7 +622,8 @@ unsigned long pv_guest_cr4_fixup(const struct
>>> vcpu *, unsigned long guest_cr4);
  X86_CR4_SMAP | X86_CR4_OSXSAVE |\
  X86_CR4_FSGSBASE))  \
| ((v)->domain->arch.vtsc ? X86_CR4_TSD : 0)) \
 - & ~X86_CR4_DE)
 + & ~(X86_CR4_DE |   \
 + ((v)->domain->arch.pv_domain.xpti ? X86_CR4_PGE : 0)))
>>>
>>> With this you manage to turn off global pages when switching to
>>> a PV vCPU. But I can't see how you turn global pages back on when
>>> switching away from it. I can see they would be turned back on e.g.
>>> on the first entry to a VMX guest, but how about an SVM one? And
>>> how about the time between switching away from the PV vCPU and
>>> that VM entry? Granted all flushes are global ones right now, but
>>> that should change with the modification here: If you look back at
>>> 4.2 code, you'll see that FLUSH_TLB was handled differently in that
>>> case, retaining Xen's global mappings. Any flush IPI not requesting
>>> global pages to be flushed could then leave intact Xen's own TLB
>>> entries, which takes as a prereq that CR4.PGE gets turned back on
>>> earlier.
>>
>> btw does PGE really matter regarding to entry to HVM guest? Xen's 
>> mappings are either all flushed (vpid disabled) or all sustained
>> (vpid enabled) at VM entries, regardless of global setting. then if 
>> PGE is anyway turned off for PV and doesn't matter for HVM is it 
>> still useful to keep it turned on between switches?
> 
> Well, yes indeed, but that's only half of where we may want to
> retain them (if possible). The other half is the time between
> context switching in a HVM vCPU and the subsequent VM entry (or
> to be precise the point in time when interrupts get turned off
> before that VM entry, as that's where flush IPIs can still arrive).

I'd like to move loading cr4 from *_ctxt_switch_to() to write_ptbase()
in order to minimize TLB flushes. This will make it much easier to use
the optimal flushing strategy as all intermediate states are handled in
one place.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 4/6] xen/x86: disable global pages for domains with XPTI active

2018-03-09 Thread Jan Beulich
>>> On 09.03.18 at 06:23,  wrote:
>>  From: Jan Beulich [mailto:jbeul...@suse.com]
>> Sent: Thursday, March 8, 2018 9:39 PM
>> 
>> > --- a/xen/include/asm-x86/domain.h
>> > +++ b/xen/include/asm-x86/domain.h
>> > @@ -622,7 +622,8 @@ unsigned long pv_guest_cr4_fixup(const struct
>> vcpu *, unsigned long guest_cr4);
>> >  X86_CR4_SMAP | X86_CR4_OSXSAVE |\
>> >  X86_CR4_FSGSBASE))  \
>> >| ((v)->domain->arch.vtsc ? X86_CR4_TSD : 0)) \
>> > - & ~X86_CR4_DE)
>> > + & ~(X86_CR4_DE |   \
>> > + ((v)->domain->arch.pv_domain.xpti ? X86_CR4_PGE : 0)))
>> 
>> With this you manage to turn off global pages when switching to
>> a PV vCPU. But I can't see how you turn global pages back on when
>> switching away from it. I can see they would be turned back on e.g.
>> on the first entry to a VMX guest, but how about an SVM one? And
>> how about the time between switching away from the PV vCPU and
>> that VM entry? Granted all flushes are global ones right now, but
>> that should change with the modification here: If you look back at
>> 4.2 code, you'll see that FLUSH_TLB was handled differently in that
>> case, retaining Xen's global mappings. Any flush IPI not requesting
>> global pages to be flushed could then leave intact Xen's own TLB
>> entries, which takes as a prereq that CR4.PGE gets turned back on
>> earlier.
> 
> btw does PGE really matter regarding to entry to HVM guest? Xen's 
> mappings are either all flushed (vpid disabled) or all sustained
> (vpid enabled) at VM entries, regardless of global setting. then if 
> PGE is anyway turned off for PV and doesn't matter for HVM is it 
> still useful to keep it turned on between switches?

Well, yes indeed, but that's only half of where we may want to
retain them (if possible). The other half is the time between
context switching in a HVM vCPU and the subsequent VM entry (or
to be precise the point in time when interrupts get turned off
before that VM entry, as that's where flush IPIs can still arrive).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 4/6] xen/x86: disable global pages for domains with XPTI active

2018-03-08 Thread Tian, Kevin
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: Thursday, March 8, 2018 9:39 PM
> 
> > --- a/xen/include/asm-x86/domain.h
> > +++ b/xen/include/asm-x86/domain.h
> > @@ -622,7 +622,8 @@ unsigned long pv_guest_cr4_fixup(const struct
> vcpu *, unsigned long guest_cr4);
> >  X86_CR4_SMAP | X86_CR4_OSXSAVE |\
> >  X86_CR4_FSGSBASE))  \
> >| ((v)->domain->arch.vtsc ? X86_CR4_TSD : 0)) \
> > - & ~X86_CR4_DE)
> > + & ~(X86_CR4_DE |   \
> > + ((v)->domain->arch.pv_domain.xpti ? X86_CR4_PGE : 0)))
> 
> With this you manage to turn off global pages when switching to
> a PV vCPU. But I can't see how you turn global pages back on when
> switching away from it. I can see they would be turned back on e.g.
> on the first entry to a VMX guest, but how about an SVM one? And
> how about the time between switching away from the PV vCPU and
> that VM entry? Granted all flushes are global ones right now, but
> that should change with the modification here: If you look back at
> 4.2 code, you'll see that FLUSH_TLB was handled differently in that
> case, retaining Xen's global mappings. Any flush IPI not requesting
> global pages to be flushed could then leave intact Xen's own TLB
> entries, which takes as a prereq that CR4.PGE gets turned back on
> earlier.
> 

btw does PGE really matter regarding to entry to HVM guest? Xen's 
mappings are either all flushed (vpid disabled) or all sustained
(vpid enabled) at VM entries, regardless of global setting. then if 
PGE is anyway turned off for PV and doesn't matter for HVM is it 
still useful to keep it turned on between switches?
 
Thanks
Kevin

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 4/6] xen/x86: disable global pages for domains with XPTI active

2018-03-08 Thread Jan Beulich
>>> On 02.03.18 at 09:14,  wrote:
> @@ -123,22 +142,14 @@ unsigned int flush_area_local(const void *va, unsigned 
> int flags)
>  u32 t = pre_flush();
>  
>  if ( !cpu_has_invpcid )
> -{
> -unsigned long cr4 = read_cr4();
> -
> -write_cr4(cr4 & ~X86_CR4_PGE);
> -barrier();
> -write_cr4(cr4);
> -}
> +do_flush_tlb(0);
>  else
> -{
>  /*
>   * Using invpcid to flush all mappings works
>   * regardless of whether PCID is enabled or not.
>   * It is faster than read-modify-write CR4.
>   */
>  invpcid_flush_all();
> -}

Btw, this is correct for FLUSH_TLB_GLOBAL, but goes too far for
FLUSH_TLB.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 4/6] xen/x86: disable global pages for domains with XPTI active

2018-03-08 Thread Juergen Gross
On 08/03/18 15:33, Jan Beulich wrote:
 On 08.03.18 at 15:05,  wrote:
>> On 08/03/18 14:38, Jan Beulich wrote:
>> On 02.03.18 at 09:14,  wrote:
 Instead of flushing the TLB from global pages when switching address
 spaces with XPTI being active just disable global pages via %cr4
 completely when a domain subject to XPTI is active. This avoids the
 need for extra TLB flushes as loading %cr3 will remove all TLB
 entries.
>>>
>>> Hmm, it's far from obvious that this is an improvement overall.
>>> Besides Xen's global pages, we also prevent guest user pages to
>>> be evicted from the TLB across user <-> kernel mode changes.
>>> And the effects of this are likely quite work load dependent.
>>
>> With XPTI active we flush the TLB, including all global entries, when
>> switching between page tables when returning to the guest. So there are
>> no entries which could survive.
> 
> Well, right, but that's the "light" in "XPTI light". From a functionality
> POV we don't need the guest user mappings to be flushed. Did you
> happen to look at what the effect would be of running Xen with
> PTE.G uniformly clear (and with the CR4 writes dropped from the
> exit paths)?

Not yet. I wanted to do that when adding PCID support for XPTI=false to
have an idea whether it makes sense to keep the global pages in case a
processor doesn't support PCID or if there are cases where global pages
are better than using PCID.

> 
 @@ -412,18 +414,22 @@ static void prepare_set(void)
write_cr0(read_cr0() | X86_CR0_CD);
wbinvd();
  
 -  /*  TLB flushing here relies on Xen always using CR4.PGE. */
 -  BUILD_BUG_ON(!(XEN_MINIMAL_CR4 & X86_CR4_PGE));
 -  write_cr4(read_cr4() & ~X86_CR4_PGE);
 +  cr4 = read_cr4();
 +  if (cr4 & X86_CR4_PGE)
 +  write_cr4(cr4 & ~X86_CR4_PGE);
 +  else
 +  asm volatile( "mov %0, %%cr3" : : "r" (read_cr3()) : "memory" );
  
/*  Save MTRR state */
rdmsrl(MSR_MTRRdefType, deftype);
  
/*  Disable MTRRs, and set the default type to uncached  */
mtrr_wrmsr(MSR_MTRRdefType, deftype & ~0xcff);
 +
 +  return !!(cr4 & X86_CR4_PGE);
>>>
>>> Unnecessary !!.
>>
>> Return type is bool. Isn't !! better in this case?
> 
> That was strictly a requirement when we were still using bool_t.
> But with bool the compiler is required to do the conversion even
> without the !! (and you'll observe that step by step we're getting
> rid of those !! now).

Okay, I'll drop the !!.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 4/6] xen/x86: disable global pages for domains with XPTI active

2018-03-08 Thread Jan Beulich
>>> On 08.03.18 at 15:05,  wrote:
> On 08/03/18 14:38, Jan Beulich wrote:
> On 02.03.18 at 09:14,  wrote:
>>> Instead of flushing the TLB from global pages when switching address
>>> spaces with XPTI being active just disable global pages via %cr4
>>> completely when a domain subject to XPTI is active. This avoids the
>>> need for extra TLB flushes as loading %cr3 will remove all TLB
>>> entries.
>> 
>> Hmm, it's far from obvious that this is an improvement overall.
>> Besides Xen's global pages, we also prevent guest user pages to
>> be evicted from the TLB across user <-> kernel mode changes.
>> And the effects of this are likely quite work load dependent.
> 
> With XPTI active we flush the TLB, including all global entries, when
> switching between page tables when returning to the guest. So there are
> no entries which could survive.

Well, right, but that's the "light" in "XPTI light". From a functionality
POV we don't need the guest user mappings to be flushed. Did you
happen to look at what the effect would be of running Xen with
PTE.G uniformly clear (and with the CR4 writes dropped from the
exit paths)?

>>> @@ -412,18 +414,22 @@ static void prepare_set(void)
>>> write_cr0(read_cr0() | X86_CR0_CD);
>>> wbinvd();
>>>  
>>> -   /*  TLB flushing here relies on Xen always using CR4.PGE. */
>>> -   BUILD_BUG_ON(!(XEN_MINIMAL_CR4 & X86_CR4_PGE));
>>> -   write_cr4(read_cr4() & ~X86_CR4_PGE);
>>> +   cr4 = read_cr4();
>>> +   if (cr4 & X86_CR4_PGE)
>>> +   write_cr4(cr4 & ~X86_CR4_PGE);
>>> +   else
>>> +   asm volatile( "mov %0, %%cr3" : : "r" (read_cr3()) : "memory" );
>>>  
>>> /*  Save MTRR state */
>>> rdmsrl(MSR_MTRRdefType, deftype);
>>>  
>>> /*  Disable MTRRs, and set the default type to uncached  */
>>> mtrr_wrmsr(MSR_MTRRdefType, deftype & ~0xcff);
>>> +
>>> +   return !!(cr4 & X86_CR4_PGE);
>> 
>> Unnecessary !!.
> 
> Return type is bool. Isn't !! better in this case?

That was strictly a requirement when we were still using bool_t.
But with bool the compiler is required to do the conversion even
without the !! (and you'll observe that step by step we're getting
rid of those !! now).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 4/6] xen/x86: disable global pages for domains with XPTI active

2018-03-08 Thread Juergen Gross
On 08/03/18 14:38, Jan Beulich wrote:
 On 02.03.18 at 09:14,  wrote:
>> Instead of flushing the TLB from global pages when switching address
>> spaces with XPTI being active just disable global pages via %cr4
>> completely when a domain subject to XPTI is active. This avoids the
>> need for extra TLB flushes as loading %cr3 will remove all TLB
>> entries.
> 
> Hmm, it's far from obvious that this is an improvement overall.
> Besides Xen's global pages, we also prevent guest user pages to
> be evicted from the TLB across user <-> kernel mode changes.
> And the effects of this are likely quite work load dependent.

With XPTI active we flush the TLB, including all global entries, when
switching between page tables when returning to the guest. So there are
no entries which could survive.

>> @@ -412,18 +414,22 @@ static void prepare_set(void)
>>  write_cr0(read_cr0() | X86_CR0_CD);
>>  wbinvd();
>>  
>> -/*  TLB flushing here relies on Xen always using CR4.PGE. */
>> -BUILD_BUG_ON(!(XEN_MINIMAL_CR4 & X86_CR4_PGE));
>> -write_cr4(read_cr4() & ~X86_CR4_PGE);
>> +cr4 = read_cr4();
>> +if (cr4 & X86_CR4_PGE)
>> +write_cr4(cr4 & ~X86_CR4_PGE);
>> +else
>> +asm volatile( "mov %0, %%cr3" : : "r" (read_cr3()) : "memory" );
>>  
>>  /*  Save MTRR state */
>>  rdmsrl(MSR_MTRRdefType, deftype);
>>  
>>  /*  Disable MTRRs, and set the default type to uncached  */
>>  mtrr_wrmsr(MSR_MTRRdefType, deftype & ~0xcff);
>> +
>> +return !!(cr4 & X86_CR4_PGE);
> 
> Unnecessary !!.

Return type is bool. Isn't !! better in this case?

> 
>> --- a/xen/arch/x86/flushtlb.c
>> +++ b/xen/arch/x86/flushtlb.c
>> @@ -72,20 +72,39 @@ static void post_flush(u32 t)
>>  this_cpu(tlbflush_time) = t;
>>  }
>>  
>> +static void do_flush_tlb(unsigned long cr3)
> 
> I think this is not a good name, because for its use in write_cr3()
> the TLB flush is specifically a secondary effect. Personally I'd
> prefer the function to be named e.g. do_write_cr3().

Okay.

> 
>> --- a/xen/include/asm-x86/domain.h
>> +++ b/xen/include/asm-x86/domain.h
>> @@ -622,7 +622,8 @@ unsigned long pv_guest_cr4_fixup(const struct vcpu *, 
>> unsigned long guest_cr4);
>>  X86_CR4_SMAP | X86_CR4_OSXSAVE |\
>>  X86_CR4_FSGSBASE))  \
>>| ((v)->domain->arch.vtsc ? X86_CR4_TSD : 0)) \
>> - & ~X86_CR4_DE)
>> + & ~(X86_CR4_DE |   \
>> + ((v)->domain->arch.pv_domain.xpti ? X86_CR4_PGE : 0)))
> 
> With this you manage to turn off global pages when switching to
> a PV vCPU. But I can't see how you turn global pages back on when
> switching away from it. I can see they would be turned back on e.g.
> on the first entry to a VMX guest, but how about an SVM one? And
> how about the time between switching away from the PV vCPU and
> that VM entry? Granted all flushes are global ones right now, but
> that should change with the modification here: If you look back at
> 4.2 code, you'll see that FLUSH_TLB was handled differently in that
> case, retaining Xen's global mappings. Any flush IPI not requesting
> global pages to be flushed could then leave intact Xen's own TLB
> entries, which takes as a prereq that CR4.PGE gets turned back on
> earlier.

Right, turning PGE on again is missing. I had a different solution for
switching PGE on and off in the beginning, but things got rather
complicated. So I changed my mind and turning PGE on again must have
slipped through.

> And one more change would belong into this patch, I think: In patch
> 2 you change write_ptbase(). The bare CR3 write there would
> become eligible to tick the TLB flush clock with what you do here.

Yes, I'll add that.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 4/6] xen/x86: disable global pages for domains with XPTI active

2018-03-08 Thread Jan Beulich
>>> On 02.03.18 at 09:14,  wrote:
> Instead of flushing the TLB from global pages when switching address
> spaces with XPTI being active just disable global pages via %cr4
> completely when a domain subject to XPTI is active. This avoids the
> need for extra TLB flushes as loading %cr3 will remove all TLB
> entries.

Hmm, it's far from obvious that this is an improvement overall.
Besides Xen's global pages, we also prevent guest user pages to
be evicted from the TLB across user <-> kernel mode changes.
And the effects of this are likely quite work load dependent.

> @@ -412,18 +414,22 @@ static void prepare_set(void)
>   write_cr0(read_cr0() | X86_CR0_CD);
>   wbinvd();
>  
> - /*  TLB flushing here relies on Xen always using CR4.PGE. */
> - BUILD_BUG_ON(!(XEN_MINIMAL_CR4 & X86_CR4_PGE));
> - write_cr4(read_cr4() & ~X86_CR4_PGE);
> + cr4 = read_cr4();
> + if (cr4 & X86_CR4_PGE)
> + write_cr4(cr4 & ~X86_CR4_PGE);
> + else
> + asm volatile( "mov %0, %%cr3" : : "r" (read_cr3()) : "memory" );
>  
>   /*  Save MTRR state */
>   rdmsrl(MSR_MTRRdefType, deftype);
>  
>   /*  Disable MTRRs, and set the default type to uncached  */
>   mtrr_wrmsr(MSR_MTRRdefType, deftype & ~0xcff);
> +
> + return !!(cr4 & X86_CR4_PGE);

Unnecessary !!.

> --- a/xen/arch/x86/flushtlb.c
> +++ b/xen/arch/x86/flushtlb.c
> @@ -72,20 +72,39 @@ static void post_flush(u32 t)
>  this_cpu(tlbflush_time) = t;
>  }
>  
> +static void do_flush_tlb(unsigned long cr3)

I think this is not a good name, because for its use in write_cr3()
the TLB flush is specifically a secondary effect. Personally I'd
prefer the function to be named e.g. do_write_cr3().

> --- a/xen/include/asm-x86/domain.h
> +++ b/xen/include/asm-x86/domain.h
> @@ -622,7 +622,8 @@ unsigned long pv_guest_cr4_fixup(const struct vcpu *, 
> unsigned long guest_cr4);
>  X86_CR4_SMAP | X86_CR4_OSXSAVE |\
>  X86_CR4_FSGSBASE))  \
>| ((v)->domain->arch.vtsc ? X86_CR4_TSD : 0)) \
> - & ~X86_CR4_DE)
> + & ~(X86_CR4_DE |   \
> + ((v)->domain->arch.pv_domain.xpti ? X86_CR4_PGE : 0)))

With this you manage to turn off global pages when switching to
a PV vCPU. But I can't see how you turn global pages back on when
switching away from it. I can see they would be turned back on e.g.
on the first entry to a VMX guest, but how about an SVM one? And
how about the time between switching away from the PV vCPU and
that VM entry? Granted all flushes are global ones right now, but
that should change with the modification here: If you look back at
4.2 code, you'll see that FLUSH_TLB was handled differently in that
case, retaining Xen's global mappings. Any flush IPI not requesting
global pages to be flushed could then leave intact Xen's own TLB
entries, which takes as a prereq that CR4.PGE gets turned back on
earlier.

And one more change would belong into this patch, I think: In patch
2 you change write_ptbase(). The bare CR3 write there would
become eligible to tick the TLB flush clock with what you do here.

Talking of VMX: I wonder whether it wouldn't be better (perhaps both
cheaper and long term possibly more correct) if vmx_ctxt_switch_to()
didn't write CR4, but instead sync-ed HOST_CR4 to what read_cr4()
returns. Jun, Kevin, do you have any thoughts on this? For the patch
here, this would get the behavior on par with SVM, as described above.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 4/6] xen/x86: disable global pages for domains with XPTI active

2018-03-02 Thread Juergen Gross
On 02/03/18 12:03, Wei Liu wrote:
> On Fri, Mar 02, 2018 at 09:14:01AM +0100, Juergen Gross wrote:
>> Instead of flushing the TLB from global pages when switching address
>> spaces with XPTI being active just disable global pages via %cr4
>> completely when a domain subject to XPTI is active. This avoids the
>> need for extra TLB flushes as loading %cr3 will remove all TLB
>> entries.
>>
>> Signed-off-by: Juergen Gross 
> 
> I think you patch tries to disable PGE wholesale for Xen, but I don't
> think this patch catches all of them.

I do it only while a domain subject to XPTI is active on the cpu.

> XEN_MINIMAL_CR4 still has PGE in it. EFI path and the stack resyncing
> asm in __start_xen are missing.

That's on purpose. Without XPTI (e.g. on AMD cpus) PGE is still good
for performance of 64-bit pv domains (but that's the only case AFAIK).

> You also missed fixing the comment of setting MTRR. It is not a big deal
> though.

I'm not sure this comment wants to be modified as normally PGE is still
on, e.g. when booting. The only case I could think of where MTRR would
be modified with PGE off should be a dom0 hypercall, e.g.
XENPF_add_memtype.

> I have in fact written a small series to make CR4.PGE configurable but
> haven't got around to send the patches yet because I didn't think it is
> very useful in its own.

Depends on whether re-using global TLB entry saves more than flushing
the TLB from global pages does cost.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 4/6] xen/x86: disable global pages for domains with XPTI active

2018-03-02 Thread Wei Liu
On Fri, Mar 02, 2018 at 09:14:01AM +0100, Juergen Gross wrote:
> Instead of flushing the TLB from global pages when switching address
> spaces with XPTI being active just disable global pages via %cr4
> completely when a domain subject to XPTI is active. This avoids the
> need for extra TLB flushes as loading %cr3 will remove all TLB
> entries.
> 
> Signed-off-by: Juergen Gross 

I think you patch tries to disable PGE wholesale for Xen, but I don't
think this patch catches all of them.

XEN_MINIMAL_CR4 still has PGE in it. EFI path and the stack resyncing
asm in __start_xen are missing. 

You also missed fixing the comment of setting MTRR. It is not a big deal
though.

I have in fact written a small series to make CR4.PGE configurable but
haven't got around to send the patches yet because I didn't think it is
very useful in its own.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 4/6] xen/x86: disable global pages for domains with XPTI active

2018-03-02 Thread Juergen Gross
Instead of flushing the TLB from global pages when switching address
spaces with XPTI being active just disable global pages via %cr4
completely when a domain subject to XPTI is active. This avoids the
need for extra TLB flushes as loading %cr3 will remove all TLB
entries.

Signed-off-by: Juergen Gross 
---
 xen/arch/x86/cpu/mtrr/generic.c | 32 +---
 xen/arch/x86/flushtlb.c | 39 +--
 xen/arch/x86/x86_64/entry.S | 10 --
 xen/include/asm-x86/domain.h|  3 ++-
 4 files changed, 48 insertions(+), 36 deletions(-)

diff --git a/xen/arch/x86/cpu/mtrr/generic.c b/xen/arch/x86/cpu/mtrr/generic.c
index e9c0e5e059..d705138100 100644
--- a/xen/arch/x86/cpu/mtrr/generic.c
+++ b/xen/arch/x86/cpu/mtrr/generic.c
@@ -400,8 +400,10 @@ static DEFINE_SPINLOCK(set_atomicity_lock);
  * has been called.
  */
 
-static void prepare_set(void)
+static bool prepare_set(void)
 {
+   unsigned long cr4;
+
/*  Note that this is not ideal, since the cache is only 
flushed/disabled
   for this CPU while the MTRRs are changed, but changing this requires
   more invasive changes to the way the kernel boots  */
@@ -412,18 +414,22 @@ static void prepare_set(void)
write_cr0(read_cr0() | X86_CR0_CD);
wbinvd();
 
-   /*  TLB flushing here relies on Xen always using CR4.PGE. */
-   BUILD_BUG_ON(!(XEN_MINIMAL_CR4 & X86_CR4_PGE));
-   write_cr4(read_cr4() & ~X86_CR4_PGE);
+   cr4 = read_cr4();
+   if (cr4 & X86_CR4_PGE)
+   write_cr4(cr4 & ~X86_CR4_PGE);
+   else
+   asm volatile( "mov %0, %%cr3" : : "r" (read_cr3()) : "memory" );
 
/*  Save MTRR state */
rdmsrl(MSR_MTRRdefType, deftype);
 
/*  Disable MTRRs, and set the default type to uncached  */
mtrr_wrmsr(MSR_MTRRdefType, deftype & ~0xcff);
+
+   return !!(cr4 & X86_CR4_PGE);
 }
 
-static void post_set(void)
+static void post_set(bool pge)
 {
/* Intel (P6) standard MTRRs */
mtrr_wrmsr(MSR_MTRRdefType, deftype);
@@ -432,7 +438,10 @@ static void post_set(void)
write_cr0(read_cr0() & ~X86_CR0_CD);
 
/*  Reenable CR4.PGE (also flushes the TLB) */
-   write_cr4(read_cr4() | X86_CR4_PGE);
+   if (pge)
+   write_cr4(read_cr4() | X86_CR4_PGE);
+   else
+   asm volatile( "mov %0, %%cr3" : : "r" (read_cr3()) : "memory" );
 
spin_unlock(_atomicity_lock);
 }
@@ -441,14 +450,15 @@ static void generic_set_all(void)
 {
unsigned long mask, count;
unsigned long flags;
+   bool pge;
 
local_irq_save(flags);
-   prepare_set();
+   pge = prepare_set();
 
/* Actually set the state */
mask = set_mtrr_state();
 
-   post_set();
+   post_set(pge);
local_irq_restore(flags);
 
/*  Use the atomic bitops to update the global mask  */
@@ -457,7 +467,6 @@ static void generic_set_all(void)
set_bit(count, _changes_mask);
mask >>= 1;
}
-   
 }
 
 static void generic_set_mtrr(unsigned int reg, unsigned long base,
@@ -474,11 +483,12 @@ static void generic_set_mtrr(unsigned int reg, unsigned 
long base,
 {
unsigned long flags;
struct mtrr_var_range *vr;
+   bool pge;
 
vr = _state.var_ranges[reg];
 
local_irq_save(flags);
-   prepare_set();
+   pge = prepare_set();
 
if (size == 0) {
/* The invalid bit is kept in the mask, so we simply clear the
@@ -499,7 +509,7 @@ static void generic_set_mtrr(unsigned int reg, unsigned 
long base,
mtrr_wrmsr(MSR_IA32_MTRR_PHYSMASK(reg), vr->mask);
}
 
-   post_set();
+   post_set(pge);
local_irq_restore(flags);
 }
 
diff --git a/xen/arch/x86/flushtlb.c b/xen/arch/x86/flushtlb.c
index e4ea4f3297..186d9099f6 100644
--- a/xen/arch/x86/flushtlb.c
+++ b/xen/arch/x86/flushtlb.c
@@ -72,20 +72,39 @@ static void post_flush(u32 t)
 this_cpu(tlbflush_time) = t;
 }
 
+static void do_flush_tlb(unsigned long cr3)
+{
+unsigned long cr4;
+
+cr4 = read_cr4();
+if ( cr4 & X86_CR4_PGE )
+{
+write_cr4(cr4 & ~X86_CR4_PGE);
+if ( cr3 )
+asm volatile ( "mov %0, %%cr3" : : "r" (cr3) : "memory" );
+else
+barrier();
+write_cr4(cr4);
+}
+else
+{
+if ( !cr3 )
+cr3 = read_cr3();
+asm volatile ( "mov %0, %%cr3" : : "r" (cr3) : "memory" );
+}
+}
+
 void write_cr3(unsigned long cr3)
 {
-unsigned long flags, cr4;
+unsigned long flags;
 u32 t;
 
 /* This non-reentrant function is sometimes called in interrupt context. */
 local_irq_save(flags);
 
 t = pre_flush();
-cr4 = read_cr4();
 
-write_cr4(cr4 & ~X86_CR4_PGE);
-asm volatile ( "mov %0, %%cr3" : : "r" (cr3) : "memory" );
-write_cr4(cr4);
+do_flush_tlb(cr3);
 
 post_flush(t);