Re: [Xen-devel] [PATCH v2 16/25] x86/pv: Use per-domain policy information in pv_cpuid()

2017-01-12 Thread Boris Ostrovsky
On 01/12/2017 03:51 PM, Boris Ostrovsky wrote:
> On 01/12/2017 03:48 PM, Andrew Cooper wrote:
>> On 12/01/17 20:46, Boris Ostrovsky wrote:
>>> On 01/12/2017 02:27 PM, Andrew Cooper wrote:
 On 12/01/17 18:00, Boris Ostrovsky wrote:
>> Ahh! found it.  This is a side effect of starting to generate the dom0
>> policy in Xen.
>>
>> Can you try this patch?
> Intel/AMD HVM/PV 64/32bit all look good. So
>
> Tested-by: Boris Ostrovsky 
 Does this mean that newer versions of Linux more picky about what they
 tolerate in cpuid?
>>> We started to fail after change in Xen so I am not sure it's something
>>> new in Linux.
>> Right, but Linux 4.4 was entirely happy with this bug, both with and
>> without having CPUID faulting imposed on it.
> Oh, I see. My tests (typically) build and run the latest Linux tree (and
> Xen staging) every morning.
>
> I am trying to see what part of Linux caused the crash.


So the problem starts in Linux ht_detect(), where we check
X86_FEATURE_CMP_LEGACY. On Intel this is supposed to be clear and we
should end up setting phys_proc_id below. This value is then used in
topology_update_package_map(). If the value is incorrect (which it will
be if we bail early in ht_detect()) we may get a BUG_ON() at the caller.
Unfortunately we were too early to see the splat from the BUG_ON so it
wasn't clear right away why we were dying.

On AMD phys_proc_id is set elsewhere.

And the reason you haven't seen problems with earlier versions of Linux
is because the last two or so kernel releases saw major changes in
topology discovery (and, more importantly, topology validation). There
have been a bunch of Xen regressions due to that (the most recent is the
one Konrad reported a few days ago with 32 cores). This all is very
fragile for Xen guests due to bogus CPUID/APICID values.

(+Mohit who has been looking into another problem related to topology)

-boris


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 16/25] x86/pv: Use per-domain policy information in pv_cpuid()

2017-01-12 Thread Boris Ostrovsky
On 01/12/2017 03:48 PM, Andrew Cooper wrote:
> On 12/01/17 20:46, Boris Ostrovsky wrote:
>> On 01/12/2017 02:27 PM, Andrew Cooper wrote:
>>> On 12/01/17 18:00, Boris Ostrovsky wrote:
> Ahh! found it.  This is a side effect of starting to generate the dom0
> policy in Xen.
>
> Can you try this patch?
 Intel/AMD HVM/PV 64/32bit all look good. So

 Tested-by: Boris Ostrovsky 
>>> Does this mean that newer versions of Linux more picky about what they
>>> tolerate in cpuid?
>> We started to fail after change in Xen so I am not sure it's something
>> new in Linux.
> Right, but Linux 4.4 was entirely happy with this bug, both with and
> without having CPUID faulting imposed on it.

Oh, I see. My tests (typically) build and run the latest Linux tree (and
Xen staging) every morning.

I am trying to see what part of Linux caused the crash.

-boris



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 16/25] x86/pv: Use per-domain policy information in pv_cpuid()

2017-01-12 Thread Andrew Cooper
On 12/01/17 20:46, Boris Ostrovsky wrote:
> On 01/12/2017 02:27 PM, Andrew Cooper wrote:
>> On 12/01/17 18:00, Boris Ostrovsky wrote:
 Ahh! found it.  This is a side effect of starting to generate the dom0
 policy in Xen.

 Can you try this patch?
>>> Intel/AMD HVM/PV 64/32bit all look good. So
>>>
>>> Tested-by: Boris Ostrovsky 
>> Does this mean that newer versions of Linux more picky about what they
>> tolerate in cpuid?
> We started to fail after change in Xen so I am not sure it's something
> new in Linux.

Right, but Linux 4.4 was entirely happy with this bug, both with and
without having CPUID faulting imposed on it.

Either way, it is definitely a bug in my code.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 16/25] x86/pv: Use per-domain policy information in pv_cpuid()

2017-01-12 Thread Boris Ostrovsky
On 01/12/2017 02:27 PM, Andrew Cooper wrote:
> On 12/01/17 18:00, Boris Ostrovsky wrote:
>>> Ahh! found it.  This is a side effect of starting to generate the dom0
>>> policy in Xen.
>>>
>>> Can you try this patch?
>> Intel/AMD HVM/PV 64/32bit all look good. So
>>
>> Tested-by: Boris Ostrovsky 
> Does this mean that newer versions of Linux more picky about what they
> tolerate in cpuid?

We started to fail after change in Xen so I am not sure it's something
new in Linux.

-boris


>
> This bug highlights a hole in my testing strategy, which I will attempt
> to plug.
>
> ~Andrew



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 16/25] x86/pv: Use per-domain policy information in pv_cpuid()

2017-01-12 Thread Andrew Cooper
On 12/01/17 18:00, Boris Ostrovsky wrote:
>> Ahh! found it.  This is a side effect of starting to generate the dom0
>> policy in Xen.
>>
>> Can you try this patch?
>
> Intel/AMD HVM/PV 64/32bit all look good. So
>
> Tested-by: Boris Ostrovsky 

Does this mean that newer versions of Linux more picky about what they
tolerate in cpuid?

This bug highlights a hole in my testing strategy, which I will attempt
to plug.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 16/25] x86/pv: Use per-domain policy information in pv_cpuid()

2017-01-12 Thread Boris Ostrovsky

> Ahh! found it.  This is a side effect of starting to generate the dom0
> policy in Xen.
>
> Can you try this patch?


Intel/AMD HVM/PV 64/32bit all look good. So

Tested-by: Boris Ostrovsky 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 16/25] x86/pv: Use per-domain policy information in pv_cpuid()

2017-01-12 Thread Andrew Cooper
On 12/01/17 15:50, Boris Ostrovsky wrote:
> On 01/12/2017 10:31 AM, Andrew Cooper wrote:
>> On 12/01/17 15:22, Boris Ostrovsky wrote:
  case 0x8001:
 -c &= pv_featureset[FEATURESET_e1c];
 -d &= pv_featureset[FEATURESET_e1d];
 +c = p->extd.e1c;
>>> This appears to crash guests Intel, at least for dom0.
>> Is this a PVH dom0?  I can't see from this snippet which function you
>> are in.
> No, this is normal PV dom0.
>
> I may have gone too far trimming the patch. It's this chunk:
>
>
> @@ -1291,15 +1281,15 @@ void pv_cpuid(struct cpu_user_regs *regs)
>  }
>  
>  case 1:
> -a &= pv_featureset[FEATURESET_Da1];
> +a = p->xstate.Da1;
>  b = c = d = 0;
>  break;
>  }
>  break;
>  
>  case 0x8001:
> -c &= pv_featureset[FEATURESET_e1c];
> -d &= pv_featureset[FEATURESET_e1d];
> +c = p->extd.e1c;
> +d = p->extd.e1d;
>
>
>>> p->extd.e1c is 0x3 and bit 1 is reserved on Intel.
>>> I haven't traced it yet to exact place that causes dom0 to crash but
>>> clearing this bit make dom0 boot.
>> The logic immediately below the snippet should clean out the common bits
>> if vendor != AMD.  Do we perhaps have a bad vendor setting?
>>
> -bash-4.1# ./cpuid 0
> CPUID 0x: eax = 0x000d ebx = 0x756e6547 ecx = 0x6c65746e edx
> = 0x49656e69
> -bash-4.1#
>
> This is machine that I run my nightly tests on and it failed this
> morning so it's not a new HW.
>
> As far as adjusting the bits based on vendor --- don't you only do this
> for edx:
>
> arch/x86/cpuid.c: pv_cpuid():
>
>case 0x8001:
> res->c = p->extd.e1c;
> res->c &= ~2U; // My workaround
> res->d = p->extd.e1d;
>
> /* If not emulating AMD, clear the duplicated features in e1d. */
> if ( currd->arch.x86_vendor != X86_VENDOR_AMD )
> res->d &= ~CPUID_COMMON_1D_FEATURES;

Ahh! found it.  This is a side effect of starting to generate the dom0
policy in Xen.

Can you try this patch?

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index b685874..1e5013d 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -164,14 +164,6 @@ static void __init calculate_pv_max_policy(void)
 /* Unconditionally claim to be able to set the hypervisor bit. */
 __set_bit(X86_FEATURE_HYPERVISOR, pv_featureset);
 
-/*
- * Allow the toolstack to set HTT, X2APIC and CMP_LEGACY.  These bits
- * affect how to interpret topology information in other cpuid leaves.
- */
-__set_bit(X86_FEATURE_HTT, pv_featureset);
-__set_bit(X86_FEATURE_X2APIC, pv_featureset);
-__set_bit(X86_FEATURE_CMP_LEGACY, pv_featureset);
-
 sanitise_featureset(pv_featureset);
 cpuid_featureset_to_policy(pv_featureset, p);
 }
@@ -199,14 +191,6 @@ static void __init calculate_hvm_max_policy(void)
 __set_bit(X86_FEATURE_HYPERVISOR, hvm_featureset);
 
 /*
- * Allow the toolstack to set HTT, X2APIC and CMP_LEGACY.  These bits
- * affect how to interpret topology information in other cpuid leaves.
- */
-__set_bit(X86_FEATURE_HTT, hvm_featureset);
-__set_bit(X86_FEATURE_X2APIC, hvm_featureset);
-__set_bit(X86_FEATURE_CMP_LEGACY, hvm_featureset);
-
-/*
  * Xen can provide an APIC emulation to HVM guests even if the
host's APIC
  * isn't enabled.
  */
@@ -301,6 +285,14 @@ void recalculate_cpuid_policy(struct domain *d)
 }
 
 /*
+ * Allow the toolstack to set HTT, X2APIC and CMP_LEGACY.  These bits
+ * affect how to interpret topology information in other cpuid leaves.
+ */
+__set_bit(X86_FEATURE_HTT, max_fs);
+__set_bit(X86_FEATURE_X2APIC, max_fs);
+__set_bit(X86_FEATURE_CMP_LEGACY, max_fs);
+
+/*
  * 32bit PV domains can't use any Long Mode features, and cannot use
  * SYSCALL on non-AMD hardware.
  */


The toolstack fudge is still necessary for PV guests (where faulting
isn't in use), and still necessary for HVM guests until I fix topology
representation, but we shouldn't be exposing them by default on hardware
which lacks the appropriate bits.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 16/25] x86/pv: Use per-domain policy information in pv_cpuid()

2017-01-12 Thread Boris Ostrovsky
On 01/12/2017 10:31 AM, Andrew Cooper wrote:
> On 12/01/17 15:22, Boris Ostrovsky wrote:
>>>  case 0x8001:
>>> -c &= pv_featureset[FEATURESET_e1c];
>>> -d &= pv_featureset[FEATURESET_e1d];
>>> +c = p->extd.e1c;
>> This appears to crash guests Intel, at least for dom0.
> Is this a PVH dom0?  I can't see from this snippet which function you
> are in.

No, this is normal PV dom0.

I may have gone too far trimming the patch. It's this chunk:


@@ -1291,15 +1281,15 @@ void pv_cpuid(struct cpu_user_regs *regs)
 }
 
 case 1:
-a &= pv_featureset[FEATURESET_Da1];
+a = p->xstate.Da1;
 b = c = d = 0;
 break;
 }
 break;
 
 case 0x8001:
-c &= pv_featureset[FEATURESET_e1c];
-d &= pv_featureset[FEATURESET_e1d];
+c = p->extd.e1c;
+d = p->extd.e1d;


>
>> p->extd.e1c is 0x3 and bit 1 is reserved on Intel.
>> I haven't traced it yet to exact place that causes dom0 to crash but
>> clearing this bit make dom0 boot.
> The logic immediately below the snippet should clean out the common bits
> if vendor != AMD.  Do we perhaps have a bad vendor setting?
>

-bash-4.1# ./cpuid 0
CPUID 0x: eax = 0x000d ebx = 0x756e6547 ecx = 0x6c65746e edx
= 0x49656e69
-bash-4.1#

This is machine that I run my nightly tests on and it failed this
morning so it's not a new HW.

As far as adjusting the bits based on vendor --- don't you only do this
for edx:

arch/x86/cpuid.c: pv_cpuid():

   case 0x8001:
res->c = p->extd.e1c;
res->c &= ~2U; // My workaround
res->d = p->extd.e1d;

/* If not emulating AMD, clear the duplicated features in e1d. */
if ( currd->arch.x86_vendor != X86_VENDOR_AMD )
res->d &= ~CPUID_COMMON_1D_FEATURES;


-boris



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 16/25] x86/pv: Use per-domain policy information in pv_cpuid()

2017-01-12 Thread Andrew Cooper
On 12/01/17 15:22, Boris Ostrovsky wrote:
>>  case 0x8001:
>> -c &= pv_featureset[FEATURESET_e1c];
>> -d &= pv_featureset[FEATURESET_e1d];
>> +c = p->extd.e1c;
> This appears to crash guests Intel, at least for dom0.

Is this a PVH dom0?  I can't see from this snippet which function you
are in.

>
> p->extd.e1c is 0x3 and bit 1 is reserved on Intel.

>
> I haven't traced it yet to exact place that causes dom0 to crash but
> clearing this bit make dom0 boot.

The logic immediately below the snippet should clean out the common bits
if vendor != AMD.  Do we perhaps have a bad vendor setting?

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 16/25] x86/pv: Use per-domain policy information in pv_cpuid()

2017-01-12 Thread Boris Ostrovsky

>  case 0x8001:
> -c &= pv_featureset[FEATURESET_e1c];
> -d &= pv_featureset[FEATURESET_e1d];
> +c = p->extd.e1c;

This appears to crash guests Intel, at least for dom0.

p->extd.e1c is 0x3 and bit 1 is reserved on Intel.

I haven't traced it yet to exact place that causes dom0 to crash but
clearing this bit make dom0 boot.


-boris


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 16/25] x86/pv: Use per-domain policy information in pv_cpuid()

2017-01-09 Thread Andrew Cooper
... rather than performing runtime adjustments.  This is safe now that
recalculate_cpuid_policy() perfoms suitable sanitisation when the policy data
is loaded.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 
---
 xen/arch/x86/traps.c | 44 
 1 file changed, 12 insertions(+), 32 deletions(-)

diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 47c7ce7..360b10d 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1064,11 +1064,8 @@ void pv_cpuid(struct cpu_user_regs *regs)
 uint32_t tmp;
 
 case 0x0001:
-c &= pv_featureset[FEATURESET_1c];
-d &= pv_featureset[FEATURESET_1d];
-
-if ( is_pv_32bit_domain(currd) )
-c &= ~cpufeat_mask(X86_FEATURE_CX16);
+c = p->basic._1c;
+d = p->basic._1d;
 
 if ( !is_pvh_domain(currd) )
 {
@@ -1127,7 +1124,7 @@ void pv_cpuid(struct cpu_user_regs *regs)
  *Emulated vs Faulted CPUID is distinguised based on whether a
  *#UD or #GP is currently being serviced.
  */
-/* OSXSAVE cleared by pv_featureset.  Fast-forward CR4 back in. */
+/* OSXSAVE clear in policy.  Fast-forward CR4 back in. */
 if ( (curr->arch.pv_vcpu.ctrlreg[4] & X86_CR4_OSXSAVE) ||
  (regs->entry_vector == TRAP_invalid_op &&
   guest_kernel_mode(curr, regs) &&
@@ -1203,21 +1200,14 @@ void pv_cpuid(struct cpu_user_regs *regs)
 if ( cpu_has(_cpu_data, X86_FEATURE_DSCPL) )
 c |= cpufeat_mask(X86_FEATURE_DSCPL);
 }
-
-c |= cpufeat_mask(X86_FEATURE_HYPERVISOR);
 break;
 
 case 0x0007:
 if ( subleaf == 0 )
 {
-/* Fold host's FDP_EXCP_ONLY and NO_FPU_SEL into guest's view. */
-b &= (pv_featureset[FEATURESET_7b0] &
-  ~special_features[FEATURESET_7b0]);
-b |= (host_featureset[FEATURESET_7b0] &
-  special_features[FEATURESET_7b0]);
-
-c &= pv_featureset[FEATURESET_7c0];
-d &= pv_featureset[FEATURESET_7d0];
+b = currd->arch.cpuid->feat._7b0;
+c = currd->arch.cpuid->feat._7c0;
+d = currd->arch.cpuid->feat._7d0;
 
 if ( !is_pvh_domain(currd) )
 {
@@ -1226,7 +1216,7 @@ void pv_cpuid(struct cpu_user_regs *regs)
  * and HVM guests no longer enter a PV codepath.
  */
 
-/* OSPKE cleared by pv_featureset.  Fast-forward CR4 back in. 
*/
+/* OSPKE clear in policy.  Fast-forward CR4 back in. */
 if ( curr->arch.pv_vcpu.ctrlreg[4] & X86_CR4_PKE )
 c |= cpufeat_mask(X86_FEATURE_OSPKE);
 }
@@ -1291,15 +1281,15 @@ void pv_cpuid(struct cpu_user_regs *regs)
 }
 
 case 1:
-a &= pv_featureset[FEATURESET_Da1];
+a = p->xstate.Da1;
 b = c = d = 0;
 break;
 }
 break;
 
 case 0x8001:
-c &= pv_featureset[FEATURESET_e1c];
-d &= pv_featureset[FEATURESET_e1d];
+c = p->extd.e1c;
+d = p->extd.e1d;
 
 /* If not emulating AMD, clear the duplicated features in e1d. */
 if ( currd->arch.x86_vendor != X86_VENDOR_AMD )
@@ -1317,25 +1307,15 @@ void pv_cpuid(struct cpu_user_regs *regs)
 if ( is_hardware_domain(currd) && guest_kernel_mode(curr, regs) &&
  cpu_has_mtrr )
 d |= cpufeat_mask(X86_FEATURE_MTRR);
-
-if ( is_pv_32bit_domain(currd) )
-{
-d &= ~cpufeat_mask(X86_FEATURE_LM);
-c &= ~cpufeat_mask(X86_FEATURE_LAHF_LM);
-
-if ( boot_cpu_data.x86_vendor != X86_VENDOR_AMD )
-d &= ~cpufeat_mask(X86_FEATURE_SYSCALL);
-}
 break;
 
 case 0x8007:
-d &= (pv_featureset[FEATURESET_e7d] |
-  (host_featureset[FEATURESET_e7d] & 
cpufeat_mask(X86_FEATURE_ITSC)));
+d = p->extd.e7d;
 break;
 
 case 0x8008:
 a = paddr_bits | (vaddr_bits << 8);
-b &= pv_featureset[FEATURESET_e8b];
+b = p->extd.e8b;
 break;
 
 case 0x0005: /* MONITOR/MWAIT */
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel