Re: [PATCH] target-i386: Do not set MCG_SER_P by default

2015-11-23 Thread Eduardo Habkost
On Sat, Nov 21, 2015 at 02:09:25AM +0100, Borislav Petkov wrote:
> On Sat, Nov 21, 2015 at 12:11:35AM +0100, Andreas Färber wrote:
> > Hi,
> > 
> > CC'ing qemu-devel.
> 
> Ah, thanks.
> 
> > Am 21.11.2015 um 00:01 schrieb Borislav Petkov:
> > > From: Borislav Petkov 
> > > 
> > > Software Error Recovery, i.e. SER, is purely an Intel feature and it
> > > shouldn't be set by default. Enable it only on Intel.

What happens when SER is enabled on an AMD CPU? If it really
should't be enabled, why is KVM returning it on
KVM_X86_GET_MCE_CAP_SUPPORTED?

> > 
> > Is this new in 2.5? Otherwise we would probably need compatibility code
> > in pc*.[ch] for incoming live migration from older versions.
> 
> It looks it is really old, AFAIK from 2010:
> 
> c0532a76b407 ("MCE: Relay UCR MCE to guest")
> 
> You'd need to be more verbose about pc*.[ch]. An example perhaps...?

If you change something that's guest-visible and not part of the
migration stream, you need to keep the old behavior on older
machine-types (e.g. pc-i440fx-2.4), or the CPU will change under
the guest's feet when migrating to another host.

For examples, see the recent commits to include/hw/i386/pc.h.
They are all about keeping compatibility when CPUID bits are
changed:

33b5e8c0 target-i386: Disable rdtscp on Opteron_G* CPU models
6aa91e4a target-i386: Remove POPCNT from qemu64 and qemu32 CPU models
71195672 target-i386: Remove ABM from qemu64 CPU model
0909ad24 target-i386: Remove SSE4a from qemu64 CPU model

In the case of this code, it looks like it's already broken
because the resulting mcg_cap depends on host kernel capabilities
(the ones reported by kvm_get_mce_cap_supported()), and the data
initialized by target-i386/cpu.c:mce_init() is silently
overwritten by kvm_arch_init_vcpu(). So we would need to fix that
before implementing a proper compatibility mechanism for
mcg_cap.

-- 
Eduardo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] target-i386: Do not set MCG_SER_P by default

2015-11-23 Thread Paolo Bonzini


On 23/11/2015 14:22, Eduardo Habkost wrote:
> > Software Error Recovery, i.e. SER, is purely an Intel feature and it
> > shouldn't be set by default. Enable it only on Intel.
> 
> What happens when SER is enabled on an AMD CPU? If it really
> should't be enabled, why is KVM returning it on
> KVM_X86_GET_MCE_CAP_SUPPORTED?

Indeed... is it a problem if our frankenstein AMD CPU can recover from
memory errors?

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] target-i386: Do not set MCG_SER_P by default

2015-11-23 Thread Borislav Petkov
+ Tony.

On Mon, Nov 23, 2015 at 03:47:44PM +0100, Paolo Bonzini wrote:
> On 23/11/2015 14:22, Eduardo Habkost wrote:
> > > Software Error Recovery, i.e. SER, is purely an Intel feature and it
> > > shouldn't be set by default. Enable it only on Intel.
> > 
> > What happens when SER is enabled on an AMD CPU? If it really
> > should't be enabled, why is KVM returning it on
> > KVM_X86_GET_MCE_CAP_SUPPORTED?
> 
> Indeed... is it a problem if our frankenstein AMD CPU can recover from
> memory errors?

The problem stems from the fact that the guest kernel looks at SER and
does different handling depending on that bit:

machine_check_poll:

...

if (!(flags & MCP_UC) &&
(m.status & (mca_cfg.ser ? MCI_STATUS_S : MCI_STATUS_UC)))
continue;

so when the guest kernel gets a correctable error (injected..., for
example) it sees that bit set. Even though kvm/qemu emulates an AMD
CPU. So on AMD with that bit set, it would puzzle the whole error
handling/reporting in the guest kernel.

Oh, btw, I'm using a kvm guest to inject MCEs. In case you were
wondering why is he even doing that. :-)

And I'm not sure it makes sense to set that bit for an Intel guest too.
For the simple reason that I don't know how much of the Software Error
Recovery stuff is actually implemented there. If stuff is missing, you
probably don't want to advertize it there too. And by "stuff" I mean
all that fun in section "15.6 RECOVERY OF UNCORRECTED RECOVERABLE (UCR)
ERRORS" of the SDM.

It's a whole another question how/whether to do UCR error handling in
the guest or maybe leave it to the host...

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] target-i386: Do not set MCG_SER_P by default

2015-11-20 Thread Andreas Färber
Hi,

CC'ing qemu-devel.

Am 21.11.2015 um 00:01 schrieb Borislav Petkov:
> From: Borislav Petkov 
> 
> Software Error Recovery, i.e. SER, is purely an Intel feature and it
> shouldn't be set by default. Enable it only on Intel.

Is this new in 2.5? Otherwise we would probably need compatibility code
in pc*.[ch] for incoming live migration from older versions.

> 
> Signed-off-by: Borislav Petkov 
> ---
>  target-i386/cpu.c | 7 ---
>  target-i386/cpu.h | 9 -
>  target-i386/kvm.c | 5 +
>  3 files changed, 13 insertions(+), 8 deletions(-)
> 
> diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> index 11e5e39a756a..8155ee94fbe1 100644
> --- a/target-i386/cpu.c
> +++ b/target-i386/cpu.c
> @@ -2803,13 +2803,6 @@ static void x86_cpu_apic_realize(X86CPU *cpu, Error 
> **errp)
>  }
>  #endif
>  
> -
> -#define IS_INTEL_CPU(env) ((env)->cpuid_vendor1 == CPUID_VENDOR_INTEL_1 && \
> -   (env)->cpuid_vendor2 == CPUID_VENDOR_INTEL_2 && \
> -   (env)->cpuid_vendor3 == CPUID_VENDOR_INTEL_3)
> -#define IS_AMD_CPU(env) ((env)->cpuid_vendor1 == CPUID_VENDOR_AMD_1 && \
> - (env)->cpuid_vendor2 == CPUID_VENDOR_AMD_2 && \
> - (env)->cpuid_vendor3 == CPUID_VENDOR_AMD_3)
>  static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
>  {
>  CPUState *cs = CPU(dev);
> diff --git a/target-i386/cpu.h b/target-i386/cpu.h
> index fc4a605d6a29..2605c564239a 100644
> --- a/target-i386/cpu.h
> +++ b/target-i386/cpu.h
> @@ -283,7 +283,7 @@
>  #define MCG_CTL_P   (1ULL<<8)   /* MCG_CAP register available */
>  #define MCG_SER_P   (1ULL<<24) /* MCA recovery/new status bits */
>  
> -#define MCE_CAP_DEF (MCG_CTL_P|MCG_SER_P)
> +#define MCE_CAP_DEF MCG_CTL_P
>  #define MCE_BANKS_DEF   10
>  
>  #define MCG_STATUS_RIPV (1ULL<<0)   /* restart ip valid */
> @@ -610,6 +610,13 @@ typedef uint32_t FeatureWordArray[FEATURE_WORDS];
>  #define CPUID_MWAIT_IBE (1U << 1) /* Interrupts can exit capability */
>  #define CPUID_MWAIT_EMX (1U << 0) /* enumeration supported */
>  
> +#define IS_INTEL_CPU(env) ((env)->cpuid_vendor1 == CPUID_VENDOR_INTEL_1 && \
> +   (env)->cpuid_vendor2 == CPUID_VENDOR_INTEL_2 && \
> +   (env)->cpuid_vendor3 == CPUID_VENDOR_INTEL_3)
> +#define IS_AMD_CPU(env) ((env)->cpuid_vendor1 == CPUID_VENDOR_AMD_1 && \
> + (env)->cpuid_vendor2 == CPUID_VENDOR_AMD_2 && \
> + (env)->cpuid_vendor3 == CPUID_VENDOR_AMD_3)
> +
>  #ifndef HYPERV_SPINLOCK_NEVER_RETRY
>  #define HYPERV_SPINLOCK_NEVER_RETRY 0x
>  #endif
> diff --git a/target-i386/kvm.c b/target-i386/kvm.c
> index 2a9953b2d4b5..082d38d4838d 100644
> --- a/target-i386/kvm.c
> +++ b/target-i386/kvm.c
> @@ -787,8 +787,13 @@ int kvm_arch_init_vcpu(CPUState *cs)
>  if (banks > MCE_BANKS_DEF) {
>  banks = MCE_BANKS_DEF;
>  }
> +
>  mcg_cap &= MCE_CAP_DEF;
>  mcg_cap |= banks;
> +
> + if (IS_INTEL_CPU(env))
> + mcg_cap |= MCG_SER_P;

Tabs and missing braces.

> +
>  ret = kvm_vcpu_ioctl(cs, KVM_X86_SETUP_MCE, _cap);
>  if (ret < 0) {
>  fprintf(stderr, "KVM_X86_SETUP_MCE: %s", strerror(-ret));

Regards,
Andreas

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Graham Norton; HRB 21284 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] target-i386: Do not set MCG_SER_P by default

2015-11-20 Thread Borislav Petkov
On Sat, Nov 21, 2015 at 12:11:35AM +0100, Andreas Färber wrote:
> Hi,
> 
> CC'ing qemu-devel.

Ah, thanks.

> Am 21.11.2015 um 00:01 schrieb Borislav Petkov:
> > From: Borislav Petkov 
> > 
> > Software Error Recovery, i.e. SER, is purely an Intel feature and it
> > shouldn't be set by default. Enable it only on Intel.
> 
> Is this new in 2.5? Otherwise we would probably need compatibility code
> in pc*.[ch] for incoming live migration from older versions.

It looks it is really old, AFAIK from 2010:

c0532a76b407 ("MCE: Relay UCR MCE to guest")

You'd need to be more verbose about pc*.[ch]. An example perhaps...?

> >  mcg_cap &= MCE_CAP_DEF;
> >  mcg_cap |= banks;
> > +
> > +   if (IS_INTEL_CPU(env))
> > +   mcg_cap |= MCG_SER_P;
> 
> Tabs and missing braces.

Ok.

Thanks.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html