Re: [Qemu-devel] [Qemu-arm] [PATCH RESEND v15 10/10] target-arm: kvm64: handle SIGBUS signal from kernel or KVM
On Mon, 26 Nov 2018 at 17:18, gengdongjiu wrote: > > Hi Peter, >Thanks for the comments. > > > > > Yes, but if you look at the code which calls that, it goes on to do: > > env->mcg_cap &= mcg_cap | MCG_CAP_BANKS_MASK; > > > > which means that if the host kernel does not support this feature then we > > will clear those bits in the env->mcg_cap field, so we do not > > advertise it to the guest. But we might be not advertising it to the guest > > at all, if env->mcg_cap was 0 before this code was called. That > > happens if we are presenting the guest with a guest CPU type which does not > > have the feature. > > So you mean "env->mcg_cap" uses a bit to indicate that whether guest CPU > support error recovery, right? If so, how we know whether guest CPU have this > feature? > Should we initialized MCG_SER_P bit of env->mcg_cap to 1 or 0 before > initializing the vcpu? Please together see the reply in the [2], thanks. This is an x86-specific field. We need to identify the Arm equivalent. > > The question is not "does the host CPU / KVM support error reporting". It > > is "does the *guest* CPU / system support error reporting". > > These are distinct questions which may not have the same answer. > > [2]: > But QEMU should not know whether *guest* CPU / system support error reporting. It should and it must. Otherwise we will deliver an exception to a guest that cannot handle it. > From my understanding, in the X86 platform, if host CPU / KVM support error > reporting, > then it will think guest CPU supports error reporting, and set the > MCG_SER_P bit of env->mcg_cap to 1 No, this is not correct. If you look at the code quoted above. it ANDs in the mcg_cap value. This code is run after the CPU is initially created with a default env->mcg_cap which depends on what CPU model is being exposed to the guest. So we only set env->mcg_cap to 1 if the guest vCPU model supports error reporting AND the host KVM does. thanks -- PMM
Re: [Qemu-devel] [Qemu-arm] [PATCH RESEND v15 10/10] target-arm: kvm64: handle SIGBUS signal from kernel or KVM
Hi Peter, Thanks for the comments. > On Fri, 23 Nov 2018 at 14:31, gengdongjiu wrote: > > > > Hi Peter, > > Thanks for the comments and mail. > > > > > > > > On 22 November 2018 at 10:28, Peter Maydell > > > wrote: > > > > On 22 November 2018 at 03:05, gengdongjiu > > > > wrote: > > > >>> > > > > >>> Shouldn't there be something in here to say "only report this error > > > >>> to the guest if we are actually reporting RAS errors to the > guest" ? > > > >> > > > >> Yes, We can say something that such as "report this error to the > > > >> guest", because this error is indeed triggered by guest, which is > > > >> guest > > > error. > > > > > > > > I'm afraid I don't really understand what you mean. Could you try > > > > rephrasing it? > > > > > > > > My understanding was: > > > > * we get this signal if there is a RAS error in the host memory > > > > * if we are exposing RAS errors to the guest (ie we have > > > >told it that in the ACPI table we passed it at startup) > > > >then we should pass on this error to the guest > > > > > > > > but that these are two different conditions. > > > > > > > > If the host hardware detects a RAS error in memory used by the > > > > guest but the guest is not being told about RAS errors, then we > > > > cannot report the error: we have no mechanism to do so, and the > > > > guest is not expecting it. > > > > > > If you look at the x86 version of this function you can see that it > > > tests (env->mcg_cap & MCG_SER_P), which I think is the equivalent x86 "is > > > the guest CPU/config one we can report these errors to" > test. > > > > MCG_SER_P (software error recovery support present) flag indicates (when > > set) that the processor supports software error recovery. > > env->mcg_cap 's value should be got from KVM as shown in the QEMU code[1], > > it indicates whether the KVM support software error > recovery. > > > > [1]: > > - > > ret = kvm_get_mce_cap_supported(cs->kvm_state, _cap, ); > > if (ret < 0) { > >fprintf(stderr, "kvm_get_mce_cap_supported: %s", strerror(-ret)); > >return ret; > >} > > -- > > --- > > Yes, but if you look at the code which calls that, it goes on to do: > env->mcg_cap &= mcg_cap | MCG_CAP_BANKS_MASK; > > which means that if the host kernel does not support this feature then we > will clear those bits in the env->mcg_cap field, so we do not > advertise it to the guest. But we might be not advertising it to the guest at > all, if env->mcg_cap was 0 before this code was called. That > happens if we are presenting the guest with a guest CPU type which does not > have the feature. So you mean "env->mcg_cap" uses a bit to indicate that whether guest CPU support error recovery, right? If so, how we know whether guest CPU have this feature? Should we initialized MCG_SER_P bit of env->mcg_cap to 1 or 0 before initializing the vcpu? Please together see the reply in the [2], thanks. > > -- > > void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr) { > > ... > > > > if (ram_addr != RAM_ADDR_INVALID && > > kvm_physical_memory_addr_from_host(c->kvm_state, addr, > > )) { > > > > If it got to here, it means the host hardware detects a RAS error in > > memory used by the guest using above two judgments. > > Maybe we can test/check whether KVM supports software error recovery > > in [3] > > > > The question is not "does the host CPU / KVM support error reporting". It is > "does the *guest* CPU / system support error reporting". > These are distinct questions which may not have the same answer. [2]: But QEMU should not know whether *guest* CPU / system support error reporting. From my understanding, in the X86 platform, if host CPU / KVM support error reporting, then it will think guest CPU supports error reporting, and set the MCG_SER_P bit of env->mcg_cap to 1 > > > thanks > -- PMM
Re: [Qemu-devel] [Qemu-arm] [PATCH RESEND v15 10/10] target-arm: kvm64: handle SIGBUS signal from kernel or KVM
On Fri, 23 Nov 2018 at 14:31, gengdongjiu wrote: > > Hi Peter, > Thanks for the comments and mail. > > > > > On 22 November 2018 at 10:28, Peter Maydell > > wrote: > > > On 22 November 2018 at 03:05, gengdongjiu wrote: > > >>> > > > >>> Shouldn't there be something in here to say "only report this error to > > >>> the guest if we are actually reporting RAS errors to the guest" ? > > >> > > >> Yes, We can say something that such as "report this error to the guest", > > >> because this error is indeed triggered by guest, which is guest > > error. > > > > > > I'm afraid I don't really understand what you mean. Could you try > > > rephrasing it? > > > > > > My understanding was: > > > * we get this signal if there is a RAS error in the host memory > > > * if we are exposing RAS errors to the guest (ie we have > > >told it that in the ACPI table we passed it at startup) > > >then we should pass on this error to the guest > > > > > > but that these are two different conditions. > > > > > > If the host hardware detects a RAS error in memory used by the guest > > > but the guest is not being told about RAS errors, then we cannot > > > report the error: we have no mechanism to do so, and the guest is not > > > expecting it. > > > > If you look at the x86 version of this function you can see that it tests > > (env->mcg_cap & MCG_SER_P), which I think is the equivalent x86 "is > > the guest CPU/config one we can report these errors to" test. > > MCG_SER_P (software error recovery support present) flag indicates (when set) > that the processor supports software error recovery. > env->mcg_cap 's value should be got from KVM as shown in the QEMU code[1], it > indicates whether the KVM support software error recovery. > > [1]: > - > ret = kvm_get_mce_cap_supported(cs->kvm_state, _cap, ); > if (ret < 0) { >fprintf(stderr, "kvm_get_mce_cap_supported: %s", strerror(-ret)); >return ret; >} > - Yes, but if you look at the code which calls that, it goes on to do: env->mcg_cap &= mcg_cap | MCG_CAP_BANKS_MASK; which means that if the host kernel does not support this feature then we will clear those bits in the env->mcg_cap field, so we do not advertise it to the guest. But we might be not advertising it to the guest at all, if env->mcg_cap was 0 before this code was called. That happens if we are presenting the guest with a guest CPU type which does not have the feature. -- > void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr) > { > ... > > if (ram_addr != RAM_ADDR_INVALID && > kvm_physical_memory_addr_from_host(c->kvm_state, addr, )) { > > If it got to here, it means the host hardware detects a RAS error in memory > used by the guest using above two judgments. > Maybe we can test/check whether KVM supports software error recovery in [3] > The question is not "does the host CPU / KVM support error reporting". It is "does the *guest* CPU / system support error reporting". These are distinct questions which may not have the same answer. thanks -- PMM