Re: [PATCH v3 2/2] arm64: Add software workaround for Falkor erratum 1041

2017-11-15 Thread Timur Tabi
On Tue, Nov 14, 2017 at 7:05 PM, Stephen Boyd  wrote:

> This also applies to Kryo CPUs. I have a patch[1] for the 1003
> Falkor errata that adds the Kryo MIDR check which can also be
> used for this errata.
>
> [1] https://patchwork.kernel.org/patch/10048987/

Please submit a follow-up patch for E1041 after our patch is merged.
We don't want to submit a patch for your SOCs, even if it's the same
erratum and work-around.

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v4 00/21] SError rework + RAS for firmware first support

2017-11-15 Thread James Morse
Hi gengdongjiu,

On 15/11/17 09:15, gengdongjiu wrote:
> On 2017/11/15 0:03, James Morse wrote:
>>> Hope this helps?
>> Yes, I'll go looking for a way to expose VSESR_EL2 to user-space.
> 
> what is the purpose to expose VSESR_EL2?
> do you mean set its value after migration?

Yes. Ideally Qemu would know the value it supplied last, and we just need to
tell it if 'the' inject SError has been delivered. But kvm_inject_vabt() makes
this impossible as Qemu can't know whose injected error this is.


> May be we can use similar below Mechanism
> https://www.spinics.net/lists/arm-kernel/msg603525.html

> when user-space sync the register status, it will get these register value.
> it will reuse the IOCTL KVM_GET_ONE_REG and no need to add extra API.

The maintainer NAKed "any patch that will expose _EL2 registers outside of
nested virtualization": https://patchwork.kernel.org/patch/9886019/

Why? If we 'spend' VSESR_EL2's name and encoding on 'the register we will give
to the guest when it can next take an SError', we will need a new name (and
encoding!) for systems with nested virtualization as now the guest has an
VSESR_EL2 too. The sys_reg/get_one_reg stuff is for guest registers. This thing
is part of the hypervisor's state.

Exposing VSESR_EL2 directly wouldn't be enough: A value of all-zeroes doesn't
tell us if an SError is pending, we need HCR_EL2.VSE too.


Your 'give me register' is a very raw interface, it makes it difficult to change
in the future: What if we get a new way to inject SError? We may not be able to
use it if user-space is poking CPU registers directly.
What happens if all those RES0 bits (and there are a lot of them) mean something
on future CPUs? Should we expose them? Should user-space be allowed to set them?
What if we need an errata workaround, based on something user-space can't know?

What about 32bit? The register names and sizes are different. User-space would
need a separate implementation to drive this. This is easier for the kernel to 
do.

We should have an API specific to the feature we are offering user-space. We are
offering a way to trigger an SError, with a specified ESR if the system supports
that. To be migrated it needs to be able to read this information back.

This way we can change the implementation without changing the API.



Thanks,

James
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 33/37] KVM: arm/arm64: Move arm64-only vgic-v2-sr.c file to arm64

2017-11-15 Thread Andre Przywara
Hi,

On 12/10/17 11:41, Christoffer Dall wrote:
> The vgic-v2-sr.c file now only contains the logic to replay unaligned
> accesses to the virtual CPU interface on 16K and 64K page systems, which
> is only relevant on 64-bit platforms.  Therefore move this file to the
> arm64 KVM tree, remove the compile directive from the 32-bit side
> makefile, and remove the ifdef in the C file.

I like that one!

> Signed-off-by: Christoffer Dall 

Reviewed-by: Andre Przywara 

Cheers,
Andre.

> ---
>  arch/arm/kvm/hyp/Makefile | 1 -
>  arch/arm64/kvm/hyp/Makefile   | 2 +-
>  {virt/kvm/arm => arch/arm64/kvm}/hyp/vgic-v2-sr.c | 2 --
>  3 files changed, 1 insertion(+), 4 deletions(-)
>  rename {virt/kvm/arm => arch/arm64/kvm}/hyp/vgic-v2-sr.c (98%)
> 
> diff --git a/arch/arm/kvm/hyp/Makefile b/arch/arm/kvm/hyp/Makefile
> index 8679405..4be5061 100644
> --- a/arch/arm/kvm/hyp/Makefile
> +++ b/arch/arm/kvm/hyp/Makefile
> @@ -6,7 +6,6 @@ ccflags-y += -fno-stack-protector
>  
>  KVM=../../../../virt/kvm
>  
> -obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/vgic-v2-sr.o
>  obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/vgic-v3-sr.o
>  obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/timer-sr.o
>  
> diff --git a/arch/arm64/kvm/hyp/Makefile b/arch/arm64/kvm/hyp/Makefile
> index 14c4e3b..b746fa5 100644
> --- a/arch/arm64/kvm/hyp/Makefile
> +++ b/arch/arm64/kvm/hyp/Makefile
> @@ -6,10 +6,10 @@ ccflags-y += -fno-stack-protector
>  
>  KVM=../../../../virt/kvm
>  
> -obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/vgic-v2-sr.o
>  obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/vgic-v3-sr.o
>  obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/timer-sr.o
>  
> +obj-$(CONFIG_KVM_ARM_HOST) += vgic-v2-sr.o
>  obj-$(CONFIG_KVM_ARM_HOST) += sysreg-sr.o
>  obj-$(CONFIG_KVM_ARM_HOST) += debug-sr.o
>  obj-$(CONFIG_KVM_ARM_HOST) += entry.o
> diff --git a/virt/kvm/arm/hyp/vgic-v2-sr.c b/arch/arm64/kvm/hyp/vgic-v2-sr.c
> similarity index 98%
> rename from virt/kvm/arm/hyp/vgic-v2-sr.c
> rename to arch/arm64/kvm/hyp/vgic-v2-sr.c
> index b433257..fcd7b4e 100644
> --- a/virt/kvm/arm/hyp/vgic-v2-sr.c
> +++ b/arch/arm64/kvm/hyp/vgic-v2-sr.c
> @@ -22,7 +22,6 @@
>  #include 
>  #include 
>  
> -#ifdef CONFIG_ARM64
>  /*
>   * __vgic_v2_perform_cpuif_access -- perform a GICV access on behalf of the
>   *guest.
> @@ -76,4 +75,3 @@ int __hyp_text __vgic_v2_perform_cpuif_access(struct 
> kvm_vcpu *vcpu)
>  
>   return 1;
>  }
> -#endif
> 
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [v3,2/2] arm64: Add software workaround for Falkor erratum 1041

2017-11-15 Thread Manoj Iyer

On Sun, 12 Nov 2017, Shanker Donthineni wrote:


The ARM architecture defines the memory locations that are permitted
to be accessed as the result of a speculative instruction fetch from
an exception level for which all stages of translation are disabled.
Specifically, the core is permitted to speculatively fetch from the
4KB region containing the current program counter 4K and next 4K.

When translation is changed from enabled to disabled for the running
exception level (SCTLR_ELn[M] changed from a value of 1 to 0), the
Falkor core may errantly speculatively access memory locations outside
of the 4KB region permitted by the architecture. The errant memory
access may lead to one of the following unexpected behaviors.

1) A System Error Interrupt (SEI) being raised by the Falkor core due
  to the errant memory access attempting to access a region of memory
  that is protected by a slave-side memory protection unit.
2) Unpredictable device behavior due to a speculative read from device
  memory. This behavior may only occur if the instruction cache is
  disabled prior to or coincident with translation being changed from
  enabled to disabled.

The conditions leading to this erratum will not occur when either of the
following occur:
1) A higher exception level disables translation of a lower exception level
  (e.g. EL2 changing SCTLR_EL1[M] from a value of 1 to 0).
2) An exception level disabling its stage-1 translation if its stage-2
   translation is enabled (e.g. EL1 changing SCTLR_EL1[M] from a value of 1
   to 0 when HCR_EL2[VM] has a value of 1).

To avoid the errant behavior, software must execute an ISB immediately
prior to executing the MSR that will change SCTLR_ELn[M] from 1 to 0.

Signed-off-by: Shanker Donthineni 
---
Changes since v1:
 Apply the workaround where it's required.
Changes since v2:
 Repost the corrected patches.

Documentation/arm64/silicon-errata.txt |  1 +
arch/arm64/Kconfig | 10 ++
arch/arm64/include/asm/assembler.h | 19 +++
arch/arm64/include/asm/cpucaps.h   |  3 ++-
arch/arm64/kernel/cpu-reset.S  |  1 +
arch/arm64/kernel/cpu_errata.c | 16 
arch/arm64/kernel/efi-entry.S  |  2 ++
arch/arm64/kernel/head.S   |  1 +
arch/arm64/kernel/relocate_kernel.S|  1 +
arch/arm64/kvm/hyp-init.S  |  1 +
10 files changed, 54 insertions(+), 1 deletion(-)

diff --git a/Documentation/arm64/silicon-errata.txt 
b/Documentation/arm64/silicon-errata.txt
index 66e8ce1..704770c0 100644
--- a/Documentation/arm64/silicon-errata.txt
+++ b/Documentation/arm64/silicon-errata.txt
@@ -74,3 +74,4 @@ stable kernels.
| Qualcomm Tech. | Falkor v1   | E1003   | QCOM_FALKOR_ERRATUM_1003 
   |
| Qualcomm Tech. | Falkor v1   | E1009   | QCOM_FALKOR_ERRATUM_1009 
   |
| Qualcomm Tech. | QDF2400 ITS | E0065   | 
QCOM_QDF2400_ERRATUM_0065   |
+| Qualcomm Tech. | Falkor v{1,2}   | E1041   | 
QCOM_FALKOR_ERRATUM_1041|
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 0df64a6..8f73eac 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -539,6 +539,16 @@ config QCOM_QDF2400_ERRATUM_0065

  If unsure, say Y.

+config QCOM_FALKOR_ERRATUM_E1041
+   bool "Falkor E1041: Speculative instruction fetches might cause errant 
memory access"
+   default y
+   help
+ Falkor CPU may speculatively fetch instructions from an improper
+ memory location when MMU translation is changed from SCTLR_ELn[M]=1
+ to SCTLR_ELn[M]=0. Prefix an ISB instruction to fix the problem.
+
+ If unsure, say Y.
+
endmenu


diff --git a/arch/arm64/include/asm/assembler.h 
b/arch/arm64/include/asm/assembler.h
index d58a625..dd9cec5 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -30,6 +30,7 @@
#include 
#include 
#include 
+#include 

/*
 * Enable and disable interrupts.
@@ -499,4 +500,22 @@
#endif
.endm

+/**
+ * Errata workaround prior to disable MMU. Insert an ISB immediately prior
+ * to executing the MSR that will change SCTLR_ELn[M] from a value of 1 to 0.
+ */
+   .macro pre_disable_mmu_workaround
+#ifdef CONFIG_QCOM_FALKOR_ERRATUM_E1041
+alternative_if ARM64_WORKAROUND_QCOM_FALKOR_E1041
+   isb
+alternative_else_nop_endif
+#endif
+   .endm
+
+   .macro pre_disable_mmu_early_workaround
+#ifdef CONFIG_QCOM_FALKOR_ERRATUM_E1041
+   isb
+#endif
+   .endm
+
#endif  /* __ASM_ASSEMBLER_H */
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index 8da6216..7f7a59d 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -40,7 +40,8 @@
#define ARM64_WORKAROUND_858921 19
#define ARM64_WORKAROUND_CAVIUM_30115   20
#define ARM64_HAS_DCPOP 21
+#define ARM64_WORKAROUND_QCOM_FALKOR_E1041 22

-#define ARM64_NCAPS 

Re: [PATCH 08/37] KVM: arm64: Defer restoring host VFP state to vcpu_put

2017-11-15 Thread Andrew Jones
On Thu, Oct 12, 2017 at 12:41:12PM +0200, Christoffer Dall wrote:
> Avoid saving the guest VFP registers and restoring the host VFP
> registers on every exit from the VM.  Only when we're about to run
> userspace or other threads in the kernel do we really have to switch the
> state back to the host state.

Rik van Riel's recently post patch "[PATCH v2 0/2] x86,kvm: move qemu/guest
FPU switching out to kvm_arch_vcpu_ioctl_run" indicates that for x86 they
only need to swap guest and userspace VFP registers before exiting VCPU_RUN
to userspace, not for running other threads. I imagine that's the same for
ARM as well.

If so, then I think this hunk

> @@ -209,4 +212,16 @@ void kvm_vcpu_load_sysregs(struct kvm_vcpu *vcpu)
>   */
>  void kvm_vcpu_put_sysregs(struct kvm_vcpu *vcpu)
>  {
> + struct kvm_cpu_context *host_ctxt = vcpu->arch.host_cpu_context;
> + struct kvm_cpu_context *guest_ctxt = >arch.ctxt;
> +
> + /* Restore host FP/SIMD state */
> + if (vcpu->arch.guest_vfp_loaded) {
> + if (vcpu_el1_is_32bit(vcpu))
> + kvm_call_hyp(__fpsimd32_save_state,
> +  kern_hyp_va(guest_ctxt));
> + __fpsimd_save_state(_ctxt->gp_regs.fp_regs);
> + __fpsimd_restore_state(_ctxt->gp_regs.fp_regs);
> + vcpu->arch.guest_vfp_loaded = 0;
> + }
>  }

could be moved to the return of kvm_arch_vcpu_ioctl_run().

Thanks,
drew
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [3/3] arm64: Add software workaround for Falkor erratum 1041

2017-11-15 Thread Manoj Iyer

On Fri, 10 Nov 2017, Manoj Iyer wrote:


On Thu, 9 Nov 2017, Manoj Iyer wrote:



James,

Looks like my VM test raised a false alarm. I retested stock Artful 4.13 
kernel (No erratum 1041 patches applied).




James, an update on the crash (false alarm). We suspect this is a firmware 
crash due to a possible fw bug. Once this is addressed I will be able to send 
you the test results you requested on VM start/stop with the erratum 1041 
patches applied.




James/Shanker,

I can report that VM start/stop/restart tests worked with the patches 
applied to Ubuntu 4.13 (Artful) kernel on the qdf2400 hardware.


Host: Ubuntu 4.13 with Erratum 1041 patches applied
Guest: Stock Ubuntu 4.13 kernel

- create 20 vms one at a time

10 iteration of:
- Stop (virsh destroy) 20 VMs one at a time
- Start (virsh start) 20 VMs one at a time.

Tested-by: Manoj Iyer 




Host: Ubuntu Artful 4.13 kernel with *no* erratum 1041 patches applied.
Guest: Ubuntu Zesty (4.10) kernel.

- Created 20 VMs one at a time

In a loop:
- Stop (virsh destroy) 20 VMs one at a time
- Start (virsh start) 20 VMs one at a time.

And, I am able to reproduce the system reset issue I previously reported. I 
think the problem I reported with VMs might have nothing to do with the 
erratum 1041 patches, and probably needs to be root caused seperately.


With stock 4.13 kernel (no erratum 1041 patches applied):

awrep6 login: [  461.881379] ACPI CPPC: PCC check channel failed. Status=0
[  462.051194] ACPI CPPC: PCC check channel failed. Status=0
[  462.223137] ACPI CPPC: PCC check channel failed. Status=0
[  462.633790] ACPI CPPC: PCC check channel failed. Status=0
[  463.231971] ACPI CPPC: PCC check channel failed. Status=0
[  463.403163] ACPI CPPC: PCC check channel failed. Status=0
[  463.822936] ACPI CPPC: PCC check channel failed. Status=0
[  463.995222] ACPI CPPC: PCC check channel failed. Status=0
[  464.130962] ACPI CPPC: PCC check channel failed. Status=0
[  464.258973] ACPI CPPC: PCC check channel failed. Status=0
[  465.283028] ACPI CPPC: PCC check channel failed. Status=0


SYS_DBG: Running SDI image (immediate mode)
SYS_DBG: Ram Dump Init
SYS_DBG: Failed to init SD card
SYS_DBG: Resetting system!


On Thu, 9 Nov 2017, Manoj Iyer wrote:





On Thu, 9 Nov 2017, Manoj Iyer wrote:



James,

(sorry for top-posting)

Applied patch 3 patches to Ubuntu Artful Kernel ( 4.13.0-16-generic )

- Start 20 VMs one at a time

In a loop:
- Stop (virsh destroy) 20 VMs one at a time
- Start (virsh start) 20 VMs one at a time.


Fixing some confusion I might have introduced in my prev email.

- Applied all 3 patches to Ubuntu Artful Kernel ( 4.13.0-16-generic )

- Created 20 VMs one at a time

In a loop:
- Stop (virsh destroy) 20 VMs one at a time
- Start (virsh start) 20 VMs one at a time.



The system reset's itself after starting the last VM on the 1st loop 
displaying the following:


awrep6 login: [ 603.349141] ACPI CPPC: PCC check channel failed. Status=0
[ 603.765101] ACPI CPPC: PCC check channel failed. Status=0
[ 603.937389] ACPI CPPC: PCC check channel failed. Status=0
[ 608.285495] ACPI CPPC: PCC check channel failed. Status=0
[ 608.289481] ACPI CPPC: PCC check channel failed. Status=0

SYS_DBG: Running SDI image (immediate mode)
SYS_DBG: Ram Dump Init
SYS_DBG: Failed to init SD card
SYS_DBG: Resetting system!

Followed by the following messages on system reboot:
[ 6.616891] BERT: Error records from previous boot:
[ 6.621655] [Hardware Error]: event severity: fatal
[ 6.626516] [Hardware Error]: imprecise tstamp: -00-00 00:00:00
[ 6.632851] [Hardware Error]: Error 0, type: fatal
[ 6.637713] [Hardware Error]: section type: unknown, 
d2e2621c-f936-468d-0d84-15a4ed015c8b

[ 6.646045] [Hardware Error]: section length: 0x238
[ 6.651082] [Hardware Error]: : 72724502 5220726f 6f736165 
6e55206e .Error Reason Un
[ 6.659761] [Hardware Error]: 0010: 776f6e6b 006e  
 known...
[ 6.668442] [Hardware Error]: 0020:    
 
[ 6.677122] [Hardware Error]: 0030:    
 



On Thu, 9 Nov 2017, James Morse wrote:


Hi Manoj,

On 08/11/17 19:05, Manoj Iyer wrote:

On Thu, 2 Nov 2017, Shanker Donthineni wrote:

The ARM architecture defines the memory locations that are permitted
to be accessed as the result of a speculative instruction fetch from
an exception level for which all stages of translation are disabled.
Specifically, the core is permitted to speculatively fetch from the
4KB region containing the current program counter and next 4KB.

When translation is changed from enabled to disabled for the running
exception level (SCTLR_ELn[M] changed from a value of 1 to 0), the
Falkor core may errantly speculatively access memory locations outside
of the 4KB region permitted by the architecture. The errant memory
access may lead to one of the following unexpected behaviors.



I applied the 3 patches to 

Re: [PATCH v8 7/7] arm64: kvm: handle SError Interrupt by categorization

2017-11-15 Thread gengdongjiu
Hi James,

   Thanks a lot for the review.

On 2017/11/15 0:00, James Morse wrote:
> Hi Dongjiu Geng,
> 
> On 10/11/17 19:54, Dongjiu Geng wrote:
>> If it is not RAS SError, directly inject virtual SError,
>> which will keep the old way. If it is RAS SError, firstly
>> let host ACPI module to handle it.
> 
>> For the ACPI handling,
>> if the error address is invalid, APEI driver will not
>> identify the address to hwpoison memory and can not notify
>> guest to do the recovery.
> 
> The guest can't do any recover either. There is no recovery you can do without
> some information about what the error is.
> 
> This is your memory corruption at an unknown address? We should reboot.
> 
> (I agree memory_failure.c's::me_kernel() is ignoring kernel errors, we should
> try and fix this. It makes some sense for polled or irq notifications, but not
> SEA/SEI).
> 
> 
>> In order to safe, KVM continues
>> categorizing errors and handle it separately.
> 
>> If the RAS error is not propagated, let host user space to
>> handle it. 
> 
> No. Host user space should not know anything about the kernel or platform RAS
> support. Doing so creates an ABI link between EL3 firmware and Qemu. This is
> totally unmaintainable.

Here I have two question:
(1) If the AET(Asynchronous Error Type) is Recoverable error (UER), do you mean 
we also reboot or panic?
(2) what is the chance to set guest ESR for Qemu?  here I return a error code 
to Qemu. when Qemu get this error return,
it will specify guest ESR and inject the abort. here if KVM does not return 
error to Qemu, Qemu will do
not know when to set the guest ESR value and inject abort.


> 
> This thing needs to be portable. The kernel should handle the error, and 
> report
> any symptoms to user-space. e.g. 'this memory is gone'.
> 
> We shouldn't special case KVM.
> 
> 
>> The reason is that sometimes we can only kill the
>> guest effected application instead of panic whose guest OS.
>> Host user space specifies a valid ESR and inject virtual
>> SError, guest can just kill the current application if the
>> non-consumed error coming from guest application.
>>
>> Signed-off-by: Dongjiu Geng 
>> Signed-off-by: Quanming Wu 
> 
> The last Signed-off-by should match the person posting the patch. It's a chain
> of custody for GPL-signoff purposes, not a 'partially-written-by'. If you want
> to credit Quanming Wu you can add CC and they can Ack/Review your patch.

Ok, got it. thanks a lot for your suggestion.


> 
> 
>> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
>> index 7debb74..1afdc87 100644
>> --- a/arch/arm64/kvm/handle_exit.c
>> +++ b/arch/arm64/kvm/handle_exit.c
>> @@ -178,6 +179,66 @@ static exit_handle_fn kvm_get_exit_handler(struct 
>> kvm_vcpu *vcpu)
>>  return arm_exit_handlers[hsr_ec];
>>  }
>>  
>> +/**
>> + * kvm_handle_guest_sei - handles SError interrupt or asynchronous aborts
>> + * @vcpu:   the VCPU pointer
>> + *
>> + * For RAS SError interrupt, firstly let host kernel handle it.
>> + * If the AET is [ESR_ELx_AET_UER], then let user space handle it,
>> + */
>> +static int kvm_handle_guest_sei(struct kvm_vcpu *vcpu, struct kvm_run *run)
>> +{
>> +unsigned int esr = kvm_vcpu_get_hsr(vcpu);
>> +bool impdef_syndrome =  esr & ESR_ELx_ISV;  /* aka IDS */
>> +unsigned int aet = esr & ESR_ELx_AET;
>> +
>> +/*
>> + * This is not RAS SError
>> + */
>> +if (!cpus_have_const_cap(ARM64_HAS_RAS_EXTN)) {
>> +kvm_inject_vabt(vcpu);
>> +return 1;
>> +}
> 
>> +/* The host kernel may handle this abort. */
>> +handle_guest_sei();
> 
> This has to claim the SError as a notification. If APEI claims the error, KVM
> doesn't need to do anything more. You ignore its return code.

Thanks for the pointing out.
I will check the return code, if it return success, KVM doesn't need to do 
anything more,
otherwise, continue run.

> 
> 
>> +
>> +/*
>> + * In below two conditions, it will directly inject the
>> + * virtual SError:
>> + * 1. The Syndrome is IMPLEMENTATION DEFINED
>> + * 2. It is Uncategorized SEI
>> + */
>> +if (impdef_syndrome ||
>> +((esr & ESR_ELx_FSC) != ESR_ELx_FSC_SERROR)) {
>> +kvm_inject_vabt(vcpu);
>> +return 1;
>> +}
>> +
>> +switch (aet) {
>> +case ESR_ELx_AET_CE:/* corrected error */
>> +case ESR_ELx_AET_UEO:   /* restartable error, not yet consumed */
>> +return 1;   /* continue processing the guest exit */
> 
>> +case ESR_ELx_AET_UER:   /* The error has not been propagated */
>> +/*
>> + * Userspace only handle the guest SError Interrupt(SEI) if the
>> + * error has not been propagated
>> + */
>> +run->exit_reason = KVM_EXIT_EXCEPTION;
>> +run->ex.exception = ESR_ELx_EC_SERROR;
>> +run->ex.error_code = KVM_SEI_SEV_RECOVERABLE;
>> + 

Re: [PATCH v8 0/7] Support RAS virtualization in KVM

2017-11-15 Thread gengdongjiu
Hi James,
   Thank you very much for your comments and review.

On 2017/11/15 0:00, James Morse wrote:
> Hi Dongjiu Geng,
> 
> On 10/11/17 19:54, Dongjiu Geng wrote:
>> This series patches mainly do below things:
>>
>> 1. Trap RAS ERR* registers Accesses to EL2 from Non-secure EL1,
>>KVM will will do a minimum simulation, there registers are simulated
>>to RAZ/WI in KVM.
>> 2. Route synchronous External Abort exceptions from Non-secure EL0
>>and EL1 to EL2. When exception EL3 routing is enabled by firmware,
>>system will trap to EL3 firmware instead of EL2 KVM, then firmware
>>judges whether El2 routing is enabled, if enabled, jump to EL2 KVM, 
>>otherwise jump to EL1 host kernel.
>> 3. Enable APEI ARv8 SEI notification to parse the CPER records for SError
>>in the ACPI GHES driver, KVM will call handle_guest_sei() to let ACPI
>>driver to parse the CPER record for SError which happened in the guest
>> 4. Although we can use APEI driver to handle the guest SError, but not all
>>system support SEI notification, such as kernel-first. So here KVM will
>>also classify the Error through Exception Syndrome Register and do 
>> different
>>approaches according to Asynchronous Error Type
> 
>> 5. If the guest SError error is not propagated and not consumed, then KVM 
>> return
>>recoverable error status to user-space, user-space will specify the guest 
>> ESR
> 
> I thought we'd gone over this. There should be no RAS errors/notifications in
> user space. Only the symptoms should be sent, using the SIGBUS_MCEERR_A{O,R} 
> if
> the kernel has handled as much as it can. This hides the actual mechanisms the
> kernel and firmware used.

Yes, I understand it.
For guest SError, if it is not  propagated and not consumed by PE, and the 
error address recorded by firmware is not accurate,
what is your suggestion about this scenario ?

I check again the comments in [0](as shown below), you ever suggest system 
panic.

-
"I think in this scenario your firmware should describe a memory-error with an
unknown address. (i.e. don't set the 'physical address valid' bit in CPER's
'Table 275 Memory Error Record'). When Linux gets one of these, it should
panic(): We know some memory is corrupt, we don't know where it is"


but I think it is not better, you ever have below concern in [0]
"The fault may be in the page tables belonging to the guest kernel,
even worse they may belong to they hypervisor's stage2 page tables"

If it is in the page tables, killing the APP, the memory will be free. if there 
is another application
will use this error address again, trigger another SError?


you know the error still not consumed by PE , so we can isolated it by killing 
it.
lets discuses the host EL0, if host El0 APP happen SError and error not 
consumed by the PE.
do you mean we also panic host OS?


> 
> User-space should not have to know how to handle RAS errors directly. This is 
> a
> service the operating system provides for it. This abstraction means the smae
> user-space code is portable between x86, arm64, powerpc etc.
> 
> What if the firmware uses another notification method? User space should 
> expect
> the kernel to hide things like this from it.
> 
> If the kernel has no information to interpret a notification, how is user 
> space
> supposed to know?
> 
> I understand you are trying to work around your 'memory corruption at an 
> unknown
> address'[0] problem, but if the kernel can't know where this corrupt memory is
> it should really reboot. What stops this corrupt data being swapped to disk?
> 
> Killing 'the thing' that was running at the time is not sufficient because we
> don't know that this 'got' all the users of the corrupt memory. KSM can merge

I think if we better using the ESB to isolate the error between EL0 and EL1, 
isolate the error between different guest.
then the error will be isolate to El0 application if it happen in El0. When KSM 
running, the ESB can synchronize
the error out instead of spread the error to other guests.



> pages between guests. This is the difference between the error persisting
> forever killing off all the VMs one by one, and the corrupt page being 
> silently
> re-read from disk clearing the error.
> 
> 
>>and inject a virtual SError. For other Asynchronous Error Type, KVM 
>> directly
>>injects virtual SError with IMPLEMENTATION DEFINED ESR or KVM is panic if 
>> the
>>error is fatal. In the RAS extension, guest virtual ESR must be set, 
>> because
>>all-zero  means 'RAS error: Uncategorized' instead of 'no valid ISS', so 
>> set
>>this ESR to IMPLEMENTATION DEFINED by default if user space does not 
>> specify it.
> 
> 
> Thanks,
> 
> James
> 
> 
> [0] https://www.spinics.net/lists/arm-kernel/msg605345.html
> 
> .
> 

___
kvmarm mailing list

Re: [PATCH v4 00/21] SError rework + RAS for firmware first support

2017-11-15 Thread gengdongjiu
> 
> (While VSESR_EL2 is 64bit[0], the value gets written into the ESR, which is
> 32bit, so I doubt the top 32bits can be used, currently they are all 
> reserved.)

In fact the valid bits for vsesr_el2 is 25bit, which will set to ESR.ISS, bits 
[24:0].
ESR.IL and ESR.EC are not set by vsesr_el2.

> 
> I'll go dig into how x86 uses this...
> 
> 
> Thanks!
> 
> James
> 
> 
> [0]
> https://static.docs.arm.com/ddi0587/a/RAS%20Extension-release%20candidate_march_29.pdf
> 
> .
> 

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 24/37] KVM: arm64: Prepare to handle traps on deferred EL0 sysregs

2017-11-15 Thread Julien Thierry



On 12/10/17 11:41, Christoffer Dall wrote:

We can trap access to ACTLR_EL1 which we can later defer to only
save/restore during vcpu_load and vcpu_put, so let's read the value
directly from the CPU when necessary.

Signed-off-by: Christoffer Dall 
---
  arch/arm64/kvm/sys_regs_generic_v8.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/sys_regs_generic_v8.c 
b/arch/arm64/kvm/sys_regs_generic_v8.c
index 969ade1..348f227 100644
--- a/arch/arm64/kvm/sys_regs_generic_v8.c
+++ b/arch/arm64/kvm/sys_regs_generic_v8.c
@@ -38,7 +38,10 @@ static bool access_actlr(struct kvm_vcpu *vcpu,
if (p->is_write)
return ignore_write(vcpu, p);
  
-	p->regval = vcpu_sys_reg(vcpu, ACTLR_EL1);

+   if (vcpu->arch.sysregs_loaded_on_cpu)
+   read_sysreg(actlr_el1);


Did you mean "p->regval = read_sysreg(actlr_el1);" ?


+   else
+   p->regval = vcpu_sys_reg(vcpu, ACTLR_EL1);
return true;
  }
  



--
Julien Thierry
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v4 00/21] SError rework + RAS for firmware first support

2017-11-15 Thread gengdongjiu
Hi james,

On 2017/11/15 0:03, James Morse wrote:
>> Hope this helps?
> Yes, I'll go looking for a way to expose VSESR_EL2 to user-space.

what is the purpose to expose VSESR_EL2?
do you mean set its value after migration?

May be we can use similar below Mechanism
https://www.spinics.net/lists/arm-kernel/msg603525.html

when user-space sync the register status, it will get these register value.
it will reuse the IOCTL KVM_GET_ONE_REG and no need to add extra API.

> 
> 
> Thanks!
> 
> James

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm