Re: [PATCH] arm/arm64: KVM: send SIGBUS error to qemu

2017-03-24 Thread gengdongjiu
Hi James,
   thanks for your review.

On 2017/3/23 23:06, James Morse wrote:
> Hi Dongjiu Geng,
> 
> On 23/03/17 13:01, Dongjiu Geng wrote:
>> when the pfn is KVM_PFN_ERR_HWPOISON, it indicates to send
>> SIGBUS signal from KVM's fault-handling code to qemu, qemu
>> can handle this signal according to the fault address.
> 
> I'm afraid I beat you to it on this one:
> https://www.spinics.net/lists/arm-kernel/msg568919.html
> 
> (Are you the same gengdj who ask me to post that patch?:
>  https://lkml.org/lkml/2017/3/5/187 )

Oh, yes, it is me. recently I do not check my gmail and think you are not reply 
mail.
it is great that you upstream this patch.

> 
> We don't need upstream KVM to do this until either arm or arm64 has
> ARCH_SUPPORTS_MEMORY_FAILURE. Punit and Tyler have discovered problems with 
> the
> way arm64's hugepage and hwpoison interact:
> https://www.spinics.net/lists/arm-kernel/msg568995.html
  ok, thanks James. do you know when the arm or arm64 will have 
ARCH_SUPPORTS_MEMORY_FAILURE?
> 
> 
> Some comments on the differences:
> 
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index 962616fd4ddd..1307ec400de3 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -1237,6 +1237,20 @@ static void coherent_cache_guest_page(struct kvm_vcpu 
>> *vcpu, kvm_pfn_t pfn,
>>  __coherent_cache_guest_page(vcpu, pfn, size);
>>  }
>>  
>> +static void kvm_send_hwpoison_signal(unsigned long address,
>> +struct task_struct *tsk)
>> +{
>> +siginfo_t info;
>> +
>> +info.si_signo   = SIGBUS;
>> +info.si_errno   = 0;
>> +info.si_code= BUS_MCEERR_AR;
>> +info.si_addr= (void __user *)address;
>> +info.si_addr_lsb = PAGE_SHIFT;
> 
> Any version of this patch should handle hugepage for the sizes KVM uses in its
> stage2 mappings. By just passing PAGE_SHIFT you let the guest fault for each
> page that makes up the hugepage.
> 
> 
>> +
>> +send_sig_info(SIGBUS, , tsk);
>> +}
>> +
>>  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>struct kvm_memory_slot *memslot, unsigned long hva,
>>unsigned long fault_status)
>> @@ -1309,6 +1323,12 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
>> phys_addr_t fault_ipa,
>>  if (is_error_noslot_pfn(pfn))
>>  return -EFAULT;
>>  
>> +if (is_error_hwpoison_pfn(pfn)) {
>> +kvm_send_hwpoison_signal(kvm_vcpu_gfn_to_hva(vcpu, gfn),
>> +current);
>> +return -EFAULT;
> 
> This will return -EFAULT from the KVM_RUN ioctl(). Is Qemu expected to know it
> should try again? This is indistinguishable from the is_error_noslot_pfn() 
> error
> above.
> 
> x86 returns 0 from this path, kvm_handle_bad_page() in arch/x86/kvm/mmu.c as 
> the
> SIGBUS should arrive first. If the SIGBUS is handled the error has been 
> resolved
> and Qemu can call KVM_RUN again. Returning an error and sending SIGBUS 
> suggests
> there are two problems.
  thanks for that. I think your Statement is reasonable.
> 
> 
>> +}
>> +
>>  if (kvm_is_device_pfn(pfn)) {
>>  mem_type = PAGE_S2_DEVICE;
>>  flags |= KVM_S2PTE_FLAG_IS_IOMAP;
> 
> 
> 
> Thanks,
> 
> James
> 
> 
> .
> 



Re: [PATCH] arm/arm64: KVM: send SIGBUS error to qemu

2017-03-24 Thread gengdongjiu
Hi James,
   thanks for your review.

On 2017/3/23 23:06, James Morse wrote:
> Hi Dongjiu Geng,
> 
> On 23/03/17 13:01, Dongjiu Geng wrote:
>> when the pfn is KVM_PFN_ERR_HWPOISON, it indicates to send
>> SIGBUS signal from KVM's fault-handling code to qemu, qemu
>> can handle this signal according to the fault address.
> 
> I'm afraid I beat you to it on this one:
> https://www.spinics.net/lists/arm-kernel/msg568919.html
> 
> (Are you the same gengdj who ask me to post that patch?:
>  https://lkml.org/lkml/2017/3/5/187 )

Oh, yes, it is me. recently I do not check my gmail and think you are not reply 
mail.
it is great that you upstream this patch.

> 
> We don't need upstream KVM to do this until either arm or arm64 has
> ARCH_SUPPORTS_MEMORY_FAILURE. Punit and Tyler have discovered problems with 
> the
> way arm64's hugepage and hwpoison interact:
> https://www.spinics.net/lists/arm-kernel/msg568995.html
  ok, thanks James. do you know when the arm or arm64 will have 
ARCH_SUPPORTS_MEMORY_FAILURE?
> 
> 
> Some comments on the differences:
> 
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index 962616fd4ddd..1307ec400de3 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -1237,6 +1237,20 @@ static void coherent_cache_guest_page(struct kvm_vcpu 
>> *vcpu, kvm_pfn_t pfn,
>>  __coherent_cache_guest_page(vcpu, pfn, size);
>>  }
>>  
>> +static void kvm_send_hwpoison_signal(unsigned long address,
>> +struct task_struct *tsk)
>> +{
>> +siginfo_t info;
>> +
>> +info.si_signo   = SIGBUS;
>> +info.si_errno   = 0;
>> +info.si_code= BUS_MCEERR_AR;
>> +info.si_addr= (void __user *)address;
>> +info.si_addr_lsb = PAGE_SHIFT;
> 
> Any version of this patch should handle hugepage for the sizes KVM uses in its
> stage2 mappings. By just passing PAGE_SHIFT you let the guest fault for each
> page that makes up the hugepage.
> 
> 
>> +
>> +send_sig_info(SIGBUS, , tsk);
>> +}
>> +
>>  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>struct kvm_memory_slot *memslot, unsigned long hva,
>>unsigned long fault_status)
>> @@ -1309,6 +1323,12 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
>> phys_addr_t fault_ipa,
>>  if (is_error_noslot_pfn(pfn))
>>  return -EFAULT;
>>  
>> +if (is_error_hwpoison_pfn(pfn)) {
>> +kvm_send_hwpoison_signal(kvm_vcpu_gfn_to_hva(vcpu, gfn),
>> +current);
>> +return -EFAULT;
> 
> This will return -EFAULT from the KVM_RUN ioctl(). Is Qemu expected to know it
> should try again? This is indistinguishable from the is_error_noslot_pfn() 
> error
> above.
> 
> x86 returns 0 from this path, kvm_handle_bad_page() in arch/x86/kvm/mmu.c as 
> the
> SIGBUS should arrive first. If the SIGBUS is handled the error has been 
> resolved
> and Qemu can call KVM_RUN again. Returning an error and sending SIGBUS 
> suggests
> there are two problems.
  thanks for that. I think your Statement is reasonable.
> 
> 
>> +}
>> +
>>  if (kvm_is_device_pfn(pfn)) {
>>  mem_type = PAGE_S2_DEVICE;
>>  flags |= KVM_S2PTE_FLAG_IS_IOMAP;
> 
> 
> 
> Thanks,
> 
> James
> 
> 
> .
> 



Re: [PATCH] arm/arm64: KVM: send SIGBUS error to qemu

2017-03-23 Thread James Morse
Hi Dongjiu Geng,

On 23/03/17 13:01, Dongjiu Geng wrote:
> when the pfn is KVM_PFN_ERR_HWPOISON, it indicates to send
> SIGBUS signal from KVM's fault-handling code to qemu, qemu
> can handle this signal according to the fault address.

I'm afraid I beat you to it on this one:
https://www.spinics.net/lists/arm-kernel/msg568919.html

(Are you the same gengdj who ask me to post that patch?:
 https://lkml.org/lkml/2017/3/5/187 )

We don't need upstream KVM to do this until either arm or arm64 has
ARCH_SUPPORTS_MEMORY_FAILURE. Punit and Tyler have discovered problems with the
way arm64's hugepage and hwpoison interact:
https://www.spinics.net/lists/arm-kernel/msg568995.html


Some comments on the differences:

> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 962616fd4ddd..1307ec400de3 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -1237,6 +1237,20 @@ static void coherent_cache_guest_page(struct kvm_vcpu 
> *vcpu, kvm_pfn_t pfn,
>   __coherent_cache_guest_page(vcpu, pfn, size);
>  }
>  
> +static void kvm_send_hwpoison_signal(unsigned long address,
> + struct task_struct *tsk)
> +{
> + siginfo_t info;
> +
> + info.si_signo   = SIGBUS;
> + info.si_errno   = 0;
> + info.si_code= BUS_MCEERR_AR;
> + info.si_addr= (void __user *)address;
> + info.si_addr_lsb = PAGE_SHIFT;

Any version of this patch should handle hugepage for the sizes KVM uses in its
stage2 mappings. By just passing PAGE_SHIFT you let the guest fault for each
page that makes up the hugepage.


> +
> + send_sig_info(SIGBUS, , tsk);
> +}
> +
>  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> struct kvm_memory_slot *memslot, unsigned long hva,
> unsigned long fault_status)
> @@ -1309,6 +1323,12 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
> phys_addr_t fault_ipa,
>   if (is_error_noslot_pfn(pfn))
>   return -EFAULT;
>  
> + if (is_error_hwpoison_pfn(pfn)) {
> + kvm_send_hwpoison_signal(kvm_vcpu_gfn_to_hva(vcpu, gfn),
> + current);
> + return -EFAULT;

This will return -EFAULT from the KVM_RUN ioctl(). Is Qemu expected to know it
should try again? This is indistinguishable from the is_error_noslot_pfn() error
above.

x86 returns 0 from this path, kvm_handle_bad_page() in arch/x86/kvm/mmu.c as the
SIGBUS should arrive first. If the SIGBUS is handled the error has been resolved
and Qemu can call KVM_RUN again. Returning an error and sending SIGBUS suggests
there are two problems.


> + }
> +
>   if (kvm_is_device_pfn(pfn)) {
>   mem_type = PAGE_S2_DEVICE;
>   flags |= KVM_S2PTE_FLAG_IS_IOMAP;



Thanks,

James



Re: [PATCH] arm/arm64: KVM: send SIGBUS error to qemu

2017-03-23 Thread James Morse
Hi Dongjiu Geng,

On 23/03/17 13:01, Dongjiu Geng wrote:
> when the pfn is KVM_PFN_ERR_HWPOISON, it indicates to send
> SIGBUS signal from KVM's fault-handling code to qemu, qemu
> can handle this signal according to the fault address.

I'm afraid I beat you to it on this one:
https://www.spinics.net/lists/arm-kernel/msg568919.html

(Are you the same gengdj who ask me to post that patch?:
 https://lkml.org/lkml/2017/3/5/187 )

We don't need upstream KVM to do this until either arm or arm64 has
ARCH_SUPPORTS_MEMORY_FAILURE. Punit and Tyler have discovered problems with the
way arm64's hugepage and hwpoison interact:
https://www.spinics.net/lists/arm-kernel/msg568995.html


Some comments on the differences:

> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 962616fd4ddd..1307ec400de3 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -1237,6 +1237,20 @@ static void coherent_cache_guest_page(struct kvm_vcpu 
> *vcpu, kvm_pfn_t pfn,
>   __coherent_cache_guest_page(vcpu, pfn, size);
>  }
>  
> +static void kvm_send_hwpoison_signal(unsigned long address,
> + struct task_struct *tsk)
> +{
> + siginfo_t info;
> +
> + info.si_signo   = SIGBUS;
> + info.si_errno   = 0;
> + info.si_code= BUS_MCEERR_AR;
> + info.si_addr= (void __user *)address;
> + info.si_addr_lsb = PAGE_SHIFT;

Any version of this patch should handle hugepage for the sizes KVM uses in its
stage2 mappings. By just passing PAGE_SHIFT you let the guest fault for each
page that makes up the hugepage.


> +
> + send_sig_info(SIGBUS, , tsk);
> +}
> +
>  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> struct kvm_memory_slot *memslot, unsigned long hva,
> unsigned long fault_status)
> @@ -1309,6 +1323,12 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
> phys_addr_t fault_ipa,
>   if (is_error_noslot_pfn(pfn))
>   return -EFAULT;
>  
> + if (is_error_hwpoison_pfn(pfn)) {
> + kvm_send_hwpoison_signal(kvm_vcpu_gfn_to_hva(vcpu, gfn),
> + current);
> + return -EFAULT;

This will return -EFAULT from the KVM_RUN ioctl(). Is Qemu expected to know it
should try again? This is indistinguishable from the is_error_noslot_pfn() error
above.

x86 returns 0 from this path, kvm_handle_bad_page() in arch/x86/kvm/mmu.c as the
SIGBUS should arrive first. If the SIGBUS is handled the error has been resolved
and Qemu can call KVM_RUN again. Returning an error and sending SIGBUS suggests
there are two problems.


> + }
> +
>   if (kvm_is_device_pfn(pfn)) {
>   mem_type = PAGE_S2_DEVICE;
>   flags |= KVM_S2PTE_FLAG_IS_IOMAP;



Thanks,

James



[PATCH] arm/arm64: KVM: send SIGBUS error to qemu

2017-03-23 Thread Dongjiu Geng
when the pfn is KVM_PFN_ERR_HWPOISON, it indicates to send
SIGBUS signal from KVM's fault-handling code to qemu, qemu
can handle this signal according to the fault address.

Signed-off-by: Dongjiu Geng 
---
 arch/arm/kvm/mmu.c   | 20 
 include/linux/kvm_host.h |  5 +
 2 files changed, 25 insertions(+)

diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 962616fd4ddd..1307ec400de3 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -1237,6 +1237,20 @@ static void coherent_cache_guest_page(struct kvm_vcpu 
*vcpu, kvm_pfn_t pfn,
__coherent_cache_guest_page(vcpu, pfn, size);
 }
 
+static void kvm_send_hwpoison_signal(unsigned long address,
+   struct task_struct *tsk)
+{
+   siginfo_t info;
+
+   info.si_signo   = SIGBUS;
+   info.si_errno   = 0;
+   info.si_code= BUS_MCEERR_AR;
+   info.si_addr= (void __user *)address;
+   info.si_addr_lsb = PAGE_SHIFT;
+
+   send_sig_info(SIGBUS, , tsk);
+}
+
 static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
  struct kvm_memory_slot *memslot, unsigned long hva,
  unsigned long fault_status)
@@ -1309,6 +1323,12 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
if (is_error_noslot_pfn(pfn))
return -EFAULT;
 
+   if (is_error_hwpoison_pfn(pfn)) {
+   kvm_send_hwpoison_signal(kvm_vcpu_gfn_to_hva(vcpu, gfn),
+   current);
+   return -EFAULT;
+   }
+
if (kvm_is_device_pfn(pfn)) {
mem_type = PAGE_S2_DEVICE;
flags |= KVM_S2PTE_FLAG_IS_IOMAP;
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 2c14ad9809da..610ded9ebe9b 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -91,6 +91,11 @@ static inline bool is_noslot_pfn(kvm_pfn_t pfn)
 {
return pfn == KVM_PFN_NOSLOT;
 }
+/* hwpoison pfn indicates that it needs to send SIGBUS */
+static inline bool is_error_hwpoison_pfn(kvm_pfn_t pfn)
+{
+   return pfn == KVM_PFN_ERR_HWPOISON;
+}
 
 /*
  * architectures with KVM_HVA_ERR_BAD other than PAGE_OFFSET (e.g. s390)
-- 
2.11.0



[PATCH] arm/arm64: KVM: send SIGBUS error to qemu

2017-03-23 Thread Dongjiu Geng
when the pfn is KVM_PFN_ERR_HWPOISON, it indicates to send
SIGBUS signal from KVM's fault-handling code to qemu, qemu
can handle this signal according to the fault address.

Signed-off-by: Dongjiu Geng 
---
 arch/arm/kvm/mmu.c   | 20 
 include/linux/kvm_host.h |  5 +
 2 files changed, 25 insertions(+)

diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 962616fd4ddd..1307ec400de3 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -1237,6 +1237,20 @@ static void coherent_cache_guest_page(struct kvm_vcpu 
*vcpu, kvm_pfn_t pfn,
__coherent_cache_guest_page(vcpu, pfn, size);
 }
 
+static void kvm_send_hwpoison_signal(unsigned long address,
+   struct task_struct *tsk)
+{
+   siginfo_t info;
+
+   info.si_signo   = SIGBUS;
+   info.si_errno   = 0;
+   info.si_code= BUS_MCEERR_AR;
+   info.si_addr= (void __user *)address;
+   info.si_addr_lsb = PAGE_SHIFT;
+
+   send_sig_info(SIGBUS, , tsk);
+}
+
 static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
  struct kvm_memory_slot *memslot, unsigned long hva,
  unsigned long fault_status)
@@ -1309,6 +1323,12 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
if (is_error_noslot_pfn(pfn))
return -EFAULT;
 
+   if (is_error_hwpoison_pfn(pfn)) {
+   kvm_send_hwpoison_signal(kvm_vcpu_gfn_to_hva(vcpu, gfn),
+   current);
+   return -EFAULT;
+   }
+
if (kvm_is_device_pfn(pfn)) {
mem_type = PAGE_S2_DEVICE;
flags |= KVM_S2PTE_FLAG_IS_IOMAP;
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 2c14ad9809da..610ded9ebe9b 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -91,6 +91,11 @@ static inline bool is_noslot_pfn(kvm_pfn_t pfn)
 {
return pfn == KVM_PFN_NOSLOT;
 }
+/* hwpoison pfn indicates that it needs to send SIGBUS */
+static inline bool is_error_hwpoison_pfn(kvm_pfn_t pfn)
+{
+   return pfn == KVM_PFN_ERR_HWPOISON;
+}
 
 /*
  * architectures with KVM_HVA_ERR_BAD other than PAGE_OFFSET (e.g. s390)
-- 
2.11.0