Re: [PATCH v4 05/15] KVM: arm64: Export kvm_handle_user_mem_abort()

2022-01-12 Thread Gavin Shan

Hi Eric,

On 11/11/21 2:02 AM, Eric Auger wrote:

On 8/15/21 2:59 AM, Gavin Shan wrote:

The main work of stage-2 page fault is handled by user_mem_abort().
When asynchronous page fault is supported, one page fault need to
be handled with two calls to this function. It means the page fault
needs to be replayed asynchronously in that case.

* This renames the function to kvm_handle_user_mem_abort() and
  exports it.

* Add arguments @esr and @prefault to user_mem_abort(). @esr is
  the cached value of ESR_EL2 instead of fetching from the current
  vCPU when the page fault is replayed in scenario of asynchronous
  page fault. @prefault is used to indicate the page fault is replayed
  one or not.

Also explain that fault_status arg is not needed anymore as derived from
@esr because otherwise at first sight a distracted reviewer like me may
have the impression you replaced fault_status by prefault while it is
totally unrelated


Yep, good point. Will do in next respin.



* Define helper functions esr_dbat_*() in asm/esr.h to extract
  or check various fields of the passed ESR_EL2 value because
  those helper functions defined in asm/kvm_emulate.h assumes
  the ESR_EL2 value has been cached in vCPU struct. It won't
  be true on handling the replayed page fault in scenario of
  asynchronous page fault.

I would introduce a seperate preliminary patch with those esr macros and
changes to the call sites + changes below.


Ok. I will split this patch into two.



* Some helper functions defined in asm/kvm_emulate.h are used
  by mmu.c only and seem not to be used by other source file
  in near future. They are moved to mmu.c and renamed accordingly.>
  is_exec_fault: kvm_vcpu_trap_is_exec_fault
  is_write_fault: kvm_is_write_fault()

Signed-off-by: Gavin Shan 
---
  arch/arm64/include/asm/esr.h |  6 
  arch/arm64/include/asm/kvm_emulate.h | 27 ++---
  arch/arm64/include/asm/kvm_host.h|  4 +++
  arch/arm64/kvm/mmu.c | 43 ++--
  4 files changed, 48 insertions(+), 32 deletions(-)

diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
index 29f97eb3dad4..0f2cb27691de 100644
--- a/arch/arm64/include/asm/esr.h
+++ b/arch/arm64/include/asm/esr.h
@@ -321,8 +321,14 @@
 ESR_ELx_CP15_32_ISS_DIR_READ)
  
  #ifndef __ASSEMBLY__

+#include 
  #include 
  
+#define esr_dabt_fault_type(esr)	(esr & ESR_ELx_FSC_TYPE)

+#define esr_dabt_fault_level(esr)  (FIELD_GET(ESR_ELx_FSC_LEVEL, esr))
+#define esr_dabt_is_wnr(esr)   (!!(FIELD_GET(ESR_ELx_WNR, esr)))
+#define esr_dabt_is_s1ptw(esr) (!!(FIELD_GET(ESR_ELx_S1PTW, esr)))
+
  static inline bool esr_is_data_abort(u32 esr)
  {
const u32 ec = ESR_ELx_EC(esr);
diff --git a/arch/arm64/include/asm/kvm_emulate.h 
b/arch/arm64/include/asm/kvm_emulate.h
index 923b4d08ea9a..90742f4b1acd 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -285,13 +285,13 @@ static __always_inline int kvm_vcpu_dabt_get_rd(const 
struct kvm_vcpu *vcpu)
  
  static __always_inline bool kvm_vcpu_abt_iss1tw(const struct kvm_vcpu *vcpu)

  {
-   return !!(kvm_vcpu_get_esr(vcpu) & ESR_ELx_S1PTW);
+   return esr_dabt_is_s1ptw(kvm_vcpu_get_esr(vcpu));
  }
  
  /* Always check for S1PTW *before* using this. */

  static __always_inline bool kvm_vcpu_dabt_iswrite(const struct kvm_vcpu *vcpu)
  {
-   return kvm_vcpu_get_esr(vcpu) & ESR_ELx_WNR;
+   return esr_dabt_is_wnr(kvm_vcpu_get_esr(vcpu));
  }
  
  static inline bool kvm_vcpu_dabt_is_cm(const struct kvm_vcpu *vcpu)

@@ -320,11 +320,6 @@ static inline bool kvm_vcpu_trap_is_iabt(const struct 
kvm_vcpu *vcpu)
return kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_IABT_LOW;
  }
  
-static inline bool kvm_vcpu_trap_is_exec_fault(const struct kvm_vcpu *vcpu)

-{
-   return kvm_vcpu_trap_is_iabt(vcpu) && !kvm_vcpu_abt_iss1tw(vcpu);
-}
-
  static __always_inline u8 kvm_vcpu_trap_get_fault(const struct kvm_vcpu *vcpu)
  {
return kvm_vcpu_get_esr(vcpu) & ESR_ELx_FSC;
@@ -332,12 +327,7 @@ static __always_inline u8 kvm_vcpu_trap_get_fault(const 
struct kvm_vcpu *vcpu)
  
  static __always_inline u8 kvm_vcpu_trap_get_fault_type(const struct kvm_vcpu *vcpu)

  {
-   return kvm_vcpu_get_esr(vcpu) & ESR_ELx_FSC_TYPE;
-}
-
-static __always_inline u8 kvm_vcpu_trap_get_fault_level(const struct kvm_vcpu 
*vcpu)
-{
-   return kvm_vcpu_get_esr(vcpu) & ESR_ELx_FSC_LEVEL;
+   return esr_dabt_fault_type(kvm_vcpu_get_esr(vcpu));
  }
  
  static __always_inline bool kvm_vcpu_abt_issea(const struct kvm_vcpu *vcpu)

@@ -365,17 +355,6 @@ static __always_inline int kvm_vcpu_sys_get_rt(struct 
kvm_vcpu *vcpu)
return ESR_ELx_SYS64_ISS_RT(esr);
  }
  
-static inline bool kvm_is_write_fault(struct kvm_vcpu *vcpu)

-{
-   if (kvm_vcpu_abt_iss1tw(vcpu))
-   return true;

Re: [PATCH v4 04/15] KVM: x86: Use generic async PF slot management

2022-01-12 Thread Gavin Shan

Hi Eric,

On 11/11/21 1:03 AM, Eric Auger wrote:

On 8/15/21 2:59 AM, Gavin Shan wrote:

This uses the generic slot management mechanism for asynchronous

Now we have moved the hash table management in the generic code, Use
this latter ...


Ok.


page fault by enabling CONFIG_KVM_ASYNC_PF_SLOT because the private
implementation is totally duplicate to the generic one.

The changes introduced by this is pretty mechanical and shouldn't
cause any logical changes.

suggest: No functional change intended.


Ok. The commit log will be improved accordingly in next respin.



Signed-off-by: Gavin Shan 
---
  arch/x86/include/asm/kvm_host.h |  2 -
  arch/x86/kvm/Kconfig|  1 +
  arch/x86/kvm/mmu/mmu.c  |  2 +-
  arch/x86/kvm/x86.c  | 86 +++--
  4 files changed, 8 insertions(+), 83 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 974cbfb1eefe..409c1e7137cd 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -810,7 +810,6 @@ struct kvm_vcpu_arch {
  
  	struct {

bool halted;
-   gfn_t gfns[ASYNC_PF_PER_VCPU];
struct gfn_to_hva_cache data;
u64 msr_en_val; /* MSR_KVM_ASYNC_PF_EN */
u64 msr_int_val; /* MSR_KVM_ASYNC_PF_INT */
@@ -1878,7 +1877,6 @@ void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu,
   struct kvm_async_pf *work);
  void kvm_arch_async_page_present_queued(struct kvm_vcpu *vcpu);
  bool kvm_arch_can_dequeue_async_page_present(struct kvm_vcpu *vcpu);
-extern bool kvm_find_async_pf_gfn(struct kvm_vcpu *vcpu, gfn_t gfn);
  
  int kvm_skip_emulated_instruction(struct kvm_vcpu *vcpu);

  int kvm_complete_insn_gp(struct kvm_vcpu *vcpu, int err);
diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index ac69894eab88..53a6ef30b6ee 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -32,6 +32,7 @@ config KVM
select HAVE_KVM_IRQ_ROUTING
select HAVE_KVM_EVENTFD
select KVM_ASYNC_PF
+   select KVM_ASYNC_PF_SLOT
select USER_RETURN_NOTIFIER
select KVM_MMIO
select SCHED_INFO
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index c4f4fa23320e..cd8aaa662ac2 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3799,7 +3799,7 @@ static bool try_async_pf(struct kvm_vcpu *vcpu, bool 
prefault, gfn_t gfn,
  
  	if (!prefault && kvm_can_do_async_pf(vcpu)) {

trace_kvm_try_async_get_page(cr2_or_gpa, gfn);
-   if (kvm_find_async_pf_gfn(vcpu, gfn)) {
+   if (kvm_async_pf_find_slot(vcpu, gfn)) {
trace_kvm_async_pf_doublefault(cr2_or_gpa, gfn);
kvm_make_request(KVM_REQ_APF_HALT, vcpu);
return true;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7f35d9324b99..a5f7d6122178 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -332,13 +332,6 @@ static struct kmem_cache *kvm_alloc_emulator_cache(void)
  
  static int emulator_fix_hypercall(struct x86_emulate_ctxt *ctxt);
  
-static inline void kvm_async_pf_hash_reset(struct kvm_vcpu *vcpu)

-{
-   int i;
-   for (i = 0; i < ASYNC_PF_PER_VCPU; i++)
-   vcpu->arch.apf.gfns[i] = ~0;
-}
-
  static void kvm_on_user_return(struct user_return_notifier *urn)
  {
unsigned slot;
@@ -854,7 +847,7 @@ void kvm_post_set_cr0(struct kvm_vcpu *vcpu, unsigned long 
old_cr0, unsigned lon
  {
if ((cr0 ^ old_cr0) & X86_CR0_PG) {
kvm_clear_async_pf_completion_queue(vcpu);
-   kvm_async_pf_hash_reset(vcpu);
+   kvm_async_pf_reset_slot(vcpu);
}
  
  	if ((cr0 ^ old_cr0) & KVM_MMU_CR0_ROLE_BITS)

@@ -3118,7 +3111,7 @@ static int kvm_pv_enable_async_pf(struct kvm_vcpu *vcpu, 
u64 data)
  
  	if (!kvm_pv_async_pf_enabled(vcpu)) {

kvm_clear_async_pf_completion_queue(vcpu);
-   kvm_async_pf_hash_reset(vcpu);
+   kvm_async_pf_reset_slot(vcpu);
return 0;
}
  
@@ -10704,7 +10697,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
  
  	vcpu->arch.pat = MSR_IA32_CR_PAT_DEFAULT;
  
-	kvm_async_pf_hash_reset(vcpu);

+   kvm_async_pf_reset_slot(vcpu);
kvm_pmu_init(vcpu);
  
  	vcpu->arch.pending_external_vector = -1;

@@ -10828,7 +10821,7 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool 
init_event)
kvmclock_reset(vcpu);
  
  	kvm_clear_async_pf_completion_queue(vcpu);

-   kvm_async_pf_hash_reset(vcpu);
+   kvm_async_pf_reset_slot(vcpu);
vcpu->arch.apf.halted = false;
  
  	if (vcpu->arch.guest_fpu && kvm_mpx_supported()) {

@@ -11737,73 +11730,6 @@ void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu, 
struct kvm_async_pf *work)
kvm_mmu_do_page_fault(vcpu, work->cr2_or_gpa, 0, true);
  }
  
-static inline u32 kvm_async_pf_hash_fn(gfn_t gfn)

-{
-   

Re: [PATCH v4 03/15] KVM: async_pf: Make GFN slot management generic

2022-01-12 Thread Gavin Shan

Hi Eric,

On 11/11/21 1:00 AM, Eric Auger wrote:

On 8/15/21 2:59 AM, Gavin Shan wrote:

It's not allowed to fire duplicate notification for same GFN on
x86 platform, with help of a hash table. This mechanism is going

s/, with help of a hash table/this is achieved through a hash table

to be used by arm64 and this makes the code generic and shareable

s/and this makes/.\n Turn the code generic

by multiple platforms.

* As this mechanism isn't needed by all platforms, a new kernel
  config option (CONFIG_ASYNC_PF_SLOT) is introduced so that it
  can be disabled at compiling time.

compile time


Ok.



* The code is basically copied from x86 platform and the functions
  are renamed to reflect the fact: (a) the input parameters are
  vCPU and GFN.

not for reset
(b) The operations are resetting, searching, adding


Ok.


  and removing.

find, add, remove ops are renamed with _slot suffix


Ok. The commit log will be improved based on your suggestions in
next respin :)



* Helper stub is also added on !CONFIG_KVM_ASYNC_PF because we're
  going to use IS_ENABLED() instead of #ifdef on arm64 when the
  asynchronous page fault is supported.

This is preparatory work to use the newly introduced functions on x86
platform and arm64 in subsequent patches.

Signed-off-by: Gavin Shan 
---
  include/linux/kvm_host.h | 18 +
  virt/kvm/Kconfig |  3 ++
  virt/kvm/async_pf.c  | 85 
  3 files changed, 106 insertions(+)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index a5f990f6dc35..a9685c2b2250 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -298,6 +298,9 @@ struct kvm_vcpu {
  
  #ifdef CONFIG_KVM_ASYNC_PF

struct {
+#ifdef CONFIG_KVM_ASYNC_PF_SLOT
+   gfn_t gfns[ASYNC_PF_PER_VCPU];
+#endif
u32 queued;
struct list_head queue;
struct list_head done;
@@ -339,6 +342,13 @@ struct kvm_async_pf {
boolnotpresent_injected;
  };
  
+#ifdef CONFIG_KVM_ASYNC_PF_SLOT

+void kvm_async_pf_reset_slot(struct kvm_vcpu *vcpu);

this does not reset a "slot" but the whole hash table. So to me this
shouldn't be renamed with _slot suffix. reset_hash or reset_all_slots?


Sure, lets have kvm_async_pf_reset_all_slots() in next respin.


+void kvm_async_pf_add_slot(struct kvm_vcpu *vcpu, gfn_t gfn);
+void kvm_async_pf_remove_slot(struct kvm_vcpu *vcpu, gfn_t gfn);
+bool kvm_async_pf_find_slot(struct kvm_vcpu *vcpu, gfn_t gfn);
+#endif
+
  static inline bool kvm_check_async_pf_completion_queue(struct kvm_vcpu *vcpu)
  {
return !list_empty_careful(>async_pf.done);
@@ -350,6 +360,14 @@ bool kvm_setup_async_pf(struct kvm_vcpu *vcpu, gpa_t 
cr2_or_gpa,
unsigned long hva, struct kvm_arch_async_pf *arch);
  int kvm_async_pf_wakeup_all(struct kvm_vcpu *vcpu);
  #else
+static inline void kvm_async_pf_reset_slot(struct kvm_vcpu *vcpu) { }
+static inline void kvm_async_pf_add_slot(struct kvm_vcpu *vcpu, gfn_t gfn) { }
+static inline void kvm_async_pf_remove_slot(struct kvm_vcpu *vcpu, gfn_t gfn) 
{ }
+static inline bool kvm_async_pf_find_slot(struct kvm_vcpu *vcpu, gfn_t gfn)
+{
+   return false;
+}
+
  static inline bool kvm_check_async_pf_completion_queue(struct kvm_vcpu *vcpu)
  {
return false;
diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
index 62b39149b8c8..59b518c8c205 100644
--- a/virt/kvm/Kconfig
+++ b/virt/kvm/Kconfig
@@ -23,6 +23,9 @@ config KVM_MMIO
  config KVM_ASYNC_PF
 bool
  
+config KVM_ASYNC_PF_SLOT

+   bool
+
  # Toggle to switch between direct notification and batch job
  config KVM_ASYNC_PF_SYNC
 bool
diff --git a/virt/kvm/async_pf.c b/virt/kvm/async_pf.c
index d145a61a046a..0d1fdb2932af 100644
--- a/virt/kvm/async_pf.c
+++ b/virt/kvm/async_pf.c
@@ -13,12 +13,97 @@
  #include 
  #include 
  #include 
+#ifdef CONFIG_KVM_ASYNC_PF_SLOT
+#include 
+#endif
  
  #include "async_pf.h"

  #include 
  
  static struct kmem_cache *async_pf_cache;
  
+#ifdef CONFIG_KVM_ASYNC_PF_SLOT

+static inline u32 kvm_async_pf_hash(gfn_t gfn)
+{
+   BUILD_BUG_ON(!is_power_of_2(ASYNC_PF_PER_VCPU));
+
+   return hash_32(gfn & 0x, order_base_2(ASYNC_PF_PER_VCPU));
+}
+
+static inline u32 kvm_async_pf_next_slot(u32 key)
+{
+   return (key + 1) & (ASYNC_PF_PER_VCPU - 1);
+}
+
+static u32 kvm_async_pf_slot(struct kvm_vcpu *vcpu, gfn_t gfn)
+{
+   u32 key = kvm_async_pf_hash(gfn);
+   int i;
+
+   for (i = 0; i < ASYNC_PF_PER_VCPU &&
+   (vcpu->async_pf.gfns[key] != gfn &&
+   vcpu->async_pf.gfns[key] != ~0); i++)
+   key = kvm_async_pf_next_slot(key);
+
+   return key;
+}
+
+void kvm_async_pf_reset_slot(struct kvm_vcpu *vcpu)
+{
+   int i;
+
+   for (i = 0; i < ASYNC_PF_PER_VCPU; i++)
+   vcpu->async_pf.gfns[i] = ~0;
+}
+
+void kvm_async_pf_add_slot(struct 

Re: [PATCH v4 02/15] KVM: async_pf: Add helper function to check completion queue

2022-01-12 Thread Gavin Shan

Hi Eric,

On 11/10/21 11:37 PM, Eric Auger wrote:

On 8/15/21 2:59 AM, Gavin Shan wrote:

This adds inline helper kvm_check_async_pf_completion_queue() to
check if there are pending completion in the queue. The empty stub
is also added on !CONFIG_KVM_ASYNC_PF so that the caller needn't
consider if CONFIG_KVM_ASYNC_PF is enabled.

All checks on the completion queue is done by the newly added inline
function since list_empty() and list_empty_careful() are interchangeable.

why is it interchangeable?



I think the commit log is misleading. list_empty_careful() is more strict
than list_empty(). In this patch, we replace list_empty() with 
list_empty_careful().
I will correct the commit log in next respin like below:

   All checks on the completion queue is done by the newly added inline
   function where list_empty_careful() instead of list_empty() is used.
 


Signed-off-by: Gavin Shan 
---
  arch/x86/kvm/x86.c   |  2 +-
  include/linux/kvm_host.h | 10 ++
  virt/kvm/async_pf.c  | 10 +-
  virt/kvm/kvm_main.c  |  4 +---
  4 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e5d5c5ed7dd4..7f35d9324b99 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11591,7 +11591,7 @@ static inline bool kvm_guest_apic_has_interrupt(struct 
kvm_vcpu *vcpu)
  
  static inline bool kvm_vcpu_has_events(struct kvm_vcpu *vcpu)

  {
-   if (!list_empty_careful(>async_pf.done))
+   if (kvm_check_async_pf_completion_queue(vcpu))
return true;
  
  	if (kvm_apic_has_events(vcpu))

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 85b61a456f1c..a5f990f6dc35 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -339,12 +339,22 @@ struct kvm_async_pf {
boolnotpresent_injected;
  };
  
+static inline bool kvm_check_async_pf_completion_queue(struct kvm_vcpu *vcpu)

+{
+   return !list_empty_careful(>async_pf.done);
+}
+
  void kvm_clear_async_pf_completion_queue(struct kvm_vcpu *vcpu);
  void kvm_check_async_pf_completion(struct kvm_vcpu *vcpu);
  bool kvm_setup_async_pf(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
unsigned long hva, struct kvm_arch_async_pf *arch);
  int kvm_async_pf_wakeup_all(struct kvm_vcpu *vcpu);
  #else
+static inline bool kvm_check_async_pf_completion_queue(struct kvm_vcpu *vcpu)
+{
+   return false;
+}
+
  static inline void kvm_check_async_pf_completion(struct kvm_vcpu *vcpu) { }
  #endif
  
diff --git a/virt/kvm/async_pf.c b/virt/kvm/async_pf.c

index dd777688d14a..d145a61a046a 100644
--- a/virt/kvm/async_pf.c
+++ b/virt/kvm/async_pf.c
@@ -70,7 +70,7 @@ static void async_pf_execute(struct work_struct *work)
kvm_arch_async_page_present(vcpu, apf);
  
  	spin_lock(>async_pf.lock);

-   first = list_empty(>async_pf.done);
+   first = !kvm_check_async_pf_completion_queue(vcpu);
list_add_tail(>link, >async_pf.done);
apf->vcpu = NULL;
spin_unlock(>async_pf.lock);
@@ -122,7 +122,7 @@ void kvm_clear_async_pf_completion_queue(struct kvm_vcpu 
*vcpu)
spin_lock(>async_pf.lock);
}
  
-	while (!list_empty(>async_pf.done)) {

+   while (kvm_check_async_pf_completion_queue(vcpu)) {

this is replaced by a stronger check. Please can you explain why is it
equivalent?


Access to the completion queue is protected by spinlock. So the additional
check in list_empty_careful() to verify the head's prev/next are modified
on the fly shouldn't happen. It means they're same in our case.


struct kvm_async_pf *work =
list_first_entry(>async_pf.done,
 typeof(*work), link);
@@ -138,7 +138,7 @@ void kvm_check_async_pf_completion(struct kvm_vcpu *vcpu)
  {
struct kvm_async_pf *work;
  
-	while (!list_empty_careful(>async_pf.done) &&

+   while (kvm_check_async_pf_completion_queue(vcpu) &&
  kvm_arch_can_dequeue_async_page_present(vcpu)) {
spin_lock(>async_pf.lock);
work = list_first_entry(>async_pf.done, typeof(*work),
@@ -205,7 +205,7 @@ int kvm_async_pf_wakeup_all(struct kvm_vcpu *vcpu)
struct kvm_async_pf *work;
bool first;
  
-	if (!list_empty_careful(>async_pf.done))

+   if (kvm_check_async_pf_completion_queue(vcpu))
return 0;
  
  	work = kmem_cache_zalloc(async_pf_cache, GFP_ATOMIC);

@@ -216,7 +216,7 @@ int kvm_async_pf_wakeup_all(struct kvm_vcpu *vcpu)
INIT_LIST_HEAD(>queue); /* for list_del to work */
  
  	spin_lock(>async_pf.lock);

-   first = list_empty(>async_pf.done);
+   first = !kvm_check_async_pf_completion_queue(vcpu);
list_add_tail(>link, >async_pf.done);
spin_unlock(>async_pf.lock);
  
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c

index b50dbe269f4b..8795503651b1 100644
--- a/virt/kvm/kvm_main.c
+++ 

Re: [PATCH v4 01/15] KVM: async_pf: Move struct kvm_async_pf around

2022-01-12 Thread Gavin Shan

Hi Eric,

On 11/10/21 11:37 PM, Eric Auger wrote:

On 8/15/21 2:59 AM, Gavin Shan wrote:

This moves the definition of "struct kvm_async_pf" and the related
functions after "struct kvm_vcpu" so that newly added inline functions
in the subsequent patches can dereference "struct kvm_vcpu" properly.
Otherwise, the unexpected build error will be raised:

error: dereferencing pointer to incomplete type ‘struct kvm_vcpu’
return !list_empty_careful(>async_pf.done);
^~
Since we're here, the sepator between type and field in "struct kvm_vcpu"

separator


Thanks, It will be fixed in next respin.


is replaced by tab. The empty stub kvm_check_async_pf_completion() is also
added on !CONFIG_KVM_ASYNC_PF, which is needed by subsequent patches to
support asynchronous page fault on ARM64.

Signed-off-by: Gavin Shan 
---
  include/linux/kvm_host.h | 44 +---
  1 file changed, 23 insertions(+), 21 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index ae7735b490b4..85b61a456f1c 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -199,27 +199,6 @@ int kvm_io_bus_unregister_dev(struct kvm *kvm, enum 
kvm_bus bus_idx,
  struct kvm_io_device *kvm_io_bus_get_dev(struct kvm *kvm, enum kvm_bus 
bus_idx,
 gpa_t addr);
  
-#ifdef CONFIG_KVM_ASYNC_PF

-struct kvm_async_pf {
-   struct work_struct work;
-   struct list_head link;
-   struct list_head queue;
-   struct kvm_vcpu *vcpu;
-   struct mm_struct *mm;
-   gpa_t cr2_or_gpa;
-   unsigned long addr;
-   struct kvm_arch_async_pf arch;
-   bool   wakeup_all;
-   bool notpresent_injected;
-};
-
-void kvm_clear_async_pf_completion_queue(struct kvm_vcpu *vcpu);
-void kvm_check_async_pf_completion(struct kvm_vcpu *vcpu);
-bool kvm_setup_async_pf(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
-   unsigned long hva, struct kvm_arch_async_pf *arch);
-int kvm_async_pf_wakeup_all(struct kvm_vcpu *vcpu);
-#endif
-
  #ifdef KVM_ARCH_WANT_MMU_NOTIFIER
  struct kvm_gfn_range {
struct kvm_memory_slot *slot;
@@ -346,6 +325,29 @@ struct kvm_vcpu {
struct kvm_dirty_ring dirty_ring;
  };
  
+#ifdef CONFIG_KVM_ASYNC_PF

+struct kvm_async_pf {
+   struct work_struct  work;
+   struct list_headlink;
+   struct list_headqueue;
+   struct kvm_vcpu *vcpu;
+   struct mm_struct*mm;
+   gpa_t   cr2_or_gpa;
+   unsigned long   addr;
+   struct kvm_arch_async_pfarch;
+   boolwakeup_all;
+   boolnotpresent_injected;
+};
+
+void kvm_clear_async_pf_completion_queue(struct kvm_vcpu *vcpu);
+void kvm_check_async_pf_completion(struct kvm_vcpu *vcpu);
+bool kvm_setup_async_pf(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
+   unsigned long hva, struct kvm_arch_async_pf *arch);
+int kvm_async_pf_wakeup_all(struct kvm_vcpu *vcpu);
+#else
+static inline void kvm_check_async_pf_completion(struct kvm_vcpu *vcpu) { }

why is that stub needed on ARM64 and not on the other archs?



We use the following pattern, suggested by James Morse.

int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
{
int r;
switch (ext) {
  :
case KVM_CAP_ASYNC_PF:
case KVM_CAP_ASYNC_PF_INT:
r = IS_ENABLED(CONFIG_KVM_ASYNC_PF) ? 1 : 0;
break;
default:
r = 0;
}

return r;
}

Thanks,
Gavin


+#endif
+
  /* must be called with irqs disabled */
  static __always_inline void guest_enter_irqoff(void)
  {





___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v4 06/21] KVM: arm64: Support SDEI_EVENT_CONTEXT hypercall

2022-01-12 Thread Gavin Shan

Hi Shannon,

On 1/13/22 3:02 PM, Gavin Shan wrote:

On 1/11/22 5:43 PM, Shannon Zhao wrote:

On 2021/8/15 8:13, Gavin Shan wrote:

+static unsigned long kvm_sdei_hypercall_context(struct kvm_vcpu *vcpu)
+{
+    struct kvm *kvm = vcpu->kvm;
+    struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+    struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+    struct kvm_sdei_vcpu_regs *regs;
+    unsigned long index = smccc_get_arg1(vcpu);
+    unsigned long ret = SDEI_SUCCESS;
+
+    /* Sanity check */
+    if (!(ksdei && vsdei)) {
+    ret = SDEI_NOT_SUPPORTED;
+    goto out;
+    }

Maybe we could move these common sanity check codes to kvm_sdei_hypercall to 
save some lines.



Not all hypercalls need this check. For example, 
COMPLETE/COMPLETE_RESUME/CONTEXT don't
have SDEI event number as the argument. If we really want move this check into 
function
kvm_sdei_hypercall(), we would have code like below. Too much duplicated 
snippets will
be seen. I don't think it's better than what we have if I fully understand your 
comments.



oops... sorry. Please ignore my previous reply. I thought you talk about
the check on the SDEI event number wrongly. Yes, you're correct that the
check should be moved to kvm_sdei_hypercall().

Thanks,
Gavin

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v4 02/21] KVM: arm64: Add SDEI virtualization infrastructure

2022-01-12 Thread Gavin Shan

Hi Shannon,

On 1/11/22 5:40 PM, Shannon Zhao wrote:

On 2021/8/15 8:13, Gavin Shan wrote:

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index e9a2b8f27792..2f021aa41632 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -150,6 +150,8 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
  kvm_vgic_early_init(kvm);
+    kvm_sdei_init_vm(kvm);
+
  /* The maximum number of VCPUs is limited by the host's GIC model */
  kvm->arch.max_vcpus = kvm_arm_default_max_vcpus();

Hi, Is it possible to let user space to choose whether enabling SEDI or not 
rather than enable it by default?



It's possible, but what's the benefit to do so. I think about it for
a while and I don't think it's not necessary, at least for now. First
of all, the SDEI event is injected from individual modules in userspace
(QEMU) or host kernel (Async PF). If we really want the function to be
disabled, the individual modules can accept parameter, used to indicate
the SDEI event injection is allowed or not. In this case, SDEI is enabled
by default, but the individual modules can choose not to use it :)

Thanks,
Gavin

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v4 06/21] KVM: arm64: Support SDEI_EVENT_CONTEXT hypercall

2022-01-12 Thread Gavin Shan

Hi Shannon,

On 1/11/22 5:43 PM, Shannon Zhao wrote:

On 2021/8/15 8:13, Gavin Shan wrote:

+static unsigned long kvm_sdei_hypercall_context(struct kvm_vcpu *vcpu)
+{
+    struct kvm *kvm = vcpu->kvm;
+    struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+    struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+    struct kvm_sdei_vcpu_regs *regs;
+    unsigned long index = smccc_get_arg1(vcpu);
+    unsigned long ret = SDEI_SUCCESS;
+
+    /* Sanity check */
+    if (!(ksdei && vsdei)) {
+    ret = SDEI_NOT_SUPPORTED;
+    goto out;
+    }

Maybe we could move these common sanity check codes to kvm_sdei_hypercall to 
save some lines.



Not all hypercalls need this check. For example, 
COMPLETE/COMPLETE_RESUME/CONTEXT don't
have SDEI event number as the argument. If we really want move this check into 
function
kvm_sdei_hypercall(), we would have code like below. Too much duplicated 
snippets will
be seen. I don't think it's better than what we have if I fully understand your 
comments.

  switch (...) {
  case REGISTER:
   if (!(ksdei && vsdei)) {
   ret = SDEI_NOT_SUPPORTED;
   break;
   }

   ret = kvm_sdei_hypercall_register(vcpu);
   break;
  case UNREGISTER:
   if (!(ksdei && vsdei)) {
   ret = SDEI_NOT_SUPPORTED;
   break;
   }

   ret = kvm_sdei_hypercall_unregister(vcpu);
   break;
 case CONTEXT:
   ret = kvm_sdei_hypercall_context(vcpu);
   break;
   :
}

Thanks,
Gavin

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [kvm-unit-tests PATCH v2] arm64: debug: mark test_[bp,wp,ss] as noinline

2022-01-12 Thread Andrew Jones
On Wed, Jan 12, 2022 at 07:21:55AM -0800, Ricardo Koller wrote:
> Clang inlines some functions (like test_ss) which define global labels
> in inline assembly (e.g., ss_start). This results in:
> 
> arm/debug.c:382:15: error: invalid symbol redefinition
> asm volatile("ss_start:\n"
>  ^
> :1:2: note: instantiated into assembly here
> ss_start:
> ^
> 1 error generated.
> 
> Fix these functions by marking them as "noinline".
> 
> Cc: Andrew Jones 
> Signed-off-by: Ricardo Koller 
> ---
> This applies on top of: "[kvm-unit-tests PATCH 0/3] arm64: debug: add 
> migration tests for debug state"
> which is in https://gitlab.com/rhdrjones/kvm-unit-tests/-/commits/arm/queue.
> 
>  arm/debug.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/arm/debug.c b/arm/debug.c
> index 54f059d..e9f8056 100644
> --- a/arm/debug.c
> +++ b/arm/debug.c
> @@ -264,7 +264,7 @@ static void do_migrate(void)
>   report_info("Migration complete");
>  }
>  
> -static void test_hw_bp(bool migrate)
> +static noinline void test_hw_bp(bool migrate)
>  {
>   extern unsigned char hw_bp0;
>   uint32_t bcr;
> @@ -310,7 +310,7 @@ static void test_hw_bp(bool migrate)
>  
>  static volatile char write_data[16];
>  
> -static void test_wp(bool migrate)
> +static noinline void test_wp(bool migrate)
>  {
>   uint32_t wcr;
>   uint32_t mdscr;
> @@ -353,7 +353,7 @@ static void test_wp(bool migrate)
>   }
>  }
>  
> -static void test_ss(bool migrate)
> +static noinline void test_ss(bool migrate)
>  {
>   extern unsigned char ss_start;
>   uint32_t mdscr;
> -- 
> 2.34.1.575.g55b058a8bb-goog
>

Applied to arm/queue.

Thanks,
drew 

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC PATCH 3/3] KVM: selftests: Add vgic initialization for dirty log perf test for ARM

2022-01-12 Thread Marc Zyngier
On Tue, 11 Jan 2022 22:16:01 +,
Jing Zhang  wrote:
> 
> On Tue, Jan 11, 2022 at 2:30 AM Marc Zyngier  wrote:
> >
> > On Mon, 10 Jan 2022 21:04:41 +,
> > Jing Zhang  wrote:
> > >
> > > For ARM64, if no vgic is setup before the dirty log perf test, the
> > > userspace irqchip would be used, which would affect the dirty log perf
> > > test result.
> >
> > Doesn't it affect *all* performance tests? How much does this change
> > contributes to the performance numbers you give in the cover letter?
> >
> This bottleneck showed up after adding the fast path patch. I didn't
> try other performance tests with this, but I think it is a good idea
> to add a vgic setup for all performance tests. I can post another
> patch later to make it available for all performance tests after
> finishing this one and verifying all other performance tests.
> Below is the test result without adding the vgic setup. It shows
> 20~30% improvement for the different number of vCPUs.
> +---++
> | #vCPU | dirty memory time (ms) |
> +---++
> | 1 | 965|
> +---++
> | 2 | 1006|
> +---++
> | 4 | 1128|
> +---++
> | 8 | 2005   |
> +---++
> | 16| 3903   |
> +---++
> | 32| 7595   |
> +---++
> | 64| 15783  |
> +---++

So please use these numbers in your cover letter when you repost your
series, as the improvement you'd observe on actual workloads is likely
to be less than what you claim due to this change in the test itself
(in other words, if you are going to benchamark something, don't
change the benchmark halfway).

M.

-- 
Without deviation from the norm, progress is not possible.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm