Re: [PATCH] kvm: arm: Skip stage2 huge mappings for unaligned ipa backed by THP

2019-04-09 Thread Zenghui Yu


On 2019/4/9 22:59, Suzuki K Poulose wrote:

Hi Zenghui

On 04/09/2019 09:05 AM, Zenghui Yu wrote:



On 2019/4/9 2:40, Suzuki K Poulose wrote:

Hi Zenhui,

On 04/08/2019 04:11 PM, Zenghui Yu wrote:

Hi Suzuki,

Thanks for the reply.



...


Hi Suzuki,

Why not making use of fault_supports_stage2_huge_mapping()?  Let 
it do

some checks for us.

fault_supports_stage2_huge_mapping() was intended to do a *two-step*
check to tell us that can we create stage2 huge block mappings, 
and this
check is both for hugetlbfs and THP.  With commit 
a80868f398554842b14,
we pass PAGE_SIZE as "map_size" for normal size pages (which 
turned out

to be almost meaningless), and unfortunately the THP check no longer
works.


Thats correct.



So we want to rework *THP* check process.  Your patch fixes the first
checking-step, but the second is still missed, am I wrong?


It fixes the step explicitly for the THP by making sure that the 
GPA and

the HVA are aligned to the map size.


Yes, I understand how your patch had fixed the issue.  But what I'm
really concerned about here is the *second* checking-step in
fault_supports_stage2_huge_mapping().

We have to check if we are mapping a non-block aligned or non-block
sized memslot, if so, we can not create block mappings for the 
beginning

and end of this memslot.  This is what the second part of
fault_supports_stage2_huge_mapping() had done.

I haven't seen this checking-step in your patch, did I miss something?



I see.


I don't think this calls for a VM_BUG_ON(). It is simply a case where
the GPA is not aligned to HVA, but for normal VMA that could be 
made THP.


We had this VM_BUG_ON(), which would have never hit because we would
have set force_pte if they were not aligned.


Yes, I agree.


+    /* Skip memslots with unaligned IPA and user address */
+    if ((gfn & mask) != (pfn & mask))
+    return false;
  if (pfn & mask) {
  *ipap &= PMD_MASK;
  kvm_release_pfn_clean(pfn);



---8>---

Rework fault_supports_stage2_huge_mapping(), let it check THP again.

Signed-off-by: Zenghui Yu 
---
  virt/kvm/arm/mmu.c | 11 ++-
  1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 27c9583..5e1b258 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1632,6 +1632,15 @@ static bool 
fault_supports_stage2_huge_mapping(struct kvm_memory_slot *memslot,

  uaddr_end = uaddr_start + size;

  /*
+ * If the memslot is _not_ backed by hugetlbfs, then check if it
+ * can be backed by transparent hugepages.
+ *
+ * Currently only PMD_SIZE THPs are supported, revisit it later.
+ */
+    if (map_size == PAGE_SIZE)
+    map_size = PMD_SIZE;
+


This looks hackish. What is we support PUD_SIZE huge page in the 
future

?


Yes, this might make the code a little difficult to understand. But by
doing so, we follow the same logic before commit a80868f398554842b14,
that said, we do the two-step checking for normal size pages in
fault_supports_stage2_huge_mapping(), to decide if we can create THP
mappings for these pages.

As for PUD_SIZE THPs, to be honest, I have no idea now :(


How about the following diff ?

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 97b5417..98e5cec 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1791,7 +1791,8 @@ static int user_mem_abort(struct kvm_vcpu 
*vcpu, phys_addr_t fault_ipa,

   * currently supported. This code will need to be
   * updated to support other THP sizes.
   */
-    if (transparent_hugepage_adjust(, _ipa))
+    if (fault_supports_stage2_huge_mappings(memslot, hva, 
PMD_SIZE) &&

+    transparent_hugepage_adjust(, _ipa))
  vma_pagesize = PMD_SIZE;
  }


I think this is good enough for the issue.

(One minor concern: With this change, it seems that we no longer need
"force_pte" and can just use "logging_active" instead. But this is not
much related to what we're fixing.)


I would still leave the force_pte there to avoid checking for a THP case
in a situation where we forced to PTE level mapping on a hugepage backed
VMA. It would serve to avoid another check.


Hi Suzuki,

Yes, I agree, thanks.


zenghui

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v13 7/8] arm64: KVM: avoid isb's by using direct pmxevtyper sysreg

2019-04-09 Thread Andrew Murray
Upon entering or exiting a guest we may modify multiple PMU counters to
enable of disable EL0 filtering. We presently do this via the indirect
PMXEVTYPER_EL0 system register (where the counter we modify is selected
by PMSELR). With this approach it is necessary to order the writes via
isb instructions such that we select the correct counter before modifying
it.

Let's avoid potentially expensive instruction barriers by using the
direct PMEVTYPER_EL0 registers instead.

As the change to counter type relates only to EL0 filtering we can rely
on the implicit instruction barrier which occurs when we transition from
EL2 to EL1 on entering the guest. On returning to userspace we can, at the
latest, rely on the implicit barrier between EL2 and EL0. We can also
depend on the explicit isb in armv8pmu_select_counter to order our write
against any other kernel changes by the PMU driver to the type register as
a result of preemption.

Signed-off-by: Andrew Murray 
---
 arch/arm64/kvm/pmu.c | 84 ++--
 1 file changed, 74 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/kvm/pmu.c b/arch/arm64/kvm/pmu.c
index 3407a529e116..4d55193ccc71 100644
--- a/arch/arm64/kvm/pmu.c
+++ b/arch/arm64/kvm/pmu.c
@@ -91,6 +91,74 @@ void __hyp_text __pmu_switch_to_host(struct kvm_cpu_context 
*host_ctxt)
write_sysreg(pmu->events_host, pmcntenset_el0);
 }
 
+#define PMEVTYPER_READ_CASE(idx)   \
+   case idx:   \
+   return read_sysreg(pmevtyper##idx##_el0)
+
+#define PMEVTYPER_WRITE_CASE(idx)  \
+   case idx:   \
+   write_sysreg(val, pmevtyper##idx##_el0);\
+   break
+
+#define PMEVTYPER_CASES(readwrite) \
+   PMEVTYPER_##readwrite##_CASE(0);\
+   PMEVTYPER_##readwrite##_CASE(1);\
+   PMEVTYPER_##readwrite##_CASE(2);\
+   PMEVTYPER_##readwrite##_CASE(3);\
+   PMEVTYPER_##readwrite##_CASE(4);\
+   PMEVTYPER_##readwrite##_CASE(5);\
+   PMEVTYPER_##readwrite##_CASE(6);\
+   PMEVTYPER_##readwrite##_CASE(7);\
+   PMEVTYPER_##readwrite##_CASE(8);\
+   PMEVTYPER_##readwrite##_CASE(9);\
+   PMEVTYPER_##readwrite##_CASE(10);   \
+   PMEVTYPER_##readwrite##_CASE(11);   \
+   PMEVTYPER_##readwrite##_CASE(12);   \
+   PMEVTYPER_##readwrite##_CASE(13);   \
+   PMEVTYPER_##readwrite##_CASE(14);   \
+   PMEVTYPER_##readwrite##_CASE(15);   \
+   PMEVTYPER_##readwrite##_CASE(16);   \
+   PMEVTYPER_##readwrite##_CASE(17);   \
+   PMEVTYPER_##readwrite##_CASE(18);   \
+   PMEVTYPER_##readwrite##_CASE(19);   \
+   PMEVTYPER_##readwrite##_CASE(20);   \
+   PMEVTYPER_##readwrite##_CASE(21);   \
+   PMEVTYPER_##readwrite##_CASE(22);   \
+   PMEVTYPER_##readwrite##_CASE(23);   \
+   PMEVTYPER_##readwrite##_CASE(24);   \
+   PMEVTYPER_##readwrite##_CASE(25);   \
+   PMEVTYPER_##readwrite##_CASE(26);   \
+   PMEVTYPER_##readwrite##_CASE(27);   \
+   PMEVTYPER_##readwrite##_CASE(28);   \
+   PMEVTYPER_##readwrite##_CASE(29);   \
+   PMEVTYPER_##readwrite##_CASE(30)
+
+/*
+ * Read a value direct from PMEVTYPER
+ */
+static u64 kvm_vcpu_pmu_read_evtype_direct(int idx)
+{
+   switch (idx) {
+   PMEVTYPER_CASES(READ);
+   default:
+   WARN_ON(1);
+   }
+
+   return 0;
+}
+
+/*
+ * Write a value direct to PMEVTYPER
+ */
+static void kvm_vcpu_pmu_write_evtype_direct(int idx, u32 val)
+{
+   switch (idx) {
+   PMEVTYPER_CASES(WRITE);
+   default:
+   WARN_ON(1);
+   }
+}
+
 /*
  * Modify ARMv8 PMU events to include EL0 counting
  */
@@ -100,11 +168,9 @@ static void kvm_vcpu_pmu_enable_el0(unsigned long events)
u32 counter;
 
for_each_set_bit(counter, , 32) {
-   write_sysreg(counter, pmselr_el0);
-   isb();
-   typer = read_sysreg(pmxevtyper_el0) & ~ARMV8_PMU_EXCLUDE_EL0;
-   write_sysreg(typer, pmxevtyper_el0);
-   isb();
+   typer = kvm_vcpu_pmu_read_evtype_direct(counter);
+   typer &= ~ARMV8_PMU_EXCLUDE_EL0;
+   kvm_vcpu_pmu_write_evtype_direct(counter, typer);

[PATCH v13 2/8] arm64: KVM: encapsulate kvm_cpu_context in kvm_host_data

2019-04-09 Thread Andrew Murray
The virt/arm core allocates a kvm_cpu_context_t percpu, at present this is
a typedef to kvm_cpu_context and is used to store host cpu context. The
kvm_cpu_context structure is also used elsewhere to hold vcpu context.
In order to use the percpu to hold additional future host information we
encapsulate kvm_cpu_context in a new structure and rename the typedef and
percpu to match.

Signed-off-by: Andrew Murray 
Reviewed-by: Suzuki K Poulose 
---
 arch/arm/include/asm/kvm_host.h   | 10 +++---
 arch/arm64/include/asm/kvm_asm.h  |  3 ++-
 arch/arm64/include/asm/kvm_host.h | 16 ++--
 arch/arm64/kernel/asm-offsets.c   |  1 +
 virt/kvm/arm/arm.c| 14 --
 5 files changed, 28 insertions(+), 16 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 770d73257ad9..356f72b3643c 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -150,9 +150,13 @@ struct kvm_cpu_context {
u32 cp15[NR_CP15_REGS];
 };
 
-typedef struct kvm_cpu_context kvm_cpu_context_t;
+struct kvm_host_data {
+   struct kvm_cpu_context host_ctxt;
+};
+
+typedef struct kvm_host_data kvm_host_data_t;
 
-static inline void kvm_init_host_cpu_context(kvm_cpu_context_t *cpu_ctxt,
+static inline void kvm_init_host_cpu_context(struct kvm_cpu_context *cpu_ctxt,
 int cpu)
 {
/* The host's MPIDR is immutable, so let's set it up at boot time */
@@ -182,7 +186,7 @@ struct kvm_vcpu_arch {
struct kvm_vcpu_fault_info fault;
 
/* Host FP context */
-   kvm_cpu_context_t *host_cpu_context;
+   struct kvm_cpu_context *host_cpu_context;
 
/* VGIC state */
struct vgic_cpu vgic_cpu;
diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index f5b79e995f40..ff73f5462aca 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -108,7 +108,8 @@ extern u32 __kvm_get_mdcr_el2(void);
 .endm
 
 .macro get_host_ctxt reg, tmp
-   hyp_adr_this_cpu \reg, kvm_host_cpu_state, \tmp
+   hyp_adr_this_cpu \reg, kvm_host_data, \tmp
+   add \reg, \reg, #HOST_DATA_CONTEXT
 .endm
 
 .macro get_vcpu_ptr vcpu, ctxt
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index a01fe087e022..6530ad522a16 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -212,7 +212,11 @@ struct kvm_cpu_context {
struct kvm_vcpu *__hyp_running_vcpu;
 };
 
-typedef struct kvm_cpu_context kvm_cpu_context_t;
+struct kvm_host_data {
+   struct kvm_cpu_context host_ctxt;
+};
+
+typedef struct kvm_host_data kvm_host_data_t;
 
 struct vcpu_reset_state {
unsigned long   pc;
@@ -255,7 +259,7 @@ struct kvm_vcpu_arch {
struct kvm_guest_debug_arch external_debug_state;
 
/* Pointer to host CPU context */
-   kvm_cpu_context_t *host_cpu_context;
+   struct kvm_cpu_context *host_cpu_context;
 
struct thread_info *host_thread_info;   /* hyp VA */
struct user_fpsimd_state *host_fpsimd_state;/* hyp VA */
@@ -432,9 +436,9 @@ void kvm_set_sei_esr(struct kvm_vcpu *vcpu, u64 syndrome);
 
 struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
 
-DECLARE_PER_CPU(kvm_cpu_context_t, kvm_host_cpu_state);
+DECLARE_PER_CPU(kvm_host_data_t, kvm_host_data);
 
-static inline void kvm_init_host_cpu_context(kvm_cpu_context_t *cpu_ctxt,
+static inline void kvm_init_host_cpu_context(struct kvm_cpu_context *cpu_ctxt,
 int cpu)
 {
/* The host's MPIDR is immutable, so let's set it up at boot time */
@@ -452,8 +456,8 @@ static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr,
 * kernel's mapping to the linear mapping, and store it in tpidr_el2
 * so that we can use adr_l to access per-cpu variables in EL2.
 */
-   u64 tpidr_el2 = ((u64)this_cpu_ptr(_host_cpu_state) -
-(u64)kvm_ksym_ref(kvm_host_cpu_state));
+   u64 tpidr_el2 = ((u64)this_cpu_ptr(_host_data) -
+(u64)kvm_ksym_ref(kvm_host_data));
 
/*
 * Call initialization code, and switch to the full blown HYP code.
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 7f40dcbdd51d..fbf0409cc2be 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -128,6 +128,7 @@ int main(void)
   DEFINE(CPU_GP_REGS,  offsetof(struct kvm_cpu_context, gp_regs));
   DEFINE(CPU_USER_PT_REGS, offsetof(struct kvm_regs, regs));
   DEFINE(HOST_CONTEXT_VCPU,offsetof(struct kvm_cpu_context, 
__hyp_running_vcpu));
+  DEFINE(HOST_DATA_CONTEXT,offsetof(struct kvm_host_data, host_ctxt));
 #endif
 #ifdef CONFIG_CPU_PM
   DEFINE(CPU_CTX_SP,   offsetof(struct cpu_suspend_ctx, sp));
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 

[PATCH v13 1/8] arm64: arm_pmu: remove unnecessary isb instruction

2019-04-09 Thread Andrew Murray
The armv8pmu_enable_event_counter function issues an isb instruction
after enabling a pair of counters - this doesn't provide any value
and is inconsistent with the armv8pmu_disable_event_counter.

In any case armv8pmu_enable_event_counter is always called with the
PMU stopped. Starting the PMU with armv8pmu_start results in an isb
instruction being issued prior to writing to PMCR_EL0.

Let's remove the unnecessary isb instruction.

Signed-off-by: Andrew Murray 
Reviewed-by: Suzuki K Poulose 
Acked-by: Mark Rutland 
---
 arch/arm64/kernel/perf_event.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 4addb38bc250..cccf4fc86d67 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -533,7 +533,6 @@ static inline void armv8pmu_enable_event_counter(struct 
perf_event *event)
armv8pmu_enable_counter(idx);
if (armv8pmu_event_is_chained(event))
armv8pmu_enable_counter(idx - 1);
-   isb();
 }
 
 static inline int armv8pmu_disable_counter(int idx)
-- 
2.21.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v13 8/8] arm64: docs: document perf event attributes

2019-04-09 Thread Andrew Murray
The interaction between the exclude_{host,guest} flags,
exclude_{user,kernel,hv} flags and presence of VHE can result in
different exception levels being filtered by the ARMv8 PMU. As this
can be confusing let's document how they work on arm64.

Signed-off-by: Andrew Murray 
---
 Documentation/arm64/perf.txt | 85 
 1 file changed, 85 insertions(+)
 create mode 100644 Documentation/arm64/perf.txt

diff --git a/Documentation/arm64/perf.txt b/Documentation/arm64/perf.txt
new file mode 100644
index ..0d6a7d87d49e
--- /dev/null
+++ b/Documentation/arm64/perf.txt
@@ -0,0 +1,85 @@
+Perf Event Attributes
+=
+
+Author: Andrew Murray 
+Date: 2019-03-06
+
+exclude_user
+
+
+This attribute excludes userspace.
+
+Userspace always runs at EL0 and thus this attribute will exclude EL0.
+
+
+exclude_kernel
+--
+
+This attribute excludes the kernel.
+
+The kernel runs at EL2 with VHE and EL1 without. Guest kernels always run
+at EL1.
+
+For the host this attribute will exclude EL1 and additionally EL2 on a VHE
+system.
+
+For the guest this attribute will exclude EL1. Please note that EL2 is
+never counted within a guest.
+
+
+exclude_hv
+--
+
+This attribute excludes the hypervisor.
+
+For a VHE host this attribute is ignored as we consider the host kernel to
+be the hypervisor.
+
+For a non-VHE host this attribute will exclude EL2 as we consider the
+hypervisor to be any code that runs at EL2 which is predominantly used for
+guest/host transitions.
+
+For the guest this attribute has no effect. Please note that EL2 is
+never counted within a guest.
+
+
+exclude_host / exclude_guest
+
+
+These attributes exclude the KVM host and guest, respectively.
+
+The KVM host may run at EL0 (userspace), EL1 (non-VHE kernel) and EL2 (VHE
+kernel or non-VHE hypervisor).
+
+The KVM guest may run at EL0 (userspace) and EL1 (kernel).
+
+Due to the overlapping exception levels between host and guests we cannot
+exclusively rely on the PMU's hardware exception filtering - therefore we
+must enable/disable counting on the entry and exit to the guest. This is
+performed differently on VHE and non-VHE systems.
+
+For non-VHE systems we exclude EL2 for exclude_host - upon entering and
+exiting the guest we disable/enable the event as appropriate based on the
+exclude_host and exclude_guest attributes.
+
+For VHE systems we exclude EL1 for exclude_guest and exclude both EL0,EL2
+for exclude_host. Upon entering and exiting the guest we modify the event
+to include/exclude EL0 as appropriate based on the exclude_host and
+exclude_guest attributes.
+
+The statements above also apply when these attributes are used within a
+non-VHE guest however please note that EL2 is never counted within a guest.
+
+
+Accuracy
+
+
+On non-VHE hosts we enable/disable counters on the entry/exit of host/guest
+transition at EL2 - however there is a period of time between
+enabling/disabling the counters and entering/exiting the guest. We are
+able to eliminate counters counting host events on the boundaries of guest
+entry/exit when counting guest events by filtering out EL2 for
+exclude_host. However when using !exclude_hv there is a small blackout
+window at the guest entry/exit where host events are not captured.
+
+On VHE systems there are no blackout windows.
-- 
2.21.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v13 6/8] arm64: KVM: Enable VHE support for :G/:H perf event modifiers

2019-04-09 Thread Andrew Murray
With VHE different exception levels are used between the host (EL2) and
guest (EL1) with a shared exception level for userpace (EL0). We can take
advantage of this and use the PMU's exception level filtering to avoid
enabling/disabling counters in the world-switch code. Instead we just
modify the counter type to include or exclude EL0 at vcpu_{load,put} time.

We also ensure that trapped PMU system register writes do not re-enable
EL0 when reconfiguring the backing perf events.

This approach completely avoids blackout windows seen with !VHE.

Suggested-by: Christoffer Dall 
Signed-off-by: Andrew Murray 
Acked-by: Will Deacon 
---
 arch/arm/include/asm/kvm_host.h   |  3 ++
 arch/arm64/include/asm/kvm_host.h |  5 +-
 arch/arm64/kernel/perf_event.c|  6 ++-
 arch/arm64/kvm/pmu.c  | 87 ++-
 arch/arm64/kvm/sys_regs.c |  3 ++
 virt/kvm/arm/arm.c|  2 +
 6 files changed, 102 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 356f72b3643c..1adfe35ba1eb 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -365,6 +365,9 @@ static inline void kvm_arch_vcpu_load_fp(struct kvm_vcpu 
*vcpu) {}
 static inline void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu) {}
 
+static inline void kvm_vcpu_pmu_restore_guest(struct kvm_vcpu *vcpu) {}
+static inline void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu) {}
+
 static inline void kvm_arm_vhe_guest_enter(void) {}
 static inline void kvm_arm_vhe_guest_exit(void) {}
 
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index a3bfb75f0be9..4f290dad3a48 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -528,7 +528,7 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu);
 
 static inline bool kvm_pmu_counter_deferred(struct perf_event_attr *attr)
 {
-   return attr->exclude_host;
+   return (!has_vhe() && attr->exclude_host);
 }
 
 #ifdef CONFIG_KVM /* Avoid conflicts with core headers if CONFIG_KVM=n */
@@ -542,6 +542,9 @@ void kvm_clr_pmu_events(u32 clr);
 
 void __pmu_switch_to_host(struct kvm_cpu_context *host_ctxt);
 bool __pmu_switch_to_guest(struct kvm_cpu_context *host_ctxt);
+
+void kvm_vcpu_pmu_restore_guest(struct kvm_vcpu *vcpu);
+void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu);
 #else
 static inline void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr) {}
 static inline void kvm_clr_pmu_events(u32 clr) {}
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 6bb28aaf5aea..314b1adedf06 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -847,8 +847,12 @@ static int armv8pmu_set_event_filter(struct hw_perf_event 
*event,
 * with other architectures (x86 and Power).
 */
if (is_kernel_in_hyp_mode()) {
-   if (!attr->exclude_kernel)
+   if (!attr->exclude_kernel && !attr->exclude_host)
config_base |= ARMV8_PMU_INCLUDE_EL2;
+   if (attr->exclude_guest)
+   config_base |= ARMV8_PMU_EXCLUDE_EL1;
+   if (attr->exclude_host)
+   config_base |= ARMV8_PMU_EXCLUDE_EL0;
} else {
if (!attr->exclude_hv && !attr->exclude_host)
config_base |= ARMV8_PMU_INCLUDE_EL2;
diff --git a/arch/arm64/kvm/pmu.c b/arch/arm64/kvm/pmu.c
index 48949f0cdba0..3407a529e116 100644
--- a/arch/arm64/kvm/pmu.c
+++ b/arch/arm64/kvm/pmu.c
@@ -8,11 +8,19 @@
 #include 
 
 /*
- * Given the exclude_{host,guest} attributes, determine if we are going
- * to need to switch counters at guest entry/exit.
+ * Given the perf event attributes and system type, determine
+ * if we are going to need to switch counters at guest entry/exit.
  */
 static bool kvm_pmu_switch_needed(struct perf_event_attr *attr)
 {
+   /**
+* With VHE the guest kernel runs at EL1 and the host at EL2,
+* where user (EL0) is excluded then we have no reason to switch
+* counters.
+*/
+   if (has_vhe() && attr->exclude_user)
+   return false;
+
/* Only switch if attributes are different */
return (attr->exclude_host != attr->exclude_guest);
 }
@@ -83,3 +91,78 @@ void __hyp_text __pmu_switch_to_host(struct kvm_cpu_context 
*host_ctxt)
write_sysreg(pmu->events_host, pmcntenset_el0);
 }
 
+/*
+ * Modify ARMv8 PMU events to include EL0 counting
+ */
+static void kvm_vcpu_pmu_enable_el0(unsigned long events)
+{
+   u64 typer;
+   u32 counter;
+
+   for_each_set_bit(counter, , 32) {
+   write_sysreg(counter, pmselr_el0);
+   isb();
+   typer = read_sysreg(pmxevtyper_el0) & ~ARMV8_PMU_EXCLUDE_EL0;
+   write_sysreg(typer, pmxevtyper_el0);
+   

[PATCH v13 0/8] arm64: Support perf event modifiers :G and :H

2019-04-09 Thread Andrew Murray
This patchset provides support for perf event modifiers :G and :H which
allows for filtering of PMU events between host and guests when used
with KVM.

As the underlying hardware cannot distinguish between guest and host
context we must switch the performance counters upon entry/exit to the
guest.

For non-VHE the counters must be stopped and started upon entry/exit to
the guest in __kvm_vcpu_run_nvhe.

For VHE we keep the counters enabled but instead change the event type
to include/exclude EL0 as appropriate. This allows us to perform this
switch outside the critical section of world-switch code in vcpu_load.

This has been tested with VHE and non-VHE kernels with a KVM guest.

Changes from v12:

 - Rebased to v5.1-rc4

 - Fixed breakage for arm32

 - Improved clarity of documentation

Changes from v11:

 - Rebased to v5.1-rc2

 - Minor changes to commit messages, fixing typos, removing superfluous
   comments, code.

 - Change ^ to != in kvm_pmu_switch_needed

Changes from v10:

 - Remove counter switch code from kvm_vcpu_run_vhe and replace with
   enable/disable EL0 event type in kvm_arch_vcpu_load instead

 - Reduce counter switch code in __kvm_vcpu_run_nvhe by moving some
   logic to kvm_set_pmu_events (kvm_pmu_switch_needed)

 - Simplify code by removing need for KVM_PMU_EVENTS_{HOST,GUEST}

 - Add kvm_host helper function to let PMU determine if event should
   start counting when it is enabled

 - Exclude EL2 on !VHE when exclude_host (to avoid counting host events
   as guest during entry/exit)

 - Moved PMU switching code to its own file

 - Rebased to v5.0

 - Added documentation

 - Removed Reviewed-By's for changed patches and updated commit messages

Changes from v9:

 - Rebased to v5.0-rc2
 - Ensure get_host_ctxt considers host_ctxt offset

Changes from v8:

 - Added additional comments
 - Fixed bisect build failure
 - Renamed cpu_ctxt variable to cpu_data

Changes from v7:

 - Added additional patch to encapsulate kvm_cpu_context in kvm_host_data

Changes from v6:

 - Move events_host/events_guest out of kvm_cpu_context

Changes from v5:

 - Tweak logic in use of kvm_set_pmu_events

Changes from v4:

 - Prevent unnecessary write_sysreg calls by improving
   __pmu_switch_to_xxx logic.

Changes from v3:

 - Remove confusing _only suffix from bitfields in kvm_cpu_context
 - Remove unnecessary condition when clearing event bits in disable
 - Simplify API of KVM accessors
 - Prevent unnecessary setting of pmcnten when guest/host events are
   the same.

Changes from v2:

 - Ensured that exclude_kernel works for guest
 - Removed unnecessary exclusion of EL2 with exclude_host on !VHE
 - Renamed kvm_clr_set_host_pmu_events to reflect args order
 - Added additional information to isb patch

Changes from v1:

 - Removed unnecessary exclusion of EL1 with exclude_guest on VHE
 - Removed unnecessary isb from existing perf_event.c driver
 - Folded perf_event.c patches together
 - Added additional information to last patch commit message


Andrew Murray (8):
  arm64: arm_pmu: remove unnecessary isb instruction
  arm64: KVM: encapsulate kvm_cpu_context in kvm_host_data
  arm64: KVM: add accessors to track guest/host only counters
  arm64: arm_pmu: Add !VHE support for exclude_host/exclude_guest
attributes
  arm64: KVM: Enable !VHE support for :G/:H perf event modifiers
  arm64: KVM: Enable VHE support for :G/:H perf event modifiers
  arm64: KVM: avoid isb's by using direct pmxevtyper sysreg
  arm64: docs: document perf event attributes

 Documentation/arm64/perf.txt  |  85 +++
 arch/arm/include/asm/kvm_host.h   |  13 +-
 arch/arm64/include/asm/kvm_asm.h  |   3 +-
 arch/arm64/include/asm/kvm_host.h |  39 -
 arch/arm64/kernel/asm-offsets.c   |   1 +
 arch/arm64/kernel/perf_event.c|  50 +--
 arch/arm64/kvm/Makefile   |   2 +-
 arch/arm64/kvm/hyp/switch.c   |   6 +
 arch/arm64/kvm/pmu.c  | 232 ++
 arch/arm64/kvm/sys_regs.c |   3 +
 virt/kvm/arm/arm.c|  16 ++-
 11 files changed, 424 insertions(+), 26 deletions(-)
 create mode 100644 Documentation/arm64/perf.txt
 create mode 100644 arch/arm64/kvm/pmu.c

-- 
2.21.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v13 3/8] arm64: KVM: add accessors to track guest/host only counters

2019-04-09 Thread Andrew Murray
In order to effeciently switch events_{guest,host} perf counters at
guest entry/exit we add bitfields to kvm_cpu_context for guest and host
events as well as accessors for updating them.

A function is also provided which allows the PMU driver to determine
if a counter should start counting when it is enabled. With exclude_host,
we may only start counting when entering the guest.

Signed-off-by: Andrew Murray 
---
 arch/arm64/include/asm/kvm_host.h | 17 
 arch/arm64/kvm/Makefile   |  2 +-
 arch/arm64/kvm/pmu.c  | 45 +++
 3 files changed, 63 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/kvm/pmu.c

diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 6530ad522a16..213465196e6b 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -212,8 +212,14 @@ struct kvm_cpu_context {
struct kvm_vcpu *__hyp_running_vcpu;
 };
 
+struct kvm_pmu_events {
+   u32 events_host;
+   u32 events_guest;
+};
+
 struct kvm_host_data {
struct kvm_cpu_context host_ctxt;
+   struct kvm_pmu_events pmu_events;
 };
 
 typedef struct kvm_host_data kvm_host_data_t;
@@ -520,11 +526,22 @@ void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
 void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu);
 void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu);
 
+static inline bool kvm_pmu_counter_deferred(struct perf_event_attr *attr)
+{
+   return attr->exclude_host;
+}
+
 #ifdef CONFIG_KVM /* Avoid conflicts with core headers if CONFIG_KVM=n */
 static inline int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
 {
return kvm_arch_vcpu_run_map_fp(vcpu);
 }
+
+void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr);
+void kvm_clr_pmu_events(u32 clr);
+#else
+static inline void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr) {}
+static inline void kvm_clr_pmu_events(u32 clr) {}
 #endif
 
 static inline void kvm_arm_vhe_guest_enter(void)
diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
index 690e033a91c0..3ac1a64d2fb9 100644
--- a/arch/arm64/kvm/Makefile
+++ b/arch/arm64/kvm/Makefile
@@ -17,7 +17,7 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/psci.o 
$(KVM)/arm/perf.o
 kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o va_layout.o
 kvm-$(CONFIG_KVM_ARM_HOST) += hyp.o hyp-init.o handle_exit.o
 kvm-$(CONFIG_KVM_ARM_HOST) += guest.o debug.o reset.o sys_regs.o 
sys_regs_generic_v8.o
-kvm-$(CONFIG_KVM_ARM_HOST) += vgic-sys-reg-v3.o fpsimd.o
+kvm-$(CONFIG_KVM_ARM_HOST) += vgic-sys-reg-v3.o fpsimd.o pmu.o
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/aarch32.o
 
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic.o
diff --git a/arch/arm64/kvm/pmu.c b/arch/arm64/kvm/pmu.c
new file mode 100644
index ..5414b134f99a
--- /dev/null
+++ b/arch/arm64/kvm/pmu.c
@@ -0,0 +1,45 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright 2019 Arm Limited
+ * Author: Andrew Murray 
+ */
+#include 
+#include 
+
+/*
+ * Given the exclude_{host,guest} attributes, determine if we are going
+ * to need to switch counters at guest entry/exit.
+ */
+static bool kvm_pmu_switch_needed(struct perf_event_attr *attr)
+{
+   /* Only switch if attributes are different */
+   return (attr->exclude_host != attr->exclude_guest);
+}
+
+/*
+ * Add events to track that we may want to switch at guest entry/exit
+ * time.
+ */
+void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr)
+{
+   struct kvm_host_data *ctx = this_cpu_ptr(_host_data);
+
+   if (!kvm_pmu_switch_needed(attr))
+   return;
+
+   if (!attr->exclude_host)
+   ctx->pmu_events.events_host |= set;
+   if (!attr->exclude_guest)
+   ctx->pmu_events.events_guest |= set;
+}
+
+/*
+ * Stop tracking events
+ */
+void kvm_clr_pmu_events(u32 clr)
+{
+   struct kvm_host_data *ctx = this_cpu_ptr(_host_data);
+
+   ctx->pmu_events.events_host &= ~clr;
+   ctx->pmu_events.events_guest &= ~clr;
+}
-- 
2.21.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v13 5/8] arm64: KVM: Enable !VHE support for :G/:H perf event modifiers

2019-04-09 Thread Andrew Murray
Enable/disable event counters as appropriate when entering and exiting
the guest to enable support for guest or host only event counting.

For both VHE and non-VHE we switch the counters between host/guest at
EL2.

The PMU may be on when we change which counters are enabled however
we avoid adding an isb as we instead rely on existing context
synchronisation events: the eret to enter the guest (__guest_enter)
and eret in kvm_call_hyp for __kvm_vcpu_run_nvhe on returning.

Signed-off-by: Andrew Murray 
---
 arch/arm64/include/asm/kvm_host.h |  3 +++
 arch/arm64/kvm/hyp/switch.c   |  6 +
 arch/arm64/kvm/pmu.c  | 40 +++
 3 files changed, 49 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 213465196e6b..a3bfb75f0be9 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -539,6 +539,9 @@ static inline int kvm_arch_vcpu_run_pid_change(struct 
kvm_vcpu *vcpu)
 
 void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr);
 void kvm_clr_pmu_events(u32 clr);
+
+void __pmu_switch_to_host(struct kvm_cpu_context *host_ctxt);
+bool __pmu_switch_to_guest(struct kvm_cpu_context *host_ctxt);
 #else
 static inline void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr) {}
 static inline void kvm_clr_pmu_events(u32 clr) {}
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 3563fe655cd5..d1dc40e53f22 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -524,6 +524,7 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
 {
struct kvm_cpu_context *host_ctxt;
struct kvm_cpu_context *guest_ctxt;
+   bool pmu_switch_needed;
u64 exit_code;
 
/*
@@ -543,6 +544,8 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
host_ctxt->__hyp_running_vcpu = vcpu;
guest_ctxt = >arch.ctxt;
 
+   pmu_switch_needed = __pmu_switch_to_guest(host_ctxt);
+
__sysreg_save_state_nvhe(host_ctxt);
 
__activate_vm(kern_hyp_va(vcpu->kvm));
@@ -589,6 +592,9 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
 */
__debug_switch_to_host(vcpu);
 
+   if (pmu_switch_needed)
+   __pmu_switch_to_host(host_ctxt);
+
/* Returning to host will clear PSR.I, remask PMR if needed */
if (system_uses_irq_prio_masking())
gic_write_pmr(GIC_PRIO_IRQOFF);
diff --git a/arch/arm64/kvm/pmu.c b/arch/arm64/kvm/pmu.c
index 5414b134f99a..48949f0cdba0 100644
--- a/arch/arm64/kvm/pmu.c
+++ b/arch/arm64/kvm/pmu.c
@@ -5,6 +5,7 @@
  */
 #include 
 #include 
+#include 
 
 /*
  * Given the exclude_{host,guest} attributes, determine if we are going
@@ -43,3 +44,42 @@ void kvm_clr_pmu_events(u32 clr)
ctx->pmu_events.events_host &= ~clr;
ctx->pmu_events.events_guest &= ~clr;
 }
+
+/**
+ * Disable host events, enable guest events
+ */
+bool __hyp_text __pmu_switch_to_guest(struct kvm_cpu_context *host_ctxt)
+{
+   struct kvm_host_data *host;
+   struct kvm_pmu_events *pmu;
+
+   host = container_of(host_ctxt, struct kvm_host_data, host_ctxt);
+   pmu = >pmu_events;
+
+   if (pmu->events_host)
+   write_sysreg(pmu->events_host, pmcntenclr_el0);
+
+   if (pmu->events_guest)
+   write_sysreg(pmu->events_guest, pmcntenset_el0);
+
+   return (pmu->events_host || pmu->events_guest);
+}
+
+/**
+ * Disable guest events, enable host events
+ */
+void __hyp_text __pmu_switch_to_host(struct kvm_cpu_context *host_ctxt)
+{
+   struct kvm_host_data *host;
+   struct kvm_pmu_events *pmu;
+
+   host = container_of(host_ctxt, struct kvm_host_data, host_ctxt);
+   pmu = >pmu_events;
+
+   if (pmu->events_guest)
+   write_sysreg(pmu->events_guest, pmcntenclr_el0);
+
+   if (pmu->events_host)
+   write_sysreg(pmu->events_host, pmcntenset_el0);
+}
+
-- 
2.21.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v13 4/8] arm64: arm_pmu: Add !VHE support for exclude_host/exclude_guest attributes

2019-04-09 Thread Andrew Murray
Add support for the :G and :H attributes in perf by handling the
exclude_host/exclude_guest event attributes.

We notify KVM of counters that we wish to be enabled or disabled on
guest entry/exit and thus defer from starting or stopping events based
on their event attributes.

With !VHE we switch the counters between host/guest at EL2. We are able
to eliminate counters counting host events on the boundaries of guest
entry/exit when using :G by filtering out EL2 for exclude_host. When
using !exclude_hv there is a small blackout window at the guest
entry/exit where host events are not captured.

Signed-off-by: Andrew Murray 
Acked-by: Will Deacon 
---
 arch/arm64/kernel/perf_event.c | 43 --
 1 file changed, 36 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index cccf4fc86d67..6bb28aaf5aea 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -26,6 +26,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -528,11 +529,21 @@ static inline int armv8pmu_enable_counter(int idx)
 
 static inline void armv8pmu_enable_event_counter(struct perf_event *event)
 {
+   struct perf_event_attr *attr = >attr;
int idx = event->hw.idx;
+   u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
 
-   armv8pmu_enable_counter(idx);
if (armv8pmu_event_is_chained(event))
-   armv8pmu_enable_counter(idx - 1);
+   counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
+
+   kvm_set_pmu_events(counter_bits, attr);
+
+   /* We rely on the hypervisor switch code to enable guest counters */
+   if (!kvm_pmu_counter_deferred(attr)) {
+   armv8pmu_enable_counter(idx);
+   if (armv8pmu_event_is_chained(event))
+   armv8pmu_enable_counter(idx - 1);
+   }
 }
 
 static inline int armv8pmu_disable_counter(int idx)
@@ -545,11 +556,21 @@ static inline int armv8pmu_disable_counter(int idx)
 static inline void armv8pmu_disable_event_counter(struct perf_event *event)
 {
struct hw_perf_event *hwc = >hw;
+   struct perf_event_attr *attr = >attr;
int idx = hwc->idx;
+   u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
 
if (armv8pmu_event_is_chained(event))
-   armv8pmu_disable_counter(idx - 1);
-   armv8pmu_disable_counter(idx);
+   counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
+
+   kvm_clr_pmu_events(counter_bits);
+
+   /* We rely on the hypervisor switch code to disable guest counters */
+   if (!kvm_pmu_counter_deferred(attr)) {
+   if (armv8pmu_event_is_chained(event))
+   armv8pmu_disable_counter(idx - 1);
+   armv8pmu_disable_counter(idx);
+   }
 }
 
 static inline int armv8pmu_enable_intens(int idx)
@@ -829,11 +850,16 @@ static int armv8pmu_set_event_filter(struct hw_perf_event 
*event,
if (!attr->exclude_kernel)
config_base |= ARMV8_PMU_INCLUDE_EL2;
} else {
-   if (attr->exclude_kernel)
-   config_base |= ARMV8_PMU_EXCLUDE_EL1;
-   if (!attr->exclude_hv)
+   if (!attr->exclude_hv && !attr->exclude_host)
config_base |= ARMV8_PMU_INCLUDE_EL2;
}
+
+   /*
+* Filter out !VHE kernels and guest kernels
+*/
+   if (attr->exclude_kernel)
+   config_base |= ARMV8_PMU_EXCLUDE_EL1;
+
if (attr->exclude_user)
config_base |= ARMV8_PMU_EXCLUDE_EL0;
 
@@ -863,6 +889,9 @@ static void armv8pmu_reset(void *info)
armv8pmu_disable_intens(idx);
}
 
+   /* Clear the counters we flip at guest entry/exit */
+   kvm_clr_pmu_events(U32_MAX);
+
/*
 * Initialize & Reset PMNC. Request overflow interrupt for
 * 64 bit cycle counter but cheat in armv8pmu_write_counter().
-- 
2.21.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v12 6/8] arm64: KVM: Enable VHE support for :G/:H perf event modifiers

2019-04-09 Thread Andrew Murray
On Tue, Apr 09, 2019 at 06:52:27PM +0100, Will Deacon wrote:
> On Thu, Mar 28, 2019 at 10:37:29AM +, Andrew Murray wrote:
> > With VHE different exception levels are used between the host (EL2) and
> > guest (EL1) with a shared exception level for userpace (EL0). We can take
> > advantage of this and use the PMU's exception level filtering to avoid
> > enabling/disabling counters in the world-switch code. Instead we just
> > modify the counter type to include or exclude EL0 at vcpu_{load,put} time.
> > 
> > We also ensure that trapped PMU system register writes do not re-enable
> > EL0 when reconfiguring the backing perf events.
> > 
> > This approach completely avoids blackout windows seen with !VHE.
> > 
> > Suggested-by: Christoffer Dall 
> > Signed-off-by: Andrew Murray 
> > ---
> >  arch/arm/include/asm/kvm_host.h   |  3 ++
> >  arch/arm64/include/asm/kvm_host.h |  5 +-
> >  arch/arm64/kernel/perf_event.c|  6 ++-
> >  arch/arm64/kvm/pmu.c  | 87 ++-
> >  arch/arm64/kvm/sys_regs.c |  3 ++
> >  virt/kvm/arm/arm.c|  2 +
> >  6 files changed, 102 insertions(+), 4 deletions(-)
> > 
> > diff --git a/arch/arm/include/asm/kvm_host.h 
> > b/arch/arm/include/asm/kvm_host.h
> > index 427c28be6452..481411295b3b 100644
> > --- a/arch/arm/include/asm/kvm_host.h
> > +++ b/arch/arm/include/asm/kvm_host.h
> > @@ -365,6 +365,9 @@ static inline void kvm_arch_vcpu_load_fp(struct 
> > kvm_vcpu *vcpu) {}
> >  static inline void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu) {}
> >  static inline void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu) {}
> >  
> > +static inline void kvm_vcpu_pmu_restore_guest(struct kvm_vcpu *vcpu) {}
> > +static inline void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu) {}
> > +
> >  static inline void kvm_arm_vhe_guest_enter(void) {}
> >  static inline void kvm_arm_vhe_guest_exit(void) {}
> >  
> > diff --git a/arch/arm64/include/asm/kvm_host.h 
> > b/arch/arm64/include/asm/kvm_host.h
> > index a3bfb75f0be9..4f290dad3a48 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -528,7 +528,7 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu);
> >  
> >  static inline bool kvm_pmu_counter_deferred(struct perf_event_attr *attr)
> >  {
> > -   return attr->exclude_host;
> > +   return (!has_vhe() && attr->exclude_host);
> >  }
> >  
> >  #ifdef CONFIG_KVM /* Avoid conflicts with core headers if CONFIG_KVM=n */
> > @@ -542,6 +542,9 @@ void kvm_clr_pmu_events(u32 clr);
> >  
> >  void __pmu_switch_to_host(struct kvm_cpu_context *host_ctxt);
> >  bool __pmu_switch_to_guest(struct kvm_cpu_context *host_ctxt);
> > +
> > +void kvm_vcpu_pmu_restore_guest(struct kvm_vcpu *vcpu);
> > +void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu);
> >  #else
> >  static inline void kvm_set_pmu_events(u32 set, struct perf_event_attr 
> > *attr) {}
> >  static inline void kvm_clr_pmu_events(u32 clr) {}
> > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> > index 6bb28aaf5aea..314b1adedf06 100644
> > --- a/arch/arm64/kernel/perf_event.c
> > +++ b/arch/arm64/kernel/perf_event.c
> > @@ -847,8 +847,12 @@ static int armv8pmu_set_event_filter(struct 
> > hw_perf_event *event,
> >  * with other architectures (x86 and Power).
> >  */
> > if (is_kernel_in_hyp_mode()) {
> > -   if (!attr->exclude_kernel)
> > +   if (!attr->exclude_kernel && !attr->exclude_host)
> > config_base |= ARMV8_PMU_INCLUDE_EL2;
> > +   if (attr->exclude_guest)
> > +   config_base |= ARMV8_PMU_EXCLUDE_EL1;
> > +   if (attr->exclude_host)
> > +   config_base |= ARMV8_PMU_EXCLUDE_EL0;
> > } else {
> > if (!attr->exclude_hv && !attr->exclude_host)
> > config_base |= ARMV8_PMU_INCLUDE_EL2;
> 
> I still don't really like these semantics, but it's consistent and
> you're documenting it so:
> 
> Acked-by: Will Deacon 

Much appreciated.

Thanks,

Andrew Murray

> 
> Will
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v12 6/8] arm64: KVM: Enable VHE support for :G/:H perf event modifiers

2019-04-09 Thread Will Deacon
On Thu, Mar 28, 2019 at 10:37:29AM +, Andrew Murray wrote:
> With VHE different exception levels are used between the host (EL2) and
> guest (EL1) with a shared exception level for userpace (EL0). We can take
> advantage of this and use the PMU's exception level filtering to avoid
> enabling/disabling counters in the world-switch code. Instead we just
> modify the counter type to include or exclude EL0 at vcpu_{load,put} time.
> 
> We also ensure that trapped PMU system register writes do not re-enable
> EL0 when reconfiguring the backing perf events.
> 
> This approach completely avoids blackout windows seen with !VHE.
> 
> Suggested-by: Christoffer Dall 
> Signed-off-by: Andrew Murray 
> ---
>  arch/arm/include/asm/kvm_host.h   |  3 ++
>  arch/arm64/include/asm/kvm_host.h |  5 +-
>  arch/arm64/kernel/perf_event.c|  6 ++-
>  arch/arm64/kvm/pmu.c  | 87 ++-
>  arch/arm64/kvm/sys_regs.c |  3 ++
>  virt/kvm/arm/arm.c|  2 +
>  6 files changed, 102 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index 427c28be6452..481411295b3b 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -365,6 +365,9 @@ static inline void kvm_arch_vcpu_load_fp(struct kvm_vcpu 
> *vcpu) {}
>  static inline void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu) {}
>  static inline void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu) {}
>  
> +static inline void kvm_vcpu_pmu_restore_guest(struct kvm_vcpu *vcpu) {}
> +static inline void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu) {}
> +
>  static inline void kvm_arm_vhe_guest_enter(void) {}
>  static inline void kvm_arm_vhe_guest_exit(void) {}
>  
> diff --git a/arch/arm64/include/asm/kvm_host.h 
> b/arch/arm64/include/asm/kvm_host.h
> index a3bfb75f0be9..4f290dad3a48 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -528,7 +528,7 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu);
>  
>  static inline bool kvm_pmu_counter_deferred(struct perf_event_attr *attr)
>  {
> - return attr->exclude_host;
> + return (!has_vhe() && attr->exclude_host);
>  }
>  
>  #ifdef CONFIG_KVM /* Avoid conflicts with core headers if CONFIG_KVM=n */
> @@ -542,6 +542,9 @@ void kvm_clr_pmu_events(u32 clr);
>  
>  void __pmu_switch_to_host(struct kvm_cpu_context *host_ctxt);
>  bool __pmu_switch_to_guest(struct kvm_cpu_context *host_ctxt);
> +
> +void kvm_vcpu_pmu_restore_guest(struct kvm_vcpu *vcpu);
> +void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu);
>  #else
>  static inline void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr) 
> {}
>  static inline void kvm_clr_pmu_events(u32 clr) {}
> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index 6bb28aaf5aea..314b1adedf06 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -847,8 +847,12 @@ static int armv8pmu_set_event_filter(struct 
> hw_perf_event *event,
>* with other architectures (x86 and Power).
>*/
>   if (is_kernel_in_hyp_mode()) {
> - if (!attr->exclude_kernel)
> + if (!attr->exclude_kernel && !attr->exclude_host)
>   config_base |= ARMV8_PMU_INCLUDE_EL2;
> + if (attr->exclude_guest)
> + config_base |= ARMV8_PMU_EXCLUDE_EL1;
> + if (attr->exclude_host)
> + config_base |= ARMV8_PMU_EXCLUDE_EL0;
>   } else {
>   if (!attr->exclude_hv && !attr->exclude_host)
>   config_base |= ARMV8_PMU_INCLUDE_EL2;

I still don't really like these semantics, but it's consistent and
you're documenting it so:

Acked-by: Will Deacon 

Will
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH] kvm: arm: Skip stage2 huge mappings for unaligned ipa backed by THP

2019-04-09 Thread Suzuki K Poulose

Hi Zenghui

On 04/09/2019 09:05 AM, Zenghui Yu wrote:



On 2019/4/9 2:40, Suzuki K Poulose wrote:

Hi Zenhui,

On 04/08/2019 04:11 PM, Zenghui Yu wrote:

Hi Suzuki,

Thanks for the reply.



...


Hi Suzuki,

Why not making use of fault_supports_stage2_huge_mapping()?  Let it do
some checks for us.

fault_supports_stage2_huge_mapping() was intended to do a *two-step*
check to tell us that can we create stage2 huge block mappings, and 
this

check is both for hugetlbfs and THP.  With commit a80868f398554842b14,
we pass PAGE_SIZE as "map_size" for normal size pages (which turned 
out

to be almost meaningless), and unfortunately the THP check no longer
works.


Thats correct.



So we want to rework *THP* check process.  Your patch fixes the first
checking-step, but the second is still missed, am I wrong?


It fixes the step explicitly for the THP by making sure that the GPA 
and

the HVA are aligned to the map size.


Yes, I understand how your patch had fixed the issue.  But what I'm
really concerned about here is the *second* checking-step in
fault_supports_stage2_huge_mapping().

We have to check if we are mapping a non-block aligned or non-block
sized memslot, if so, we can not create block mappings for the beginning
and end of this memslot.  This is what the second part of
fault_supports_stage2_huge_mapping() had done.

I haven't seen this checking-step in your patch, did I miss something?



I see.


I don't think this calls for a VM_BUG_ON(). It is simply a case where
the GPA is not aligned to HVA, but for normal VMA that could be made 
THP.


We had this VM_BUG_ON(), which would have never hit because we would
have set force_pte if they were not aligned.


Yes, I agree.


+    /* Skip memslots with unaligned IPA and user address */
+    if ((gfn & mask) != (pfn & mask))
+    return false;
  if (pfn & mask) {
  *ipap &= PMD_MASK;
  kvm_release_pfn_clean(pfn);



---8>---

Rework fault_supports_stage2_huge_mapping(), let it check THP again.

Signed-off-by: Zenghui Yu 
---
  virt/kvm/arm/mmu.c | 11 ++-
  1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 27c9583..5e1b258 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1632,6 +1632,15 @@ static bool 
fault_supports_stage2_huge_mapping(struct kvm_memory_slot *memslot,

  uaddr_end = uaddr_start + size;

  /*
+ * If the memslot is _not_ backed by hugetlbfs, then check if it
+ * can be backed by transparent hugepages.
+ *
+ * Currently only PMD_SIZE THPs are supported, revisit it later.
+ */
+    if (map_size == PAGE_SIZE)
+    map_size = PMD_SIZE;
+


This looks hackish. What is we support PUD_SIZE huge page in the future
?


Yes, this might make the code a little difficult to understand. But by
doing so, we follow the same logic before commit a80868f398554842b14,
that said, we do the two-step checking for normal size pages in
fault_supports_stage2_huge_mapping(), to decide if we can create THP
mappings for these pages.

As for PUD_SIZE THPs, to be honest, I have no idea now :(


How about the following diff ?

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 97b5417..98e5cec 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1791,7 +1791,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,

   * currently supported. This code will need to be
   * updated to support other THP sizes.
   */
-    if (transparent_hugepage_adjust(, _ipa))
+    if (fault_supports_stage2_huge_mappings(memslot, hva, 
PMD_SIZE) &&

+    transparent_hugepage_adjust(, _ipa))
  vma_pagesize = PMD_SIZE;
  }


I think this is good enough for the issue.

(One minor concern: With this change, it seems that we no longer need
"force_pte" and can just use "logging_active" instead. But this is not
much related to what we're fixing.)


I would still leave the force_pte there to avoid checking for a THP case
in a situation where we forced to PTE level mapping on a hugepage backed
VMA. It would serve to avoid another check.

Cheers
Suzuki





thanks.




___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v12 8/8] arm64: docs: document perf event attributes

2019-04-09 Thread Andrew Murray
On Fri, Apr 05, 2019 at 01:43:08PM +0100, Will Deacon wrote:
> On Thu, Apr 04, 2019 at 08:33:51PM +0100, Andrew Murray wrote:
> > On Thu, Apr 04, 2019 at 05:21:28PM +0100, Will Deacon wrote:
> > > On Thu, Mar 28, 2019 at 10:37:31AM +, Andrew Murray wrote:
> > > > +exclude_kernel
> > > > +--
> > > > +
> > > > +This attribute excludes the kernel.
> > > > +
> > > > +The kernel runs at EL2 with VHE and EL1 without. Guest kernels always 
> > > > run
> > > > +at EL1.
> > > > +
> > > > +This attribute will exclude EL1 and additionally EL2 on a VHE system.
> > > 
> > > I find this last sentence a bit confusing, because it can be read to imply
> > > that if you don't set exclude_kernel and you're in a guest on a VHE 
> > > system,
> > > then you can profile EL2.
> > 
> > Yes this could be misleading.
> > 
> > However from the perspective of the guest, when exclude_kernel is not set we
> > do indeed allow the guest to program it's PMU with ARMV8_PMU_INCLUDE_EL2 - 
> > and
> > thus the statement above is correct in terms of what the kernel believes it 
> > is
> > doing.
> > 
> > I think these statements are less confusing if we treat the exception levels
> > as those 'detected' by the running context (e.g. consider the impact of 
> > nested
> > virt here) - and we if ignore what the hypervisor (KVM) does outside (e.g.
> > stops counting upon switching between guest/host, translating PMU filters in
> > kvm_pmu_set_counter_event_type etc, etc). This then makes this document 
> > useful
> > for those wishing to change this logic (which is the intent) rather than 
> > those
> > trying to understand how we filter for EL levels as seen bare-metal.
> > 
> > With regards to the example you gave (exclude_kernel, EL2) - yes we want the
> > kernel to believe it can count EL2 - because one day we may want to update
> > KVM to allow the guest to count it's hypervisor overhead (e.g. host kernel
> > time associated with the guest).
> 
> If we were to support this in the future, then exclude_hv will suddenly
> start meaning something in a guest, so this could be considered to be an ABI
> break.
> 
> > I could write some preface that describes this outlook. Alternatively I 
> > could
> > just spell out what happens on a guest, e.g.
> > 
> > "For the host this attribute will exclude EL1 and additionally EL2 on a VHE
> > system.
> > 
> > For the guest this attribute will exclude EL1."
> > 
> > Though I'm less comfortable with this, as the last statement "For the guest 
> > this
> > attribute will exclude EL1." describes the product of both
> > kvm_pmu_set_counter_event_type and armv8pmu_set_event_filter which is 
> > confusing
> > to work out and also makes an assumption that we don't have nested virt 
> > (true
> > for now at least) and also reasons about bare-metal EL levels which probably
> > aren't that useful for someone changing this logic or understanding what the
> > flags do for there performance analysis.
> > 
> > Do you have a preference for how this is improved?
> 
> I think you should be explicit about what is counted. If we don't count EL2
> when profiling in a guest (regardless of the exclude_*) flags, then we
> should say that. By not documenting this we don't actually buy ourselves
> room to change things in future, we should have an emergent behaviour which
> isn't covered by our docs.

OK no problem, I'll update this.

Andrew Murray

> 
> Will
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v12 2/8] arm64: KVM: encapsulate kvm_cpu_context in kvm_host_data

2019-04-09 Thread Andrew Murray
On Thu, Mar 28, 2019 at 03:28:37PM +, Suzuki K Poulose wrote:
> On 03/28/2019 10:37 AM, Andrew Murray wrote:
> > The virt/arm core allocates a kvm_cpu_context_t percpu, at present this is
> > a typedef to kvm_cpu_context and is used to store host cpu context. The
> > kvm_cpu_context structure is also used elsewhere to hold vcpu context.
> > In order to use the percpu to hold additional future host information we
> > encapsulate kvm_cpu_context in a new structure and rename the typedef and
> > percpu to match.
> > 
> > Signed-off-by: Andrew Murray 
> > ---
> >   arch/arm/include/asm/kvm_host.h   |  8 ++--
> >   arch/arm64/include/asm/kvm_asm.h  |  3 ++-
> >   arch/arm64/include/asm/kvm_host.h | 16 ++--
> >   arch/arm64/kernel/asm-offsets.c   |  1 +
> >   virt/kvm/arm/arm.c| 14 --
> >   5 files changed, 27 insertions(+), 15 deletions(-)
> > 
> > diff --git a/arch/arm/include/asm/kvm_host.h 
> > b/arch/arm/include/asm/kvm_host.h
> > index 770d73257ad9..427c28be6452 100644
> > --- a/arch/arm/include/asm/kvm_host.h
> > +++ b/arch/arm/include/asm/kvm_host.h
> > @@ -150,7 +150,11 @@ struct kvm_cpu_context {
> > u32 cp15[NR_CP15_REGS];
> >   };
> > -typedef struct kvm_cpu_context kvm_cpu_context_t;
> > +struct kvm_host_data {
> > +   struct kvm_cpu_context host_ctxt;
> > +};
> > +
> > +typedef struct kvm_host_data kvm_host_data_t;
> >   static inline void kvm_init_host_cpu_context(kvm_cpu_context_t *cpu_ctxt,
> >  int cpu)
> 
> We need to fix this function prototype to accept struct kvm_cpu_context,
> instead of the now removed kvm_cpu_context_t, to prevent a build break
> on arm32 ?

Yes this was a breakage, thanks for pointing this out.

Thanks,

Andrew Murray

> 
> With that :
> 
> Reviewed-by: Suzuki K Poulose 
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [kvm-unit-tests PATCH] lib: arm: Use correct halt() prototype from smp.h

2019-04-09 Thread Andrew Jones
On Tue, Apr 09, 2019 at 10:15:59AM +0100, Alexandru Elisei wrote:
> On 4/9/19 8:40 AM, Andrew Jones wrote:
> > On Mon, Apr 08, 2019 at 04:11:25PM +0100, Alexandru Elisei wrote:
> >> The prototype for the halt() function is incorrect, because halt() doesn't
> >> take any arguments. Fix it by using the prototype from smp.h.
> >>
> >> Signed-off-by: Alexandru Elisei 
> >> ---
> >>  lib/arm/io.c | 5 ++---
> >>  1 file changed, 2 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/lib/arm/io.c b/lib/arm/io.c
> >> index 8226b765bdc5..6d3d7afed002 100644
> >> --- a/lib/arm/io.c
> >> +++ b/lib/arm/io.c
> >> @@ -15,11 +15,10 @@
> >>  #include 
> >>  #include 
> >>  #include 
> >> +#include 
> >>  
> >>  #include "io.h"
> >>  
> >> -extern void halt(int code);
> >> -
> >>  static struct spinlock uart_lock;
> >>  /*
> >>   * Use this guess for the uart base in order to make an attempt at
> >> @@ -93,6 +92,6 @@ void exit(int code)
> >>  {
> >>chr_testdev_exit(code);
> >>psci_system_off();
> >> -  halt(code);
> >> +  halt();
> >>__builtin_unreachable();
> >>  }
> >> -- 
> >> 2.17.0
> >>
> > I don't mind this change, because per the code it is the "correct"
> > thing to do. However, I was being a bit tricky here when I wrote it.
> > By changing the prototype to take 'code' as argument we guarantee
> > that 'code' will be in x0/r0 when we halt, giving us a last chance
> > to see it when inspecting the halted unit test state.
> >
> > Anyway, like I said, I'm fine with the cleanup, but the prototype
> > abuse does serve a purpose - maybe just not a good enough purpose
> > to justify the weirdness.
> 
> Now it makes sense. I didn't think it was intentional, but now that you have
> mentioned it, the same pattern is used by powerpc.

No surprise there. I wrote that too, based on the arm code :)

> 
> Perhaps a comment explaining that having different prototypes was on purpose
> would be the best solution?

Yes, either a comment explaining the weirdness or your patch to
remove it would be a good idea in order to avoid future head
scratching. I'll send a patch that adds comments to both arm and
powerpc.

Thanks,
drew
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [kvm-unit-tests PATCH] lib: arm: Use correct halt() prototype from smp.h

2019-04-09 Thread Alexandru Elisei
On 4/9/19 8:40 AM, Andrew Jones wrote:
> On Mon, Apr 08, 2019 at 04:11:25PM +0100, Alexandru Elisei wrote:
>> The prototype for the halt() function is incorrect, because halt() doesn't
>> take any arguments. Fix it by using the prototype from smp.h.
>>
>> Signed-off-by: Alexandru Elisei 
>> ---
>>  lib/arm/io.c | 5 ++---
>>  1 file changed, 2 insertions(+), 3 deletions(-)
>>
>> diff --git a/lib/arm/io.c b/lib/arm/io.c
>> index 8226b765bdc5..6d3d7afed002 100644
>> --- a/lib/arm/io.c
>> +++ b/lib/arm/io.c
>> @@ -15,11 +15,10 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  
>>  #include "io.h"
>>  
>> -extern void halt(int code);
>> -
>>  static struct spinlock uart_lock;
>>  /*
>>   * Use this guess for the uart base in order to make an attempt at
>> @@ -93,6 +92,6 @@ void exit(int code)
>>  {
>>  chr_testdev_exit(code);
>>  psci_system_off();
>> -halt(code);
>> +halt();
>>  __builtin_unreachable();
>>  }
>> -- 
>> 2.17.0
>>
> I don't mind this change, because per the code it is the "correct"
> thing to do. However, I was being a bit tricky here when I wrote it.
> By changing the prototype to take 'code' as argument we guarantee
> that 'code' will be in x0/r0 when we halt, giving us a last chance
> to see it when inspecting the halted unit test state.
>
> Anyway, like I said, I'm fine with the cleanup, but the prototype
> abuse does serve a purpose - maybe just not a good enough purpose
> to justify the weirdness.

Now it makes sense. I didn't think it was intentional, but now that you have
mentioned it, the same pattern is used by powerpc.

Perhaps a comment explaining that having different prototypes was on purpose
would be the best solution?

>
> Thanks,
> drew
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v8 4/9] KVM: arm/arm64: preserve host HCR_EL2 value

2019-04-09 Thread Marc Zyngier
On 08/04/2019 19:39, Kristina Martsenko wrote:
> On 08/04/2019 14:05, Amit Daniel Kachhap wrote:
>> Hi James,
>>
>> On 4/6/19 4:07 PM, James Morse wrote:
>>> Hi Amit,
>>>
>>> On 02/04/2019 03:27, Amit Daniel Kachhap wrote:
 From: Mark Rutland 

 When restoring HCR_EL2 for the host, KVM uses HCR_HOST_VHE_FLAGS, which
 is a constant value. This works today, as the host HCR_EL2 value is
 always the same, but this will get in the way of supporting extensions
 that require HCR_EL2 bits to be set conditionally for the host.

 To allow such features to work without KVM having to explicitly handle
 every possible host feature combination, this patch has KVM save/restore
 for the host HCR when switching to/from a guest HCR. The saving of the
 register is done once during cpu hypervisor initialization state and is
 just restored after switch from guest.

 For fetching HCR_EL2 during kvm initialisation, a hyp call is made using
 kvm_call_hyp and is helpful in non-VHE case.

 For the hyp TLB maintenance code, __tlb_switch_to_host_vhe() is updated
 to toggle the TGE bit with a RMW sequence, as we already do in
 __tlb_switch_to_guest_vhe().

 The value of hcr_el2 is now stored in struct kvm_cpu_context as both host
 and guest can now use this field in a common way.
>>>
>>> These HCR_EL2 flags have had me confused for quite a while.
>>> I thought this was preserving the value that head.S or cpufeature.c had 
>>> set, and with
>>> ptrauth we couldn't know what this register should be anymore, the host 
>>> flags has to vary.
>>>
>>> Kristina's explanation of it[0], clarified things, and with a bit more 
>>> digging it appears
>>> we always set API/APK, even if the hardware doesn't support the feature (as 
>>> its harmless).
>>> So we don't need to vary the host flags...
>>
>> API/APK is always set for NVHE host mode.
>>>
>>> My question is, what breaks if this patch isn't merged? (the MDCR change is 
>>> cleanup we can
>>> do because of this HCR change), is this HCR change just cleanup too? If so, 
>>> can we merge
>>> ptrauth without either, so we only make the change when its needed? (it 
>>> will cause some
>>> changes in your patch 7, but I can't see where you depend on the host 
>>> flags).
>>
>> Yes you are right that this patch does not directly effect pointer 
>> authentication functionality but contains several optimizations and cleanups 
>> such as,
>>
>> * Removes assigning static flags HCR_HOST_VHE_FLAGS/HCR_HOST_NVHE_FLAGS from 
>> switch.c so switching functions now are more generic in nature.
>> * Currently the variation in hcr_el2 flags is across modes (VHE/NVHE). Any 
>> future conditional change within those modes in host HCR_EL2 may not effect 
>> code changes in switch.c
>> * Save of hcr_el2 done at hyp init time so not expensive switching wise.
>>
>> I am fine on posting it separately also.
> 
> FWIW I think it makes sense to post the HCR and MDCR patches separately
> from this series. That should make it clear that pointer auth does not
> depend on these changes, and should make it easier to evaluate the
> changes on their own.
> 
> Others' opinions are welcome as well.

Agreed. I'm quite eager to move forward with this series, and the least
unrelated changes it makes, the better. Cleanups and optimizations can
always be merged at a later time.

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH] kvm: arm: Skip stage2 huge mappings for unaligned ipa backed by THP

2019-04-09 Thread Zenghui Yu



On 2019/4/9 2:40, Suzuki K Poulose wrote:

Hi Zenhui,

On 04/08/2019 04:11 PM, Zenghui Yu wrote:

Hi Suzuki,

Thanks for the reply.



...


Hi Suzuki,

Why not making use of fault_supports_stage2_huge_mapping()?  Let it do
some checks for us.

fault_supports_stage2_huge_mapping() was intended to do a *two-step*
check to tell us that can we create stage2 huge block mappings, and 
this

check is both for hugetlbfs and THP.  With commit a80868f398554842b14,
we pass PAGE_SIZE as "map_size" for normal size pages (which turned out
to be almost meaningless), and unfortunately the THP check no longer
works.


Thats correct.



So we want to rework *THP* check process.  Your patch fixes the first
checking-step, but the second is still missed, am I wrong?


It fixes the step explicitly for the THP by making sure that the GPA and
the HVA are aligned to the map size.


Yes, I understand how your patch had fixed the issue.  But what I'm
really concerned about here is the *second* checking-step in
fault_supports_stage2_huge_mapping().

We have to check if we are mapping a non-block aligned or non-block
sized memslot, if so, we can not create block mappings for the beginning
and end of this memslot.  This is what the second part of
fault_supports_stage2_huge_mapping() had done.

I haven't seen this checking-step in your patch, did I miss something?



I see.


I don't think this calls for a VM_BUG_ON(). It is simply a case where
the GPA is not aligned to HVA, but for normal VMA that could be made 
THP.


We had this VM_BUG_ON(), which would have never hit because we would
have set force_pte if they were not aligned.


Yes, I agree.


+    /* Skip memslots with unaligned IPA and user address */
+    if ((gfn & mask) != (pfn & mask))
+    return false;
  if (pfn & mask) {
  *ipap &= PMD_MASK;
  kvm_release_pfn_clean(pfn);



---8>---

Rework fault_supports_stage2_huge_mapping(), let it check THP again.

Signed-off-by: Zenghui Yu 
---
  virt/kvm/arm/mmu.c | 11 ++-
  1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 27c9583..5e1b258 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1632,6 +1632,15 @@ static bool 
fault_supports_stage2_huge_mapping(struct kvm_memory_slot *memslot,

  uaddr_end = uaddr_start + size;

  /*
+ * If the memslot is _not_ backed by hugetlbfs, then check if it
+ * can be backed by transparent hugepages.
+ *
+ * Currently only PMD_SIZE THPs are supported, revisit it later.
+ */
+    if (map_size == PAGE_SIZE)
+    map_size = PMD_SIZE;
+


This looks hackish. What is we support PUD_SIZE huge page in the future
?


Yes, this might make the code a little difficult to understand. But by
doing so, we follow the same logic before commit a80868f398554842b14,
that said, we do the two-step checking for normal size pages in
fault_supports_stage2_huge_mapping(), to decide if we can create THP
mappings for these pages.

As for PUD_SIZE THPs, to be honest, I have no idea now :(


How about the following diff ?

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 97b5417..98e5cec 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1791,7 +1791,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,

   * currently supported. This code will need to be
   * updated to support other THP sizes.
   */
-    if (transparent_hugepage_adjust(, _ipa))
+    if (fault_supports_stage2_huge_mappings(memslot, hva, PMD_SIZE) &&
+    transparent_hugepage_adjust(, _ipa))
  vma_pagesize = PMD_SIZE;
  }


I think this is good enough for the issue.

(One minor concern: With this change, it seems that we no longer need
"force_pte" and can just use "logging_active" instead. But this is not
much related to what we're fixing.)


thanks.


___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [kvm-unit-tests PATCH] lib: arm: Use correct halt() prototype from smp.h

2019-04-09 Thread Andrew Jones
On Mon, Apr 08, 2019 at 04:11:25PM +0100, Alexandru Elisei wrote:
> The prototype for the halt() function is incorrect, because halt() doesn't
> take any arguments. Fix it by using the prototype from smp.h.
> 
> Signed-off-by: Alexandru Elisei 
> ---
>  lib/arm/io.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/arm/io.c b/lib/arm/io.c
> index 8226b765bdc5..6d3d7afed002 100644
> --- a/lib/arm/io.c
> +++ b/lib/arm/io.c
> @@ -15,11 +15,10 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include "io.h"
>  
> -extern void halt(int code);
> -
>  static struct spinlock uart_lock;
>  /*
>   * Use this guess for the uart base in order to make an attempt at
> @@ -93,6 +92,6 @@ void exit(int code)
>  {
>   chr_testdev_exit(code);
>   psci_system_off();
> - halt(code);
> + halt();
>   __builtin_unreachable();
>  }
> -- 
> 2.17.0
>

I don't mind this change, because per the code it is the "correct"
thing to do. However, I was being a bit tricky here when I wrote it.
By changing the prototype to take 'code' as argument we guarantee
that 'code' will be in x0/r0 when we halt, giving us a last chance
to see it when inspecting the halted unit test state.

Anyway, like I said, I'm fine with the cleanup, but the prototype
abuse does serve a purpose - maybe just not a good enough purpose
to justify the weirdness.

Thanks,
drew
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm