Re: [PATCH 2/4] KVM: arm64: Use find_vma_intersection()

2021-03-15 Thread Keqian Zhu
Hi Gavin,

On 2021/3/16 11:52, Gavin Shan wrote:
> Hi Keqian,
> 
> On 3/15/21 8:42 PM, Gavin Shan wrote:
>> On 3/15/21 7:04 PM, Keqian Zhu wrote:
>>> On 2021/3/15 12:18, Gavin Shan wrote:
 find_vma_intersection() has been existing to search the intersected
 vma. This uses the function where it's applicable, to simplify the
 code.

 Signed-off-by: Gavin Shan 
 ---
   arch/arm64/kvm/mmu.c | 10 ++
   1 file changed, 6 insertions(+), 4 deletions(-)

 diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
 index 84e70f953de6..286b603ed0d3 100644
 --- a/arch/arm64/kvm/mmu.c
 +++ b/arch/arm64/kvm/mmu.c
 @@ -421,10 +421,11 @@ static void stage2_unmap_memslot(struct kvm *kvm,
* ++
*/
   do {
 -struct vm_area_struct *vma = find_vma(current->mm, hva);
 +struct vm_area_struct *vma;
   hva_t vm_start, vm_end;
 -if (!vma || vma->vm_start >= reg_end)
 +vma = find_vma_intersection(current->mm, hva, reg_end);
>>> Nit: Keep a same style may be better(Assign vma when declare it).
>>> Other looks good to me.
>>>
>>
>> Yeah, I agree. I will adjust the code in v2 and included your r-b.
>> Thanks for your time to review.
>>
> 
> After rechecking the code, I think it'd better to keep current style
> because there is a follow-on validation on @vma. Keeping them together
> seems a good idea. I think it wouldn't a big deal to you. So I will
> keep current style with your r-b in v2.
Sure, both is OK. ;-)

Thanks,
Keqian
> 
> vma = find_vma_intersection(current->mm, hva, reg_end);
> if (!vma)
>  break;
> Thanks,
> Gavin
>  
 +if (!vma)
   break;
   /*
 @@ -1330,10 +1331,11 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
* ++
*/
   do {
 -struct vm_area_struct *vma = find_vma(current->mm, hva);
 +struct vm_area_struct *vma;
   hva_t vm_start, vm_end;
 -if (!vma || vma->vm_start >= reg_end)
 +vma = find_vma_intersection(current->mm, hva, reg_end);
 +if (!vma)
   break;
   /*

>>>
>>
> 
> .
> 
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v2 2/3] KVM: arm64: Use find_vma_intersection()

2021-03-15 Thread Gavin Shan
find_vma_intersection() has been existing to search the intersected
vma. This uses the function where it's applicable, to simplify the
code.

Signed-off-by: Gavin Shan 
Reviewed-by: Keqian Zhu 
---
 arch/arm64/kvm/mmu.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 28f3b3736dc8..192e0df2fc8e 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -421,10 +421,11 @@ static void stage2_unmap_memslot(struct kvm *kvm,
 * ++
 */
do {
-   struct vm_area_struct *vma = find_vma(current->mm, hva);
+   struct vm_area_struct *vma;
hva_t vm_start, vm_end;
 
-   if (!vma || vma->vm_start >= reg_end)
+   vma = find_vma_intersection(current->mm, hva, reg_end);
+   if (!vma)
break;
 
/*
@@ -1329,10 +1330,11 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
 * ++
 */
do {
-   struct vm_area_struct *vma = find_vma(current->mm, hva);
+   struct vm_area_struct *vma;
hva_t vm_start, vm_end;
 
-   if (!vma || vma->vm_start >= reg_end)
+   vma = find_vma_intersection(current->mm, hva, reg_end);
+   if (!vma)
break;
 
/*
-- 
2.23.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v2 3/3] KVM: arm64: Don't retrieve memory slot again in page fault handler

2021-03-15 Thread Gavin Shan
We needn't retrieve the memory slot again in user_mem_abort() because
the corresponding memory slot has been passed from the caller. This
would save some CPU cycles. For example, the time used to write 1GB
memory, which is backed by 2MB hugetlb pages and write-protected, is
dropped by 6.8% from 928ms to 864ms.

Signed-off-by: Gavin Shan 
Reviewed-by: Keqian Zhu 
---
 arch/arm64/kvm/mmu.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 192e0df2fc8e..2491b40a294a 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -843,10 +843,15 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
 * unmapped afterwards, the call to kvm_unmap_hva will take it away
 * from us again properly. This smp_rmb() interacts with the smp_wmb()
 * in kvm_mmu_notifier_invalidate_.
+*
+* Besides, __gfn_to_pfn_memslot() instead of gfn_to_pfn_prot() is
+* used to avoid unnecessary overhead introduced to locate the memory
+* slot because it's always fixed even @gfn is adjusted for huge pages.
 */
smp_rmb();
 
-   pfn = gfn_to_pfn_prot(kvm, gfn, write_fault, );
+   pfn = __gfn_to_pfn_memslot(memslot, gfn, false, NULL,
+  write_fault, , NULL);
if (pfn == KVM_PFN_ERR_HWPOISON) {
kvm_send_hwpoison_signal(hva, vma_shift);
return 0;
@@ -912,7 +917,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
/* Mark the page dirty only if the fault is handled successfully */
if (writable && !ret) {
kvm_set_pfn_dirty(pfn);
-   mark_page_dirty(kvm, gfn);
+   mark_page_dirty_in_slot(kvm, memslot, gfn);
}
 
 out_unlock:
-- 
2.23.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v2 1/3] KVM: arm64: Hide kvm_mmu_wp_memory_region()

2021-03-15 Thread Gavin Shan
We needn't expose the function as it's only used by mmu.c since it
was introduced by commit c64735554c0a ("KVM: arm: Add initial dirty
page locking support").

Signed-off-by: Gavin Shan 
Reviewed-by: Keqian Zhu 
---
 arch/arm64/include/asm/kvm_host.h | 1 -
 arch/arm64/kvm/mmu.c  | 2 +-
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 3d10e6527f7d..688f2df1957b 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -632,7 +632,6 @@ void kvm_arm_resume_guest(struct kvm *kvm);
})
 
 void force_vm_exit(const cpumask_t *mask);
-void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
 
 int handle_exit(struct kvm_vcpu *vcpu, int exception_index);
 void handle_exit_early(struct kvm_vcpu *vcpu, int exception_index);
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 8711894db8c2..28f3b3736dc8 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -555,7 +555,7 @@ static void stage2_wp_range(struct kvm_s2_mmu *mmu, 
phys_addr_t addr, phys_addr_
  * Acquires kvm_mmu_lock. Called with kvm->slots_lock mutex acquired,
  * serializing operations for VM memory regions.
  */
-void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot)
+static void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot)
 {
struct kvm_memslots *slots = kvm_memslots(kvm);
struct kvm_memory_slot *memslot = id_to_memslot(slots, slot);
-- 
2.23.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v2 0/3] KVM: arm64: Minor page fault handler improvement

2021-03-15 Thread Gavin Shan
The series includes several minior improvements to stage-2 page fault
handler: PATCH[1/2] are cleaning up the code. PATCH[3] don't retrieve
the memory slot again in the page fault handler to save a bit CPU cycles.

Changelog
=
v2:
   * Rebased to 5.12.rc3 and include r-bs from Keqian  (Gavin)
   * Drop patch to fix IPA limit boundary issue(Keqian)
   * Comments on why we use __gfn_to_pfn_memslot() (Keqian)

Gavin Shan (3):
  KVM: arm64: Hide kvm_mmu_wp_memory_region()
  KVM: arm64: Use find_vma_intersection()
  KVM: arm64: Don't retrieve memory slot again in page fault handler

 arch/arm64/include/asm/kvm_host.h |  1 -
 arch/arm64/kvm/mmu.c  | 21 ++---
 2 files changed, 14 insertions(+), 8 deletions(-)

-- 
2.23.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 2/4] KVM: arm64: Use find_vma_intersection()

2021-03-15 Thread Gavin Shan

Hi Keqian,

On 3/15/21 8:42 PM, Gavin Shan wrote:

On 3/15/21 7:04 PM, Keqian Zhu wrote:

On 2021/3/15 12:18, Gavin Shan wrote:

find_vma_intersection() has been existing to search the intersected
vma. This uses the function where it's applicable, to simplify the
code.

Signed-off-by: Gavin Shan 
---
  arch/arm64/kvm/mmu.c | 10 ++
  1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 84e70f953de6..286b603ed0d3 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -421,10 +421,11 @@ static void stage2_unmap_memslot(struct kvm *kvm,
   * ++
   */
  do {
-    struct vm_area_struct *vma = find_vma(current->mm, hva);
+    struct vm_area_struct *vma;
  hva_t vm_start, vm_end;
-    if (!vma || vma->vm_start >= reg_end)
+    vma = find_vma_intersection(current->mm, hva, reg_end);

Nit: Keep a same style may be better(Assign vma when declare it).
Other looks good to me.



Yeah, I agree. I will adjust the code in v2 and included your r-b.
Thanks for your time to review.



After rechecking the code, I think it'd better to keep current style
because there is a follow-on validation on @vma. Keeping them together
seems a good idea. I think it wouldn't a big deal to you. So I will
keep current style with your r-b in v2.

vma = find_vma_intersection(current->mm, hva, reg_end);
if (!vma)
 break;
Thanks,
Gavin
 

+    if (!vma)
  break;
  /*
@@ -1330,10 +1331,11 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
   * ++
   */
  do {
-    struct vm_area_struct *vma = find_vma(current->mm, hva);
+    struct vm_area_struct *vma;
  hva_t vm_start, vm_end;
-    if (!vma || vma->vm_start >= reg_end)
+    vma = find_vma_intersection(current->mm, hva, reg_end);
+    if (!vma)
  break;
  /*







___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC PATCH 3/4] KVM: stats: Add ioctl commands to pull statistics in binary format

2021-03-15 Thread Jing Zhang
Hi Paolo,

On Wed, Mar 10, 2021 at 8:55 AM Paolo Bonzini  wrote:
>
> On 10/03/21 01:30, Jing Zhang wrote:
> > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> > index 383df23514b9..87dd62516c8b 100644
> > --- a/virt/kvm/kvm_main.c
> > +++ b/virt/kvm/kvm_main.c
> > @@ -3464,6 +3464,51 @@ static long kvm_vcpu_ioctl(struct file *filp,
> >   r = kvm_arch_vcpu_ioctl_set_fpu(vcpu, fpu);
> >   break;
> >   }
> > + case KVM_STATS_GET_INFO: {
> > + struct kvm_stats_info stats_info;
> > +
> > + r = -EFAULT;
> > + stats_info.num_stats = VCPU_STAT_COUNT;
> > + if (copy_to_user(argp, _info, sizeof(stats_info)))
> > + goto out;
> > + r = 0;
> > + break;
> > + }
> > + case KVM_STATS_GET_NAMES: {
> > + struct kvm_stats_names stats_names;
> > +
> > + r = -EFAULT;
> > + if (copy_from_user(_names, argp, sizeof(stats_names)))
> > + goto out;
> > + r = -EINVAL;
> > + if (stats_names.size < VCPU_STAT_COUNT * KVM_STATS_NAME_LEN)
> > + goto out;
> > +
> > + r = -EFAULT;
> > + if (copy_to_user(argp + sizeof(stats_names),
> > + kvm_vcpu_stat_strings,
> > + VCPU_STAT_COUNT * KVM_STATS_NAME_LEN))
>
> The only reason to separate the strings in patch 1 is to pass them here.
>   But this is a poor API because it imposes a limit on the length of the
> statistics, and makes that length part of the binary interface.
>
> I would prefer a completely different interface, where you have a file
> descriptor that can be created and associated to a vCPU or VM (or even
> to /dev/kvm).  Having a file descriptor is important because the fd can
We are considering about how to create the file descriptor. It might be risky
to create an extra fd for every vCPU. It will easily hit the fd limit for the
process or the system for machines running a ton of small VMs.
Looks like creating an extra file descriptor for every VM is a better option.
And then we can check per vCPU stats through Ioctl of this VM fd by
passing the vCPU index.
What do you think?
> be passed to a less-privileged process that takes care of gathering the
> metrics
>
> The result of reading the file descriptor could be either ASCII or
> binary.  IMO the real cost lies in opening and reading a multitude of
> files rather than in the ASCII<->binary conversion.
>
> The format could be one of the following:
>
> * binary:
>
> 4 bytes flags (always zero)
> 4 bytes number of statistics
> 4 bytes offset of the first stat description
> 4 bytes offset of the first stat value
> stat descriptions:
>- 4 bytes for the type (for now always zero: uint64_t)
>- 4 bytes for the flags (for now always zero)
>- length of name
>- name
> statistics in 64-bit format
>
> * text:
>
> stat1_name uint64 123
> stat2_name uint64 456
> ...
>
> What do you think?
>
> Paolo
>
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v5 14/36] KVM: arm64: Provide __flush_dcache_area at EL2

2021-03-15 Thread Quentin Perret
On Monday 15 Mar 2021 at 16:33:23 (+), Will Deacon wrote:
> On Mon, Mar 15, 2021 at 02:35:14PM +, Quentin Perret wrote:
> > We will need to do cache maintenance at EL2 soon, so compile a copy of
> > __flush_dcache_area at EL2, and provide a copy of arm64_ftr_reg_ctrel0
> > as it is needed by the read_ctr macro.
> > 
> > Signed-off-by: Quentin Perret 
> > ---
> >  arch/arm64/include/asm/kvm_cpufeature.h |  2 ++
> >  arch/arm64/kvm/hyp/nvhe/Makefile|  3 ++-
> >  arch/arm64/kvm/hyp/nvhe/cache.S | 13 +
> >  arch/arm64/kvm/sys_regs.c   |  1 +
> >  4 files changed, 18 insertions(+), 1 deletion(-)
> >  create mode 100644 arch/arm64/kvm/hyp/nvhe/cache.S
> > 
> > diff --git a/arch/arm64/include/asm/kvm_cpufeature.h 
> > b/arch/arm64/include/asm/kvm_cpufeature.h
> > index 3fd9f60d2180..efba1b89b8a4 100644
> > --- a/arch/arm64/include/asm/kvm_cpufeature.h
> > +++ b/arch/arm64/include/asm/kvm_cpufeature.h
> > @@ -13,3 +13,5 @@
> >  #define KVM_HYP_CPU_FTR_REG(name) extern struct arm64_ftr_reg 
> > kvm_nvhe_sym(name)
> >  #endif
> >  #endif
> > +
> > +KVM_HYP_CPU_FTR_REG(arm64_ftr_reg_ctrel0);
> 
> I still think this is a bit weird. If you really want to macro-ise stuff,
> then why not follow the sort of thing we do for e.g. per-cpu variables and
> have separate DECLARE_HYP_CPU_FTR_REG and DEFINE_HYP_CPU_FTR_REG macros.
> 
> That way kvm_cpufeature.h can have header guards like a normal header and
> we can drop the '#ifndef KVM_HYP_CPU_FTR_REG' altogether. I don't think
> the duplication of the symbol name really matters -- it should fail at
> build time if something is missing.

I just tend to hate unnecessary boilerplate, but if you feel strongly
about it, happy to change :)

Cheers,
Quentin
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v5 29/36] KVM: arm64: Use page-table to track page ownership

2021-03-15 Thread Quentin Perret
On Monday 15 Mar 2021 at 16:36:19 (+), Will Deacon wrote:
> On Mon, Mar 15, 2021 at 02:35:29PM +, Quentin Perret wrote:
> > As the host stage 2 will be identity mapped, all the .hyp memory regions
> > and/or memory pages donated to protected guestis will have to marked
> > invalid in the host stage 2 page-table. At the same time, the hypervisor
> > will need a way to track the ownership of each physical page to ensure
> > memory sharing or donation between entities (host, guests, hypervisor) is
> > legal.
> > 
> > In order to enable this tracking at EL2, let's use the host stage 2
> > page-table itself. The idea is to use the top bits of invalid mappings
> > to store the unique identifier of the page owner. The page-table owner
> > (the host) gets identifier 0 such that, at boot time, it owns the entire
> > IPA space as the pgd starts zeroed.
> > 
> > Provide kvm_pgtable_stage2_set_owner() which allows to modify the
> > ownership of pages in the host stage 2. It re-uses most of the map()
> > logic, but ends up creating invalid mappings instead. This impacts
> > how we do refcount as we now need to count invalid mappings when they
> > are used for ownership tracking.
> > 
> > Signed-off-by: Quentin Perret 
> > ---
> >  arch/arm64/include/asm/kvm_pgtable.h |  21 +
> >  arch/arm64/kvm/hyp/pgtable.c | 127 ++-
> >  2 files changed, 124 insertions(+), 24 deletions(-)
> > 
> > diff --git a/arch/arm64/include/asm/kvm_pgtable.h 
> > b/arch/arm64/include/asm/kvm_pgtable.h
> > index 4ae19247837b..683e96abdc24 100644
> > --- a/arch/arm64/include/asm/kvm_pgtable.h
> > +++ b/arch/arm64/include/asm/kvm_pgtable.h
> > @@ -238,6 +238,27 @@ int kvm_pgtable_stage2_map(struct kvm_pgtable *pgt, 
> > u64 addr, u64 size,
> >u64 phys, enum kvm_pgtable_prot prot,
> >void *mc);
> >  
> > +/**
> > + * kvm_pgtable_stage2_set_owner() - Annotate invalid mappings with metadata
> > + * encoding the ownership of a page in the
> > + * IPA space.
> 
> The function does more than this, though, as it will also go ahead and unmap
> existing valid mappings which I think should be mentioned here, no?

Right, I see why you mean. How about:

'Unmap and annotate pages in the IPA space to track ownership'

> > +int kvm_pgtable_stage2_set_owner(struct kvm_pgtable *pgt, u64 addr, u64 
> > size,
> > +void *mc, u8 owner_id)
> > +{
> > +   int ret;
> > +   struct stage2_map_data map_data = {
> > +   .phys   = KVM_PHYS_INVALID,
> > +   .mmu= pgt->mmu,
> > +   .memcache   = mc,
> > +   .mm_ops = pgt->mm_ops,
> > +   .owner_id   = owner_id,
> > +   };
> > +   struct kvm_pgtable_walker walker = {
> > +   .cb = stage2_map_walker,
> > +   .flags  = KVM_PGTABLE_WALK_TABLE_PRE |
> > + KVM_PGTABLE_WALK_LEAF |
> > + KVM_PGTABLE_WALK_TABLE_POST,
> > +   .arg= _data,
> > +   };
> > +
> > +   if (owner_id > KVM_MAX_OWNER_ID)
> > +   return -EINVAL;
> > +
> > +   ret = kvm_pgtable_walk(pgt, addr, size, );
> > +   dsb(ishst);
> 
> Why is the DSB needed here? afaict, we only ever unmap a valid entry (which
> will have a DSB as part of the TLBI sequence) or we update the owner for an
> existing invalid entry, in which case the walker doesn't care.

Indeed, that is now unnecessary. I'll remove it.

Thanks,
Quentin
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v5 09/36] KVM: arm64: Allow using kvm_nvhe_sym() in hyp code

2021-03-15 Thread Quentin Perret
In order to allow the usage of code shared by the host and the hyp in
static inline library functions, allow the usage of kvm_nvhe_sym() at
EL2 by defaulting to the raw symbol name.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/hyp_image.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/arm64/include/asm/hyp_image.h 
b/arch/arm64/include/asm/hyp_image.h
index 78cd77990c9c..b4b3076a76fb 100644
--- a/arch/arm64/include/asm/hyp_image.h
+++ b/arch/arm64/include/asm/hyp_image.h
@@ -10,11 +10,15 @@
 #define __HYP_CONCAT(a, b) a ## b
 #define HYP_CONCAT(a, b)   __HYP_CONCAT(a, b)
 
+#ifndef __KVM_NVHE_HYPERVISOR__
 /*
  * KVM nVHE code has its own symbol namespace prefixed with __kvm_nvhe_,
  * to separate it from the kernel proper.
  */
 #define kvm_nvhe_sym(sym)  __kvm_nvhe_##sym
+#else
+#define kvm_nvhe_sym(sym)  sym
+#endif
 
 #ifdef LINKER_SCRIPT
 
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v5 28/36] KVM: arm64: Always zero invalid PTEs

2021-03-15 Thread Quentin Perret
kvm_set_invalid_pte() currently only clears bit 0 from a PTE because
stage2_map_walk_table_post() needs to be able to follow the anchor. In
preparation for re-using bits 63-01 from invalid PTEs, make sure to zero
it entirely by ensuring to cache the anchor's child upfront.

Acked-by: Will Deacon 
Suggested-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/kvm/hyp/pgtable.c | 26 --
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index bdd6e3d4eeb6..f37b4179b880 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -156,10 +156,9 @@ static kvm_pte_t *kvm_pte_follow(kvm_pte_t pte, struct 
kvm_pgtable_mm_ops *mm_op
return mm_ops->phys_to_virt(kvm_pte_to_phys(pte));
 }
 
-static void kvm_set_invalid_pte(kvm_pte_t *ptep)
+static void kvm_clear_pte(kvm_pte_t *ptep)
 {
-   kvm_pte_t pte = *ptep;
-   WRITE_ONCE(*ptep, pte & ~KVM_PTE_VALID);
+   WRITE_ONCE(*ptep, 0);
 }
 
 static void kvm_set_table_pte(kvm_pte_t *ptep, kvm_pte_t *childp,
@@ -443,6 +442,7 @@ struct stage2_map_data {
kvm_pte_t   attr;
 
kvm_pte_t   *anchor;
+   kvm_pte_t   *childp;
 
struct kvm_s2_mmu   *mmu;
void*memcache;
@@ -532,7 +532,7 @@ static int stage2_map_walker_try_leaf(u64 addr, u64 end, 
u32 level,
 * There's an existing different valid leaf entry, so perform
 * break-before-make.
 */
-   kvm_set_invalid_pte(ptep);
+   kvm_clear_pte(ptep);
kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, data->mmu, addr, level);
mm_ops->put_page(ptep);
}
@@ -553,7 +553,8 @@ static int stage2_map_walk_table_pre(u64 addr, u64 end, u32 
level,
if (!kvm_block_mapping_supported(addr, end, data->phys, level))
return 0;
 
-   kvm_set_invalid_pte(ptep);
+   data->childp = kvm_pte_follow(*ptep, data->mm_ops);
+   kvm_clear_pte(ptep);
 
/*
 * Invalidate the whole stage-2, as we may have numerous leaf
@@ -599,7 +600,7 @@ static int stage2_map_walk_leaf(u64 addr, u64 end, u32 
level, kvm_pte_t *ptep,
 * will be mapped lazily.
 */
if (kvm_pte_valid(pte)) {
-   kvm_set_invalid_pte(ptep);
+   kvm_clear_pte(ptep);
kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, data->mmu, addr, level);
mm_ops->put_page(ptep);
}
@@ -615,19 +616,24 @@ static int stage2_map_walk_table_post(u64 addr, u64 end, 
u32 level,
  struct stage2_map_data *data)
 {
struct kvm_pgtable_mm_ops *mm_ops = data->mm_ops;
+   kvm_pte_t *childp;
int ret = 0;
 
if (!data->anchor)
return 0;
 
-   mm_ops->put_page(kvm_pte_follow(*ptep, mm_ops));
-   mm_ops->put_page(ptep);
-
if (data->anchor == ptep) {
+   childp = data->childp;
data->anchor = NULL;
+   data->childp = NULL;
ret = stage2_map_walk_leaf(addr, end, level, ptep, data);
+   } else {
+   childp = kvm_pte_follow(*ptep, mm_ops);
}
 
+   mm_ops->put_page(childp);
+   mm_ops->put_page(ptep);
+
return ret;
 }
 
@@ -736,7 +742,7 @@ static int stage2_unmap_walker(u64 addr, u64 end, u32 
level, kvm_pte_t *ptep,
 * block entry and rely on the remaining portions being faulted
 * back lazily.
 */
-   kvm_set_invalid_pte(ptep);
+   kvm_clear_pte(ptep);
kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, mmu, addr, level);
mm_ops->put_page(ptep);
 
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v5 25/36] KVM: arm64: Make memcache anonymous in pgtable allocator

2021-03-15 Thread Quentin Perret
The current stage2 page-table allocator uses a memcache to get
pre-allocated pages when it needs any. To allow re-using this code at
EL2 which uses a concept of memory pools, make the memcache argument of
kvm_pgtable_stage2_map() anonymous, and let the mm_ops zalloc_page()
callbacks use it the way they need to.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_pgtable.h | 6 +++---
 arch/arm64/kvm/hyp/pgtable.c | 4 ++--
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_pgtable.h 
b/arch/arm64/include/asm/kvm_pgtable.h
index 9cdc198ea6b4..4ae19247837b 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -213,8 +213,8 @@ void kvm_pgtable_stage2_destroy(struct kvm_pgtable *pgt);
  * @size:  Size of the mapping.
  * @phys:  Physical address of the memory to map.
  * @prot:  Permissions and attributes for the mapping.
- * @mc:Cache of pre-allocated GFP_PGTABLE_USER memory from 
which to
- * allocate page-table pages.
+ * @mc:Cache of pre-allocated and zeroed memory from which to 
allocate
+ * page-table pages.
  *
  * The offset of @addr within a page is ignored, @size is rounded-up to
  * the next page boundary and @phys is rounded-down to the previous page
@@ -236,7 +236,7 @@ void kvm_pgtable_stage2_destroy(struct kvm_pgtable *pgt);
  */
 int kvm_pgtable_stage2_map(struct kvm_pgtable *pgt, u64 addr, u64 size,
   u64 phys, enum kvm_pgtable_prot prot,
-  struct kvm_mmu_memory_cache *mc);
+  void *mc);
 
 /**
  * kvm_pgtable_stage2_unmap() - Remove a mapping from a guest stage-2 
page-table.
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 296675e5600d..bdd6e3d4eeb6 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -445,7 +445,7 @@ struct stage2_map_data {
kvm_pte_t   *anchor;
 
struct kvm_s2_mmu   *mmu;
-   struct kvm_mmu_memory_cache *memcache;
+   void*memcache;
 
struct kvm_pgtable_mm_ops   *mm_ops;
 };
@@ -669,7 +669,7 @@ static int stage2_map_walker(u64 addr, u64 end, u32 level, 
kvm_pte_t *ptep,
 
 int kvm_pgtable_stage2_map(struct kvm_pgtable *pgt, u64 addr, u64 size,
   u64 phys, enum kvm_pgtable_prot prot,
-  struct kvm_mmu_memory_cache *mc)
+  void *mc)
 {
int ret;
struct stage2_map_data map_data = {
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v5 07/36] KVM: arm64: Introduce a BSS section for use at Hyp

2021-03-15 Thread Quentin Perret
Currently, the hyp code cannot make full use of a bss, as the kernel
section is mapped read-only.

While this mapping could simply be changed to read-write, it would
intermingle even more the hyp and kernel state than they currently are.
Instead, introduce a __hyp_bss section, that uses reserved pages, and
create the appropriate RW hyp mappings during KVM init.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/sections.h |  1 +
 arch/arm64/kernel/vmlinux.lds.S   | 52 ---
 arch/arm64/kvm/arm.c  | 14 -
 arch/arm64/kvm/hyp/nvhe/hyp.lds.S |  1 +
 4 files changed, 49 insertions(+), 19 deletions(-)

diff --git a/arch/arm64/include/asm/sections.h 
b/arch/arm64/include/asm/sections.h
index 2f36b16a5b5d..e4ad9db53af1 100644
--- a/arch/arm64/include/asm/sections.h
+++ b/arch/arm64/include/asm/sections.h
@@ -13,6 +13,7 @@ extern char __hyp_idmap_text_start[], __hyp_idmap_text_end[];
 extern char __hyp_text_start[], __hyp_text_end[];
 extern char __hyp_rodata_start[], __hyp_rodata_end[];
 extern char __hyp_reloc_begin[], __hyp_reloc_end[];
+extern char __hyp_bss_start[], __hyp_bss_end[];
 extern char __idmap_text_start[], __idmap_text_end[];
 extern char __initdata_begin[], __initdata_end[];
 extern char __inittext_begin[], __inittext_end[];
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 7eea7888bb02..e96173ce211b 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -5,24 +5,7 @@
  * Written by Martin Mares 
  */
 
-#define RO_EXCEPTION_TABLE_ALIGN   8
-#define RUNTIME_DISCARD_EXIT
-
-#include 
-#include 
 #include 
-#include 
-#include 
-#include 
-
-#include "image.h"
-
-OUTPUT_ARCH(aarch64)
-ENTRY(_text)
-
-jiffies = jiffies_64;
-
-
 #ifdef CONFIG_KVM
 #define HYPERVISOR_EXTABLE \
. = ALIGN(SZ_8);\
@@ -51,13 +34,43 @@ jiffies = jiffies_64;
__hyp_reloc_end = .;\
}
 
+#define BSS_FIRST_SECTIONS \
+   __hyp_bss_start = .;\
+   *(HYP_SECTION_NAME(.bss))   \
+   . = ALIGN(PAGE_SIZE);   \
+   __hyp_bss_end = .;
+
+/*
+ * We require that __hyp_bss_start and __bss_start are aligned, and enforce it
+ * with an assertion. But the BSS_SECTION macro places an empty .sbss section
+ * between them, which can in some cases cause the linker to misalign them. To
+ * work around the issue, force a page alignment for __bss_start.
+ */
+#define SBSS_ALIGN PAGE_SIZE
 #else /* CONFIG_KVM */
 #define HYPERVISOR_EXTABLE
 #define HYPERVISOR_DATA_SECTIONS
 #define HYPERVISOR_PERCPU_SECTION
 #define HYPERVISOR_RELOC_SECTION
+#define SBSS_ALIGN 0
 #endif
 
+#define RO_EXCEPTION_TABLE_ALIGN   8
+#define RUNTIME_DISCARD_EXIT
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "image.h"
+
+OUTPUT_ARCH(aarch64)
+ENTRY(_text)
+
+jiffies = jiffies_64;
+
 #define HYPERVISOR_TEXT\
/*  \
 * Align to 4 KB so that\
@@ -276,7 +289,7 @@ SECTIONS
__pecoff_data_rawsize = ABSOLUTE(. - __initdata_begin);
_edata = .;
 
-   BSS_SECTION(0, 0, 0)
+   BSS_SECTION(SBSS_ALIGN, 0, 0)
 
. = ALIGN(PAGE_SIZE);
init_pg_dir = .;
@@ -324,6 +337,9 @@ ASSERT(__hibernate_exit_text_end - 
(__hibernate_exit_text_start & ~(SZ_4K - 1))
 ASSERT((__entry_tramp_text_end - __entry_tramp_text_start) == PAGE_SIZE,
"Entry trampoline text too big")
 #endif
+#ifdef CONFIG_KVM
+ASSERT(__hyp_bss_start == __bss_start, "HYP and Host BSS are misaligned")
+#endif
 /*
  * If padding is applied before .head.text, virt<->phys conversions will fail.
  */
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 2d1e7ef69c04..3f8bcf8db036 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1770,7 +1770,19 @@ static int init_hyp_mode(void)
goto out_err;
}
 
-   err = create_hyp_mappings(kvm_ksym_ref(__bss_start),
+   /*
+* .hyp.bss is guaranteed to be placed at the beginning of the .bss
+* section thanks to an assertion in the linker script. Map it RW and
+* the rest of .bss RO.
+*/
+   err = create_hyp_mappings(kvm_ksym_ref(__hyp_bss_start),
+ kvm_ksym_ref(__hyp_bss_end), PAGE_HYP);
+   if (err) {
+   kvm_err("Cannot map hyp bss section: %d\n", err);
+   goto out_err;
+   }
+
+   err = create_hyp_mappings(kvm_ksym_ref(__hyp_bss_end),
  kvm_ksym_ref(__bss_stop), PAGE_HYP_RO);
if (err) {
kvm_err("Cannot map bss section\n");
diff --git 

[PATCH v5 18/36] KVM: arm64: Elevate hypervisor mappings creation at EL2

2021-03-15 Thread Quentin Perret
Previous commits have introduced infrastructure to enable the EL2 code
to manage its own stage 1 mappings. However, this was preliminary work,
and none of it is currently in use.

Put all of this together by elevating the mapping creation at EL2 when
memory protection is enabled. In this case, the host kernel running
at EL1 still creates _temporary_ EL2 mappings, only used while
initializing the hypervisor, but frees them right after.

As such, all calls to create_hyp_mappings() after kvm init has finished
turn into hypercalls, as the host now has no 'legal' way to modify the
hypevisor page tables directly.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_mmu.h |  2 +-
 arch/arm64/kvm/arm.c | 87 +---
 arch/arm64/kvm/mmu.c | 43 ++--
 3 files changed, 120 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 5c42ec023cc7..ce02a4052dcf 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -166,7 +166,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu);
 
 phys_addr_t kvm_mmu_get_httbr(void);
 phys_addr_t kvm_get_idmap_vector(void);
-int kvm_mmu_init(void);
+int kvm_mmu_init(u32 *hyp_va_bits);
 
 static inline void *__kvm_vector_slot2addr(void *base,
   enum arm64_hyp_spectre_vector slot)
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 26e573cdede3..7d62211109d9 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1421,7 +1421,7 @@ static void cpu_prepare_hyp_mode(int cpu)
kvm_flush_dcache_to_poc(params, sizeof(*params));
 }
 
-static void cpu_init_hyp_mode(void)
+static void hyp_install_host_vector(void)
 {
struct kvm_nvhe_init_params *params;
struct arm_smccc_res res;
@@ -1439,6 +1439,11 @@ static void cpu_init_hyp_mode(void)
params = this_cpu_ptr_nvhe_sym(kvm_init_params);
arm_smccc_1_1_hvc(KVM_HOST_SMCCC_FUNC(__kvm_hyp_init), 
virt_to_phys(params), );
WARN_ON(res.a0 != SMCCC_RET_SUCCESS);
+}
+
+static void cpu_init_hyp_mode(void)
+{
+   hyp_install_host_vector();
 
/*
 * Disabling SSBD on a non-VHE system requires us to enable SSBS
@@ -1481,7 +1486,10 @@ static void cpu_set_hyp_vector(void)
struct bp_hardening_data *data = this_cpu_ptr(_hardening_data);
void *vector = hyp_spectre_vector_selector[data->slot];
 
-   *this_cpu_ptr_hyp_sym(kvm_hyp_vector) = (unsigned long)vector;
+   if (!is_protected_kvm_enabled())
+   *this_cpu_ptr_hyp_sym(kvm_hyp_vector) = (unsigned long)vector;
+   else
+   kvm_call_hyp_nvhe(__pkvm_cpu_set_vector, data->slot);
 }
 
 static void cpu_hyp_reinit(void)
@@ -1489,13 +1497,14 @@ static void cpu_hyp_reinit(void)

kvm_init_host_cpu_context(_cpu_ptr_hyp_sym(kvm_host_data)->host_ctxt);
 
cpu_hyp_reset();
-   cpu_set_hyp_vector();
 
if (is_kernel_in_hyp_mode())
kvm_timer_init_vhe();
else
cpu_init_hyp_mode();
 
+   cpu_set_hyp_vector();
+
kvm_arm_init_debug();
 
if (vgic_present)
@@ -1691,18 +1700,59 @@ static void teardown_hyp_mode(void)
}
 }
 
+static int do_pkvm_init(u32 hyp_va_bits)
+{
+   void *per_cpu_base = kvm_ksym_ref(kvm_arm_hyp_percpu_base);
+   int ret;
+
+   preempt_disable();
+   hyp_install_host_vector();
+   ret = kvm_call_hyp_nvhe(__pkvm_init, hyp_mem_base, hyp_mem_size,
+   num_possible_cpus(), kern_hyp_va(per_cpu_base),
+   hyp_va_bits);
+   preempt_enable();
+
+   return ret;
+}
+
+static int kvm_hyp_init_protection(u32 hyp_va_bits)
+{
+   void *addr = phys_to_virt(hyp_mem_base);
+   int ret;
+
+   ret = create_hyp_mappings(addr, addr + hyp_mem_size, PAGE_HYP);
+   if (ret)
+   return ret;
+
+   ret = do_pkvm_init(hyp_va_bits);
+   if (ret)
+   return ret;
+
+   free_hyp_pgds();
+
+   return 0;
+}
+
 /**
  * Inits Hyp-mode on all online CPUs
  */
 static int init_hyp_mode(void)
 {
+   u32 hyp_va_bits;
int cpu;
-   int err = 0;
+   int err = -ENOMEM;
+
+   /*
+* The protected Hyp-mode cannot be initialized if the memory pool
+* allocation has failed.
+*/
+   if (is_protected_kvm_enabled() && !hyp_mem_base)
+   goto out_err;
 
/*
 * Allocate Hyp PGD and setup Hyp identity mapping
 */
-   err = kvm_mmu_init();
+   err = kvm_mmu_init(_va_bits);
if (err)
goto out_err;
 
@@ -1818,6 +1868,14 @@ static int init_hyp_mode(void)
goto out_err;
}
 
+   if (is_protected_kvm_enabled()) {
+   err = kvm_hyp_init_protection(hyp_va_bits);
+   if (err) {
+   kvm_err("Failed 

[PATCH v5 21/36] KVM: arm64: Set host stage 2 using kvm_nvhe_init_params

2021-03-15 Thread Quentin Perret
Move the registers relevant to host stage 2 enablement to
kvm_nvhe_init_params to prepare the ground for enabling it in later
patches.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_asm.h   |  3 +++
 arch/arm64/kernel/asm-offsets.c|  3 +++
 arch/arm64/kvm/arm.c   |  5 +
 arch/arm64/kvm/hyp/nvhe/hyp-init.S | 14 +-
 arch/arm64/kvm/hyp/nvhe/switch.c   |  5 +
 5 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index db20a9477870..6dce860f8bca 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -158,6 +158,9 @@ struct kvm_nvhe_init_params {
unsigned long tpidr_el2;
unsigned long stack_hyp_va;
phys_addr_t pgd_pa;
+   unsigned long hcr_el2;
+   unsigned long vttbr;
+   unsigned long vtcr;
 };
 
 /* Translate a kernel address @ptr into its equivalent linear mapping */
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index a36e2fc330d4..8930b42f6418 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -120,6 +120,9 @@ int main(void)
   DEFINE(NVHE_INIT_TPIDR_EL2,  offsetof(struct kvm_nvhe_init_params, 
tpidr_el2));
   DEFINE(NVHE_INIT_STACK_HYP_VA,   offsetof(struct kvm_nvhe_init_params, 
stack_hyp_va));
   DEFINE(NVHE_INIT_PGD_PA, offsetof(struct kvm_nvhe_init_params, pgd_pa));
+  DEFINE(NVHE_INIT_HCR_EL2,offsetof(struct kvm_nvhe_init_params, hcr_el2));
+  DEFINE(NVHE_INIT_VTTBR,  offsetof(struct kvm_nvhe_init_params, vttbr));
+  DEFINE(NVHE_INIT_VTCR,   offsetof(struct kvm_nvhe_init_params, vtcr));
 #endif
 #ifdef CONFIG_CPU_PM
   DEFINE(CPU_CTX_SP,   offsetof(struct cpu_suspend_ctx, sp));
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 7d62211109d9..d474eec606a3 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1413,6 +1413,11 @@ static void cpu_prepare_hyp_mode(int cpu)
 
params->stack_hyp_va = kern_hyp_va(per_cpu(kvm_arm_hyp_stack_page, cpu) 
+ PAGE_SIZE);
params->pgd_pa = kvm_mmu_get_httbr();
+   if (is_protected_kvm_enabled())
+   params->hcr_el2 = HCR_HOST_NVHE_PROTECTED_FLAGS;
+   else
+   params->hcr_el2 = HCR_HOST_NVHE_FLAGS;
+   params->vttbr = params->vtcr = 0;
 
/*
 * Flush the init params from the data cache because the struct will
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-init.S 
b/arch/arm64/kvm/hyp/nvhe/hyp-init.S
index a2b8b6a84cbd..a50ad9e9fc05 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-init.S
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-init.S
@@ -83,11 +83,6 @@ SYM_CODE_END(__kvm_hyp_init)
  * x0: struct kvm_nvhe_init_params PA
  */
 SYM_CODE_START_LOCAL(___kvm_hyp_init)
-alternative_if ARM64_KVM_PROTECTED_MODE
-   mov_q   x1, HCR_HOST_NVHE_PROTECTED_FLAGS
-   msr hcr_el2, x1
-alternative_else_nop_endif
-
ldr x1, [x0, #NVHE_INIT_TPIDR_EL2]
msr tpidr_el2, x1
 
@@ -97,6 +92,15 @@ alternative_else_nop_endif
ldr x1, [x0, #NVHE_INIT_MAIR_EL2]
msr mair_el2, x1
 
+   ldr x1, [x0, #NVHE_INIT_HCR_EL2]
+   msr hcr_el2, x1
+
+   ldr x1, [x0, #NVHE_INIT_VTTBR]
+   msr vttbr_el2, x1
+
+   ldr x1, [x0, #NVHE_INIT_VTCR]
+   msr vtcr_el2, x1
+
ldr x1, [x0, #NVHE_INIT_PGD_PA]
phys_to_ttbr x2, x1
 alternative_if ARM64_HAS_CNP
diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index f3d0e9eca56c..979a76cdf9fb 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -97,10 +97,7 @@ static void __deactivate_traps(struct kvm_vcpu *vcpu)
mdcr_el2 |= MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT;
 
write_sysreg(mdcr_el2, mdcr_el2);
-   if (is_protected_kvm_enabled())
-   write_sysreg(HCR_HOST_NVHE_PROTECTED_FLAGS, hcr_el2);
-   else
-   write_sysreg(HCR_HOST_NVHE_FLAGS, hcr_el2);
+   write_sysreg(this_cpu_ptr(_init_params)->hcr_el2, hcr_el2);
write_sysreg(CPTR_EL2_DEFAULT, cptr_el2);
write_sysreg(__kvm_hyp_host_vector, vbar_el2);
 }
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v5 19/36] KVM: arm64: Use kvm_arch for stage 2 pgtable

2021-03-15 Thread Quentin Perret
In order to make use of the stage 2 pgtable code for the host stage 2,
use struct kvm_arch in lieu of struct kvm as the host will have the
former but not the latter.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_pgtable.h | 5 +++--
 arch/arm64/kvm/hyp/pgtable.c | 6 +++---
 arch/arm64/kvm/mmu.c | 2 +-
 3 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_pgtable.h 
b/arch/arm64/include/asm/kvm_pgtable.h
index bf7a3cc49420..7945ec87eaec 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -162,12 +162,13 @@ int kvm_pgtable_hyp_map(struct kvm_pgtable *pgt, u64 
addr, u64 size, u64 phys,
 /**
  * kvm_pgtable_stage2_init() - Initialise a guest stage-2 page-table.
  * @pgt:   Uninitialised page-table structure to initialise.
- * @kvm:   KVM structure representing the guest virtual machine.
+ * @arch:  Arch-specific KVM structure representing the guest virtual
+ * machine.
  * @mm_ops:Memory management callbacks.
  *
  * Return: 0 on success, negative error code on failure.
  */
-int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm *kvm,
+int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_arch *arch,
struct kvm_pgtable_mm_ops *mm_ops);
 
 /**
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 7ce0969203e8..3d79c8094cdd 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -879,11 +879,11 @@ int kvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 
addr, u64 size)
return kvm_pgtable_walk(pgt, addr, size, );
 }
 
-int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm *kvm,
+int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_arch *arch,
struct kvm_pgtable_mm_ops *mm_ops)
 {
size_t pgd_sz;
-   u64 vtcr = kvm->arch.vtcr;
+   u64 vtcr = arch->vtcr;
u32 ia_bits = VTCR_EL2_IPA(vtcr);
u32 sl0 = FIELD_GET(VTCR_EL2_SL0_MASK, vtcr);
u32 start_level = VTCR_EL2_TGRAN_SL0_BASE - sl0;
@@ -896,7 +896,7 @@ int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct 
kvm *kvm,
pgt->ia_bits= ia_bits;
pgt->start_level= start_level;
pgt->mm_ops = mm_ops;
-   pgt->mmu= >arch.mmu;
+   pgt->mmu= >mmu;
 
/* Ensure zeroed PGD pages are visible to the hardware walker */
dsb(ishst);
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 9d331bf262d2..41f9c03cbcc3 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -457,7 +457,7 @@ int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu 
*mmu)
if (!pgt)
return -ENOMEM;
 
-   err = kvm_pgtable_stage2_init(pgt, kvm, _s2_mm_ops);
+   err = kvm_pgtable_stage2_init(pgt, >arch, _s2_mm_ops);
if (err)
goto out_free_pgtable;
 
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v5 12/36] KVM: arm64: Introduce a Hyp buddy page allocator

2021-03-15 Thread Quentin Perret
When memory protection is enabled, the hyp code will require a basic
form of memory management in order to allocate and free memory pages at
EL2. This is needed for various use-cases, including the creation of hyp
mappings or the allocation of stage 2 page tables.

To address these use-case, introduce a simple memory allocator in the
hyp code. The allocator is designed as a conventional 'buddy allocator',
working with a page granularity. It allows to allocate and free
physically contiguous pages from memory 'pools', with a guaranteed order
alignment in the PA space. Each page in a memory pool is associated
with a struct hyp_page which holds the page's metadata, including its
refcount, as well as its current order, hence mimicking the kernel's
buddy system in the GFP infrastructure. The hyp_page metadata are made
accessible through a hyp_vmemmap, following the concept of
SPARSE_VMEMMAP in the kernel.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/kvm/hyp/include/nvhe/gfp.h|  68 
 arch/arm64/kvm/hyp/include/nvhe/memory.h |  28 
 arch/arm64/kvm/hyp/nvhe/Makefile |   2 +-
 arch/arm64/kvm/hyp/nvhe/page_alloc.c | 195 +++
 4 files changed, 292 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/gfp.h
 create mode 100644 arch/arm64/kvm/hyp/nvhe/page_alloc.c

diff --git a/arch/arm64/kvm/hyp/include/nvhe/gfp.h 
b/arch/arm64/kvm/hyp/include/nvhe/gfp.h
new file mode 100644
index ..55b3f0ce5bc8
--- /dev/null
+++ b/arch/arm64/kvm/hyp/include/nvhe/gfp.h
@@ -0,0 +1,68 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef __KVM_HYP_GFP_H
+#define __KVM_HYP_GFP_H
+
+#include 
+
+#include 
+#include 
+
+#define HYP_NO_ORDER   UINT_MAX
+
+struct hyp_pool {
+   /*
+* Spinlock protecting concurrent changes to the memory pool as well as
+* the struct hyp_page of the pool's pages until we have a proper atomic
+* API at EL2.
+*/
+   hyp_spinlock_t lock;
+   struct list_head free_area[MAX_ORDER];
+   phys_addr_t range_start;
+   phys_addr_t range_end;
+   unsigned int max_order;
+};
+
+static inline void hyp_page_ref_inc(struct hyp_page *p)
+{
+   struct hyp_pool *pool = hyp_page_to_pool(p);
+
+   hyp_spin_lock(>lock);
+   p->refcount++;
+   hyp_spin_unlock(>lock);
+}
+
+static inline int hyp_page_ref_dec_and_test(struct hyp_page *p)
+{
+   struct hyp_pool *pool = hyp_page_to_pool(p);
+   int ret;
+
+   hyp_spin_lock(>lock);
+   p->refcount--;
+   ret = (p->refcount == 0);
+   hyp_spin_unlock(>lock);
+
+   return ret;
+}
+
+static inline void hyp_set_page_refcounted(struct hyp_page *p)
+{
+   struct hyp_pool *pool = hyp_page_to_pool(p);
+
+   hyp_spin_lock(>lock);
+   if (p->refcount) {
+   hyp_spin_unlock(>lock);
+   hyp_panic();
+   }
+   p->refcount = 1;
+   hyp_spin_unlock(>lock);
+}
+
+/* Allocation */
+void *hyp_alloc_pages(struct hyp_pool *pool, unsigned int order);
+void hyp_get_page(void *addr);
+void hyp_put_page(void *addr);
+
+/* Used pages cannot be freed */
+int hyp_pool_init(struct hyp_pool *pool, u64 pfn, unsigned int nr_pages,
+ unsigned int reserved_pages);
+#endif /* __KVM_HYP_GFP_H */
diff --git a/arch/arm64/kvm/hyp/include/nvhe/memory.h 
b/arch/arm64/kvm/hyp/include/nvhe/memory.h
index 3e49eaa7e682..d2fb307c5952 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/memory.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/memory.h
@@ -6,7 +6,17 @@
 
 #include 
 
+struct hyp_pool;
+struct hyp_page {
+   unsigned int refcount;
+   unsigned int order;
+   struct hyp_pool *pool;
+   struct list_head node;
+};
+
 extern s64 hyp_physvirt_offset;
+extern u64 __hyp_vmemmap;
+#define hyp_vmemmap ((struct hyp_page *)__hyp_vmemmap)
 
 #define __hyp_pa(virt) ((phys_addr_t)(virt) + hyp_physvirt_offset)
 #define __hyp_va(phys) ((void *)((phys_addr_t)(phys) - hyp_physvirt_offset))
@@ -21,4 +31,22 @@ static inline phys_addr_t hyp_virt_to_phys(void *addr)
return __hyp_pa(addr);
 }
 
+#define hyp_phys_to_pfn(phys)  ((phys) >> PAGE_SHIFT)
+#define hyp_pfn_to_phys(pfn)   ((phys_addr_t)((pfn) << PAGE_SHIFT))
+#define hyp_phys_to_page(phys) (_vmemmap[hyp_phys_to_pfn(phys)])
+#define hyp_virt_to_page(virt) hyp_phys_to_page(__hyp_pa(virt))
+#define hyp_virt_to_pfn(virt)  hyp_phys_to_pfn(__hyp_pa(virt))
+
+#define hyp_page_to_pfn(page)  ((struct hyp_page *)(page) - hyp_vmemmap)
+#define hyp_page_to_phys(page)  hyp_pfn_to_phys((hyp_page_to_pfn(page)))
+#define hyp_page_to_virt(page) __hyp_va(hyp_page_to_phys(page))
+#define hyp_page_to_pool(page) (((struct hyp_page *)page)->pool)
+
+static inline int hyp_page_count(void *addr)
+{
+   struct hyp_page *p = hyp_virt_to_page(addr);
+
+   return p->refcount;
+}
+
 #endif /* __KVM_HYP_MEMORY_H */
diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index 

[PATCH v5 35/36] KVM: arm64: Disable PMU support in protected mode

2021-03-15 Thread Quentin Perret
The host currently writes directly in EL2 per-CPU data sections from
the PMU code when running in nVHE. In preparation for unmapping the EL2
sections from the host stage 2, disable PMU support in protected mode as
we currently do not have a use-case for it.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/kvm/perf.c | 3 ++-
 arch/arm64/kvm/pmu.c  | 8 
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kvm/perf.c b/arch/arm64/kvm/perf.c
index 739164324afe..8f860ae56bb7 100644
--- a/arch/arm64/kvm/perf.c
+++ b/arch/arm64/kvm/perf.c
@@ -55,7 +55,8 @@ int kvm_perf_init(void)
 * hardware performance counters. This could ensure the presence of
 * a physical PMU and CONFIG_PERF_EVENT is selected.
 */
-   if (IS_ENABLED(CONFIG_ARM_PMU) && perf_num_counters() > 0)
+   if (IS_ENABLED(CONFIG_ARM_PMU) && perf_num_counters() > 0
+  && !is_protected_kvm_enabled())
static_branch_enable(_arm_pmu_available);
 
return perf_register_guest_info_callbacks(_guest_cbs);
diff --git a/arch/arm64/kvm/pmu.c b/arch/arm64/kvm/pmu.c
index faf32a44ba04..03a6c1f4a09a 100644
--- a/arch/arm64/kvm/pmu.c
+++ b/arch/arm64/kvm/pmu.c
@@ -33,7 +33,7 @@ void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr)
 {
struct kvm_host_data *ctx = this_cpu_ptr_hyp_sym(kvm_host_data);
 
-   if (!ctx || !kvm_pmu_switch_needed(attr))
+   if (!kvm_arm_support_pmu_v3() || !ctx || !kvm_pmu_switch_needed(attr))
return;
 
if (!attr->exclude_host)
@@ -49,7 +49,7 @@ void kvm_clr_pmu_events(u32 clr)
 {
struct kvm_host_data *ctx = this_cpu_ptr_hyp_sym(kvm_host_data);
 
-   if (!ctx)
+   if (!kvm_arm_support_pmu_v3() || !ctx)
return;
 
ctx->pmu_events.events_host &= ~clr;
@@ -172,7 +172,7 @@ void kvm_vcpu_pmu_restore_guest(struct kvm_vcpu *vcpu)
struct kvm_host_data *host;
u32 events_guest, events_host;
 
-   if (!has_vhe())
+   if (!kvm_arm_support_pmu_v3() || !has_vhe())
return;
 
preempt_disable();
@@ -193,7 +193,7 @@ void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu)
struct kvm_host_data *host;
u32 events_guest, events_host;
 
-   if (!has_vhe())
+   if (!kvm_arm_support_pmu_v3() || !has_vhe())
return;
 
host = this_cpu_ptr_hyp_sym(kvm_host_data);
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v5 20/36] KVM: arm64: Use kvm_arch in kvm_s2_mmu

2021-03-15 Thread Quentin Perret
In order to make use of the stage 2 pgtable code for the host stage 2,
change kvm_s2_mmu to use a kvm_arch pointer in lieu of the kvm pointer,
as the host will have the former but not the latter.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_host.h | 2 +-
 arch/arm64/include/asm/kvm_mmu.h  | 6 +-
 arch/arm64/kvm/mmu.c  | 8 
 3 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index b9d45a1f8538..90565782ce3e 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -94,7 +94,7 @@ struct kvm_s2_mmu {
/* The last vcpu id that ran on each physical CPU */
int __percpu *last_vcpu_ran;
 
-   struct kvm *kvm;
+   struct kvm_arch *arch;
 };
 
 struct kvm_arch_memory_slot {
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index ce02a4052dcf..6f743e20cb06 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -272,7 +272,7 @@ static __always_inline u64 kvm_get_vttbr(struct kvm_s2_mmu 
*mmu)
  */
 static __always_inline void __load_guest_stage2(struct kvm_s2_mmu *mmu)
 {
-   write_sysreg(kern_hyp_va(mmu->kvm)->arch.vtcr, vtcr_el2);
+   write_sysreg(kern_hyp_va(mmu->arch)->vtcr, vtcr_el2);
write_sysreg(kvm_get_vttbr(mmu), vttbr_el2);
 
/*
@@ -283,5 +283,9 @@ static __always_inline void __load_guest_stage2(struct 
kvm_s2_mmu *mmu)
asm(ALTERNATIVE("nop", "isb", ARM64_WORKAROUND_SPECULATIVE_AT));
 }
 
+static inline struct kvm *kvm_s2_mmu_to_kvm(struct kvm_s2_mmu *mmu)
+{
+   return container_of(mmu->arch, struct kvm, arch);
+}
 #endif /* __ASSEMBLY__ */
 #endif /* __ARM64_KVM_MMU_H__ */
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 41f9c03cbcc3..3257cadfab24 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -165,7 +165,7 @@ static void *kvm_host_va(phys_addr_t phys)
 static void __unmap_stage2_range(struct kvm_s2_mmu *mmu, phys_addr_t start, 
u64 size,
 bool may_block)
 {
-   struct kvm *kvm = mmu->kvm;
+   struct kvm *kvm = kvm_s2_mmu_to_kvm(mmu);
phys_addr_t end = start + size;
 
assert_spin_locked(>mmu_lock);
@@ -470,7 +470,7 @@ int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu 
*mmu)
for_each_possible_cpu(cpu)
*per_cpu_ptr(mmu->last_vcpu_ran, cpu) = -1;
 
-   mmu->kvm = kvm;
+   mmu->arch = >arch;
mmu->pgt = pgt;
mmu->pgd_phys = __pa(pgt->pgd);
mmu->vmid.vmid_gen = 0;
@@ -552,7 +552,7 @@ void stage2_unmap_vm(struct kvm *kvm)
 
 void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu)
 {
-   struct kvm *kvm = mmu->kvm;
+   struct kvm *kvm = kvm_s2_mmu_to_kvm(mmu);
struct kvm_pgtable *pgt = NULL;
 
spin_lock(>mmu_lock);
@@ -621,7 +621,7 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t 
guest_ipa,
  */
 static void stage2_wp_range(struct kvm_s2_mmu *mmu, phys_addr_t addr, 
phys_addr_t end)
 {
-   struct kvm *kvm = mmu->kvm;
+   struct kvm *kvm = kvm_s2_mmu_to_kvm(mmu);
stage2_apply_range_resched(kvm, addr, end, 
kvm_pgtable_stage2_wrprotect);
 }
 
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v5 14/36] KVM: arm64: Provide __flush_dcache_area at EL2

2021-03-15 Thread Quentin Perret
We will need to do cache maintenance at EL2 soon, so compile a copy of
__flush_dcache_area at EL2, and provide a copy of arm64_ftr_reg_ctrel0
as it is needed by the read_ctr macro.

Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_cpufeature.h |  2 ++
 arch/arm64/kvm/hyp/nvhe/Makefile|  3 ++-
 arch/arm64/kvm/hyp/nvhe/cache.S | 13 +
 arch/arm64/kvm/sys_regs.c   |  1 +
 4 files changed, 18 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/kvm/hyp/nvhe/cache.S

diff --git a/arch/arm64/include/asm/kvm_cpufeature.h 
b/arch/arm64/include/asm/kvm_cpufeature.h
index 3fd9f60d2180..efba1b89b8a4 100644
--- a/arch/arm64/include/asm/kvm_cpufeature.h
+++ b/arch/arm64/include/asm/kvm_cpufeature.h
@@ -13,3 +13,5 @@
 #define KVM_HYP_CPU_FTR_REG(name) extern struct arm64_ftr_reg 
kvm_nvhe_sym(name)
 #endif
 #endif
+
+KVM_HYP_CPU_FTR_REG(arm64_ftr_reg_ctrel0);
diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index 6894a917f290..42dde4bb80b1 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -13,7 +13,8 @@ lib-objs := clear_page.o copy_page.o memcpy.o memset.o
 lib-objs := $(addprefix ../../../lib/, $(lib-objs))
 
 obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
-hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o
+hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o \
+cache.o
 obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
 ../fpsimd.o ../hyp-entry.o ../exception.o
 obj-y += $(lib-objs)
diff --git a/arch/arm64/kvm/hyp/nvhe/cache.S b/arch/arm64/kvm/hyp/nvhe/cache.S
new file mode 100644
index ..36cef6915428
--- /dev/null
+++ b/arch/arm64/kvm/hyp/nvhe/cache.S
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Code copied from arch/arm64/mm/cache.S.
+ */
+
+#include 
+#include 
+#include 
+
+SYM_FUNC_START_PI(__flush_dcache_area)
+   dcache_by_line_op civac, sy, x0, x1, x2, x3
+   ret
+SYM_FUNC_END_PI(__flush_dcache_area)
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 6c5d133689ae..3ec34c25e877 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -2783,6 +2783,7 @@ struct __ftr_reg_copy_entry {
u32 sys_id;
struct arm64_ftr_reg*dst;
 } hyp_ftr_regs[] __initdata = {
+   CPU_FTR_REG_HYP_COPY(SYS_CTR_EL0, arm64_ftr_reg_ctrel0),
 };
 
 void __init setup_kvm_el2_caps(void)
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v5 10/36] KVM: arm64: Introduce an early Hyp page allocator

2021-03-15 Thread Quentin Perret
With nVHE, the host currently creates all stage 1 hypervisor mappings at
EL1 during boot, installs them at EL2, and extends them as required
(e.g. when creating a new VM). But in a world where the host is no
longer trusted, it cannot have full control over the code mapped in the
hypervisor.

In preparation for enabling the hypervisor to create its own stage 1
mappings during boot, introduce an early page allocator, with minimal
functionality. This allocator is designed to be used only during early
bootstrap of the hyp code when memory protection is enabled, which will
then switch to using a full-fledged page allocator after init.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/kvm/hyp/include/nvhe/early_alloc.h | 14 +
 arch/arm64/kvm/hyp/include/nvhe/memory.h  | 24 +
 arch/arm64/kvm/hyp/nvhe/Makefile  |  2 +-
 arch/arm64/kvm/hyp/nvhe/early_alloc.c | 54 +++
 arch/arm64/kvm/hyp/nvhe/psci-relay.c  |  4 +-
 5 files changed, 94 insertions(+), 4 deletions(-)
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/early_alloc.h
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/memory.h
 create mode 100644 arch/arm64/kvm/hyp/nvhe/early_alloc.c

diff --git a/arch/arm64/kvm/hyp/include/nvhe/early_alloc.h 
b/arch/arm64/kvm/hyp/include/nvhe/early_alloc.h
new file mode 100644
index ..dc61aaa56f31
--- /dev/null
+++ b/arch/arm64/kvm/hyp/include/nvhe/early_alloc.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef __KVM_HYP_EARLY_ALLOC_H
+#define __KVM_HYP_EARLY_ALLOC_H
+
+#include 
+
+void hyp_early_alloc_init(void *virt, unsigned long size);
+unsigned long hyp_early_alloc_nr_used_pages(void);
+void *hyp_early_alloc_page(void *arg);
+void *hyp_early_alloc_contig(unsigned int nr_pages);
+
+extern struct kvm_pgtable_mm_ops hyp_early_alloc_mm_ops;
+
+#endif /* __KVM_HYP_EARLY_ALLOC_H */
diff --git a/arch/arm64/kvm/hyp/include/nvhe/memory.h 
b/arch/arm64/kvm/hyp/include/nvhe/memory.h
new file mode 100644
index ..3e49eaa7e682
--- /dev/null
+++ b/arch/arm64/kvm/hyp/include/nvhe/memory.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef __KVM_HYP_MEMORY_H
+#define __KVM_HYP_MEMORY_H
+
+#include 
+
+#include 
+
+extern s64 hyp_physvirt_offset;
+
+#define __hyp_pa(virt) ((phys_addr_t)(virt) + hyp_physvirt_offset)
+#define __hyp_va(phys) ((void *)((phys_addr_t)(phys) - hyp_physvirt_offset))
+
+static inline void *hyp_phys_to_virt(phys_addr_t phys)
+{
+   return __hyp_va(phys);
+}
+
+static inline phys_addr_t hyp_virt_to_phys(void *addr)
+{
+   return __hyp_pa(addr);
+}
+
+#endif /* __KVM_HYP_MEMORY_H */
diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index bc98f8e3d1da..24ff99e2eac5 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -13,7 +13,7 @@ lib-objs := clear_page.o copy_page.o memcpy.o memset.o
 lib-objs := $(addprefix ../../../lib/, $(lib-objs))
 
 obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
-hyp-main.o hyp-smp.o psci-relay.o
+hyp-main.o hyp-smp.o psci-relay.o early_alloc.o
 obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
 ../fpsimd.o ../hyp-entry.o ../exception.o
 obj-y += $(lib-objs)
diff --git a/arch/arm64/kvm/hyp/nvhe/early_alloc.c 
b/arch/arm64/kvm/hyp/nvhe/early_alloc.c
new file mode 100644
index ..1306c430ab87
--- /dev/null
+++ b/arch/arm64/kvm/hyp/nvhe/early_alloc.c
@@ -0,0 +1,54 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2020 Google LLC
+ * Author: Quentin Perret 
+ */
+
+#include 
+
+#include 
+#include 
+
+struct kvm_pgtable_mm_ops hyp_early_alloc_mm_ops;
+s64 __ro_after_init hyp_physvirt_offset;
+
+static unsigned long base;
+static unsigned long end;
+static unsigned long cur;
+
+unsigned long hyp_early_alloc_nr_used_pages(void)
+{
+   return (cur - base) >> PAGE_SHIFT;
+}
+
+void *hyp_early_alloc_contig(unsigned int nr_pages)
+{
+   unsigned long size = (nr_pages << PAGE_SHIFT);
+   void *ret = (void *)cur;
+
+   if (!nr_pages)
+   return NULL;
+
+   if (end - cur < size)
+   return NULL;
+
+   cur += size;
+   memset(ret, 0, size);
+
+   return ret;
+}
+
+void *hyp_early_alloc_page(void *arg)
+{
+   return hyp_early_alloc_contig(1);
+}
+
+void hyp_early_alloc_init(void *virt, unsigned long size)
+{
+   base = cur = (unsigned long)virt;
+   end = base + size;
+
+   hyp_early_alloc_mm_ops.zalloc_page = hyp_early_alloc_page;
+   hyp_early_alloc_mm_ops.phys_to_virt = hyp_phys_to_virt;
+   hyp_early_alloc_mm_ops.virt_to_phys = hyp_virt_to_phys;
+}
diff --git a/arch/arm64/kvm/hyp/nvhe/psci-relay.c 
b/arch/arm64/kvm/hyp/nvhe/psci-relay.c
index 63de71c0481e..08508783ec3d 100644
--- a/arch/arm64/kvm/hyp/nvhe/psci-relay.c
+++ b/arch/arm64/kvm/hyp/nvhe/psci-relay.c
@@ -11,6 +11,7 @@
 #include 
 

[PATCH v5 26/36] KVM: arm64: Reserve memory for host stage 2

2021-03-15 Thread Quentin Perret
Extend the memory pool allocated for the hypervisor to include enough
pages to map all of memory at page granularity for the host stage 2.
While at it, also reserve some memory for device mappings.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/kvm/hyp/include/nvhe/mm.h | 27 ++-
 arch/arm64/kvm/hyp/nvhe/setup.c  | 12 
 arch/arm64/kvm/hyp/reserved_mem.c|  2 ++
 3 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/hyp/include/nvhe/mm.h 
b/arch/arm64/kvm/hyp/include/nvhe/mm.h
index ac0f7fcffd08..0095f6289742 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/mm.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/mm.h
@@ -53,7 +53,7 @@ static inline unsigned long __hyp_pgtable_max_pages(unsigned 
long nr_pages)
return total;
 }
 
-static inline unsigned long hyp_s1_pgtable_pages(void)
+static inline unsigned long __hyp_pgtable_total_pages(void)
 {
unsigned long res = 0, i;
 
@@ -63,9 +63,34 @@ static inline unsigned long hyp_s1_pgtable_pages(void)
res += __hyp_pgtable_max_pages(reg->size >> PAGE_SHIFT);
}
 
+   return res;
+}
+
+static inline unsigned long hyp_s1_pgtable_pages(void)
+{
+   unsigned long res;
+
+   res = __hyp_pgtable_total_pages();
+
/* Allow 1 GiB for private mappings */
res += __hyp_pgtable_max_pages(SZ_1G >> PAGE_SHIFT);
 
return res;
 }
+
+static inline unsigned long host_s2_mem_pgtable_pages(void)
+{
+   /*
+* Include an extra 16 pages to safely upper-bound the worst case of
+* concatenated pgds.
+*/
+   return __hyp_pgtable_total_pages() + 16;
+}
+
+static inline unsigned long host_s2_dev_pgtable_pages(void)
+{
+   /* Allow 1 GiB for MMIO mappings */
+   return __hyp_pgtable_max_pages(SZ_1G >> PAGE_SHIFT);
+}
+
 #endif /* __KVM_HYP_MM_H */
diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c
index 1e8bcd8b0299..c1a3e7e0ebbc 100644
--- a/arch/arm64/kvm/hyp/nvhe/setup.c
+++ b/arch/arm64/kvm/hyp/nvhe/setup.c
@@ -24,6 +24,8 @@ unsigned long hyp_nr_cpus;
 
 static void *vmemmap_base;
 static void *hyp_pgt_base;
+static void *host_s2_mem_pgt_base;
+static void *host_s2_dev_pgt_base;
 
 static int divide_memory_pool(void *virt, unsigned long size)
 {
@@ -42,6 +44,16 @@ static int divide_memory_pool(void *virt, unsigned long size)
if (!hyp_pgt_base)
return -ENOMEM;
 
+   nr_pages = host_s2_mem_pgtable_pages();
+   host_s2_mem_pgt_base = hyp_early_alloc_contig(nr_pages);
+   if (!host_s2_mem_pgt_base)
+   return -ENOMEM;
+
+   nr_pages = host_s2_dev_pgtable_pages();
+   host_s2_dev_pgt_base = hyp_early_alloc_contig(nr_pages);
+   if (!host_s2_dev_pgt_base)
+   return -ENOMEM;
+
return 0;
 }
 
diff --git a/arch/arm64/kvm/hyp/reserved_mem.c 
b/arch/arm64/kvm/hyp/reserved_mem.c
index 9bc6a6d27904..fd42705a3c26 100644
--- a/arch/arm64/kvm/hyp/reserved_mem.c
+++ b/arch/arm64/kvm/hyp/reserved_mem.c
@@ -52,6 +52,8 @@ void __init kvm_hyp_reserve(void)
}
 
hyp_mem_pages += hyp_s1_pgtable_pages();
+   hyp_mem_pages += host_s2_mem_pgtable_pages();
+   hyp_mem_pages += host_s2_dev_pgtable_pages();
 
/*
 * The hyp_vmemmap needs to be backed by pages, but these pages
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v5 22/36] KVM: arm64: Refactor kvm_arm_setup_stage2()

2021-03-15 Thread Quentin Perret
In order to re-use some of the stage 2 setup code at EL2, factor parts
of kvm_arm_setup_stage2() out into separate functions.

No functional change intended.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_pgtable.h | 26 +
 arch/arm64/kvm/hyp/pgtable.c | 32 +
 arch/arm64/kvm/reset.c   | 42 +++-
 3 files changed, 62 insertions(+), 38 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_pgtable.h 
b/arch/arm64/include/asm/kvm_pgtable.h
index 7945ec87eaec..9cdc198ea6b4 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -13,6 +13,16 @@
 
 #define KVM_PGTABLE_MAX_LEVELS 4U
 
+static inline u64 kvm_get_parange(u64 mmfr0)
+{
+   u64 parange = cpuid_feature_extract_unsigned_field(mmfr0,
+   ID_AA64MMFR0_PARANGE_SHIFT);
+   if (parange > ID_AA64MMFR0_PARANGE_MAX)
+   parange = ID_AA64MMFR0_PARANGE_MAX;
+
+   return parange;
+}
+
 typedef u64 kvm_pte_t;
 
 /**
@@ -159,6 +169,22 @@ void kvm_pgtable_hyp_destroy(struct kvm_pgtable *pgt);
 int kvm_pgtable_hyp_map(struct kvm_pgtable *pgt, u64 addr, u64 size, u64 phys,
enum kvm_pgtable_prot prot);
 
+/**
+ * kvm_get_vtcr() - Helper to construct VTCR_EL2
+ * @mmfr0: Sanitized value of SYS_ID_AA64MMFR0_EL1 register.
+ * @mmfr1: Sanitized value of SYS_ID_AA64MMFR1_EL1 register.
+ * @phys_shfit:Value to set in VTCR_EL2.T0SZ.
+ *
+ * The VTCR value is common across all the physical CPUs on the system.
+ * We use system wide sanitised values to fill in different fields,
+ * except for Hardware Management of Access Flags. HA Flag is set
+ * unconditionally on all CPUs, as it is safe to run with or without
+ * the feature and the bit is RES0 on CPUs that don't support it.
+ *
+ * Return: VTCR_EL2 value
+ */
+u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift);
+
 /**
  * kvm_pgtable_stage2_init() - Initialise a guest stage-2 page-table.
  * @pgt:   Uninitialised page-table structure to initialise.
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 3d79c8094cdd..296675e5600d 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -9,6 +9,7 @@
 
 #include 
 #include 
+#include 
 
 #define KVM_PTE_VALID  BIT(0)
 
@@ -449,6 +450,37 @@ struct stage2_map_data {
struct kvm_pgtable_mm_ops   *mm_ops;
 };
 
+u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift)
+{
+   u64 vtcr = VTCR_EL2_FLAGS;
+   u8 lvls;
+
+   vtcr |= kvm_get_parange(mmfr0) << VTCR_EL2_PS_SHIFT;
+   vtcr |= VTCR_EL2_T0SZ(phys_shift);
+   /*
+* Use a minimum 2 level page table to prevent splitting
+* host PMD huge pages at stage2.
+*/
+   lvls = stage2_pgtable_levels(phys_shift);
+   if (lvls < 2)
+   lvls = 2;
+   vtcr |= VTCR_EL2_LVLS_TO_SL0(lvls);
+
+   /*
+* Enable the Hardware Access Flag management, unconditionally
+* on all CPUs. The features is RES0 on CPUs without the support
+* and must be ignored by the CPUs.
+*/
+   vtcr |= VTCR_EL2_HA;
+
+   /* Set the vmid bits */
+   vtcr |= (get_vmid_bits(mmfr1) == 16) ?
+   VTCR_EL2_VS_16BIT :
+   VTCR_EL2_VS_8BIT;
+
+   return vtcr;
+}
+
 static int stage2_map_set_prot_attr(enum kvm_pgtable_prot prot,
struct stage2_map_data *data)
 {
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 47f3f035f3ea..6aae118c960a 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -332,19 +332,10 @@ int kvm_set_ipa_limit(void)
return 0;
 }
 
-/*
- * Configure the VTCR_EL2 for this VM. The VTCR value is common
- * across all the physical CPUs on the system. We use system wide
- * sanitised values to fill in different fields, except for Hardware
- * Management of Access Flags. HA Flag is set unconditionally on
- * all CPUs, as it is safe to run with or without the feature and
- * the bit is RES0 on CPUs that don't support it.
- */
 int kvm_arm_setup_stage2(struct kvm *kvm, unsigned long type)
 {
-   u64 vtcr = VTCR_EL2_FLAGS, mmfr0;
-   u32 parange, phys_shift;
-   u8 lvls;
+   u64 mmfr0, mmfr1;
+   u32 phys_shift;
 
if (type & ~KVM_VM_TYPE_ARM_IPA_SIZE_MASK)
return -EINVAL;
@@ -359,33 +350,8 @@ int kvm_arm_setup_stage2(struct kvm *kvm, unsigned long 
type)
}
 
mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
-   parange = cpuid_feature_extract_unsigned_field(mmfr0,
-   ID_AA64MMFR0_PARANGE_SHIFT);
-   if (parange > ID_AA64MMFR0_PARANGE_MAX)
-   parange = ID_AA64MMFR0_PARANGE_MAX;
-   vtcr |= parange << VTCR_EL2_PS_SHIFT;
-
-   vtcr |= VTCR_EL2_T0SZ(phys_shift);
-   /*
-* Use a minimum 2 level page 

[PATCH v5 24/36] KVM: arm64: Refactor __populate_fault_info()

2021-03-15 Thread Quentin Perret
Refactor __populate_fault_info() to introduce __get_fault_info() which
will be used once the host is wrapped in a stage 2.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/kvm/hyp/include/hyp/switch.h | 34 +
 1 file changed, 18 insertions(+), 16 deletions(-)

diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h 
b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 6c1f51f25eb3..40c274da5a7c 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -160,19 +160,9 @@ static inline bool __translate_far_to_hpfar(u64 far, u64 
*hpfar)
return true;
 }
 
-static inline bool __populate_fault_info(struct kvm_vcpu *vcpu)
+static inline bool __get_fault_info(u64 esr, struct kvm_vcpu_fault_info *fault)
 {
-   u8 ec;
-   u64 esr;
-   u64 hpfar, far;
-
-   esr = vcpu->arch.fault.esr_el2;
-   ec = ESR_ELx_EC(esr);
-
-   if (ec != ESR_ELx_EC_DABT_LOW && ec != ESR_ELx_EC_IABT_LOW)
-   return true;
-
-   far = read_sysreg_el2(SYS_FAR);
+   fault->far_el2 = read_sysreg_el2(SYS_FAR);
 
/*
 * The HPFAR can be invalid if the stage 2 fault did not
@@ -188,17 +178,29 @@ static inline bool __populate_fault_info(struct kvm_vcpu 
*vcpu)
if (!(esr & ESR_ELx_S1PTW) &&
(cpus_have_final_cap(ARM64_WORKAROUND_834220) ||
 (esr & ESR_ELx_FSC_TYPE) == FSC_PERM)) {
-   if (!__translate_far_to_hpfar(far, ))
+   if (!__translate_far_to_hpfar(fault->far_el2, 
>hpfar_el2))
return false;
} else {
-   hpfar = read_sysreg(hpfar_el2);
+   fault->hpfar_el2 = read_sysreg(hpfar_el2);
}
 
-   vcpu->arch.fault.far_el2 = far;
-   vcpu->arch.fault.hpfar_el2 = hpfar;
return true;
 }
 
+static inline bool __populate_fault_info(struct kvm_vcpu *vcpu)
+{
+   u8 ec;
+   u64 esr;
+
+   esr = vcpu->arch.fault.esr_el2;
+   ec = ESR_ELx_EC(esr);
+
+   if (ec != ESR_ELx_EC_DABT_LOW && ec != ESR_ELx_EC_IABT_LOW)
+   return true;
+
+   return __get_fault_info(esr, >arch.fault);
+}
+
 /* Check for an FPSIMD/SVE trap and handle as appropriate */
 static inline bool __hyp_handle_fpsimd(struct kvm_vcpu *vcpu)
 {
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v5 30/36] KVM: arm64: Refactor the *_map_set_prot_attr() helpers

2021-03-15 Thread Quentin Perret
In order to ease their re-use in other code paths, refactor the
*_map_set_prot_attr() helpers to not depend on a map_data struct.
No functional change intended.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/kvm/hyp/pgtable.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index bd44e84dedc4..a5347d78293f 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -324,8 +324,7 @@ struct hyp_map_data {
struct kvm_pgtable_mm_ops   *mm_ops;
 };
 
-static int hyp_map_set_prot_attr(enum kvm_pgtable_prot prot,
-struct hyp_map_data *data)
+static int hyp_set_prot_attr(enum kvm_pgtable_prot prot, kvm_pte_t *ptep)
 {
bool device = prot & KVM_PGTABLE_PROT_DEVICE;
u32 mtype = device ? MT_DEVICE_nGnRE : MT_NORMAL;
@@ -350,7 +349,8 @@ static int hyp_map_set_prot_attr(enum kvm_pgtable_prot prot,
attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S1_AP, ap);
attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S1_SH, sh);
attr |= KVM_PTE_LEAF_ATTR_LO_S1_AF;
-   data->attr = attr;
+   *ptep = attr;
+
return 0;
 }
 
@@ -407,7 +407,7 @@ int kvm_pgtable_hyp_map(struct kvm_pgtable *pgt, u64 addr, 
u64 size, u64 phys,
.arg= _data,
};
 
-   ret = hyp_map_set_prot_attr(prot, _data);
+   ret = hyp_set_prot_attr(prot, _data.attr);
if (ret)
return ret;
 
@@ -500,8 +500,7 @@ u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift)
return vtcr;
 }
 
-static int stage2_map_set_prot_attr(enum kvm_pgtable_prot prot,
-   struct stage2_map_data *data)
+static int stage2_set_prot_attr(enum kvm_pgtable_prot prot, kvm_pte_t *ptep)
 {
bool device = prot & KVM_PGTABLE_PROT_DEVICE;
kvm_pte_t attr = device ? PAGE_S2_MEMATTR(DEVICE_nGnRE) :
@@ -521,7 +520,8 @@ static int stage2_map_set_prot_attr(enum kvm_pgtable_prot 
prot,
 
attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S2_SH, sh);
attr |= KVM_PTE_LEAF_ATTR_LO_S2_AF;
-   data->attr = attr;
+   *ptep = attr;
+
return 0;
 }
 
@@ -741,7 +741,7 @@ int kvm_pgtable_stage2_map(struct kvm_pgtable *pgt, u64 
addr, u64 size,
.arg= _data,
};
 
-   ret = stage2_map_set_prot_attr(prot, _data);
+   ret = stage2_set_prot_attr(prot, _data.attr);
if (ret)
return ret;
 
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v5 34/36] KVM: arm64: Page-align the .hyp sections

2021-03-15 Thread Quentin Perret
We will soon unmap the .hyp sections from the host stage 2 in Protected
nVHE mode, which obviously works with at least page granularity, so make
sure to align them correctly.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/kernel/vmlinux.lds.S | 22 +-
 1 file changed, 9 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index e96173ce211b..709d2c433c5e 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -15,9 +15,11 @@
 
 #define HYPERVISOR_DATA_SECTIONS   \
HYP_SECTION_NAME(.rodata) : {   \
+   . = ALIGN(PAGE_SIZE);   \
__hyp_rodata_start = .; \
*(HYP_SECTION_NAME(.data..ro_after_init))   \
*(HYP_SECTION_NAME(.rodata))\
+   . = ALIGN(PAGE_SIZE);   \
__hyp_rodata_end = .;   \
}
 
@@ -72,21 +74,14 @@ ENTRY(_text)
 jiffies = jiffies_64;
 
 #define HYPERVISOR_TEXT\
-   /*  \
-* Align to 4 KB so that\
-* a) the HYP vector table is at its minimum\
-*alignment of 2048 bytes   \
-* b) the HYP init code will not cross a page   \
-*boundary if its size does not exceed  \
-*4 KB (see related ASSERT() below) \
-*/ \
-   . = ALIGN(SZ_4K);   \
+   . = ALIGN(PAGE_SIZE);   \
__hyp_idmap_text_start = .; \
*(.hyp.idmap.text)  \
__hyp_idmap_text_end = .;   \
__hyp_text_start = .;   \
*(.hyp.text)\
HYPERVISOR_EXTABLE  \
+   . = ALIGN(PAGE_SIZE);   \
__hyp_text_end = .;
 
 #define IDMAP_TEXT \
@@ -322,11 +317,12 @@ SECTIONS
 #include "image-vars.h"
 
 /*
- * The HYP init code and ID map text can't be longer than a page each,
- * and should not cross a page boundary.
+ * The HYP init code and ID map text can't be longer than a page each. The
+ * former is page-aligned, but the latter may not be with 16K or 64K pages, so
+ * it should also not cross a page boundary.
  */
-ASSERT(__hyp_idmap_text_end - (__hyp_idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K,
-   "HYP init code too big or misaligned")
+ASSERT(__hyp_idmap_text_end - __hyp_idmap_text_start <= PAGE_SIZE,
+   "HYP init code too big")
 ASSERT(__idmap_text_end - (__idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K,
"ID map text too big or misaligned")
 #ifdef CONFIG_HIBERNATION
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v5 36/36] KVM: arm64: Protect the .hyp sections from the host

2021-03-15 Thread Quentin Perret
When KVM runs in nVHE protected mode, use the host stage 2 to unmap the
hypervisor sections by marking them as owned by the hypervisor itself.
The long-term goal is to ensure the EL2 code can remain robust
regardless of the host's state, so this starts by making sure the host
cannot e.g. write to the .hyp sections directly.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_asm.h  |  1 +
 arch/arm64/kvm/arm.c  | 46 +++
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |  2 +
 arch/arm64/kvm/hyp/nvhe/hyp-main.c|  9 
 arch/arm64/kvm/hyp/nvhe/mem_protect.c | 33 +
 5 files changed, 91 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index b127af02bd45..d468c4b37190 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -62,6 +62,7 @@
 #define __KVM_HOST_SMCCC_FUNC___pkvm_create_private_mapping17
 #define __KVM_HOST_SMCCC_FUNC___pkvm_cpu_set_vector18
 #define __KVM_HOST_SMCCC_FUNC___pkvm_prot_finalize 19
+#define __KVM_HOST_SMCCC_FUNC___pkvm_mark_hyp  20
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 7e6a81079652..d6baf76d4747 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1894,11 +1894,57 @@ void _kvm_host_prot_finalize(void *discard)
WARN_ON(kvm_call_hyp_nvhe(__pkvm_prot_finalize));
 }
 
+static inline int pkvm_mark_hyp(phys_addr_t start, phys_addr_t end)
+{
+   return kvm_call_hyp_nvhe(__pkvm_mark_hyp, start, end);
+}
+
+#define pkvm_mark_hyp_section(__section)   \
+   pkvm_mark_hyp(__pa_symbol(__section##_start),   \
+   __pa_symbol(__section##_end))
+
 static int finalize_hyp_mode(void)
 {
+   int cpu, ret;
+
if (!is_protected_kvm_enabled())
return 0;
 
+   ret = pkvm_mark_hyp_section(__hyp_idmap_text);
+   if (ret)
+   return ret;
+
+   ret = pkvm_mark_hyp_section(__hyp_text);
+   if (ret)
+   return ret;
+
+   ret = pkvm_mark_hyp_section(__hyp_rodata);
+   if (ret)
+   return ret;
+
+   ret = pkvm_mark_hyp_section(__hyp_bss);
+   if (ret)
+   return ret;
+
+   ret = pkvm_mark_hyp(hyp_mem_base, hyp_mem_base + hyp_mem_size);
+   if (ret)
+   return ret;
+
+   for_each_possible_cpu(cpu) {
+   phys_addr_t start = virt_to_phys((void 
*)kvm_arm_hyp_percpu_base[cpu]);
+   phys_addr_t end = start + (PAGE_SIZE << nvhe_percpu_order());
+
+   ret = pkvm_mark_hyp(start, end);
+   if (ret)
+   return ret;
+
+   start = virt_to_phys((void *)per_cpu(kvm_arm_hyp_stack_page, 
cpu));
+   end = start + PAGE_SIZE;
+   ret = pkvm_mark_hyp(start, end);
+   if (ret)
+   return ret;
+   }
+
/*
 * Flip the static key upfront as that may no longer be possible
 * once the host stage 2 is installed.
diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h 
b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
index d293cb328cc4..42d81ec739fa 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
@@ -21,6 +21,8 @@ struct host_kvm {
 extern struct host_kvm host_kvm;
 
 int __pkvm_prot_finalize(void);
+int __pkvm_mark_hyp(phys_addr_t start, phys_addr_t end);
+
 int kvm_host_prepare_stage2(void *mem_pgt_pool, void *dev_pgt_pool);
 void handle_host_mem_abort(struct kvm_cpu_context *host_ctxt);
 
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c 
b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index f47028d3fd0a..3df33d4de4a1 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -156,6 +156,14 @@ static void handle___pkvm_prot_finalize(struct 
kvm_cpu_context *host_ctxt)
 {
cpu_reg(host_ctxt, 1) = __pkvm_prot_finalize();
 }
+
+static void handle___pkvm_mark_hyp(struct kvm_cpu_context *host_ctxt)
+{
+   DECLARE_REG(phys_addr_t, start, host_ctxt, 1);
+   DECLARE_REG(phys_addr_t, end, host_ctxt, 2);
+
+   cpu_reg(host_ctxt, 1) = __pkvm_mark_hyp(start, end);
+}
 typedef void (*hcall_t)(struct kvm_cpu_context *);
 
 #define HANDLE_FUNC(x) [__KVM_HOST_SMCCC_FUNC_##x] = (hcall_t)handle_##x
@@ -180,6 +188,7 @@ static const hcall_t host_hcall[] = {
HANDLE_FUNC(__pkvm_create_mappings),
HANDLE_FUNC(__pkvm_create_private_mapping),
HANDLE_FUNC(__pkvm_prot_finalize),
+   HANDLE_FUNC(__pkvm_mark_hyp),
 };
 
 static void handle_host_hcall(struct kvm_cpu_context *host_ctxt)
diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c 
b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index 5c88a325e6fc..dd03252b9574 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ 

[PATCH v5 32/36] KVM: arm64: Provide sanitized mmfr* registers at EL2

2021-03-15 Thread Quentin Perret
We will need to read sanitized values of mmfr{0,1}_el1 at EL2 soon, so
add them to the list of copied variables.

Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_cpufeature.h | 2 ++
 arch/arm64/kvm/sys_regs.c   | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_cpufeature.h 
b/arch/arm64/include/asm/kvm_cpufeature.h
index efba1b89b8a4..48cba6cecd71 100644
--- a/arch/arm64/include/asm/kvm_cpufeature.h
+++ b/arch/arm64/include/asm/kvm_cpufeature.h
@@ -15,3 +15,5 @@
 #endif
 
 KVM_HYP_CPU_FTR_REG(arm64_ftr_reg_ctrel0);
+KVM_HYP_CPU_FTR_REG(arm64_ftr_reg_id_aa64mmfr0_el1);
+KVM_HYP_CPU_FTR_REG(arm64_ftr_reg_id_aa64mmfr1_el1);
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 3ec34c25e877..dfb3b4f9ca84 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -2784,6 +2784,8 @@ struct __ftr_reg_copy_entry {
struct arm64_ftr_reg*dst;
 } hyp_ftr_regs[] __initdata = {
CPU_FTR_REG_HYP_COPY(SYS_CTR_EL0, arm64_ftr_reg_ctrel0),
+   CPU_FTR_REG_HYP_COPY(SYS_ID_AA64MMFR0_EL1, 
arm64_ftr_reg_id_aa64mmfr0_el1),
+   CPU_FTR_REG_HYP_COPY(SYS_ID_AA64MMFR1_EL1, 
arm64_ftr_reg_id_aa64mmfr1_el1),
 };
 
 void __init setup_kvm_el2_caps(void)
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v5 31/36] KVM: arm64: Add kvm_pgtable_stage2_find_range()

2021-03-15 Thread Quentin Perret
Since the host stage 2 will be identity mapped, and since it will own
most of memory, it would preferable for performance to try and use large
block mappings whenever that is possible. To ease this, introduce a new
helper in the KVM page-table code which allows to search for large
ranges of available IPA space. This will be used in the host memory
abort path to greedily idmap large portion of the PA space.

Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_pgtable.h | 29 +
 arch/arm64/kvm/hyp/pgtable.c | 89 ++--
 2 files changed, 114 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_pgtable.h 
b/arch/arm64/include/asm/kvm_pgtable.h
index 683e96abdc24..b93a2a3526ab 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -94,6 +94,16 @@ enum kvm_pgtable_prot {
 #define PAGE_HYP_RO(KVM_PGTABLE_PROT_R)
 #define PAGE_HYP_DEVICE(PAGE_HYP | KVM_PGTABLE_PROT_DEVICE)
 
+/**
+ * struct kvm_mem_range - Range of Intermediate Physical Addresses
+ * @start: Start of the range.
+ * @end:   End of the range.
+ */
+struct kvm_mem_range {
+   u64 start;
+   u64 end;
+};
+
 /**
  * enum kvm_pgtable_walk_flags - Flags to control a depth-first page-table 
walk.
  * @KVM_PGTABLE_WALK_LEAF: Visit leaf entries, including invalid
@@ -398,4 +408,23 @@ int kvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 
addr, u64 size);
 int kvm_pgtable_walk(struct kvm_pgtable *pgt, u64 addr, u64 size,
 struct kvm_pgtable_walker *walker);
 
+/**
+ * kvm_pgtable_stage2_find_range() - Find a range of Intermediate Physical
+ *  Addresses with compatible permission
+ *  attributes.
+ * @pgt:   Page-table structure initialised by kvm_pgtable_stage2_init().
+ * @addr:  Address that must be covered by the range.
+ * @prot:  Protection attributes that the range must be compatible with.
+ * @range: Range structure used to limit the search space at call time and
+ * that will hold the result.
+ *
+ * The offset of @addr within a page is ignored. An IPA is compatible with 
@prot
+ * iff its corresponding stage-2 page-table entry has default ownership and, if
+ * valid, is mapped with protection attributes identical to @prot.
+ *
+ * Return: 0 on success, negative error code on failure.
+ */
+int kvm_pgtable_stage2_find_range(struct kvm_pgtable *pgt, u64 addr,
+ enum kvm_pgtable_prot prot,
+ struct kvm_mem_range *range);
 #endif /* __ARM64_KVM_PGTABLE_H__ */
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index a5347d78293f..3a971df278bd 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -48,6 +48,8 @@
 KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W | \
 KVM_PTE_LEAF_ATTR_HI_S2_XN)
 
+#define KVM_PTE_LEAF_ATTR_S2_IGNORED   GENMASK(58, 55)
+
 #define KVM_INVALID_PTE_OWNER_MASK GENMASK(63, 56)
 #define KVM_MAX_OWNER_ID   1
 
@@ -77,15 +79,20 @@ static bool kvm_phys_is_valid(u64 phys)
return phys < 
BIT(id_aa64mmfr0_parange_to_phys_shift(ID_AA64MMFR0_PARANGE_MAX));
 }
 
-static bool kvm_block_mapping_supported(u64 addr, u64 end, u64 phys, u32 level)
+static bool kvm_level_supports_block_mapping(u32 level)
 {
-   u64 granule = kvm_granule_size(level);
-
/*
 * Reject invalid block mappings and don't bother with 4TB mappings for
 * 52-bit PAs.
 */
-   if (level == 0 || (PAGE_SIZE != SZ_4K && level == 1))
+   return !(level == 0 || (PAGE_SIZE != SZ_4K && level == 1));
+}
+
+static bool kvm_block_mapping_supported(u64 addr, u64 end, u64 phys, u32 level)
+{
+   u64 granule = kvm_granule_size(level);
+
+   if (!kvm_level_supports_block_mapping(level))
return false;
 
if (granule > (end - addr))
@@ -1053,3 +1060,77 @@ void kvm_pgtable_stage2_destroy(struct kvm_pgtable *pgt)
pgt->mm_ops->free_pages_exact(pgt->pgd, pgd_sz);
pgt->pgd = NULL;
 }
+
+#define KVM_PTE_LEAF_S2_COMPAT_MASK(KVM_PTE_LEAF_ATTR_S2_PERMS | \
+KVM_PTE_LEAF_ATTR_LO_S2_MEMATTR | \
+KVM_PTE_LEAF_ATTR_S2_IGNORED)
+
+static int stage2_check_permission_walker(u64 addr, u64 end, u32 level,
+ kvm_pte_t *ptep,
+ enum kvm_pgtable_walk_flags flag,
+ void * const arg)
+{
+   kvm_pte_t old_attr, pte = *ptep, *new_attr = arg;
+
+   /*
+* Compatible mappings are either invalid and owned by the page-table
+* owner (whose id is 0), or valid with matching permission attributes.
+*/
+   if (kvm_pte_valid(pte)) 

[PATCH v5 15/36] KVM: arm64: Factor out vector address calculation

2021-03-15 Thread Quentin Perret
In order to re-map the guest vectors at EL2 when pKVM is enabled,
refactor __kvm_vector_slot2idx() and kvm_init_vector_slot() to move all
the address calculation logic in a static inline function.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_mmu.h | 8 
 arch/arm64/kvm/arm.c | 9 +
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 90873851f677..5c42ec023cc7 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -168,6 +168,14 @@ phys_addr_t kvm_mmu_get_httbr(void);
 phys_addr_t kvm_get_idmap_vector(void);
 int kvm_mmu_init(void);
 
+static inline void *__kvm_vector_slot2addr(void *base,
+  enum arm64_hyp_spectre_vector slot)
+{
+   int idx = slot - (slot != HYP_VECTOR_DIRECT);
+
+   return base + (idx * SZ_2K);
+}
+
 struct kvm;
 
 #define kvm_flush_dcache_to_poc(a,l)   __flush_dcache_area((a), (l))
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 3f8bcf8db036..26e573cdede3 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1345,16 +1345,9 @@ static unsigned long nvhe_percpu_order(void)
 /* A lookup table holding the hypervisor VA for each vector slot */
 static void *hyp_spectre_vector_selector[BP_HARDEN_EL2_SLOTS];
 
-static int __kvm_vector_slot2idx(enum arm64_hyp_spectre_vector slot)
-{
-   return slot - (slot != HYP_VECTOR_DIRECT);
-}
-
 static void kvm_init_vector_slot(void *base, enum arm64_hyp_spectre_vector 
slot)
 {
-   int idx = __kvm_vector_slot2idx(slot);
-
-   hyp_spectre_vector_selector[slot] = base + (idx * SZ_2K);
+   hyp_spectre_vector_selector[slot] = __kvm_vector_slot2addr(base, slot);
 }
 
 static int kvm_init_vector_slots(void)
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v5 16/36] arm64: asm: Provide set_sctlr_el2 macro

2021-03-15 Thread Quentin Perret
We will soon need to turn the EL2 stage 1 MMU on and off in nVHE
protected mode, so refactor the set_sctlr_el1 macro to make it usable
for that purpose.

Acked-by: Will Deacon 
Suggested-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/assembler.h | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h 
b/arch/arm64/include/asm/assembler.h
index ca31594d3d6c..fb651c1f26e9 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -676,11 +676,11 @@ USER(\label, ic   ivau, \tmp2)// 
invalidate I line PoU
.endm
 
 /*
- * Set SCTLR_EL1 to the passed value, and invalidate the local icache
+ * Set SCTLR_ELx to the @reg value, and invalidate the local icache
  * in the process. This is called when setting the MMU on.
  */
-.macro set_sctlr_el1, reg
-   msr sctlr_el1, \reg
+.macro set_sctlr, sreg, reg
+   msr \sreg, \reg
isb
/*
 * Invalidate the local I-cache so that any instructions fetched
@@ -692,6 +692,14 @@ USER(\label, icivau, \tmp2)// 
invalidate I line PoU
isb
 .endm
 
+.macro set_sctlr_el1, reg
+   set_sctlr sctlr_el1, \reg
+.endm
+
+.macro set_sctlr_el2, reg
+   set_sctlr sctlr_el2, \reg
+.endm
+
 /*
  * Check whether to yield to another runnable task from kernel mode NEON code
  * (which runs with preemption disabled).
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v5 13/36] KVM: arm64: Enable access to sanitized CPU features at EL2

2021-03-15 Thread Quentin Perret
Introduce the infrastructure in KVM enabling to copy CPU feature
registers into EL2-owned data-structures, to allow reading sanitised
values directly at EL2 in nVHE.

Given that only a subset of these features are being read by the
hypervisor, the ones that need to be copied are to be listed under
 together with the name of the nVHE variable that
will hold the copy. This introduces only the infrastructure enabling
this copy. The first users will follow shortly.

Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/cpufeature.h |  1 +
 arch/arm64/include/asm/kvm_cpufeature.h | 15 +++
 arch/arm64/include/asm/kvm_host.h   |  4 
 arch/arm64/kernel/cpufeature.c  | 13 +
 arch/arm64/kvm/hyp/nvhe/hyp-smp.c   |  7 +++
 arch/arm64/kvm/sys_regs.c   | 19 +++
 6 files changed, 59 insertions(+)
 create mode 100644 arch/arm64/include/asm/kvm_cpufeature.h

diff --git a/arch/arm64/include/asm/cpufeature.h 
b/arch/arm64/include/asm/cpufeature.h
index 61177bac49fa..a85cea2cac57 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -607,6 +607,7 @@ void check_local_cpu_capabilities(void);
 
 u64 read_sanitised_ftr_reg(u32 id);
 u64 __read_sysreg_by_encoding(u32 sys_id);
+int copy_ftr_reg(u32 id, struct arm64_ftr_reg *dst);
 
 static inline bool cpu_supports_mixed_endian_el0(void)
 {
diff --git a/arch/arm64/include/asm/kvm_cpufeature.h 
b/arch/arm64/include/asm/kvm_cpufeature.h
new file mode 100644
index ..3fd9f60d2180
--- /dev/null
+++ b/arch/arm64/include/asm/kvm_cpufeature.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2020 - Google LLC
+ * Author: Quentin Perret 
+ */
+
+#include 
+
+#ifndef KVM_HYP_CPU_FTR_REG
+#if defined(__KVM_NVHE_HYPERVISOR__)
+#define KVM_HYP_CPU_FTR_REG(name) extern struct arm64_ftr_reg name
+#else
+#define KVM_HYP_CPU_FTR_REG(name) extern struct arm64_ftr_reg 
kvm_nvhe_sym(name)
+#endif
+#endif
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 06ca4828005f..459ee557f87c 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -751,9 +751,13 @@ void kvm_clr_pmu_events(u32 clr);
 
 void kvm_vcpu_pmu_restore_guest(struct kvm_vcpu *vcpu);
 void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu);
+
+void setup_kvm_el2_caps(void);
 #else
 static inline void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr) {}
 static inline void kvm_clr_pmu_events(u32 clr) {}
+
+static inline void setup_kvm_el2_caps(void) {}
 #endif
 
 void kvm_vcpu_load_sysregs_vhe(struct kvm_vcpu *vcpu);
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 066030717a4c..6252476e4e73 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1154,6 +1154,18 @@ u64 read_sanitised_ftr_reg(u32 id)
 }
 EXPORT_SYMBOL_GPL(read_sanitised_ftr_reg);
 
+int copy_ftr_reg(u32 id, struct arm64_ftr_reg *dst)
+{
+   struct arm64_ftr_reg *regp = get_arm64_ftr_reg(id);
+
+   if (!regp)
+   return -EINVAL;
+
+   *dst = *regp;
+
+   return 0;
+}
+
 #define read_sysreg_case(r)\
case r: val = read_sysreg_s(r); break;
 
@@ -2773,6 +2785,7 @@ void __init setup_cpu_features(void)
 
setup_system_capabilities();
setup_elf_hwcaps(arm64_elf_hwcaps);
+   setup_kvm_el2_caps();
 
if (system_supports_32bit_el0())
setup_elf_hwcaps(compat_elf_hwcaps);
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-smp.c 
b/arch/arm64/kvm/hyp/nvhe/hyp-smp.c
index 879559057dee..cc829b9db0da 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-smp.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-smp.c
@@ -38,3 +38,10 @@ unsigned long __hyp_per_cpu_offset(unsigned int cpu)
elf_base = (unsigned long)&__per_cpu_start;
return this_cpu_base - elf_base;
 }
+
+/*
+ * Define the CPU feature registers variables that will hold the copies of
+ * the host's sanitized values.
+ */
+#define KVM_HYP_CPU_FTR_REG(name) struct arm64_ftr_reg name
+#include 
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 4f2f1e3145de..6c5d133689ae 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -21,6 +21,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -2775,3 +2776,21 @@ void kvm_sys_reg_table_init(void)
/* Clear all higher bits. */
cache_levels &= (1 << (i*3))-1;
 }
+
+#define CPU_FTR_REG_HYP_COPY(id, name) \
+   { .sys_id = id, .dst = (struct arm64_ftr_reg *)_nvhe_sym(name) }
+struct __ftr_reg_copy_entry {
+   u32 sys_id;
+   struct arm64_ftr_reg*dst;
+} hyp_ftr_regs[] __initdata = {
+};
+
+void __init setup_kvm_el2_caps(void)
+{
+   int i;
+
+   for (i = 0; i < ARRAY_SIZE(hyp_ftr_regs); i++) {
+   WARN(copy_ftr_reg(hyp_ftr_regs[i].sys_id, hyp_ftr_regs[i].dst),
+

[PATCH v5 33/36] KVM: arm64: Wrap the host with a stage 2

2021-03-15 Thread Quentin Perret
When KVM runs in protected nVHE mode, make use of a stage 2 page-table
to give the hypervisor some control over the host memory accesses. The
host stage 2 is created lazily using large block mappings if possible,
and will default to page mappings in absence of a better solution.

>From this point on, memory accesses from the host to protected memory
regions (e.g. not 'owned' by the host) are fatal and lead to hyp_panic().

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_asm.h  |   1 +
 arch/arm64/kernel/image-vars.h|   3 +
 arch/arm64/kvm/arm.c  |  10 +
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |  34 +++
 arch/arm64/kvm/hyp/nvhe/Makefile  |   2 +-
 arch/arm64/kvm/hyp/nvhe/hyp-init.S|   1 +
 arch/arm64/kvm/hyp/nvhe/hyp-main.c|  11 +
 arch/arm64/kvm/hyp/nvhe/mem_protect.c | 246 ++
 arch/arm64/kvm/hyp/nvhe/setup.c   |   5 +
 arch/arm64/kvm/hyp/nvhe/switch.c  |   7 +-
 arch/arm64/kvm/hyp/nvhe/tlb.c |   4 +-
 11 files changed, 317 insertions(+), 7 deletions(-)
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
 create mode 100644 arch/arm64/kvm/hyp/nvhe/mem_protect.c

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 6dce860f8bca..b127af02bd45 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -61,6 +61,7 @@
 #define __KVM_HOST_SMCCC_FUNC___pkvm_create_mappings   16
 #define __KVM_HOST_SMCCC_FUNC___pkvm_create_private_mapping17
 #define __KVM_HOST_SMCCC_FUNC___pkvm_cpu_set_vector18
+#define __KVM_HOST_SMCCC_FUNC___pkvm_prot_finalize 19
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index 940c378fa837..d5dc2b792651 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -131,6 +131,9 @@ KVM_NVHE_ALIAS(__hyp_bss_end);
 KVM_NVHE_ALIAS(__hyp_rodata_start);
 KVM_NVHE_ALIAS(__hyp_rodata_end);
 
+/* pKVM static key */
+KVM_NVHE_ALIAS(kvm_protected_mode_initialized);
+
 #endif /* CONFIG_KVM */
 
 #endif /* __ARM64_KERNEL_IMAGE_VARS_H */
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index d474eec606a3..7e6a81079652 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1889,12 +1889,22 @@ static int init_hyp_mode(void)
return err;
 }
 
+void _kvm_host_prot_finalize(void *discard)
+{
+   WARN_ON(kvm_call_hyp_nvhe(__pkvm_prot_finalize));
+}
+
 static int finalize_hyp_mode(void)
 {
if (!is_protected_kvm_enabled())
return 0;
 
+   /*
+* Flip the static key upfront as that may no longer be possible
+* once the host stage 2 is installed.
+*/
static_branch_enable(_protected_mode_initialized);
+   on_each_cpu(_kvm_host_prot_finalize, NULL, 1);
 
return 0;
 }
diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h 
b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
new file mode 100644
index ..d293cb328cc4
--- /dev/null
+++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
@@ -0,0 +1,34 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2020 Google LLC
+ * Author: Quentin Perret 
+ */
+
+#ifndef __KVM_NVHE_MEM_PROTECT__
+#define __KVM_NVHE_MEM_PROTECT__
+#include 
+#include 
+#include 
+#include 
+#include 
+
+struct host_kvm {
+   struct kvm_arch arch;
+   struct kvm_pgtable pgt;
+   struct kvm_pgtable_mm_ops mm_ops;
+   hyp_spinlock_t lock;
+};
+extern struct host_kvm host_kvm;
+
+int __pkvm_prot_finalize(void);
+int kvm_host_prepare_stage2(void *mem_pgt_pool, void *dev_pgt_pool);
+void handle_host_mem_abort(struct kvm_cpu_context *host_ctxt);
+
+static __always_inline void __load_host_stage2(void)
+{
+   if (static_branch_likely(_protected_mode_initialized))
+   __load_stage2(_kvm.arch.mmu, host_kvm.arch.vtcr);
+   else
+   write_sysreg(0, vttbr_el2);
+}
+#endif /* __KVM_NVHE_MEM_PROTECT__ */
diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index b334354b8dd0..f55201a7ff33 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -14,7 +14,7 @@ lib-objs := $(addprefix ../../../lib/, $(lib-objs))
 
 obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
 hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o \
-cache.o setup.o mm.o
+cache.o setup.o mm.o mem_protect.o
 obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
 ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o
 obj-y += $(lib-objs)
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-init.S 
b/arch/arm64/kvm/hyp/nvhe/hyp-init.S
index a50ad9e9fc05..c164045af238 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-init.S
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-init.S

[PATCH v5 08/36] KVM: arm64: Make kvm_call_hyp() a function call at Hyp

2021-03-15 Thread Quentin Perret
kvm_call_hyp() has some logic to issue a function call or a hypercall
depending on the EL at which the kernel is running. However, all the
code compiled under __KVM_NVHE_HYPERVISOR__ is guaranteed to only run
at EL2 which allows us to simplify.

Add ifdefery to kvm_host.h to simplify kvm_call_hyp() in .hyp.text.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_host.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 3d10e6527f7d..06ca4828005f 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -591,6 +591,7 @@ int kvm_test_age_hva(struct kvm *kvm, unsigned long hva);
 void kvm_arm_halt_guest(struct kvm *kvm);
 void kvm_arm_resume_guest(struct kvm *kvm);
 
+#ifndef __KVM_NVHE_HYPERVISOR__
 #define kvm_call_hyp_nvhe(f, ...)  
\
({  \
struct arm_smccc_res res;   \
@@ -630,6 +631,11 @@ void kvm_arm_resume_guest(struct kvm *kvm);
\
ret;\
})
+#else /* __KVM_NVHE_HYPERVISOR__ */
+#define kvm_call_hyp(f, ...) f(__VA_ARGS__)
+#define kvm_call_hyp_ret(f, ...) f(__VA_ARGS__)
+#define kvm_call_hyp_nvhe(f, ...) f(__VA_ARGS__)
+#endif /* __KVM_NVHE_HYPERVISOR__ */
 
 void force_vm_exit(const cpumask_t *mask);
 void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v5 11/36] KVM: arm64: Stub CONFIG_DEBUG_LIST at Hyp

2021-03-15 Thread Quentin Perret
In order to use the kernel list library at EL2, introduce stubs for the
CONFIG_DEBUG_LIST out-of-lines calls.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/kvm/hyp/nvhe/Makefile |  2 +-
 arch/arm64/kvm/hyp/nvhe/stub.c   | 22 ++
 2 files changed, 23 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/kvm/hyp/nvhe/stub.c

diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index 24ff99e2eac5..144da72ad510 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -13,7 +13,7 @@ lib-objs := clear_page.o copy_page.o memcpy.o memset.o
 lib-objs := $(addprefix ../../../lib/, $(lib-objs))
 
 obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
-hyp-main.o hyp-smp.o psci-relay.o early_alloc.o
+hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o
 obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
 ../fpsimd.o ../hyp-entry.o ../exception.o
 obj-y += $(lib-objs)
diff --git a/arch/arm64/kvm/hyp/nvhe/stub.c b/arch/arm64/kvm/hyp/nvhe/stub.c
new file mode 100644
index ..c0aa6bbfd79d
--- /dev/null
+++ b/arch/arm64/kvm/hyp/nvhe/stub.c
@@ -0,0 +1,22 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Stubs for out-of-line function calls caused by re-using kernel
+ * infrastructure at EL2.
+ *
+ * Copyright (C) 2020 - Google LLC
+ */
+
+#include 
+
+#ifdef CONFIG_DEBUG_LIST
+bool __list_add_valid(struct list_head *new, struct list_head *prev,
+ struct list_head *next)
+{
+   return true;
+}
+
+bool __list_del_entry_valid(struct list_head *entry)
+{
+   return true;
+}
+#endif
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v5 06/36] KVM: arm64: Factor memory allocation out of pgtable.c

2021-03-15 Thread Quentin Perret
In preparation for enabling the creation of page-tables at EL2, factor
all memory allocation out of the page-table code, hence making it
re-usable with any compatible memory allocator.

No functional changes intended.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_pgtable.h | 41 +++-
 arch/arm64/kvm/hyp/pgtable.c | 98 +---
 arch/arm64/kvm/mmu.c | 66 ++-
 3 files changed, 163 insertions(+), 42 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_pgtable.h 
b/arch/arm64/include/asm/kvm_pgtable.h
index 8886d43cfb11..bbe840e430cb 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -13,17 +13,50 @@
 
 typedef u64 kvm_pte_t;
 
+/**
+ * struct kvm_pgtable_mm_ops - Memory management callbacks.
+ * @zalloc_page:   Allocate a single zeroed memory page. The @arg parameter
+ * can be used by the walker to pass a memcache. The
+ * initial refcount of the page is 1.
+ * @zalloc_pages_exact:Allocate an exact number of zeroed memory 
pages. The
+ * @size parameter is in bytes, and is rounded-up to the
+ * next page boundary. The resulting allocation is
+ * physically contiguous.
+ * @free_pages_exact:  Free an exact number of memory pages previously
+ * allocated by zalloc_pages_exact.
+ * @get_page:  Increment the refcount on a page.
+ * @put_page:  Decrement the refcount on a page. When the refcount
+ * reaches 0 the page is automatically freed.
+ * @page_count:Return the refcount of a page.
+ * @phys_to_virt:  Convert a physical address into a virtual address mapped
+ * in the current context.
+ * @virt_to_phys:  Convert a virtual address mapped in the current context
+ * into a physical address.
+ */
+struct kvm_pgtable_mm_ops {
+   void*   (*zalloc_page)(void *arg);
+   void*   (*zalloc_pages_exact)(size_t size);
+   void(*free_pages_exact)(void *addr, size_t size);
+   void(*get_page)(void *addr);
+   void(*put_page)(void *addr);
+   int (*page_count)(void *addr);
+   void*   (*phys_to_virt)(phys_addr_t phys);
+   phys_addr_t (*virt_to_phys)(void *addr);
+};
+
 /**
  * struct kvm_pgtable - KVM page-table.
  * @ia_bits:   Maximum input address size, in bits.
  * @start_level:   Level at which the page-table walk starts.
  * @pgd:   Pointer to the first top-level entry of the page-table.
+ * @mm_ops:Memory management callbacks.
  * @mmu:   Stage-2 KVM MMU struct. Unused for stage-1 page-tables.
  */
 struct kvm_pgtable {
u32 ia_bits;
u32 start_level;
kvm_pte_t   *pgd;
+   struct kvm_pgtable_mm_ops   *mm_ops;
 
/* Stage-2 only */
struct kvm_s2_mmu   *mmu;
@@ -86,10 +119,12 @@ struct kvm_pgtable_walker {
  * kvm_pgtable_hyp_init() - Initialise a hypervisor stage-1 page-table.
  * @pgt:   Uninitialised page-table structure to initialise.
  * @va_bits:   Maximum virtual address bits.
+ * @mm_ops:Memory management callbacks.
  *
  * Return: 0 on success, negative error code on failure.
  */
-int kvm_pgtable_hyp_init(struct kvm_pgtable *pgt, u32 va_bits);
+int kvm_pgtable_hyp_init(struct kvm_pgtable *pgt, u32 va_bits,
+struct kvm_pgtable_mm_ops *mm_ops);
 
 /**
  * kvm_pgtable_hyp_destroy() - Destroy an unused hypervisor stage-1 page-table.
@@ -126,10 +161,12 @@ int kvm_pgtable_hyp_map(struct kvm_pgtable *pgt, u64 
addr, u64 size, u64 phys,
  * kvm_pgtable_stage2_init() - Initialise a guest stage-2 page-table.
  * @pgt:   Uninitialised page-table structure to initialise.
  * @kvm:   KVM structure representing the guest virtual machine.
+ * @mm_ops:Memory management callbacks.
  *
  * Return: 0 on success, negative error code on failure.
  */
-int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm *kvm);
+int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm *kvm,
+   struct kvm_pgtable_mm_ops *mm_ops);
 
 /**
  * kvm_pgtable_stage2_destroy() - Destroy an unused guest stage-2 page-table.
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 81fe032f34d1..b975a67d1f85 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -152,9 +152,9 @@ static kvm_pte_t kvm_phys_to_pte(u64 pa)
return pte;
 }
 
-static kvm_pte_t *kvm_pte_follow(kvm_pte_t pte)
+static kvm_pte_t *kvm_pte_follow(kvm_pte_t pte, struct kvm_pgtable_mm_ops 
*mm_ops)
 {
-   return __va(kvm_pte_to_phys(pte));
+   

[PATCH v5 03/36] arm64: kvm: Add standalone ticket spinlock implementation for use at hyp

2021-03-15 Thread Quentin Perret
From: Will Deacon 

We will soon need to synchronise multiple CPUs in the hyp text at EL2.
The qspinlock-based locking used by the host is overkill for this purpose
and relies on the kernel's "percpu" implementation for the MCS nodes.

Implement a simple ticket locking scheme based heavily on the code removed
by commit c11090474d70 ("arm64: locking: Replace ticket lock implementation
with qspinlock").

Signed-off-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/kvm/hyp/include/nvhe/spinlock.h | 92 ++
 1 file changed, 92 insertions(+)
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/spinlock.h

diff --git a/arch/arm64/kvm/hyp/include/nvhe/spinlock.h 
b/arch/arm64/kvm/hyp/include/nvhe/spinlock.h
new file mode 100644
index ..76b537f8d1c6
--- /dev/null
+++ b/arch/arm64/kvm/hyp/include/nvhe/spinlock.h
@@ -0,0 +1,92 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * A stand-alone ticket spinlock implementation for use by the non-VHE
+ * KVM hypervisor code running at EL2.
+ *
+ * Copyright (C) 2020 Google LLC
+ * Author: Will Deacon 
+ *
+ * Heavily based on the implementation removed by c11090474d70 which was:
+ * Copyright (C) 2012 ARM Ltd.
+ */
+
+#ifndef __ARM64_KVM_NVHE_SPINLOCK_H__
+#define __ARM64_KVM_NVHE_SPINLOCK_H__
+
+#include 
+#include 
+
+typedef union hyp_spinlock {
+   u32 __val;
+   struct {
+#ifdef __AARCH64EB__
+   u16 next, owner;
+#else
+   u16 owner, next;
+#endif
+   };
+} hyp_spinlock_t;
+
+#define hyp_spin_lock_init(l)  \
+do {   \
+   *(l) = (hyp_spinlock_t){ .__val = 0 };  \
+} while (0)
+
+static inline void hyp_spin_lock(hyp_spinlock_t *lock)
+{
+   u32 tmp;
+   hyp_spinlock_t lockval, newval;
+
+   asm volatile(
+   /* Atomically increment the next ticket. */
+   ARM64_LSE_ATOMIC_INSN(
+   /* LL/SC */
+"  prfmpstl1strm, %3\n"
+"1:ldaxr   %w0, %3\n"
+"  add %w1, %w0, #(1 << 16)\n"
+"  stxr%w2, %w1, %3\n"
+"  cbnz%w2, 1b\n",
+   /* LSE atomics */
+"  mov %w2, #(1 << 16)\n"
+"  ldadda  %w2, %w0, %3\n"
+   __nops(3))
+
+   /* Did we get the lock? */
+"  eor %w1, %w0, %w0, ror #16\n"
+"  cbz %w1, 3f\n"
+   /*
+* No: spin on the owner. Send a local event to avoid missing an
+* unlock before the exclusive load.
+*/
+"  sevl\n"
+"2:wfe\n"
+"  ldaxrh  %w2, %4\n"
+"  eor %w1, %w2, %w0, lsr #16\n"
+"  cbnz%w1, 2b\n"
+   /* We got the lock. Critical section starts here. */
+"3:"
+   : "=" (lockval), "=" (newval), "=" (tmp), "+Q" (*lock)
+   : "Q" (lock->owner)
+   : "memory");
+}
+
+static inline void hyp_spin_unlock(hyp_spinlock_t *lock)
+{
+   u64 tmp;
+
+   asm volatile(
+   ARM64_LSE_ATOMIC_INSN(
+   /* LL/SC */
+   "   ldrh%w1, %0\n"
+   "   add %w1, %w1, #1\n"
+   "   stlrh   %w1, %0",
+   /* LSE atomics */
+   "   mov %w1, #1\n"
+   "   staddlh %w1, %0\n"
+   __nops(1))
+   : "=Q" (lock->owner), "=" (tmp)
+   :
+   : "memory");
+}
+
+#endif /* __ARM64_KVM_NVHE_SPINLOCK_H__ */
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v5 01/36] arm64: lib: Annotate {clear, copy}_page() as position-independent

2021-03-15 Thread Quentin Perret
From: Will Deacon 

clear_page() and copy_page() are suitable for use outside of the kernel
address space, so annotate them as position-independent code.

Signed-off-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/lib/clear_page.S | 4 ++--
 arch/arm64/lib/copy_page.S  | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/lib/clear_page.S b/arch/arm64/lib/clear_page.S
index 073acbf02a7c..b84b179edba3 100644
--- a/arch/arm64/lib/clear_page.S
+++ b/arch/arm64/lib/clear_page.S
@@ -14,7 +14,7 @@
  * Parameters:
  * x0 - dest
  */
-SYM_FUNC_START(clear_page)
+SYM_FUNC_START_PI(clear_page)
mrs x1, dczid_el0
and w1, w1, #0xf
mov x2, #4
@@ -25,5 +25,5 @@ SYM_FUNC_START(clear_page)
tst x0, #(PAGE_SIZE - 1)
b.ne1b
ret
-SYM_FUNC_END(clear_page)
+SYM_FUNC_END_PI(clear_page)
 EXPORT_SYMBOL(clear_page)
diff --git a/arch/arm64/lib/copy_page.S b/arch/arm64/lib/copy_page.S
index e7a793961408..29144f4cd449 100644
--- a/arch/arm64/lib/copy_page.S
+++ b/arch/arm64/lib/copy_page.S
@@ -17,7 +17,7 @@
  * x0 - dest
  * x1 - src
  */
-SYM_FUNC_START(copy_page)
+SYM_FUNC_START_PI(copy_page)
 alternative_if ARM64_HAS_NO_HW_PREFETCH
// Prefetch three cache lines ahead.
prfmpldl1strm, [x1, #128]
@@ -75,5 +75,5 @@ alternative_else_nop_endif
stnpx16, x17, [x0, #112 - 256]
 
ret
-SYM_FUNC_END(copy_page)
+SYM_FUNC_END_PI(copy_page)
 EXPORT_SYMBOL(copy_page)
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v5 05/36] KVM: arm64: Avoid free_page() in page-table allocator

2021-03-15 Thread Quentin Perret
Currently, the KVM page-table allocator uses a mix of put_page() and
free_page() calls depending on the context even though page-allocation
is always achieved using variants of __get_free_page().

Make the code consistent by using put_page() throughout, and reduce the
memory management API surface used by the page-table code. This will
ease factoring out page-allocation from pgtable.c, which is a
pre-requisite to creating page-tables at EL2.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/kvm/hyp/pgtable.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 4d177ce1d536..81fe032f34d1 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -413,7 +413,7 @@ int kvm_pgtable_hyp_init(struct kvm_pgtable *pgt, u32 
va_bits)
 static int hyp_free_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep,
   enum kvm_pgtable_walk_flags flag, void * const arg)
 {
-   free_page((unsigned long)kvm_pte_follow(*ptep));
+   put_page(virt_to_page(kvm_pte_follow(*ptep)));
return 0;
 }
 
@@ -425,7 +425,7 @@ void kvm_pgtable_hyp_destroy(struct kvm_pgtable *pgt)
};
 
WARN_ON(kvm_pgtable_walk(pgt, 0, BIT(pgt->ia_bits), ));
-   free_page((unsigned long)pgt->pgd);
+   put_page(virt_to_page(pgt->pgd));
pgt->pgd = NULL;
 }
 
@@ -577,7 +577,7 @@ static int stage2_map_walk_table_post(u64 addr, u64 end, 
u32 level,
if (!data->anchor)
return 0;
 
-   free_page((unsigned long)kvm_pte_follow(*ptep));
+   put_page(virt_to_page(kvm_pte_follow(*ptep)));
put_page(virt_to_page(ptep));
 
if (data->anchor == ptep) {
@@ -700,7 +700,7 @@ static int stage2_unmap_walker(u64 addr, u64 end, u32 
level, kvm_pte_t *ptep,
}
 
if (childp)
-   free_page((unsigned long)childp);
+   put_page(virt_to_page(childp));
 
return 0;
 }
@@ -897,7 +897,7 @@ static int stage2_free_walker(u64 addr, u64 end, u32 level, 
kvm_pte_t *ptep,
put_page(virt_to_page(ptep));
 
if (kvm_pte_table(pte, level))
-   free_page((unsigned long)kvm_pte_follow(pte));
+   put_page(virt_to_page(kvm_pte_follow(pte)));
 
return 0;
 }
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v5 27/36] KVM: arm64: Sort the hypervisor memblocks

2021-03-15 Thread Quentin Perret
We will soon need to check if a Physical Address belongs to a memblock
at EL2, so make sure to sort them so this can be done efficiently.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/kvm/hyp/reserved_mem.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/arch/arm64/kvm/hyp/reserved_mem.c 
b/arch/arm64/kvm/hyp/reserved_mem.c
index fd42705a3c26..83ca23ac259b 100644
--- a/arch/arm64/kvm/hyp/reserved_mem.c
+++ b/arch/arm64/kvm/hyp/reserved_mem.c
@@ -6,6 +6,7 @@
 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -18,6 +19,23 @@ static unsigned int *hyp_memblock_nr_ptr = 
_nvhe_sym(hyp_memblock_nr);
 phys_addr_t hyp_mem_base;
 phys_addr_t hyp_mem_size;
 
+static int cmp_hyp_memblock(const void *p1, const void *p2)
+{
+   const struct memblock_region *r1 = p1;
+   const struct memblock_region *r2 = p2;
+
+   return r1->base < r2->base ? -1 : (r1->base > r2->base);
+}
+
+static void __init sort_memblock_regions(void)
+{
+   sort(hyp_memory,
+*hyp_memblock_nr_ptr,
+sizeof(struct memblock_region),
+cmp_hyp_memblock,
+NULL);
+}
+
 static int __init register_memblock_regions(void)
 {
struct memblock_region *reg;
@@ -29,6 +47,7 @@ static int __init register_memblock_regions(void)
hyp_memory[*hyp_memblock_nr_ptr] = *reg;
(*hyp_memblock_nr_ptr)++;
}
+   sort_memblock_regions();
 
return 0;
 }
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v5 23/36] KVM: arm64: Refactor __load_guest_stage2()

2021-03-15 Thread Quentin Perret
Refactor __load_guest_stage2() to introduce __load_stage2() which will
be re-used when loading the host stage 2.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_mmu.h | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 6f743e20cb06..9d64fa73ee67 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -270,9 +270,9 @@ static __always_inline u64 kvm_get_vttbr(struct kvm_s2_mmu 
*mmu)
  * Must be called from hyp code running at EL2 with an updated VTTBR
  * and interrupts disabled.
  */
-static __always_inline void __load_guest_stage2(struct kvm_s2_mmu *mmu)
+static __always_inline void __load_stage2(struct kvm_s2_mmu *mmu, unsigned 
long vtcr)
 {
-   write_sysreg(kern_hyp_va(mmu->arch)->vtcr, vtcr_el2);
+   write_sysreg(vtcr, vtcr_el2);
write_sysreg(kvm_get_vttbr(mmu), vttbr_el2);
 
/*
@@ -283,6 +283,11 @@ static __always_inline void __load_guest_stage2(struct 
kvm_s2_mmu *mmu)
asm(ALTERNATIVE("nop", "isb", ARM64_WORKAROUND_SPECULATIVE_AT));
 }
 
+static __always_inline void __load_guest_stage2(struct kvm_s2_mmu *mmu)
+{
+   __load_stage2(mmu, kern_hyp_va(mmu->arch)->vtcr);
+}
+
 static inline struct kvm *kvm_s2_mmu_to_kvm(struct kvm_s2_mmu *mmu)
 {
return container_of(mmu->arch, struct kvm, arch);
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v5 02/36] KVM: arm64: Link position-independent string routines into .hyp.text

2021-03-15 Thread Quentin Perret
From: Will Deacon 

Pull clear_page(), copy_page(), memcpy() and memset() into the nVHE hyp
code and ensure that we always execute the '__pi_' entry point on the
offchance that it changes in future.

[ qperret: Commit title nits and added linker script alias ]

Signed-off-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/hyp_image.h |  3 +++
 arch/arm64/kernel/image-vars.h | 11 +++
 arch/arm64/kvm/hyp/nvhe/Makefile   |  4 
 3 files changed, 18 insertions(+)

diff --git a/arch/arm64/include/asm/hyp_image.h 
b/arch/arm64/include/asm/hyp_image.h
index 737ded6b6d0d..78cd77990c9c 100644
--- a/arch/arm64/include/asm/hyp_image.h
+++ b/arch/arm64/include/asm/hyp_image.h
@@ -56,6 +56,9 @@
  */
 #define KVM_NVHE_ALIAS(sym)kvm_nvhe_sym(sym) = sym;
 
+/* Defines a linker script alias for KVM nVHE hyp symbols */
+#define KVM_NVHE_ALIAS_HYP(first, sec) kvm_nvhe_sym(first) = kvm_nvhe_sym(sec);
+
 #endif /* LINKER_SCRIPT */
 
 #endif /* __ARM64_HYP_IMAGE_H__ */
diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index 5aa9ed1e9ec6..4eb7a15c8b60 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -104,6 +104,17 @@ KVM_NVHE_ALIAS(kvm_arm_hyp_percpu_base);
 /* PMU available static key */
 KVM_NVHE_ALIAS(kvm_arm_pmu_available);
 
+/* Position-independent library routines */
+KVM_NVHE_ALIAS_HYP(clear_page, __pi_clear_page);
+KVM_NVHE_ALIAS_HYP(copy_page, __pi_copy_page);
+KVM_NVHE_ALIAS_HYP(memcpy, __pi_memcpy);
+KVM_NVHE_ALIAS_HYP(memset, __pi_memset);
+
+#ifdef CONFIG_KASAN
+KVM_NVHE_ALIAS_HYP(__memcpy, __pi_memcpy);
+KVM_NVHE_ALIAS_HYP(__memset, __pi_memset);
+#endif
+
 #endif /* CONFIG_KVM */
 
 #endif /* __ARM64_KERNEL_IMAGE_VARS_H */
diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index a6707df4f6c0..bc98f8e3d1da 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -9,10 +9,14 @@ ccflags-y := -D__KVM_NVHE_HYPERVISOR__ -D__DISABLE_EXPORTS
 hostprogs := gen-hyprel
 HOST_EXTRACFLAGS += -I$(objtree)/include
 
+lib-objs := clear_page.o copy_page.o memcpy.o memset.o
+lib-objs := $(addprefix ../../../lib/, $(lib-objs))
+
 obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
 hyp-main.o hyp-smp.o psci-relay.o
 obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
 ../fpsimd.o ../hyp-entry.o ../exception.o
+obj-y += $(lib-objs)
 
 ##
 ## Build rules for compiling nVHE hyp code
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v5 04/36] KVM: arm64: Initialize kvm_nvhe_init_params early

2021-03-15 Thread Quentin Perret
Move the initialization of kvm_nvhe_init_params in a dedicated function
that is run early, and only once during KVM init, rather than every time
the KVM vectors are set and reset.

This also opens the opportunity for the hypervisor to change the init
structs during boot, hence simplifying the replacement of host-provided
page-table by the one the hypervisor will create for itself.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/kvm/arm.c | 30 ++
 1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index fc4c95dd2d26..2d1e7ef69c04 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1383,22 +1383,18 @@ static int kvm_init_vector_slots(void)
return 0;
 }
 
-static void cpu_init_hyp_mode(void)
+static void cpu_prepare_hyp_mode(int cpu)
 {
-   struct kvm_nvhe_init_params *params = 
this_cpu_ptr_nvhe_sym(kvm_init_params);
-   struct arm_smccc_res res;
+   struct kvm_nvhe_init_params *params = 
per_cpu_ptr_nvhe_sym(kvm_init_params, cpu);
unsigned long tcr;
 
-   /* Switch from the HYP stub to our own HYP init vector */
-   __hyp_set_vectors(kvm_get_idmap_vector());
-
/*
 * Calculate the raw per-cpu offset without a translation from the
 * kernel's mapping to the linear mapping, and store it in tpidr_el2
 * so that we can use adr_l to access per-cpu variables in EL2.
 * Also drop the KASAN tag which gets in the way...
 */
-   params->tpidr_el2 = (unsigned 
long)kasan_reset_tag(this_cpu_ptr_nvhe_sym(__per_cpu_start)) -
+   params->tpidr_el2 = (unsigned 
long)kasan_reset_tag(per_cpu_ptr_nvhe_sym(__per_cpu_start, cpu)) -
(unsigned 
long)kvm_ksym_ref(CHOOSE_NVHE_SYM(__per_cpu_start));
 
params->mair_el2 = read_sysreg(mair_el1);
@@ -1422,7 +1418,7 @@ static void cpu_init_hyp_mode(void)
tcr |= (idmap_t0sz & GENMASK(TCR_TxSZ_WIDTH - 1, 0)) << TCR_T0SZ_OFFSET;
params->tcr_el2 = tcr;
 
-   params->stack_hyp_va = 
kern_hyp_va(__this_cpu_read(kvm_arm_hyp_stack_page) + PAGE_SIZE);
+   params->stack_hyp_va = kern_hyp_va(per_cpu(kvm_arm_hyp_stack_page, cpu) 
+ PAGE_SIZE);
params->pgd_pa = kvm_mmu_get_httbr();
 
/*
@@ -1430,6 +1426,15 @@ static void cpu_init_hyp_mode(void)
 * be read while the MMU is off.
 */
kvm_flush_dcache_to_poc(params, sizeof(*params));
+}
+
+static void cpu_init_hyp_mode(void)
+{
+   struct kvm_nvhe_init_params *params;
+   struct arm_smccc_res res;
+
+   /* Switch from the HYP stub to our own HYP init vector */
+   __hyp_set_vectors(kvm_get_idmap_vector());
 
/*
 * Call initialization code, and switch to the full blown HYP code.
@@ -1438,6 +1443,7 @@ static void cpu_init_hyp_mode(void)
 * cpus_have_const_cap() wrapper.
 */
BUG_ON(!system_capabilities_finalized());
+   params = this_cpu_ptr_nvhe_sym(kvm_init_params);
arm_smccc_1_1_hvc(KVM_HOST_SMCCC_FUNC(__kvm_hyp_init), 
virt_to_phys(params), );
WARN_ON(res.a0 != SMCCC_RET_SUCCESS);
 
@@ -1785,19 +1791,19 @@ static int init_hyp_mode(void)
}
}
 
-   /*
-* Map Hyp percpu pages
-*/
for_each_possible_cpu(cpu) {
char *percpu_begin = (char *)kvm_arm_hyp_percpu_base[cpu];
char *percpu_end = percpu_begin + nvhe_percpu_size();
 
+   /* Map Hyp percpu pages */
err = create_hyp_mappings(percpu_begin, percpu_end, PAGE_HYP);
-
if (err) {
kvm_err("Cannot map hyp percpu region\n");
goto out_err;
}
+
+   /* Prepare the CPU initialization parameters */
+   cpu_prepare_hyp_mode(cpu);
}
 
if (is_protected_kvm_enabled()) {
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v5 29/36] KVM: arm64: Use page-table to track page ownership

2021-03-15 Thread Quentin Perret
As the host stage 2 will be identity mapped, all the .hyp memory regions
and/or memory pages donated to protected guestis will have to marked
invalid in the host stage 2 page-table. At the same time, the hypervisor
will need a way to track the ownership of each physical page to ensure
memory sharing or donation between entities (host, guests, hypervisor) is
legal.

In order to enable this tracking at EL2, let's use the host stage 2
page-table itself. The idea is to use the top bits of invalid mappings
to store the unique identifier of the page owner. The page-table owner
(the host) gets identifier 0 such that, at boot time, it owns the entire
IPA space as the pgd starts zeroed.

Provide kvm_pgtable_stage2_set_owner() which allows to modify the
ownership of pages in the host stage 2. It re-uses most of the map()
logic, but ends up creating invalid mappings instead. This impacts
how we do refcount as we now need to count invalid mappings when they
are used for ownership tracking.

Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_pgtable.h |  21 +
 arch/arm64/kvm/hyp/pgtable.c | 127 ++-
 2 files changed, 124 insertions(+), 24 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_pgtable.h 
b/arch/arm64/include/asm/kvm_pgtable.h
index 4ae19247837b..683e96abdc24 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -238,6 +238,27 @@ int kvm_pgtable_stage2_map(struct kvm_pgtable *pgt, u64 
addr, u64 size,
   u64 phys, enum kvm_pgtable_prot prot,
   void *mc);
 
+/**
+ * kvm_pgtable_stage2_set_owner() - Annotate invalid mappings with metadata
+ * encoding the ownership of a page in the
+ * IPA space.
+ * @pgt:   Page-table structure initialised by kvm_pgtable_stage2_init().
+ * @addr:  Base intermediate physical address to annotate.
+ * @size:  Size of the annotated range.
+ * @mc:Cache of pre-allocated and zeroed memory from which to 
allocate
+ * page-table pages.
+ * @owner_id:  Unique identifier for the owner of the page.
+ *
+ * By default, all page-tables are owned by identifier 0. This function can be
+ * used to mark portions of the IPA space as owned by other entities. When a
+ * stage 2 is used with identity-mappings, these annotations allow to use the
+ * page-table data structure as a simple rmap.
+ *
+ * Return: 0 on success, negative error code on failure.
+ */
+int kvm_pgtable_stage2_set_owner(struct kvm_pgtable *pgt, u64 addr, u64 size,
+void *mc, u8 owner_id);
+
 /**
  * kvm_pgtable_stage2_unmap() - Remove a mapping from a guest stage-2 
page-table.
  * @pgt:   Page-table structure initialised by kvm_pgtable_stage2_init().
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index f37b4179b880..bd44e84dedc4 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -48,6 +48,9 @@
 KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W | \
 KVM_PTE_LEAF_ATTR_HI_S2_XN)
 
+#define KVM_INVALID_PTE_OWNER_MASK GENMASK(63, 56)
+#define KVM_MAX_OWNER_ID   1
+
 struct kvm_pgtable_walk_data {
struct kvm_pgtable  *pgt;
struct kvm_pgtable_walker   *walker;
@@ -67,6 +70,13 @@ static u64 kvm_granule_size(u32 level)
return BIT(kvm_granule_shift(level));
 }
 
+#define KVM_PHYS_INVALID (-1ULL)
+
+static bool kvm_phys_is_valid(u64 phys)
+{
+   return phys < 
BIT(id_aa64mmfr0_parange_to_phys_shift(ID_AA64MMFR0_PARANGE_MAX));
+}
+
 static bool kvm_block_mapping_supported(u64 addr, u64 end, u64 phys, u32 level)
 {
u64 granule = kvm_granule_size(level);
@@ -81,7 +91,10 @@ static bool kvm_block_mapping_supported(u64 addr, u64 end, 
u64 phys, u32 level)
if (granule > (end - addr))
return false;
 
-   return IS_ALIGNED(addr, granule) && IS_ALIGNED(phys, granule);
+   if (kvm_phys_is_valid(phys) && !IS_ALIGNED(phys, granule))
+   return false;
+
+   return IS_ALIGNED(addr, granule);
 }
 
 static u32 kvm_pgtable_idx(struct kvm_pgtable_walk_data *data, u32 level)
@@ -186,6 +199,11 @@ static kvm_pte_t kvm_init_valid_leaf_pte(u64 pa, kvm_pte_t 
attr, u32 level)
return pte;
 }
 
+static kvm_pte_t kvm_init_invalid_leaf_owner(u8 owner_id)
+{
+   return FIELD_PREP(KVM_INVALID_PTE_OWNER_MASK, owner_id);
+}
+
 static int kvm_pgtable_visitor_cb(struct kvm_pgtable_walk_data *data, u64 addr,
  u32 level, kvm_pte_t *ptep,
  enum kvm_pgtable_walk_flags flag)
@@ -440,6 +458,7 @@ void kvm_pgtable_hyp_destroy(struct kvm_pgtable *pgt)
 struct stage2_map_data {
u64 phys;
kvm_pte_t   attr;
+   u8  

[PATCH v5 17/36] KVM: arm64: Prepare the creation of s1 mappings at EL2

2021-03-15 Thread Quentin Perret
When memory protection is enabled, the EL2 code needs the ability to
create and manage its own page-table. To do so, introduce a new set of
hypercalls to bootstrap a memory management system at EL2.

This leads to the following boot flow in nVHE Protected mode:

 1. the host allocates memory for the hypervisor very early on, using
the memblock API;

 2. the host creates a set of stage 1 page-table for EL2, installs the
EL2 vectors, and issues the __pkvm_init hypercall;

 3. during __pkvm_init, the hypervisor re-creates its stage 1 page-table
and stores it in the memory pool provided by the host;

 4. the hypervisor then extends its stage 1 mappings to include a
vmemmap in the EL2 VA space, hence allowing to use the buddy
allocator introduced in a previous patch;

 5. the hypervisor jumps back in the idmap page, switches from the
host-provided page-table to the new one, and wraps up its
initialization by enabling the new allocator, before returning to
the host.

 6. the host can free the now unused page-table created for EL2, and
will now need to issue hypercalls to make changes to the EL2 stage 1
mappings instead of modifying them directly.

Note that for the sake of simplifying the review, this patch focuses on
the hypervisor side of things. In other words, this only implements the
new hypercalls, but does not make use of them from the host yet. The
host-side changes will follow in a subsequent patch.

Credits to Will for __pkvm_init_switch_pgd.

Acked-by: Will Deacon 
Co-authored-by: Will Deacon 
Signed-off-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_asm.h |   4 +
 arch/arm64/include/asm/kvm_host.h|   7 +
 arch/arm64/include/asm/kvm_hyp.h |   8 ++
 arch/arm64/include/asm/kvm_pgtable.h |   2 +
 arch/arm64/kernel/image-vars.h   |  16 +++
 arch/arm64/kvm/hyp/Makefile  |   2 +-
 arch/arm64/kvm/hyp/include/nvhe/mm.h |  71 ++
 arch/arm64/kvm/hyp/nvhe/Makefile |   4 +-
 arch/arm64/kvm/hyp/nvhe/hyp-init.S   |  27 
 arch/arm64/kvm/hyp/nvhe/hyp-main.c   |  49 +++
 arch/arm64/kvm/hyp/nvhe/mm.c | 173 +++
 arch/arm64/kvm/hyp/nvhe/setup.c  | 197 +++
 arch/arm64/kvm/hyp/pgtable.c |   2 -
 arch/arm64/kvm/hyp/reserved_mem.c|  92 +
 arch/arm64/mm/init.c |   3 +
 15 files changed, 652 insertions(+), 5 deletions(-)
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/mm.h
 create mode 100644 arch/arm64/kvm/hyp/nvhe/mm.c
 create mode 100644 arch/arm64/kvm/hyp/nvhe/setup.c
 create mode 100644 arch/arm64/kvm/hyp/reserved_mem.c

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 22d933e9b59e..db20a9477870 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -57,6 +57,10 @@
 #define __KVM_HOST_SMCCC_FUNC___kvm_get_mdcr_el2   12
 #define __KVM_HOST_SMCCC_FUNC___vgic_v3_save_aprs  13
 #define __KVM_HOST_SMCCC_FUNC___vgic_v3_restore_aprs   14
+#define __KVM_HOST_SMCCC_FUNC___pkvm_init  15
+#define __KVM_HOST_SMCCC_FUNC___pkvm_create_mappings   16
+#define __KVM_HOST_SMCCC_FUNC___pkvm_create_private_mapping17
+#define __KVM_HOST_SMCCC_FUNC___pkvm_cpu_set_vector18
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 459ee557f87c..b9d45a1f8538 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -781,5 +781,12 @@ bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu);
(test_bit(KVM_ARM_VCPU_PMU_V3, (vcpu)->arch.features))
 
 int kvm_trng_call(struct kvm_vcpu *vcpu);
+#ifdef CONFIG_KVM
+extern phys_addr_t hyp_mem_base;
+extern phys_addr_t hyp_mem_size;
+void __init kvm_hyp_reserve(void);
+#else
+static inline void kvm_hyp_reserve(void) { }
+#endif
 
 #endif /* __ARM64_KVM_HOST_H__ */
diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index c0450828378b..ae55351b99a4 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -100,4 +100,12 @@ void __noreturn hyp_panic(void);
 void __noreturn __hyp_do_panic(bool restore_host, u64 spsr, u64 elr, u64 par);
 #endif
 
+#ifdef __KVM_NVHE_HYPERVISOR__
+void __pkvm_init_switch_pgd(phys_addr_t phys, unsigned long size,
+   phys_addr_t pgd, void *sp, void *cont_fn);
+int __pkvm_init(phys_addr_t phys, unsigned long size, unsigned long nr_cpus,
+   unsigned long *per_cpu_base, u32 hyp_va_bits);
+void __noreturn __host_enter(struct kvm_cpu_context *host_ctxt);
+#endif
+
 #endif /* __ARM64_KVM_HYP_H__ */
diff --git a/arch/arm64/include/asm/kvm_pgtable.h 
b/arch/arm64/include/asm/kvm_pgtable.h
index bbe840e430cb..bf7a3cc49420 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -11,6 +11,8 @@
 #include 
 

[PATCH v5 00/36] KVM: arm64: A stage 2 for the host

2021-03-15 Thread Quentin Perret
Hi all,

This is the v5 of the series previously posted here:

  https://lore.kernel.org/kvmarm/20210310175751.3320106-1-qper...@google.com/

This basically allows us to wrap the host with a stage 2 when running in
nVHE, hence paving the way for protecting guest memory from the host in
the future (among other use-cases). For more details about the
motivation and the design angle taken here, I would recommend to have a
look at the cover letter of v1, and/or to watch these presentations at
LPC [1] and KVM forum 2020 [2].

Changes since v3:

 - simplified the infrastructure allowing to copy feature registers for
   use at EL2;

 - reworked the page-ownership path in the pgtable code to use an
   invalid PA instead of setting a valid bit upfront in the map() path;

 - refactored hyp_map_set_prot_attr() to match its stage-2 counterpart;

 - and a handful of small cleanups / comestic changes.

This series depends on Will's vCPU context fix ([3]) and Marc's PMU
fixes ([4]). And here's a branch with all the goodies applied:

  https://android-kvm.googlesource.com/linux qperret/host-stage2-v5

Thanks,
Quentin

[1] https://youtu.be/54q6RzS9BpQ?t=10859
[2] https://youtu.be/wY-u6n75iXc
[3] https://lore.kernel.org/kvmarm/20210226181211.14542-1-w...@kernel.org/
[4] 
https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=kvm-arm64/pmu-undef-NV

Quentin Perret (33):
  KVM: arm64: Initialize kvm_nvhe_init_params early
  KVM: arm64: Avoid free_page() in page-table allocator
  KVM: arm64: Factor memory allocation out of pgtable.c
  KVM: arm64: Introduce a BSS section for use at Hyp
  KVM: arm64: Make kvm_call_hyp() a function call at Hyp
  KVM: arm64: Allow using kvm_nvhe_sym() in hyp code
  KVM: arm64: Introduce an early Hyp page allocator
  KVM: arm64: Stub CONFIG_DEBUG_LIST at Hyp
  KVM: arm64: Introduce a Hyp buddy page allocator
  KVM: arm64: Enable access to sanitized CPU features at EL2
  KVM: arm64: Provide __flush_dcache_area at EL2
  KVM: arm64: Factor out vector address calculation
  arm64: asm: Provide set_sctlr_el2 macro
  KVM: arm64: Prepare the creation of s1 mappings at EL2
  KVM: arm64: Elevate hypervisor mappings creation at EL2
  KVM: arm64: Use kvm_arch for stage 2 pgtable
  KVM: arm64: Use kvm_arch in kvm_s2_mmu
  KVM: arm64: Set host stage 2 using kvm_nvhe_init_params
  KVM: arm64: Refactor kvm_arm_setup_stage2()
  KVM: arm64: Refactor __load_guest_stage2()
  KVM: arm64: Refactor __populate_fault_info()
  KVM: arm64: Make memcache anonymous in pgtable allocator
  KVM: arm64: Reserve memory for host stage 2
  KVM: arm64: Sort the hypervisor memblocks
  KVM: arm64: Always zero invalid PTEs
  KVM: arm64: Use page-table to track page ownership
  KVM: arm64: Refactor the *_map_set_prot_attr() helpers
  KVM: arm64: Add kvm_pgtable_stage2_find_range()
  KVM: arm64: Provide sanitized mmfr* registers at EL2
  KVM: arm64: Wrap the host with a stage 2
  KVM: arm64: Page-align the .hyp sections
  KVM: arm64: Disable PMU support in protected mode
  KVM: arm64: Protect the .hyp sections from the host

Will Deacon (3):
  arm64: lib: Annotate {clear,copy}_page() as position-independent
  KVM: arm64: Link position-independent string routines into .hyp.text
  arm64: kvm: Add standalone ticket spinlock implementation for use at
hyp

 arch/arm64/include/asm/assembler.h|  14 +-
 arch/arm64/include/asm/cpufeature.h   |   1 +
 arch/arm64/include/asm/hyp_image.h|   7 +
 arch/arm64/include/asm/kvm_asm.h  |   9 +
 arch/arm64/include/asm/kvm_cpufeature.h   |  19 +
 arch/arm64/include/asm/kvm_host.h |  19 +-
 arch/arm64/include/asm/kvm_hyp.h  |   8 +
 arch/arm64/include/asm/kvm_mmu.h  |  23 +-
 arch/arm64/include/asm/kvm_pgtable.h  | 128 +-
 arch/arm64/include/asm/sections.h |   1 +
 arch/arm64/kernel/asm-offsets.c   |   3 +
 arch/arm64/kernel/cpufeature.c|  13 +
 arch/arm64/kernel/image-vars.h|  30 ++
 arch/arm64/kernel/vmlinux.lds.S   |  74 ++--
 arch/arm64/kvm/arm.c  | 199 +++--
 arch/arm64/kvm/hyp/Makefile   |   2 +-
 arch/arm64/kvm/hyp/include/hyp/switch.h   |  34 +-
 arch/arm64/kvm/hyp/include/nvhe/early_alloc.h |  14 +
 arch/arm64/kvm/hyp/include/nvhe/gfp.h |  68 
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |  36 ++
 arch/arm64/kvm/hyp/include/nvhe/memory.h  |  52 +++
 arch/arm64/kvm/hyp/include/nvhe/mm.h  |  96 +
 arch/arm64/kvm/hyp/include/nvhe/spinlock.h|  92 +
 arch/arm64/kvm/hyp/nvhe/Makefile  |   9 +-
 arch/arm64/kvm/hyp/nvhe/cache.S   |  13 +
 arch/arm64/kvm/hyp/nvhe/early_alloc.c |  54 +++
 arch/arm64/kvm/hyp/nvhe/hyp-init.S|  42 +-
 arch/arm64/kvm/hyp/nvhe/hyp-main.c|  69 
 arch/arm64/kvm/hyp/nvhe/hyp-smp.c |   7 +
 arch/arm64/kvm/hyp/nvhe/hyp.lds.S |   1 +
 

[PATCH] KVM: arm64: Update comment for parameter rename

2021-03-15 Thread Andrew Scull
The first parameter of __hyp_do_panic() was changed, so update the
comment that's intended to explain the significance of passing zero.
This hunk previously got lost in the merge.

Fixes: c4b000c3928d ("KVM: arm64: Fix nVHE hyp panic host context restore")
Signed-off-by: Andrew Scull 
---

Applied on 5.12-rc3. The backports of the original patch contained this
hunk, and it's mainly cosmetic anyway, so no further action is needed.

---
 arch/arm64/kvm/hyp/nvhe/host.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/host.S b/arch/arm64/kvm/hyp/nvhe/host.S
index 5d94584840cc..c419648c1d3f 100644
--- a/arch/arm64/kvm/hyp/nvhe/host.S
+++ b/arch/arm64/kvm/hyp/nvhe/host.S
@@ -152,7 +152,7 @@ SYM_FUNC_END(__hyp_do_panic)
 
 .macro invalid_host_el1_vect
.align 7
-   mov x0, xzr /* restore_host = false */
+   mov x0, xzr /* host_ctxt = NULL */
mrs x1, spsr_el2
mrs x2, elr_el2
mrs x3, par_el1
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [kvm-unit-tests PATCH v4 00/11] GIC fixes and improvements

2021-03-15 Thread Andrew Jones
On Fri, Feb 19, 2021 at 12:13:26PM +, Alexandru Elisei wrote:
> What started this series is Andre's SPI and group interrupts tests [1],
> which prompted me to attempt to rewrite check_acked() so it's more flexible
> and not so complicated to review. When I was doing that I noticed that the
> message passing pattern for accesses to the acked, bad_irq and bad_sender
> arrays didn't look quite right, and that turned into the first 7 patches of
> the series. Even though the diffs are relatively small, they are not
> trivial and the reviewer can skip them for the more palatable patches that
> follow. I would still appreciate someone having a look at the memory
> ordering fixes.
> 
> Patch #8 ("Split check_acked() into two functions") is where check_acked()
> is reworked with an eye towards supporting different timeout values or
> silent reporting without adding too many arguments to check_acked().
> 
> After changing the IPI tests, I turned my attention to the LPI tests, which
> followed the same memory synchronization patterns, but invented their own
> interrupt handler and testing functions. Instead of redoing the work that I
> did for the IPI tests, I decided to convert the LPI tests to use the same
> infrastructure.
>

Applied to arm/queue

https://gitlab.com/rhdrjones/kvm-unit-tests/-/tree/arm/queue

Thanks,
drew

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v5 14/36] KVM: arm64: Provide __flush_dcache_area at EL2

2021-03-15 Thread Will Deacon
On Mon, Mar 15, 2021 at 04:56:21PM +, Quentin Perret wrote:
> On Monday 15 Mar 2021 at 16:33:23 (+), Will Deacon wrote:
> > On Mon, Mar 15, 2021 at 02:35:14PM +, Quentin Perret wrote:
> > > We will need to do cache maintenance at EL2 soon, so compile a copy of
> > > __flush_dcache_area at EL2, and provide a copy of arm64_ftr_reg_ctrel0
> > > as it is needed by the read_ctr macro.
> > > 
> > > Signed-off-by: Quentin Perret 
> > > ---
> > >  arch/arm64/include/asm/kvm_cpufeature.h |  2 ++
> > >  arch/arm64/kvm/hyp/nvhe/Makefile|  3 ++-
> > >  arch/arm64/kvm/hyp/nvhe/cache.S | 13 +
> > >  arch/arm64/kvm/sys_regs.c   |  1 +
> > >  4 files changed, 18 insertions(+), 1 deletion(-)
> > >  create mode 100644 arch/arm64/kvm/hyp/nvhe/cache.S
> > > 
> > > diff --git a/arch/arm64/include/asm/kvm_cpufeature.h 
> > > b/arch/arm64/include/asm/kvm_cpufeature.h
> > > index 3fd9f60d2180..efba1b89b8a4 100644
> > > --- a/arch/arm64/include/asm/kvm_cpufeature.h
> > > +++ b/arch/arm64/include/asm/kvm_cpufeature.h
> > > @@ -13,3 +13,5 @@
> > >  #define KVM_HYP_CPU_FTR_REG(name) extern struct arm64_ftr_reg 
> > > kvm_nvhe_sym(name)
> > >  #endif
> > >  #endif
> > > +
> > > +KVM_HYP_CPU_FTR_REG(arm64_ftr_reg_ctrel0);
> > 
> > I still think this is a bit weird. If you really want to macro-ise stuff,
> > then why not follow the sort of thing we do for e.g. per-cpu variables and
> > have separate DECLARE_HYP_CPU_FTR_REG and DEFINE_HYP_CPU_FTR_REG macros.
> > 
> > That way kvm_cpufeature.h can have header guards like a normal header and
> > we can drop the '#ifndef KVM_HYP_CPU_FTR_REG' altogether. I don't think
> > the duplication of the symbol name really matters -- it should fail at
> > build time if something is missing.
> 
> I just tend to hate unnecessary boilerplate, but if you feel strongly
> about it, happy to change :)

I don't like it either, but I prefer it to overriding macros like this! I
think having the "boilerplate" is a better starting point should we decide
to consolidate the definitions somehow.

Will
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v5 29/36] KVM: arm64: Use page-table to track page ownership

2021-03-15 Thread Will Deacon
On Mon, Mar 15, 2021 at 04:53:18PM +, Quentin Perret wrote:
> On Monday 15 Mar 2021 at 16:36:19 (+), Will Deacon wrote:
> > On Mon, Mar 15, 2021 at 02:35:29PM +, Quentin Perret wrote:
> > > As the host stage 2 will be identity mapped, all the .hyp memory regions
> > > and/or memory pages donated to protected guestis will have to marked
> > > invalid in the host stage 2 page-table. At the same time, the hypervisor
> > > will need a way to track the ownership of each physical page to ensure
> > > memory sharing or donation between entities (host, guests, hypervisor) is
> > > legal.
> > > 
> > > In order to enable this tracking at EL2, let's use the host stage 2
> > > page-table itself. The idea is to use the top bits of invalid mappings
> > > to store the unique identifier of the page owner. The page-table owner
> > > (the host) gets identifier 0 such that, at boot time, it owns the entire
> > > IPA space as the pgd starts zeroed.
> > > 
> > > Provide kvm_pgtable_stage2_set_owner() which allows to modify the
> > > ownership of pages in the host stage 2. It re-uses most of the map()
> > > logic, but ends up creating invalid mappings instead. This impacts
> > > how we do refcount as we now need to count invalid mappings when they
> > > are used for ownership tracking.
> > > 
> > > Signed-off-by: Quentin Perret 
> > > ---
> > >  arch/arm64/include/asm/kvm_pgtable.h |  21 +
> > >  arch/arm64/kvm/hyp/pgtable.c | 127 ++-
> > >  2 files changed, 124 insertions(+), 24 deletions(-)
> > > 
> > > diff --git a/arch/arm64/include/asm/kvm_pgtable.h 
> > > b/arch/arm64/include/asm/kvm_pgtable.h
> > > index 4ae19247837b..683e96abdc24 100644
> > > --- a/arch/arm64/include/asm/kvm_pgtable.h
> > > +++ b/arch/arm64/include/asm/kvm_pgtable.h
> > > @@ -238,6 +238,27 @@ int kvm_pgtable_stage2_map(struct kvm_pgtable *pgt, 
> > > u64 addr, u64 size,
> > >  u64 phys, enum kvm_pgtable_prot prot,
> > >  void *mc);
> > >  
> > > +/**
> > > + * kvm_pgtable_stage2_set_owner() - Annotate invalid mappings with 
> > > metadata
> > > + *   encoding the ownership of a page in 
> > > the
> > > + *   IPA space.
> > 
> > The function does more than this, though, as it will also go ahead and unmap
> > existing valid mappings which I think should be mentioned here, no?
> 
> Right, I see why you mean. How about:
> 
> 'Unmap and annotate pages in the IPA space to track ownership'

I think I'd go with:

'Unmap pages and annotate the invalid mappings with ownership metadata for
 the unmapped IPA range.'

as it's the page-table which is annotated, not the actual pages (which could
potentially be mapped by other page-tables).

Will
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [kvm-unit-tests PATCH] configure: arm/arm64: Add --earlycon option to set UART type and address

2021-03-15 Thread Alexandru Elisei
Hi Andre,

On 3/3/21 2:18 PM, Andre Przywara wrote:
> On Fri, 19 Feb 2021 16:37:18 +
> Alexandru Elisei  wrote:
>
>> Currently, the UART early address is set indirectly with the --vmm option
>> and there are only two possible values: if the VMM is qemu (the default),
>> then the UART address is set to 0x0900; if the VMM is kvmtool, then the
>> UART address is set to 0x3f8.
>>
>> There several efforts under way to change the kvmtool UART address, and
>> kvm-unit-tests so far hasn't had mechanism to let the user set a specific
>> address, which means that the early UART won't be available.
>>
>> This situation will only become worse as kvm-unit-tests gains support to
>> run as an EFI app, as each platform will have their own UART type and
>> address.
>>
>> To address both issues, a new configure option is added, --earlycon. The
>> syntax and semantics are identical to the kernel parameter with the same
>> name.
> Nice one! I like that reusing of an existing scheme.
>
>> Specifying this option will overwrite the UART address set by --vmm.
>>
>> At the moment, the UART type and register width parameters are ignored
>> since both qemu's and kvmtool's UART emulation use the same offset for the
>> TX register and no other registers are used by kvm-unit-tests, but the
>> parameters will become relevant once EFI support is added.
>>
>> Signed-off-by: Alexandru Elisei 
>> ---
>> The kvmtool patches I was referring to are the patches to unify ioport and
>> MMIO emulation [1] and to allow the user to specify a custom memory layout
>> for the VM [2] (these patches are very old, but I plan to revive them after
>> the ioport and MMIO unification series are merged).
>>
>> [1] 
>> https://lore.kernel.org/kvm/20201210142908.169597-1-andre.przyw...@arm.com/T/#t
>> [2] 
>> https://lore.kernel.org/kvm/1569245722-23375-1-git-send-email-alexandru.eli...@arm.com/
>>
>>  configure | 35 +++
>>  1 file changed, 35 insertions(+)
>>
>> diff --git a/configure b/configure
>> index cdcd34e94030..d94b92255088 100755
>> --- a/configure
>> +++ b/configure
>> @@ -26,6 +26,7 @@ errata_force=0
>>  erratatxt="$srcdir/errata.txt"
>>  host_key_document=
>>  page_size=
>> +earlycon=
>>  
>>  usage() {
>>  cat <<-EOF
>> @@ -54,6 +55,17 @@ usage() {
>>  --page-size=PAGE_SIZE
>> Specify the page size (translation granule) 
>> (4k, 16k or
>> 64k, default is 64k, arm64 only)
>> +--earlycon=EARLYCON
>> +   Specify the UART name, type and address 
>> (optional, arm and
>> +   arm64 only). The specified address will 
>> overwrite the UART
>> +   address set by the --vmm option. EARLYCON 
>> can be on of (case
>> +   sensitive):
>> +   uart[8250],mmio,ADDR
>> +   Specify an 8250 compatible UART at address 
>> ADDR. Supported
>> +   register stride is 8 bit only.
>> +   pl011,mmio,ADDR
>> +   Specify a PL011 compatible UART at address 
>> ADDR. Supported
>> +   register stride is 8 bit only.
> I think the PL011 only ever specified 32-bit register accesses? I just

You are correct, according to Arm Base System Architecture 1.0 (DEN0094A), page 
43:

"The registers that are described in this specification are a subset of the Arm
PL011 r1p5 UART. [..] The Generic UART is specified as a set of 32-bit 
registers.
[..] The Generic UART is little-endian."

> see that we actually do a writeb() for puts, that is not guaranteed to
> work on a hardware PL011, AFAIK. I guess QEMU just doesn't care ...

Table 19, page 43 of DEN0094A, says that permitted accesses sizes for the UARTDR
register are 8, 16 and 32 bits. I think using writeb() at address 0 of the UART
memory region to write a character is correct.

> Looks like we should fix this, maybe we get mmio32 for uart8250 for
> free, then.
>
> The kernel specifies "pl011,mmio32,ADDR" or "pl011,ADDR", so I think we
> should keep it compatible. "mmio[32]" is pretty much redundant on the
> PL011 (no port I/O), I think it's just for consistency with the 8250.
> Can you tweak the routine below to make this optional, and also accept
> mmio32?

Definitely, my intention was to make it as close as possible to what Linux 
does. I
made a mistake when I allowed this value, I will change it in the next version.

Thanks,

Alex

>
> Cheers,
> Andre
>
>>  EOF
>>  exit 1
>>  }
>> @@ -112,6 +124,9 @@ while [[ "$1" = -* ]]; do
>>  --page-size)
>>  page_size="$arg"
>>  ;;
>> +--earlycon)
>> +earlycon="$arg"
>> +;;
>>  --help)
>>  usage
>>  ;;
>> @@ -170,6 +185,26 @@ elif [ "$arch" = "arm" ] || [ "$arch" = "arm64" ]; then
>>  echo '--vmm must be one of "qemu" or "kvmtool"!'
>>  usage
>>  

Re: [PATCH v5 29/36] KVM: arm64: Use page-table to track page ownership

2021-03-15 Thread Will Deacon
On Mon, Mar 15, 2021 at 02:35:29PM +, Quentin Perret wrote:
> As the host stage 2 will be identity mapped, all the .hyp memory regions
> and/or memory pages donated to protected guestis will have to marked
> invalid in the host stage 2 page-table. At the same time, the hypervisor
> will need a way to track the ownership of each physical page to ensure
> memory sharing or donation between entities (host, guests, hypervisor) is
> legal.
> 
> In order to enable this tracking at EL2, let's use the host stage 2
> page-table itself. The idea is to use the top bits of invalid mappings
> to store the unique identifier of the page owner. The page-table owner
> (the host) gets identifier 0 such that, at boot time, it owns the entire
> IPA space as the pgd starts zeroed.
> 
> Provide kvm_pgtable_stage2_set_owner() which allows to modify the
> ownership of pages in the host stage 2. It re-uses most of the map()
> logic, but ends up creating invalid mappings instead. This impacts
> how we do refcount as we now need to count invalid mappings when they
> are used for ownership tracking.
> 
> Signed-off-by: Quentin Perret 
> ---
>  arch/arm64/include/asm/kvm_pgtable.h |  21 +
>  arch/arm64/kvm/hyp/pgtable.c | 127 ++-
>  2 files changed, 124 insertions(+), 24 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_pgtable.h 
> b/arch/arm64/include/asm/kvm_pgtable.h
> index 4ae19247837b..683e96abdc24 100644
> --- a/arch/arm64/include/asm/kvm_pgtable.h
> +++ b/arch/arm64/include/asm/kvm_pgtable.h
> @@ -238,6 +238,27 @@ int kvm_pgtable_stage2_map(struct kvm_pgtable *pgt, u64 
> addr, u64 size,
>  u64 phys, enum kvm_pgtable_prot prot,
>  void *mc);
>  
> +/**
> + * kvm_pgtable_stage2_set_owner() - Annotate invalid mappings with metadata
> + *   encoding the ownership of a page in the
> + *   IPA space.

The function does more than this, though, as it will also go ahead and unmap
existing valid mappings which I think should be mentioned here, no?

> +int kvm_pgtable_stage2_set_owner(struct kvm_pgtable *pgt, u64 addr, u64 size,
> +  void *mc, u8 owner_id)
> +{
> + int ret;
> + struct stage2_map_data map_data = {
> + .phys   = KVM_PHYS_INVALID,
> + .mmu= pgt->mmu,
> + .memcache   = mc,
> + .mm_ops = pgt->mm_ops,
> + .owner_id   = owner_id,
> + };
> + struct kvm_pgtable_walker walker = {
> + .cb = stage2_map_walker,
> + .flags  = KVM_PGTABLE_WALK_TABLE_PRE |
> +   KVM_PGTABLE_WALK_LEAF |
> +   KVM_PGTABLE_WALK_TABLE_POST,
> + .arg= _data,
> + };
> +
> + if (owner_id > KVM_MAX_OWNER_ID)
> + return -EINVAL;
> +
> + ret = kvm_pgtable_walk(pgt, addr, size, );
> + dsb(ishst);

Why is the DSB needed here? afaict, we only ever unmap a valid entry (which
will have a DSB as part of the TLBI sequence) or we update the owner for an
existing invalid entry, in which case the walker doesn't care.

Will
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v5 14/36] KVM: arm64: Provide __flush_dcache_area at EL2

2021-03-15 Thread Will Deacon
On Mon, Mar 15, 2021 at 02:35:14PM +, Quentin Perret wrote:
> We will need to do cache maintenance at EL2 soon, so compile a copy of
> __flush_dcache_area at EL2, and provide a copy of arm64_ftr_reg_ctrel0
> as it is needed by the read_ctr macro.
> 
> Signed-off-by: Quentin Perret 
> ---
>  arch/arm64/include/asm/kvm_cpufeature.h |  2 ++
>  arch/arm64/kvm/hyp/nvhe/Makefile|  3 ++-
>  arch/arm64/kvm/hyp/nvhe/cache.S | 13 +
>  arch/arm64/kvm/sys_regs.c   |  1 +
>  4 files changed, 18 insertions(+), 1 deletion(-)
>  create mode 100644 arch/arm64/kvm/hyp/nvhe/cache.S
> 
> diff --git a/arch/arm64/include/asm/kvm_cpufeature.h 
> b/arch/arm64/include/asm/kvm_cpufeature.h
> index 3fd9f60d2180..efba1b89b8a4 100644
> --- a/arch/arm64/include/asm/kvm_cpufeature.h
> +++ b/arch/arm64/include/asm/kvm_cpufeature.h
> @@ -13,3 +13,5 @@
>  #define KVM_HYP_CPU_FTR_REG(name) extern struct arm64_ftr_reg 
> kvm_nvhe_sym(name)
>  #endif
>  #endif
> +
> +KVM_HYP_CPU_FTR_REG(arm64_ftr_reg_ctrel0);

I still think this is a bit weird. If you really want to macro-ise stuff,
then why not follow the sort of thing we do for e.g. per-cpu variables and
have separate DECLARE_HYP_CPU_FTR_REG and DEFINE_HYP_CPU_FTR_REG macros.

That way kvm_cpufeature.h can have header guards like a normal header and
we can drop the '#ifndef KVM_HYP_CPU_FTR_REG' altogether. I don't think
the duplication of the symbol name really matters -- it should fail at
build time if something is missing.

Will
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v5 32/36] KVM: arm64: Provide sanitized mmfr* registers at EL2

2021-03-15 Thread Will Deacon
On Mon, Mar 15, 2021 at 02:35:32PM +, Quentin Perret wrote:
> We will need to read sanitized values of mmfr{0,1}_el1 at EL2 soon, so
> add them to the list of copied variables.
> 
> Signed-off-by: Quentin Perret 
> ---
>  arch/arm64/include/asm/kvm_cpufeature.h | 2 ++
>  arch/arm64/kvm/sys_regs.c   | 2 ++
>  2 files changed, 4 insertions(+)

Acked-by: Will Deacon 

Will
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v5 31/36] KVM: arm64: Add kvm_pgtable_stage2_find_range()

2021-03-15 Thread Will Deacon
On Mon, Mar 15, 2021 at 02:35:31PM +, Quentin Perret wrote:
> Since the host stage 2 will be identity mapped, and since it will own
> most of memory, it would preferable for performance to try and use large
> block mappings whenever that is possible. To ease this, introduce a new
> helper in the KVM page-table code which allows to search for large
> ranges of available IPA space. This will be used in the host memory
> abort path to greedily idmap large portion of the PA space.
> 
> Signed-off-by: Quentin Perret 
> ---
>  arch/arm64/include/asm/kvm_pgtable.h | 29 +
>  arch/arm64/kvm/hyp/pgtable.c | 89 ++--
>  2 files changed, 114 insertions(+), 4 deletions(-)

Acked-by: Will Deacon 

Will
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH] KVM: arm64: Fix nVHE hyp panic host context restore

2021-03-15 Thread Andrew Scull
Commit c4b000c3928d4f20acef79dccf3a65ae3795e0b0 upstream.

When panicking from the nVHE hyp and restoring the host context, x29 is
expected to hold a pointer to the host context. This wasn't being done
so fix it to make sure there's a valid pointer the host context being
used.

Rather than passing a boolean indicating whether or not the host context
should be restored, instead pass the pointer to the host context. NULL
is passed to indicate that no context should be restored.

Fixes: a2e102e20fd6 ("KVM: arm64: nVHE: Handle hyp panics")
Cc: sta...@vger.kernel.org # 5.10.y only
Signed-off-by: Andrew Scull 
Signed-off-by: Marc Zyngier 
Link: https://lore.kernel.org/r/20210219122406.1337626-1-asc...@google.com
---
 arch/arm64/include/asm/kvm_hyp.h |  3 ++-
 arch/arm64/kvm/hyp/nvhe/host.S   | 20 ++--
 arch/arm64/kvm/hyp/nvhe/switch.c |  3 +--
 3 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index 6b664de5ec1f..183bc9c7e1cb 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -94,7 +94,8 @@ u64 __guest_enter(struct kvm_vcpu *vcpu);
 
 void __noreturn hyp_panic(void);
 #ifdef __KVM_NVHE_HYPERVISOR__
-void __noreturn __hyp_do_panic(bool restore_host, u64 spsr, u64 elr, u64 par);
+void __noreturn __hyp_do_panic(struct kvm_cpu_context *host_ctxt, u64 spsr,
+  u64 elr, u64 par);
 #endif
 
 #endif /* __ARM64_KVM_HYP_H__ */
diff --git a/arch/arm64/kvm/hyp/nvhe/host.S b/arch/arm64/kvm/hyp/nvhe/host.S
index ed27f06a31ba..4ce934fc1f72 100644
--- a/arch/arm64/kvm/hyp/nvhe/host.S
+++ b/arch/arm64/kvm/hyp/nvhe/host.S
@@ -64,10 +64,15 @@ __host_enter_without_restoring:
 SYM_FUNC_END(__host_exit)
 
 /*
- * void __noreturn __hyp_do_panic(bool restore_host, u64 spsr, u64 elr, u64 
par);
+ * void __noreturn __hyp_do_panic(struct kvm_cpu_context *host_ctxt, u64 spsr,
+ *   u64 elr, u64 par);
  */
 SYM_FUNC_START(__hyp_do_panic)
-   /* Load the format arguments into x1-7 */
+   mov x29, x0
+
+   /* Load the format string into x0 and arguments into x1-7 */
+   ldr x0, =__hyp_panic_string
+
mov x6, x3
get_vcpu_ptr x7, x3
 
@@ -82,13 +87,8 @@ SYM_FUNC_START(__hyp_do_panic)
ldr lr, =panic
msr elr_el2, lr
 
-   /*
-* Set the panic format string and enter the host, conditionally
-* restoring the host context.
-*/
-   cmp x0, xzr
-   ldr x0, =__hyp_panic_string
-   b.eq__host_enter_without_restoring
+   /* Enter the host, conditionally restoring the host context. */
+   cbz x29, __host_enter_without_restoring
b   __host_enter_for_panic
 SYM_FUNC_END(__hyp_do_panic)
 
@@ -144,7 +144,7 @@ SYM_FUNC_END(__hyp_do_panic)
 
 .macro invalid_host_el1_vect
.align 7
-   mov x0, xzr /* restore_host = false */
+   mov x0, xzr /* host_ctxt = NULL */
mrs x1, spsr_el2
mrs x2, elr_el2
mrs x3, par_el1
diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index 8ae8160bc93a..1b16a3457e2b 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -251,7 +251,6 @@ void __noreturn hyp_panic(void)
u64 spsr = read_sysreg_el2(SYS_SPSR);
u64 elr = read_sysreg_el2(SYS_ELR);
u64 par = read_sysreg_par();
-   bool restore_host = true;
struct kvm_cpu_context *host_ctxt;
struct kvm_vcpu *vcpu;
 
@@ -265,7 +264,7 @@ void __noreturn hyp_panic(void)
__sysreg_restore_state_nvhe(host_ctxt);
}
 
-   __hyp_do_panic(restore_host, spsr, elr, par);
+   __hyp_do_panic(host_ctxt, spsr, elr, par);
unreachable();
 }
 
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH] KVM: arm64: Fix nVHE hyp panic host context restore

2021-03-15 Thread Andrew Scull
Commit c4b000c3928d4f20acef79dccf3a65ae3795e0b0 upstream.

When panicking from the nVHE hyp and restoring the host context, x29 is
expected to hold a pointer to the host context. This wasn't being done
so fix it to make sure there's a valid pointer the host context being
used.

Rather than passing a boolean indicating whether or not the host context
should be restored, instead pass the pointer to the host context. NULL
is passed to indicate that no context should be restored.

Fixes: a2e102e20fd6 ("KVM: arm64: nVHE: Handle hyp panics")
Cc: sta...@vger.kernel.org # 5.11.y only
Signed-off-by: Andrew Scull 
Signed-off-by: Marc Zyngier 
Link: https://lore.kernel.org/r/20210219122406.1337626-1-asc...@google.com
---
 arch/arm64/include/asm/kvm_hyp.h |  3 ++-
 arch/arm64/kvm/hyp/nvhe/host.S   | 20 ++--
 arch/arm64/kvm/hyp/nvhe/switch.c |  3 +--
 3 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index c0450828378b..fb8404fefd1f 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -97,7 +97,8 @@ bool kvm_host_psci_handler(struct kvm_cpu_context *host_ctxt);
 
 void __noreturn hyp_panic(void);
 #ifdef __KVM_NVHE_HYPERVISOR__
-void __noreturn __hyp_do_panic(bool restore_host, u64 spsr, u64 elr, u64 par);
+void __noreturn __hyp_do_panic(struct kvm_cpu_context *host_ctxt, u64 spsr,
+  u64 elr, u64 par);
 #endif
 
 #endif /* __ARM64_KVM_HYP_H__ */
diff --git a/arch/arm64/kvm/hyp/nvhe/host.S b/arch/arm64/kvm/hyp/nvhe/host.S
index a820dfdc9c25..3a06085aab6f 100644
--- a/arch/arm64/kvm/hyp/nvhe/host.S
+++ b/arch/arm64/kvm/hyp/nvhe/host.S
@@ -71,10 +71,15 @@ SYM_FUNC_START(__host_enter)
 SYM_FUNC_END(__host_enter)
 
 /*
- * void __noreturn __hyp_do_panic(bool restore_host, u64 spsr, u64 elr, u64 
par);
+ * void __noreturn __hyp_do_panic(struct kvm_cpu_context *host_ctxt, u64 spsr,
+ *   u64 elr, u64 par);
  */
 SYM_FUNC_START(__hyp_do_panic)
-   /* Load the format arguments into x1-7 */
+   mov x29, x0
+
+   /* Load the format string into x0 and arguments into x1-7 */
+   ldr x0, =__hyp_panic_string
+
mov x6, x3
get_vcpu_ptr x7, x3
 
@@ -89,13 +94,8 @@ SYM_FUNC_START(__hyp_do_panic)
ldr lr, =panic
msr elr_el2, lr
 
-   /*
-* Set the panic format string and enter the host, conditionally
-* restoring the host context.
-*/
-   cmp x0, xzr
-   ldr x0, =__hyp_panic_string
-   b.eq__host_enter_without_restoring
+   /* Enter the host, conditionally restoring the host context. */
+   cbz x29, __host_enter_without_restoring
b   __host_enter_for_panic
 SYM_FUNC_END(__hyp_do_panic)
 
@@ -150,7 +150,7 @@ SYM_FUNC_END(__hyp_do_panic)
 
 .macro invalid_host_el1_vect
.align 7
-   mov x0, xzr /* restore_host = false */
+   mov x0, xzr /* host_ctxt = NULL */
mrs x1, spsr_el2
mrs x2, elr_el2
mrs x3, par_el1
diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index f3d0e9eca56c..038147b7674b 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -257,7 +257,6 @@ void __noreturn hyp_panic(void)
u64 spsr = read_sysreg_el2(SYS_SPSR);
u64 elr = read_sysreg_el2(SYS_ELR);
u64 par = read_sysreg_par();
-   bool restore_host = true;
struct kvm_cpu_context *host_ctxt;
struct kvm_vcpu *vcpu;
 
@@ -271,7 +270,7 @@ void __noreturn hyp_panic(void)
__sysreg_restore_state_nvhe(host_ctxt);
}
 
-   __hyp_do_panic(restore_host, spsr, elr, par);
+   __hyp_do_panic(host_ctxt, spsr, elr, par);
unreachable();
 }
 
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [kvm-unit-tests PATCH 3/6] arm/arm64: Remove unnecessary ISB when doing dcache maintenance

2021-03-15 Thread Alexandru Elisei
Hi Drew,

On 3/12/21 2:59 PM, Andrew Jones wrote:
> On Sat, Feb 27, 2021 at 10:41:58AM +, Alexandru Elisei wrote:
>> The dcache_by_line_op macro executes a DSB to complete the cache
>> maintenance operations. According to ARM DDI 0487G.a, page B2-150:
>>
>> "In addition, no instruction that appears in program order after the DSB
>> instruction can alter any state of the system or perform any part of its
>> functionality until the DSB completes other than:
>>
>> - Being fetched from memory and decoded.
>> - Reading the general-purpose, SIMD and floating-point, Special-purpose, or
>>   System registers that are directly or indirectly read without causing
>>   side-effects."
>>
>> Similar definition for ARM in ARM DDI 0406C.d, page A3-150:
>>
>> "In addition, no instruction that appears in program order after the DSB
>> instruction can execute until the DSB completes."
>>
>> This means that we don't need the ISB to prevent reordering of the cache
>> maintenance instructions.
>>
>> We are also not doing icache maintenance, where an ISB would be required
>> for the PE to discard instructions speculated before the invalidation.
>>
>> In conclusion, the ISB is unnecessary, so remove it.
> Hi Alexandru,
>
> We can go ahead and take this patch, since you've written quite a
> convincing commit message, but in general I'd prefer we be overly cautious
> in our common code. We'd like to ensure we don't introduce difficult to
> debug issues there, and we don't care about optimizations, let alone
> micro-optimizations. Testing barrier needs to the letter of the spec is a
> good idea, but it's probably better to do that in the test cases.

You are correct, the intention of this patch was to do the minimum necessary to
ensure correctness.

Thank you for the explanation, I will keep this in mind for future patches.

Thanks,

Alex

>
> Thanks,
> drew
>
>> Signed-off-by: Alexandru Elisei 
>> ---
>>  arm/cstart.S   | 1 -
>>  arm/cstart64.S | 1 -
>>  2 files changed, 2 deletions(-)
>>
>> diff --git a/arm/cstart.S b/arm/cstart.S
>> index 954748b00f64..2d62c1e6d40d 100644
>> --- a/arm/cstart.S
>> +++ b/arm/cstart.S
>> @@ -212,7 +212,6 @@ asm_mmu_disable:
>>  ldr r1, [r1]
>>  sub r1, r1, r0
>>  dcache_by_line_op dccimvac, sy, r0, r1, r2, r3
>> -isb
>>  
>>  mov pc, lr
>>  
>> diff --git a/arm/cstart64.S b/arm/cstart64.S
>> index 046bd3914098..c1deff842f03 100644
>> --- a/arm/cstart64.S
>> +++ b/arm/cstart64.S
>> @@ -219,7 +219,6 @@ asm_mmu_disable:
>>  ldr x1, [x1, :lo12:__phys_end]
>>  sub x1, x1, x0
>>  dcache_by_line_op civac, sy, x0, x1, x2, x3
>> -isb
>>  
>>  ret
>>  
>> -- 
>> 2.30.1
>>
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [kvm-unit-tests PATCH 2/6] arm/arm64: Remove dcache_line_size global variable

2021-03-15 Thread Alexandru Elisei
Hi Andre,

On 3/4/21 3:00 PM, Andre Przywara wrote:
> On Sat, 27 Feb 2021 10:41:57 +
> Alexandru Elisei  wrote:
>
>> Compute the dcache line size when doing dcache maintenance instead of using
>> a global variable computed in setup(), which allows us to do dcache
>> maintenance at any point in the boot process. This will be useful for
>> running as an EFI app and it also aligns our implementation to that of the
>> Linux kernel.
> Can you add that this changes the semantic of dcache_by_line_op to use
> the size instead of the end address?

Sure, I can do that. The dcache_by_line_op was never visible to code outside
assembly, and it was only used by asm_mmu_disable, so no other callers are
affected by this change.

>
>> For consistency, the arm code has been similary modified.
>>
>> Signed-off-by: Alexandru Elisei 
>> ---
>>  lib/arm/asm/assembler.h   | 44 
>>  lib/arm/asm/processor.h   |  7 --
>>  lib/arm64/asm/assembler.h | 53 +++
>>  lib/arm64/asm/processor.h |  7 --
>>  lib/arm/setup.c   |  7 --
>>  arm/cstart.S  | 18 +++--
>>  arm/cstart64.S| 16 ++--
>>  7 files changed, 102 insertions(+), 50 deletions(-)
>>  create mode 100644 lib/arm/asm/assembler.h
>>  create mode 100644 lib/arm64/asm/assembler.h
>>
>> diff --git a/lib/arm/asm/assembler.h b/lib/arm/asm/assembler.h
>> new file mode 100644
>> index ..6b932df86204
>> --- /dev/null
>> +++ b/lib/arm/asm/assembler.h
>> @@ -0,0 +1,44 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/*
>> + * Based on several files from Linux version v5.10: 
>> arch/arm/mm/proc-macros.S,
>> + * arch/arm/mm/proc-v7.S.
>> + */
>> +
>> +/*
>> + * dcache_line_size - get the minimum D-cache line size from the CTR 
>> register
> `> + * on ARMv7.

Well, it's in the arm directory and there's a file with the same name under
lib/arm64/asm/, so I don't think there's any room for confusion here.

>> + */
>> +.macro  dcache_line_size, reg, tmp
>> +mrc p15, 0, \tmp, c0, c0, 1 // read ctr
>> +lsr \tmp, \tmp, #16
>> +and \tmp, \tmp, #0xf// cache line size encoding
>> +mov \reg, #4// bytes per word
>> +mov \reg, \reg, lsl \tmp// actual cache line size
>> +.endm
>> +
>> +/*
>> + * Macro to perform a data cache maintenance for the interval
>> + * [addr, addr + size).
>> + *
>> + *  op: operation to execute
>> + *  domain  domain used in the dsb instruction
>> + *  addr:   starting virtual address of the region
>> + *  size:   size of the region
>> + *  Corrupts:   addr, size, tmp1, tmp2
>> + */
>> +.macro dcache_by_line_op op, domain, addr, size, tmp1, tmp2
>> +dcache_line_size \tmp1, \tmp2
>> +add \size, \addr, \size
>> +sub \tmp2, \tmp1, #1
>> +bic \addr, \addr, \tmp2
> Just a nit, but since my brain was in assembly land: We could skip tmp2,
> by adding back #1 to tmp1 after the bic.
> Same for the arm64 code.

Using one less temporary register wouldn't help with register pressure:

- On arm, registers r0-r3 are used, which ARM IHI 0042F says that they can be 
used
as scratch registers and the caller will save their contents before the calling
the function (or not use them at all).

- On arm64, register x0-x3 are used, which have a similar usage according to ARM
IHI 0055B.

Using one less temporary register means one more instruction, but not relevant
since the macro will perform writes, as even invalidation is transformed to a
clean + invalidate under virtualization.

The reason I chose to keep the macro unchanged for arm64 is that it matches the
Linux definition, and I think it's better to try not to deviate too much from 
it,
as in the long it will make maintenance easier for everyone.

For arm, I wrote it this way to match the arm64 definition.

>
>> +9998:
>> +.ifc\op, dccimvac
>> +mcr p15, 0, \addr, c7, c14, 1
>> +.else
>> +.err
>> +.endif
>> +add \addr, \addr, \tmp1
>> +cmp \addr, \size
>> +blo 9998b
>> +dsb \domain
>> +.endm
>> diff --git a/lib/arm/asm/processor.h b/lib/arm/asm/processor.h
>> index 273366d1fe1c..3c36eac903f0 100644
>> --- a/lib/arm/asm/processor.h
>> +++ b/lib/arm/asm/processor.h
>> @@ -9,11 +9,6 @@
>>  #include 
>>  #include 
> Do we want the same protection against inclusion from C here as in the
> arm64 version?

We do, I will add it in the next iteration.

>
>> -#define CTR_DMINLINE_SHIFT  16
>> -#define CTR_DMINLINE_MASK   (0xf << 16)
>> -#define CTR_DMINLINE(x) \
>> -(((x) & CTR_DMINLINE_MASK) >> CTR_DMINLINE_SHIFT)
>> -
>>  enum vector {
>>  EXCPTN_RST,
>>  EXCPTN_UND,
>> @@ -89,6 +84,4 @@ static inline u32 get_ctr(void)
>>  return read_sysreg(CTR);
>>  }
>>  
>> -extern unsigned long dcache_line_size;
>> -
>>  #endif /* _ASMARM_PROCESSOR_H_ */
>> diff --git 

[PATCH kvmtool v3 22/22] hw/rtc: ARM/arm64: Use MMIO at higher addresses

2021-03-15 Thread Andre Przywara
Using the RTC device at its legacy I/O address as set by IBM in 1981
was a kludge we used for simplicity on ARM platforms as well.
However this imposes problems due to their missing alignment and overlap
with the PCI I/O address space.

Now that we can switch a device easily between using ioports and
MMIO, let's move the RTC out of the first 4K of memory on ARM platforms.

That should be transparent for well behaved guests, since the change is
naturally reflected in the device tree.

Signed-off-by: Andre Przywara 
Reviewed-by: Alexandru Elisei 
---
 arm/include/arm-common/kvm-arch.h |  7 +--
 hw/rtc.c  | 24 
 2 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/arm/include/arm-common/kvm-arch.h 
b/arm/include/arm-common/kvm-arch.h
index bf34d742..436b67b8 100644
--- a/arm/include/arm-common/kvm-arch.h
+++ b/arm/include/arm-common/kvm-arch.h
@@ -14,8 +14,8 @@
  * +---++---+---++-+-+---..
  * |  PCI  || plat  |   || | |
  * |  I/O  || MMIO: | Flash | virtio | GIC |   PCI   |  DRAM
- * | space || UART  |   |  MMIO  | |  (AXI)  |
- * |   ||   |   || | |
+ * | space || UART, |   |  MMIO  | |  (AXI)  |
+ * |   || RTC   |   || | |
  * +---++---+---++-+-+---..
  */
 
@@ -31,6 +31,9 @@
 #define ARM_UART_MMIO_BASE ARM_MMIO_AREA
 #define ARM_UART_MMIO_SIZE 0x1
 
+#define ARM_RTC_MMIO_BASE  (ARM_UART_MMIO_BASE + ARM_UART_MMIO_SIZE)
+#define ARM_RTC_MMIO_SIZE  0x1
+
 #define KVM_FLASH_MMIO_BASE(ARM_MMIO_AREA + 0x100)
 #define KVM_FLASH_MAX_SIZE 0x100
 
diff --git a/hw/rtc.c b/hw/rtc.c
index ee4c9102..aec31c52 100644
--- a/hw/rtc.c
+++ b/hw/rtc.c
@@ -5,6 +5,15 @@
 
 #include 
 
+#if defined(CONFIG_ARM) || defined(CONFIG_ARM64)
+#define RTC_BUS_TYPE   DEVICE_BUS_MMIO
+#define RTC_BASE_ADDRESS   ARM_RTC_MMIO_BASE
+#else
+/* PORT 0070-007F - CMOS RAM/RTC (REAL TIME CLOCK) */
+#define RTC_BUS_TYPE   DEVICE_BUS_IOPORT
+#define RTC_BASE_ADDRESS   0x70
+#endif
+
 /*
  * MC146818 RTC registers
  */
@@ -49,7 +58,7 @@ static void cmos_ram_io(struct kvm_cpu *vcpu, u64 addr, u8 
*data,
time_t ti;
 
if (is_write) {
-   if (addr == 0x70) { /* index register */
+   if (addr == RTC_BASE_ADDRESS) { /* index register */
u8 value = ioport__read8(data);
 
vcpu->kvm->nmi_disabled = value & (1UL << 7);
@@ -70,7 +79,7 @@ static void cmos_ram_io(struct kvm_cpu *vcpu, u64 addr, u8 
*data,
return;
}
 
-   if (addr == 0x70)
+   if (addr == RTC_BASE_ADDRESS)   /* index register is write-only */
return;
 
time();
@@ -127,7 +136,7 @@ static void generate_rtc_fdt_node(void *fdt,
u8 irq,
enum irq_type))
 {
-   u64 reg_prop[2] = { cpu_to_fdt64(0x70), cpu_to_fdt64(2) };
+   u64 reg_prop[2] = { cpu_to_fdt64(RTC_BASE_ADDRESS), cpu_to_fdt64(2) };
 
_FDT(fdt_begin_node(fdt, "rtc"));
_FDT(fdt_property_string(fdt, "compatible", "motorola,mc146818"));
@@ -139,7 +148,7 @@ static void generate_rtc_fdt_node(void *fdt,
 #endif
 
 struct device_header rtc_dev_hdr = {
-   .bus_type = DEVICE_BUS_IOPORT,
+   .bus_type = RTC_BUS_TYPE,
.data = generate_rtc_fdt_node,
 };
 
@@ -151,8 +160,8 @@ int rtc__init(struct kvm *kvm)
if (r < 0)
return r;
 
-   /* PORT 0070-007F - CMOS RAM/RTC (REAL TIME CLOCK) */
-   r = kvm__register_pio(kvm, 0x0070, 2, cmos_ram_io, NULL);
+   r = kvm__register_iotrap(kvm, RTC_BASE_ADDRESS, 2, cmos_ram_io, NULL,
+RTC_BUS_TYPE);
if (r < 0)
goto out_device;
 
@@ -170,8 +179,7 @@ dev_init(rtc__init);
 
 int rtc__exit(struct kvm *kvm)
 {
-   /* PORT 0070-007F - CMOS RAM/RTC (REAL TIME CLOCK) */
-   kvm__deregister_pio(kvm, 0x0070);
+   kvm__deregister_iotrap(kvm, RTC_BASE_ADDRESS, RTC_BUS_TYPE);
 
return 0;
 }
-- 
2.17.5

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH kvmtool v3 21/22] hw/serial: ARM/arm64: Use MMIO at higher addresses

2021-03-15 Thread Andre Przywara
Using the UART devices at their legacy I/O addresses as set by IBM in
1981 was a kludge we used for simplicity on ARM platforms as well.
However this imposes problems due to their missing alignment and overlap
with the PCI I/O address space.

Now that we can switch a device easily between using ioports and MMIO,
let's move the UARTs out of the first 4K of memory on ARM platforms.

That should be transparent for well behaved guests, since the change is
naturally reflected in the device tree. Even "earlycon" keeps working,
as the stdout-path property is adjusted automatically.

People providing direct earlycon parameters via the command line need to
adjust it to: "earlycon=uart,mmio,0x100".

Signed-off-by: Andre Przywara 
---
 arm/include/arm-common/kvm-arch.h |  7 ++--
 hw/serial.c   | 54 +--
 2 files changed, 42 insertions(+), 19 deletions(-)

diff --git a/arm/include/arm-common/kvm-arch.h 
b/arm/include/arm-common/kvm-arch.h
index a2e32953..bf34d742 100644
--- a/arm/include/arm-common/kvm-arch.h
+++ b/arm/include/arm-common/kvm-arch.h
@@ -13,8 +13,8 @@
  * 0  64K  16M 32M 48M1GB   2GB
  * +---++---+---++-+-+---..
  * |  PCI  || plat  |   || | |
- * |  I/O  || MMIO  | Flash | virtio | GIC |   PCI   |  DRAM
- * | space ||   |   |  MMIO  | |  (AXI)  |
+ * |  I/O  || MMIO: | Flash | virtio | GIC |   PCI   |  DRAM
+ * | space || UART  |   |  MMIO  | |  (AXI)  |
  * |   ||   |   || | |
  * +---++---+---++-+-+---..
  */
@@ -28,6 +28,9 @@
 #define ARM_IOPORT_SIZE(1U << 16)
 
 
+#define ARM_UART_MMIO_BASE ARM_MMIO_AREA
+#define ARM_UART_MMIO_SIZE 0x1
+
 #define KVM_FLASH_MMIO_BASE(ARM_MMIO_AREA + 0x100)
 #define KVM_FLASH_MAX_SIZE 0x100
 
diff --git a/hw/serial.c b/hw/serial.c
index 16af493b..3d533623 100644
--- a/hw/serial.c
+++ b/hw/serial.c
@@ -13,6 +13,24 @@
 
 #include 
 
+#if defined(CONFIG_ARM) || defined(CONFIG_ARM64)
+#define serial_iobase(nr)  (ARM_UART_MMIO_BASE + (nr) * 0x1000)
+#define serial_irq(nr) (32 + (nr))
+#define SERIAL8250_BUS_TYPEDEVICE_BUS_MMIO
+#else
+#define serial_iobase_0(KVM_IOPORT_AREA + 0x3f8)
+#define serial_iobase_1(KVM_IOPORT_AREA + 0x2f8)
+#define serial_iobase_2(KVM_IOPORT_AREA + 0x3e8)
+#define serial_iobase_3(KVM_IOPORT_AREA + 0x2e8)
+#define serial_irq_0   4
+#define serial_irq_1   3
+#define serial_irq_2   4
+#define serial_irq_3   3
+#define serial_iobase(nr)  serial_iobase_##nr
+#define serial_irq(nr) serial_irq_##nr
+#define SERIAL8250_BUS_TYPEDEVICE_BUS_IOPORT
+#endif
+
 /*
  * This fakes a U6_16550A. The fifo len needs to be 64 as the kernel
  * expects that for autodetection.
@@ -27,7 +45,7 @@ struct serial8250_device {
struct mutexmutex;
u8  id;
 
-   u16 iobase;
+   u32 iobase;
u8  irq;
u8  irq_state;
int txcnt;
@@ -65,56 +83,56 @@ static struct serial8250_device devices[] = {
/* ttyS0 */
[0] = {
.dev_hdr = {
-   .bus_type   = DEVICE_BUS_IOPORT,
+   .bus_type   = SERIAL8250_BUS_TYPE,
.data   = serial8250_generate_fdt_node,
},
.mutex  = MUTEX_INITIALIZER,
 
.id = 0,
-   .iobase = 0x3f8,
-   .irq= 4,
+   .iobase = serial_iobase(0),
+   .irq= serial_irq(0),
 
SERIAL_REGS_SETTING
},
/* ttyS1 */
[1] = {
.dev_hdr = {
-   .bus_type   = DEVICE_BUS_IOPORT,
+   .bus_type   = SERIAL8250_BUS_TYPE,
.data   = serial8250_generate_fdt_node,
},
.mutex  = MUTEX_INITIALIZER,
 
.id = 1,
-   .iobase = 0x2f8,
-   .irq= 3,
+   .iobase = serial_iobase(1),
+   .irq= serial_irq(1),
 
SERIAL_REGS_SETTING
},
/* ttyS2 */
[2] = {
.dev_hdr = {
-   .bus_type   = DEVICE_BUS_IOPORT,
+   .bus_type   = SERIAL8250_BUS_TYPE,
.data   = serial8250_generate_fdt_node,
},
.mutex   

[PATCH kvmtool v3 20/22] arm: Reorganise and document memory map

2021-03-15 Thread Andre Przywara
The hardcoded memory map we expose to a guest is currently described
using a series of partially interconnected preprocessor constants,
which is hard to read and follow.

In preparation for moving the UART and RTC to some different MMIO
region, document the current map with some ASCII art, and clean up the
definition of the sections.

This changes the only internally used value of ARM_MMIO_AREA, to better
align with its actual meaning and future extensions.

No functional change.

Signed-off-by: Andre Przywara 
---
 arm/include/arm-common/kvm-arch.h | 41 ++-
 1 file changed, 29 insertions(+), 12 deletions(-)

diff --git a/arm/include/arm-common/kvm-arch.h 
b/arm/include/arm-common/kvm-arch.h
index d84e50cd..a2e32953 100644
--- a/arm/include/arm-common/kvm-arch.h
+++ b/arm/include/arm-common/kvm-arch.h
@@ -7,14 +7,33 @@
 
 #include "arm-common/gic.h"
 
+/*
+ * The memory map used for ARM guests (not to scale):
+ *
+ * 0  64K  16M 32M 48M1GB   2GB
+ * +---++---+---++-+-+---..
+ * |  PCI  || plat  |   || | |
+ * |  I/O  || MMIO  | Flash | virtio | GIC |   PCI   |  DRAM
+ * | space ||   |   |  MMIO  | |  (AXI)  |
+ * |   ||   |   || | |
+ * +---++---+---++-+-+---..
+ */
+
 #define ARM_IOPORT_AREA_AC(0x, UL)
-#define ARM_FLASH_AREA _AC(0x0200, UL)
-#define ARM_MMIO_AREA  _AC(0x0300, UL)
+#define ARM_MMIO_AREA  _AC(0x0100, UL)
 #define ARM_AXI_AREA   _AC(0x4000, UL)
 #define ARM_MEMORY_AREA_AC(0x8000, UL)
 
-#define ARM_LOMAP_MAX_MEMORY   ((1ULL << 32) - ARM_MEMORY_AREA)
-#define ARM_HIMAP_MAX_MEMORY   ((1ULL << 40) - ARM_MEMORY_AREA)
+#define KVM_IOPORT_AREAARM_IOPORT_AREA
+#define ARM_IOPORT_SIZE(1U << 16)
+
+
+#define KVM_FLASH_MMIO_BASE(ARM_MMIO_AREA + 0x100)
+#define KVM_FLASH_MAX_SIZE 0x100
+
+#define KVM_VIRTIO_MMIO_AREA   (KVM_FLASH_MMIO_BASE + KVM_FLASH_MAX_SIZE)
+#define ARM_VIRTIO_MMIO_SIZE   (ARM_AXI_AREA - \
+   (KVM_VIRTIO_MMIO_AREA + ARM_GIC_SIZE))
 
 #define ARM_GIC_DIST_BASE  (ARM_AXI_AREA - ARM_GIC_DIST_SIZE)
 #define ARM_GIC_CPUI_BASE  (ARM_GIC_DIST_BASE - ARM_GIC_CPUI_SIZE)
@@ -22,19 +41,17 @@
 #define ARM_GIC_DIST_SIZE  0x1
 #define ARM_GIC_CPUI_SIZE  0x2
 
-#define KVM_FLASH_MMIO_BASEARM_FLASH_AREA
-#define KVM_FLASH_MAX_SIZE (ARM_MMIO_AREA - ARM_FLASH_AREA)
 
-#define ARM_IOPORT_SIZE(1U << 16)
-#define ARM_VIRTIO_MMIO_SIZE   (ARM_AXI_AREA - (ARM_MMIO_AREA + ARM_GIC_SIZE))
+#define KVM_PCI_CFG_AREA   ARM_AXI_AREA
 #define ARM_PCI_CFG_SIZE   (1ULL << 24)
+#define KVM_PCI_MMIO_AREA  (KVM_PCI_CFG_AREA + ARM_PCI_CFG_SIZE)
 #define ARM_PCI_MMIO_SIZE  (ARM_MEMORY_AREA - \
(ARM_AXI_AREA + ARM_PCI_CFG_SIZE))
 
-#define KVM_IOPORT_AREAARM_IOPORT_AREA
-#define KVM_PCI_CFG_AREA   ARM_AXI_AREA
-#define KVM_PCI_MMIO_AREA  (KVM_PCI_CFG_AREA + ARM_PCI_CFG_SIZE)
-#define KVM_VIRTIO_MMIO_AREA   ARM_MMIO_AREA
+
+#define ARM_LOMAP_MAX_MEMORY   ((1ULL << 32) - ARM_MEMORY_AREA)
+#define ARM_HIMAP_MAX_MEMORY   ((1ULL << 40) - ARM_MEMORY_AREA)
+
 
 #define KVM_IOEVENTFD_HAS_PIO  0
 
-- 
2.17.5

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH kvmtool v3 19/22] Remove ioport specific routines

2021-03-15 Thread Andre Przywara
Now that all users of the dedicated ioport trap handler interface are
gone, we can retire the code associated with it.

This removes ioport.c and ioport.h, along with removing prototypes from
other header files.

This also transfers the responsibility for port I/O trap handling
entirely into the new routine in mmio.c.

Signed-off-by: Andre Przywara 
Reviewed-by: Alexandru Elisei 
---
 Makefile |   1 -
 include/kvm/ioport.h |  27 --
 include/kvm/kvm.h|   2 -
 ioport.c | 195 ---
 mmio.c   |   2 +-
 5 files changed, 1 insertion(+), 226 deletions(-)
 delete mode 100644 ioport.c

diff --git a/Makefile b/Makefile
index 35bb1182..94ff5da6 100644
--- a/Makefile
+++ b/Makefile
@@ -56,7 +56,6 @@ OBJS  += framebuffer.o
 OBJS   += guest_compat.o
 OBJS   += hw/rtc.o
 OBJS   += hw/serial.o
-OBJS   += ioport.o
 OBJS   += irq.o
 OBJS   += kvm-cpu.o
 OBJS   += kvm.o
diff --git a/include/kvm/ioport.h b/include/kvm/ioport.h
index a61038e2..b6f579cb 100644
--- a/include/kvm/ioport.h
+++ b/include/kvm/ioport.h
@@ -1,13 +1,8 @@
 #ifndef KVM__IOPORT_H
 #define KVM__IOPORT_H
 
-#include "kvm/devices.h"
 #include "kvm/kvm-cpu.h"
-#include "kvm/rbtree-interval.h"
-#include "kvm/fdt.h"
 
-#include 
-#include 
 #include 
 #include 
 #include 
@@ -15,30 +10,8 @@
 /* some ports we reserve for own use */
 #define IOPORT_DBG 0xe0
 
-struct kvm;
-
-struct ioport {
-   struct rb_int_node  node;
-   struct ioport_operations*ops;
-   void*priv;
-   struct device_headerdev_hdr;
-   u32 refcount;
-   boolremove;
-};
-
-struct ioport_operations {
-   bool (*io_in)(struct ioport *ioport, struct kvm_cpu *vcpu, u16 port, 
void *data, int size);
-   bool (*io_out)(struct ioport *ioport, struct kvm_cpu *vcpu, u16 port, 
void *data, int size);
-};
-
 void ioport__map_irq(u8 *irq);
 
-int __must_check ioport__register(struct kvm *kvm, u16 port, struct 
ioport_operations *ops,
- int count, void *param);
-int ioport__unregister(struct kvm *kvm, u16 port);
-int ioport__init(struct kvm *kvm);
-int ioport__exit(struct kvm *kvm);
-
 static inline u8 ioport__read8(u8 *data)
 {
return *data;
diff --git a/include/kvm/kvm.h b/include/kvm/kvm.h
index 306b258a..6c28afa3 100644
--- a/include/kvm/kvm.h
+++ b/include/kvm/kvm.h
@@ -126,8 +126,6 @@ void kvm__irq_line(struct kvm *kvm, int irq, int level);
 void kvm__irq_trigger(struct kvm *kvm, int irq);
 bool kvm__emulate_io(struct kvm_cpu *vcpu, u16 port, void *data, int 
direction, int size, u32 count);
 bool kvm__emulate_mmio(struct kvm_cpu *vcpu, u64 phys_addr, u8 *data, u32 len, 
u8 is_write);
-bool kvm__emulate_pio(struct kvm_cpu *vcpu, u16 port, void *data,
- int direction, int size, u32 count);
 int kvm__destroy_mem(struct kvm *kvm, u64 guest_phys, u64 size, void 
*userspace_addr);
 int kvm__register_mem(struct kvm *kvm, u64 guest_phys, u64 size, void 
*userspace_addr,
  enum kvm_mem_type type);
diff --git a/ioport.c b/ioport.c
deleted file mode 100644
index ce29e7e7..
--- a/ioport.c
+++ /dev/null
@@ -1,195 +0,0 @@
-#include "kvm/ioport.h"
-
-#include "kvm/kvm.h"
-#include "kvm/util.h"
-#include "kvm/rbtree-interval.h"
-#include "kvm/mutex.h"
-
-#include  /* for KVM_EXIT_* */
-#include 
-
-#include 
-#include 
-#include 
-#include 
-
-#define ioport_node(n) rb_entry(n, struct ioport, node)
-
-static DEFINE_MUTEX(ioport_lock);
-
-static struct rb_root  ioport_tree = RB_ROOT;
-
-static struct ioport *ioport_search(struct rb_root *root, u64 addr)
-{
-   struct rb_int_node *node;
-
-   node = rb_int_search_single(root, addr);
-   if (node == NULL)
-   return NULL;
-
-   return ioport_node(node);
-}
-
-static int ioport_insert(struct rb_root *root, struct ioport *data)
-{
-   return rb_int_insert(root, >node);
-}
-
-static void ioport_remove(struct rb_root *root, struct ioport *data)
-{
-   rb_int_erase(root, >node);
-}
-
-static struct ioport *ioport_get(struct rb_root *root, u64 addr)
-{
-   struct ioport *ioport;
-
-   mutex_lock(_lock);
-   ioport = ioport_search(root, addr);
-   if (ioport)
-   ioport->refcount++;
-   mutex_unlock(_lock);
-
-   return ioport;
-}
-
-/* Called with ioport_lock held. */
-static void ioport_unregister(struct rb_root *root, struct ioport *data)
-{
-   ioport_remove(root, data);
-   free(data);
-}
-
-static void ioport_put(struct rb_root *root, struct ioport *data)
-{
-   mutex_lock(_lock);
-   data->refcount--;
-   if (data->remove && data->refcount == 0)
-   ioport_unregister(root, data);
-   mutex_unlock(_lock);
-}
-
-int ioport__register(struct kvm *kvm, u16 port, struct ioport_operations *ops, 
int count, void *param)
-{
-   

[PATCH kvmtool v3 15/22] vfio: Refactor ioport trap handler

2021-03-15 Thread Andre Przywara
With the planned retirement of the special ioport emulation code, we
need to provide an emulation function compatible with the MMIO prototype.

Adjust the I/O port trap handler to use that new function, and provide
shims to implement the old ioport interface, for now.

Signed-off-by: Andre Przywara 
Reviewed-by: Alexandru Elisei 
---
 vfio/core.c | 51 ---
 1 file changed, 36 insertions(+), 15 deletions(-)

diff --git a/vfio/core.c b/vfio/core.c
index 0b45e78b..ddd3c2c7 100644
--- a/vfio/core.c
+++ b/vfio/core.c
@@ -81,15 +81,12 @@ out_free_buf:
return ret;
 }
 
-static bool vfio_ioport_in(struct ioport *ioport, struct kvm_cpu *vcpu,
-  u16 port, void *data, int len)
+static bool _vfio_ioport_in(struct vfio_region *region, u32 offset,
+   void *data, int len)
 {
-   u32 val;
-   ssize_t nr;
-   struct vfio_region *region = ioport->priv;
struct vfio_device *vdev = region->vdev;
-
-   u32 offset = port - region->port_base;
+   ssize_t nr;
+   u32 val;
 
if (!(region->info.flags & VFIO_REGION_INFO_FLAG_READ))
return false;
@@ -97,7 +94,7 @@ static bool vfio_ioport_in(struct ioport *ioport, struct 
kvm_cpu *vcpu,
nr = pread(vdev->fd, , len, region->info.offset + offset);
if (nr != len) {
vfio_dev_err(vdev, "could not read %d bytes from I/O port 
0x%x\n",
-len, port);
+len, offset + region->port_base);
return false;
}
 
@@ -118,15 +115,13 @@ static bool vfio_ioport_in(struct ioport *ioport, struct 
kvm_cpu *vcpu,
return true;
 }
 
-static bool vfio_ioport_out(struct ioport *ioport, struct kvm_cpu *vcpu,
-   u16 port, void *data, int len)
+static bool _vfio_ioport_out(struct vfio_region *region, u32 offset,
+void *data, int len)
 {
-   u32 val;
-   ssize_t nr;
-   struct vfio_region *region = ioport->priv;
struct vfio_device *vdev = region->vdev;
+   ssize_t nr;
+   u32 val;
 
-   u32 offset = port - region->port_base;
 
if (!(region->info.flags & VFIO_REGION_INFO_FLAG_WRITE))
return false;
@@ -148,11 +143,37 @@ static bool vfio_ioport_out(struct ioport *ioport, struct 
kvm_cpu *vcpu,
nr = pwrite(vdev->fd, , len, region->info.offset + offset);
if (nr != len)
vfio_dev_err(vdev, "could not write %d bytes to I/O port 0x%x",
-len, port);
+len, offset + region->port_base);
 
return nr == len;
 }
 
+static void vfio_ioport_mmio(struct kvm_cpu *vcpu, u64 addr, u8 *data, u32 len,
+u8 is_write, void *ptr)
+{
+   struct vfio_region *region = ptr;
+   u32 offset = addr - region->port_base;
+
+   if (is_write)
+   _vfio_ioport_out(region, offset, data, len);
+   else
+   _vfio_ioport_in(region, offset, data, len);
+}
+
+static bool vfio_ioport_out(struct ioport *ioport, struct kvm_cpu *vcpu,
+   u16 port, void *data, int len)
+{
+   vfio_ioport_mmio(vcpu, port, data, len, true, ioport->priv);
+   return true;
+}
+
+static bool vfio_ioport_in(struct ioport *ioport, struct kvm_cpu *vcpu,
+  u16 port, void *data, int len)
+{
+   vfio_ioport_mmio(vcpu, port, data, len, false, ioport->priv);
+   return true;
+}
+
 static struct ioport_operations vfio_ioport_ops = {
.io_in  = vfio_ioport_in,
.io_out = vfio_ioport_out,
-- 
2.17.5

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH kvmtool v3 18/22] pci: Switch trap handling to use MMIO handler

2021-03-15 Thread Andre Przywara
With the planned retirement of the special ioport emulation code, we
need to provide an emulation function compatible with the MMIO prototype.

Merge the existing _in and _out handlers to adhere to that MMIO
interface, and register these using the new registration function.

Signed-off-by: Andre Przywara 
Reviewed-by: Alexandru Elisei 
---
 pci.c | 82 +--
 1 file changed, 24 insertions(+), 58 deletions(-)

diff --git a/pci.c b/pci.c
index 2e2c0270..d6da79e0 100644
--- a/pci.c
+++ b/pci.c
@@ -87,29 +87,16 @@ static void *pci_config_address_ptr(u16 port)
return base + offset;
 }
 
-static bool pci_config_address_out(struct ioport *ioport, struct kvm_cpu 
*vcpu, u16 port, void *data, int size)
+static void pci_config_address_mmio(struct kvm_cpu *vcpu, u64 addr, u8 *data,
+   u32 len, u8 is_write, void *ptr)
 {
-   void *p = pci_config_address_ptr(port);
+   void *p = pci_config_address_ptr(addr);
 
-   memcpy(p, data, size);
-
-   return true;
-}
-
-static bool pci_config_address_in(struct ioport *ioport, struct kvm_cpu *vcpu, 
u16 port, void *data, int size)
-{
-   void *p = pci_config_address_ptr(port);
-
-   memcpy(data, p, size);
-
-   return true;
+   if (is_write)
+   memcpy(p, data, len);
+   else
+   memcpy(data, p, len);
 }
-
-static struct ioport_operations pci_config_address_ops = {
-   .io_in  = pci_config_address_in,
-   .io_out = pci_config_address_out,
-};
-
 static bool pci_device_exists(u8 bus_number, u8 device_number, u8 
function_number)
 {
union pci_config_address pci_config_address;
@@ -125,49 +112,27 @@ static bool pci_device_exists(u8 bus_number, u8 
device_number, u8 function_numbe
return !IS_ERR_OR_NULL(device__find_dev(DEVICE_BUS_PCI, device_number));
 }
 
-static bool pci_config_data_out(struct ioport *ioport, struct kvm_cpu *vcpu, 
u16 port, void *data, int size)
-{
-   union pci_config_address pci_config_address;
-
-   if (size > 4)
-   size = 4;
-
-   pci_config_address.w = ioport__read32(_config_address_bits);
-   /*
-* If someone accesses PCI configuration space offsets that are not
-* aligned to 4 bytes, it uses ioports to signify that.
-*/
-   pci_config_address.reg_offset = port - PCI_CONFIG_DATA;
-
-   pci__config_wr(vcpu->kvm, pci_config_address, data, size);
-
-   return true;
-}
-
-static bool pci_config_data_in(struct ioport *ioport, struct kvm_cpu *vcpu, 
u16 port, void *data, int size)
+static void pci_config_data_mmio(struct kvm_cpu *vcpu, u64 addr, u8 *data,
+u32 len, u8 is_write, void *kvm)
 {
union pci_config_address pci_config_address;
 
-   if (size > 4)
-   size = 4;
+   if (len > 4)
+   len = 4;
 
pci_config_address.w = ioport__read32(_config_address_bits);
/*
 * If someone accesses PCI configuration space offsets that are not
 * aligned to 4 bytes, it uses ioports to signify that.
 */
-   pci_config_address.reg_offset = port - PCI_CONFIG_DATA;
+   pci_config_address.reg_offset = addr - PCI_CONFIG_DATA;
 
-   pci__config_rd(vcpu->kvm, pci_config_address, data, size);
-
-   return true;
+   if (is_write)
+   pci__config_wr(vcpu->kvm, pci_config_address, data, len);
+   else
+   pci__config_rd(vcpu->kvm, pci_config_address, data, len);
 }
 
-static struct ioport_operations pci_config_data_ops = {
-   .io_in  = pci_config_data_in,
-   .io_out = pci_config_data_out,
-};
-
 static int pci_activate_bar(struct kvm *kvm, struct pci_device_header *pci_hdr,
int bar_num)
 {
@@ -512,11 +477,12 @@ int pci__init(struct kvm *kvm)
 {
int r;
 
-   r = ioport__register(kvm, PCI_CONFIG_DATA + 0, _config_data_ops, 4, 
NULL);
+   r = kvm__register_pio(kvm, PCI_CONFIG_DATA, 4,
+pci_config_data_mmio, NULL);
if (r < 0)
return r;
-
-   r = ioport__register(kvm, PCI_CONFIG_ADDRESS + 0, 
_config_address_ops, 4, NULL);
+   r = kvm__register_pio(kvm, PCI_CONFIG_ADDRESS, 4,
+pci_config_address_mmio, NULL);
if (r < 0)
goto err_unregister_data;
 
@@ -528,17 +494,17 @@ int pci__init(struct kvm *kvm)
return 0;
 
 err_unregister_addr:
-   ioport__unregister(kvm, PCI_CONFIG_ADDRESS);
+   kvm__deregister_pio(kvm, PCI_CONFIG_ADDRESS);
 err_unregister_data:
-   ioport__unregister(kvm, PCI_CONFIG_DATA);
+   kvm__deregister_pio(kvm, PCI_CONFIG_DATA);
return r;
 }
 dev_base_init(pci__init);
 
 int pci__exit(struct kvm *kvm)
 {
-   ioport__unregister(kvm, PCI_CONFIG_DATA);
-   ioport__unregister(kvm, PCI_CONFIG_ADDRESS);
+   kvm__deregister_pio(kvm, PCI_CONFIG_DATA);
+   

[PATCH kvmtool v3 17/22] virtio: Switch trap handling to use MMIO handler

2021-03-15 Thread Andre Przywara
With the planned retirement of the special ioport emulation code, we
need to provide an emulation function compatible with the MMIO prototype.

Adjust the existing MMIO callback routine to automatically determine
the region this trap came through, and call the existing I/O handlers.
Register the ioport region using the new registration function.

Signed-off-by: Andre Przywara 
Reviewed-by: Alexandru Elisei 
---
 virtio/pci.c | 46 ++
 1 file changed, 14 insertions(+), 32 deletions(-)

diff --git a/virtio/pci.c b/virtio/pci.c
index 6eea6c68..eb91f512 100644
--- a/virtio/pci.c
+++ b/virtio/pci.c
@@ -178,15 +178,6 @@ static bool virtio_pci__data_in(struct kvm_cpu *vcpu, 
struct virtio_device *vdev
return ret;
 }
 
-static bool virtio_pci__io_in(struct ioport *ioport, struct kvm_cpu *vcpu, u16 
port, void *data, int size)
-{
-   struct virtio_device *vdev = ioport->priv;
-   struct virtio_pci *vpci = vdev->virtio;
-   unsigned long offset = port - virtio_pci__port_addr(vpci);
-
-   return virtio_pci__data_in(vcpu, vdev, offset, data, size);
-}
-
 static void update_msix_map(struct virtio_pci *vpci,
struct msix_table *msix_entry, u32 vecnum)
 {
@@ -334,20 +325,6 @@ static bool virtio_pci__data_out(struct kvm_cpu *vcpu, 
struct virtio_device *vde
return ret;
 }
 
-static bool virtio_pci__io_out(struct ioport *ioport, struct kvm_cpu *vcpu, 
u16 port, void *data, int size)
-{
-   struct virtio_device *vdev = ioport->priv;
-   struct virtio_pci *vpci = vdev->virtio;
-   unsigned long offset = port - virtio_pci__port_addr(vpci);
-
-   return virtio_pci__data_out(vcpu, vdev, offset, data, size);
-}
-
-static struct ioport_operations virtio_pci__io_ops = {
-   .io_in  = virtio_pci__io_in,
-   .io_out = virtio_pci__io_out,
-};
-
 static void virtio_pci__msix_mmio_callback(struct kvm_cpu *vcpu,
   u64 addr, u8 *data, u32 len,
   u8 is_write, void *ptr)
@@ -455,12 +432,19 @@ static void virtio_pci__io_mmio_callback(struct kvm_cpu 
*vcpu,
 {
struct virtio_device *vdev = ptr;
struct virtio_pci *vpci = vdev->virtio;
-   u32 mmio_addr = virtio_pci__mmio_addr(vpci);
+   u32 ioport_addr = virtio_pci__port_addr(vpci);
+   u32 base_addr;
+
+   if (addr >= ioport_addr &&
+   addr < ioport_addr + pci__bar_size(>pci_hdr, 0))
+   base_addr = ioport_addr;
+   else
+   base_addr = virtio_pci__mmio_addr(vpci);
 
if (!is_write)
-   virtio_pci__data_in(vcpu, vdev, addr - mmio_addr, data, len);
+   virtio_pci__data_in(vcpu, vdev, addr - base_addr, data, len);
else
-   virtio_pci__data_out(vcpu, vdev, addr - mmio_addr, data, len);
+   virtio_pci__data_out(vcpu, vdev, addr - base_addr, data, len);
 }
 
 static int virtio_pci__bar_activate(struct kvm *kvm,
@@ -478,10 +462,8 @@ static int virtio_pci__bar_activate(struct kvm *kvm,
 
switch (bar_num) {
case 0:
-   r = ioport__register(kvm, bar_addr, _pci__io_ops,
-bar_size, vdev);
-   if (r > 0)
-   r = 0;
+   r = kvm__register_pio(kvm, bar_addr, bar_size,
+ virtio_pci__io_mmio_callback, vdev);
break;
case 1:
r =  kvm__register_mmio(kvm, bar_addr, bar_size, false,
@@ -510,7 +492,7 @@ static int virtio_pci__bar_deactivate(struct kvm *kvm,
 
switch (bar_num) {
case 0:
-   r = ioport__unregister(kvm, bar_addr);
+   r = kvm__deregister_pio(kvm, bar_addr);
break;
case 1:
case 2:
@@ -625,7 +607,7 @@ int virtio_pci__exit(struct kvm *kvm, struct virtio_device 
*vdev)
virtio_pci__reset(kvm, vdev);
kvm__deregister_mmio(kvm, virtio_pci__mmio_addr(vpci));
kvm__deregister_mmio(kvm, virtio_pci__msix_io_addr(vpci));
-   ioport__unregister(kvm, virtio_pci__port_addr(vpci));
+   kvm__deregister_pio(kvm, virtio_pci__port_addr(vpci));
 
return 0;
 }
-- 
2.17.5

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH kvmtool v3 16/22] vfio: Switch to new ioport trap handlers

2021-03-15 Thread Andre Przywara
Now that the vfio device has a trap handler adhering to the MMIO fault
handler prototype, let's switch over to the joint registration routine.

This allows us to get rid of the ioport shim routines.

Signed-off-by: Andre Przywara 
Reviewed-by: Alexandru Elisei 
---
 vfio/core.c | 37 ++---
 1 file changed, 10 insertions(+), 27 deletions(-)

diff --git a/vfio/core.c b/vfio/core.c
index ddd3c2c7..3ff2c0b0 100644
--- a/vfio/core.c
+++ b/vfio/core.c
@@ -81,7 +81,7 @@ out_free_buf:
return ret;
 }
 
-static bool _vfio_ioport_in(struct vfio_region *region, u32 offset,
+static bool vfio_ioport_in(struct vfio_region *region, u32 offset,
void *data, int len)
 {
struct vfio_device *vdev = region->vdev;
@@ -115,7 +115,7 @@ static bool _vfio_ioport_in(struct vfio_region *region, u32 
offset,
return true;
 }
 
-static bool _vfio_ioport_out(struct vfio_region *region, u32 offset,
+static bool vfio_ioport_out(struct vfio_region *region, u32 offset,
 void *data, int len)
 {
struct vfio_device *vdev = region->vdev;
@@ -155,30 +155,11 @@ static void vfio_ioport_mmio(struct kvm_cpu *vcpu, u64 
addr, u8 *data, u32 len,
u32 offset = addr - region->port_base;
 
if (is_write)
-   _vfio_ioport_out(region, offset, data, len);
+   vfio_ioport_out(region, offset, data, len);
else
-   _vfio_ioport_in(region, offset, data, len);
+   vfio_ioport_in(region, offset, data, len);
 }
 
-static bool vfio_ioport_out(struct ioport *ioport, struct kvm_cpu *vcpu,
-   u16 port, void *data, int len)
-{
-   vfio_ioport_mmio(vcpu, port, data, len, true, ioport->priv);
-   return true;
-}
-
-static bool vfio_ioport_in(struct ioport *ioport, struct kvm_cpu *vcpu,
-  u16 port, void *data, int len)
-{
-   vfio_ioport_mmio(vcpu, port, data, len, false, ioport->priv);
-   return true;
-}
-
-static struct ioport_operations vfio_ioport_ops = {
-   .io_in  = vfio_ioport_in,
-   .io_out = vfio_ioport_out,
-};
-
 static void vfio_mmio_access(struct kvm_cpu *vcpu, u64 addr, u8 *data, u32 len,
 u8 is_write, void *ptr)
 {
@@ -223,9 +204,11 @@ static int vfio_setup_trap_region(struct kvm *kvm, struct 
vfio_device *vdev,
  struct vfio_region *region)
 {
if (region->is_ioport) {
-   int port = ioport__register(kvm, region->port_base,
-  _ioport_ops, region->info.size,
-  region);
+   int port;
+
+   port = kvm__register_pio(kvm, region->port_base,
+region->info.size, vfio_ioport_mmio,
+region);
if (port < 0)
return port;
return 0;
@@ -292,7 +275,7 @@ void vfio_unmap_region(struct kvm *kvm, struct vfio_region 
*region)
munmap(region->host_addr, region->info.size);
region->host_addr = NULL;
} else if (region->is_ioport) {
-   ioport__unregister(kvm, region->port_base);
+   kvm__deregister_pio(kvm, region->port_base);
} else {
kvm__deregister_mmio(kvm, region->guest_phys_addr);
}
-- 
2.17.5

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH kvmtool v3 14/22] hw/serial: Switch to new trap handlers

2021-03-15 Thread Andre Przywara
Now that the serial device has a trap handler adhering to the MMIO fault
handler prototype, let's switch over to the joint registration routine.

This allows us to get rid of the ioport shim routines.

Signed-off-by: Andre Przywara 
Reviewed-by: Alexandru Elisei 
---
 hw/serial.c | 31 +++
 1 file changed, 3 insertions(+), 28 deletions(-)

diff --git a/hw/serial.c b/hw/serial.c
index 3f797452..16af493b 100644
--- a/hw/serial.c
+++ b/hw/serial.c
@@ -393,26 +393,6 @@ static void serial8250_mmio(struct kvm_cpu *vcpu, u64 
addr, u8 *data, u32 len,
serial8250_in(dev, vcpu, addr - dev->iobase, data);
 }
 
-static bool serial8250_ioport_out(struct ioport *ioport, struct kvm_cpu *vcpu,
- u16 port, void *data, int size)
-{
-   struct serial8250_device *dev = ioport->priv;
-
-   serial8250_mmio(vcpu, port, data, 1, true, dev);
-
-   return true;
-}
-
-static bool serial8250_ioport_in(struct ioport *ioport, struct kvm_cpu *vcpu,
-u16 port, void *data, int size)
-{
-   struct serial8250_device *dev = ioport->priv;
-
-   serial8250_mmio(vcpu, port, data, 1, false, dev);
-
-   return true;
-}
-
 #ifdef CONFIG_HAS_LIBFDT
 
 char *fdt_stdout_path = NULL;
@@ -450,11 +430,6 @@ void serial8250_generate_fdt_node(void *fdt, struct 
device_header *dev_hdr,
 }
 #endif
 
-static struct ioport_operations serial8250_ops = {
-   .io_in  = serial8250_ioport_in,
-   .io_out = serial8250_ioport_out,
-};
-
 static int serial8250__device_init(struct kvm *kvm,
   struct serial8250_device *dev)
 {
@@ -465,7 +440,7 @@ static int serial8250__device_init(struct kvm *kvm,
return r;
 
ioport__map_irq(>irq);
-   r = ioport__register(kvm, dev->iobase, _ops, 8, dev);
+   r = kvm__register_pio(kvm, dev->iobase, 8, serial8250_mmio, dev);
 
return r;
 }
@@ -488,7 +463,7 @@ cleanup:
for (j = 0; j <= i; j++) {
struct serial8250_device *dev = [j];
 
-   ioport__unregister(kvm, dev->iobase);
+   kvm__deregister_pio(kvm, dev->iobase);
device__unregister(>dev_hdr);
}
 
@@ -504,7 +479,7 @@ int serial8250__exit(struct kvm *kvm)
for (i = 0; i < ARRAY_SIZE(devices); i++) {
struct serial8250_device *dev = [i];
 
-   r = ioport__unregister(kvm, dev->iobase);
+   r = kvm__deregister_pio(kvm, dev->iobase);
if (r < 0)
return r;
device__unregister(>dev_hdr);
-- 
2.17.5

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH kvmtool v3 13/22] hw/serial: Refactor trap handler

2021-03-15 Thread Andre Przywara
With the planned retirement of the special ioport emulation code, we
need to provide an emulation function compatible with the MMIO prototype.

Adjust the trap handler to use that new function, and provide shims to
implement the old ioport interface, for now.

Signed-off-by: Andre Przywara 
---
 hw/serial.c | 50 +-
 1 file changed, 37 insertions(+), 13 deletions(-)

diff --git a/hw/serial.c b/hw/serial.c
index b0465d99..3f797452 100644
--- a/hw/serial.c
+++ b/hw/serial.c
@@ -242,18 +242,14 @@ void serial8250__inject_sysrq(struct kvm *kvm, char sysrq)
sysrq_pending = sysrq;
 }
 
-static bool serial8250_out(struct ioport *ioport, struct kvm_cpu *vcpu, u16 
port,
-  void *data, int size)
+static bool serial8250_out(struct serial8250_device *dev, struct kvm_cpu *vcpu,
+  u16 offset, void *data)
 {
-   struct serial8250_device *dev = ioport->priv;
-   u16 offset;
bool ret = true;
char *addr = data;
 
mutex_lock(>mutex);
 
-   offset = port - dev->iobase;
-
switch (offset) {
case UART_TX:
if (dev->lcr & UART_LCR_DLAB) {
@@ -336,16 +332,13 @@ static void serial8250_rx(struct serial8250_device *dev, 
void *data)
}
 }
 
-static bool serial8250_in(struct ioport *ioport, struct kvm_cpu *vcpu, u16 
port, void *data, int size)
+static bool serial8250_in(struct serial8250_device *dev, struct kvm_cpu *vcpu,
+ u16 offset, void *data)
 {
-   struct serial8250_device *dev = ioport->priv;
-   u16 offset;
bool ret = true;
 
mutex_lock(>mutex);
 
-   offset = port - dev->iobase;
-
switch (offset) {
case UART_RX:
if (dev->lcr & UART_LCR_DLAB)
@@ -389,6 +382,37 @@ static bool serial8250_in(struct ioport *ioport, struct 
kvm_cpu *vcpu, u16 port,
return ret;
 }
 
+static void serial8250_mmio(struct kvm_cpu *vcpu, u64 addr, u8 *data, u32 len,
+   u8 is_write, void *ptr)
+{
+   struct serial8250_device *dev = ptr;
+
+   if (is_write)
+   serial8250_out(dev, vcpu, addr - dev->iobase, data);
+   else
+   serial8250_in(dev, vcpu, addr - dev->iobase, data);
+}
+
+static bool serial8250_ioport_out(struct ioport *ioport, struct kvm_cpu *vcpu,
+ u16 port, void *data, int size)
+{
+   struct serial8250_device *dev = ioport->priv;
+
+   serial8250_mmio(vcpu, port, data, 1, true, dev);
+
+   return true;
+}
+
+static bool serial8250_ioport_in(struct ioport *ioport, struct kvm_cpu *vcpu,
+u16 port, void *data, int size)
+{
+   struct serial8250_device *dev = ioport->priv;
+
+   serial8250_mmio(vcpu, port, data, 1, false, dev);
+
+   return true;
+}
+
 #ifdef CONFIG_HAS_LIBFDT
 
 char *fdt_stdout_path = NULL;
@@ -427,8 +451,8 @@ void serial8250_generate_fdt_node(void *fdt, struct 
device_header *dev_hdr,
 #endif
 
 static struct ioport_operations serial8250_ops = {
-   .io_in  = serial8250_in,
-   .io_out = serial8250_out,
+   .io_in  = serial8250_ioport_in,
+   .io_out = serial8250_ioport_out,
 };
 
 static int serial8250__device_init(struct kvm *kvm,
-- 
2.17.5

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH kvmtool v3 12/22] hw/vesa: Switch trap handling to use MMIO handler

2021-03-15 Thread Andre Przywara
To be able to use the VESA device with the new generic I/O trap handler,
we need to use the different MMIO handler callback routine.

Replace the existing dummy in and out handlers with a joint dummy
MMIO handler, and register this using the new registration function.

Signed-off-by: Andre Przywara 
Reviewed-by: Alexandru Elisei 
---
 hw/vesa.c | 19 +--
 1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/hw/vesa.c b/hw/vesa.c
index 8659a002..7f82cdb4 100644
--- a/hw/vesa.c
+++ b/hw/vesa.c
@@ -43,21 +43,11 @@ static struct framebuffer vesafb = {
.mem_size   = VESA_MEM_SIZE,
 };
 
-static bool vesa_pci_io_in(struct ioport *ioport, struct kvm_cpu *vcpu, u16 
port, void *data, int size)
+static void vesa_pci_io(struct kvm_cpu *vcpu, u64 addr, u8 *data, u32 len,
+   u8 is_write, void *ptr)
 {
-   return true;
 }
 
-static bool vesa_pci_io_out(struct ioport *ioport, struct kvm_cpu *vcpu, u16 
port, void *data, int size)
-{
-   return true;
-}
-
-static struct ioport_operations vesa_io_ops = {
-   .io_in  = vesa_pci_io_in,
-   .io_out = vesa_pci_io_out,
-};
-
 static int vesa__bar_activate(struct kvm *kvm, struct pci_device_header 
*pci_hdr,
  int bar_num, void *data)
 {
@@ -82,7 +72,8 @@ struct framebuffer *vesa__init(struct kvm *kvm)
BUILD_BUG_ON(VESA_MEM_SIZE < VESA_BPP/8 * VESA_WIDTH * VESA_HEIGHT);
 
vesa_base_addr = pci_get_io_port_block(PCI_IO_SIZE);
-   r = ioport__register(kvm, vesa_base_addr, _io_ops, PCI_IO_SIZE, 
NULL);
+   r = kvm__register_pio(kvm, vesa_base_addr, PCI_IO_SIZE, vesa_pci_io,
+ NULL);
if (r < 0)
goto out_error;
 
@@ -116,7 +107,7 @@ unmap_dev:
 unregister_device:
device__unregister(_device);
 unregister_ioport:
-   ioport__unregister(kvm, vesa_base_addr);
+   kvm__deregister_pio(kvm, vesa_base_addr);
 out_error:
return ERR_PTR(r);
 }
-- 
2.17.5

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH kvmtool v3 11/22] hw/rtc: Switch to new trap handler

2021-03-15 Thread Andre Przywara
Now that the RTC device has a trap handler adhering to the MMIO fault
handler prototype, let's switch over to the joint registration routine.

This allows us to get rid of the ioport shim routines.

Signed-off-by: Andre Przywara 
Reviewed-by: Alexandru Elisei 
---
 hw/rtc.c | 21 ++---
 1 file changed, 2 insertions(+), 19 deletions(-)

diff --git a/hw/rtc.c b/hw/rtc.c
index 664d4cb0..ee4c9102 100644
--- a/hw/rtc.c
+++ b/hw/rtc.c
@@ -120,23 +120,6 @@ static void cmos_ram_io(struct kvm_cpu *vcpu, u64 addr, u8 
*data,
}
 }
 
-static bool cmos_ram_in(struct ioport *ioport, struct kvm_cpu *vcpu, u16 port, 
void *data, int size)
-{
-   cmos_ram_io(vcpu, port, data, size, false, NULL);
-   return true;
-}
-
-static bool cmos_ram_out(struct ioport *ioport, struct kvm_cpu *vcpu, u16 
port, void *data, int size)
-{
-   cmos_ram_io(vcpu, port, data, size, true, NULL);
-   return true;
-}
-
-static struct ioport_operations cmos_ram_ioport_ops = {
-   .io_out = cmos_ram_out,
-   .io_in  = cmos_ram_in,
-};
-
 #ifdef CONFIG_HAS_LIBFDT
 static void generate_rtc_fdt_node(void *fdt,
  struct device_header *dev_hdr,
@@ -169,7 +152,7 @@ int rtc__init(struct kvm *kvm)
return r;
 
/* PORT 0070-007F - CMOS RAM/RTC (REAL TIME CLOCK) */
-   r = ioport__register(kvm, 0x0070, _ram_ioport_ops, 2, NULL);
+   r = kvm__register_pio(kvm, 0x0070, 2, cmos_ram_io, NULL);
if (r < 0)
goto out_device;
 
@@ -188,7 +171,7 @@ dev_init(rtc__init);
 int rtc__exit(struct kvm *kvm)
 {
/* PORT 0070-007F - CMOS RAM/RTC (REAL TIME CLOCK) */
-   ioport__unregister(kvm, 0x0070);
+   kvm__deregister_pio(kvm, 0x0070);
 
return 0;
 }
-- 
2.17.5

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH kvmtool v3 10/22] hw/rtc: Refactor trap handlers

2021-03-15 Thread Andre Przywara
With the planned retirement of the special ioport emulation code, we
need to provide emulation functions compatible with the MMIO prototype.

Merge the two different trap handlers into one function, checking for
read/write and data/index register inside.
Adjust the trap handlers to use that new function, and provide shims to
implement the old ioport interface, for now.

Signed-off-by: Andre Przywara 
Reviewed-by: Alexandru Elisei 
---
 hw/rtc.c | 70 
 1 file changed, 35 insertions(+), 35 deletions(-)

diff --git a/hw/rtc.c b/hw/rtc.c
index 5483879f..664d4cb0 100644
--- a/hw/rtc.c
+++ b/hw/rtc.c
@@ -42,11 +42,37 @@ static inline unsigned char bin2bcd(unsigned val)
return ((val / 10) << 4) + val % 10;
 }
 
-static bool cmos_ram_data_in(struct ioport *ioport, struct kvm_cpu *vcpu, u16 
port, void *data, int size)
+static void cmos_ram_io(struct kvm_cpu *vcpu, u64 addr, u8 *data,
+   u32 len, u8 is_write, void *ptr)
 {
struct tm *tm;
time_t ti;
 
+   if (is_write) {
+   if (addr == 0x70) { /* index register */
+   u8 value = ioport__read8(data);
+
+   vcpu->kvm->nmi_disabled = value & (1UL << 7);
+   rtc.cmos_idx= value & ~(1UL << 7);
+
+   return;
+   }
+
+   switch (rtc.cmos_idx) {
+   case RTC_REG_C:
+   case RTC_REG_D:
+   /* Read-only */
+   break;
+   default:
+   rtc.cmos_data[rtc.cmos_idx] = ioport__read8(data);
+   break;
+   }
+   return;
+   }
+
+   if (addr == 0x70)
+   return;
+
time();
 
tm = gmtime();
@@ -92,42 +118,23 @@ static bool cmos_ram_data_in(struct ioport *ioport, struct 
kvm_cpu *vcpu, u16 po
ioport__write8(data, rtc.cmos_data[rtc.cmos_idx]);
break;
}
-
-   return true;
 }
 
-static bool cmos_ram_data_out(struct ioport *ioport, struct kvm_cpu *vcpu, u16 
port, void *data, int size)
+static bool cmos_ram_in(struct ioport *ioport, struct kvm_cpu *vcpu, u16 port, 
void *data, int size)
 {
-   switch (rtc.cmos_idx) {
-   case RTC_REG_C:
-   case RTC_REG_D:
-   /* Read-only */
-   break;
-   default:
-   rtc.cmos_data[rtc.cmos_idx] = ioport__read8(data);
-   break;
-   }
-
+   cmos_ram_io(vcpu, port, data, size, false, NULL);
return true;
 }
 
-static struct ioport_operations cmos_ram_data_ioport_ops = {
-   .io_out = cmos_ram_data_out,
-   .io_in  = cmos_ram_data_in,
-};
-
-static bool cmos_ram_index_out(struct ioport *ioport, struct kvm_cpu *vcpu, 
u16 port, void *data, int size)
+static bool cmos_ram_out(struct ioport *ioport, struct kvm_cpu *vcpu, u16 
port, void *data, int size)
 {
-   u8 value = ioport__read8(data);
-
-   vcpu->kvm->nmi_disabled = value & (1UL << 7);
-   rtc.cmos_idx= value & ~(1UL << 7);
-
+   cmos_ram_io(vcpu, port, data, size, true, NULL);
return true;
 }
 
-static struct ioport_operations cmos_ram_index_ioport_ops = {
-   .io_out = cmos_ram_index_out,
+static struct ioport_operations cmos_ram_ioport_ops = {
+   .io_out = cmos_ram_out,
+   .io_in  = cmos_ram_in,
 };
 
 #ifdef CONFIG_HAS_LIBFDT
@@ -162,21 +169,15 @@ int rtc__init(struct kvm *kvm)
return r;
 
/* PORT 0070-007F - CMOS RAM/RTC (REAL TIME CLOCK) */
-   r = ioport__register(kvm, 0x0070, _ram_index_ioport_ops, 1, NULL);
+   r = ioport__register(kvm, 0x0070, _ram_ioport_ops, 2, NULL);
if (r < 0)
goto out_device;
 
-   r = ioport__register(kvm, 0x0071, _ram_data_ioport_ops, 1, NULL);
-   if (r < 0)
-   goto out_ioport;
-
/* Set the VRT bit in Register D to indicate valid RAM and time */
rtc.cmos_data[RTC_REG_D] = RTC_REG_D_VRT;
 
return r;
 
-out_ioport:
-   ioport__unregister(kvm, 0x0070);
 out_device:
device__unregister(_dev_hdr);
 
@@ -188,7 +189,6 @@ int rtc__exit(struct kvm *kvm)
 {
/* PORT 0070-007F - CMOS RAM/RTC (REAL TIME CLOCK) */
ioport__unregister(kvm, 0x0070);
-   ioport__unregister(kvm, 0x0071);
 
return 0;
 }
-- 
2.17.5

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH kvmtool v3 09/22] x86/ioport: Switch to new trap handlers

2021-03-15 Thread Andre Przywara
Now that the x86 I/O ports have trap handlers adhering to the MMIO fault
handler prototype, let's switch over to the joint registration routine.

This allows us to get rid of the ioport shim routines.

Since the debug output was done in ioport.c, we would lose this
functionality when moving over to the MMIO handlers. So bring this back
here explicitly, by introducing debug_io().

Signed-off-by: Andre Przywara 
Reviewed-by: Alexandru Elisei 
---
 x86/ioport.c | 101 +++
 1 file changed, 37 insertions(+), 64 deletions(-)

diff --git a/x86/ioport.c b/x86/ioport.c
index b198de7a..06b7defb 100644
--- a/x86/ioport.c
+++ b/x86/ioport.c
@@ -8,15 +8,29 @@ static void dummy_io(struct kvm_cpu *vcpu, u64 addr, u8 
*data, u32 len,
 {
 }
 
-static bool debug_io_out(struct ioport *ioport, struct kvm_cpu *vcpu, u16 
port, void *data, int size)
+static void debug_io(struct kvm_cpu *vcpu, u64 addr, u8 *data, u32 len,
+u8 is_write, void *ptr)
 {
-   dummy_io(vcpu, port, data, size, true, NULL);
-   return 0;
-}
+   if (!vcpu->kvm->cfg.ioport_debug)
+   return;
 
-static struct ioport_operations debug_ops = {
-   .io_out = debug_io_out,
-};
+   fprintf(stderr, "debug port %s from VCPU%lu: port=0x%lx, size=%u",
+   is_write ? "write" : "read", vcpu->cpu_id,
+   (unsigned long)addr, len);
+   if (is_write) {
+   u32 value;
+
+   switch (len) {
+   case 1: value = ioport__read8(data); break;
+   case 2: value = ioport__read16((u16*)data); break;
+   case 4: value = ioport__read32((u32*)data); break;
+   default: value = 0; break;
+   }
+   fprintf(stderr, ", data: 0x%x\n", value);
+   } else {
+   fprintf(stderr, "\n");
+   }
+}
 
 static void seabios_debug_io(struct kvm_cpu *vcpu, u64 addr, u8 *data,
 u32 len, u8 is_write, void *ptr)
@@ -31,37 +45,6 @@ static void seabios_debug_io(struct kvm_cpu *vcpu, u64 addr, 
u8 *data,
putchar(ch);
 }
 
-static bool seabios_debug_io_out(struct ioport *ioport, struct kvm_cpu *vcpu, 
u16 port, void *data, int size)
-{
-   seabios_debug_io(vcpu, port, data, size, true, NULL);
-   return 0;
-}
-
-static struct ioport_operations seabios_debug_ops = {
-   .io_out = seabios_debug_io_out,
-};
-
-static bool dummy_io_in(struct ioport *ioport, struct kvm_cpu *vcpu, u16 port, 
void *data, int size)
-{
-   dummy_io(vcpu, port, data, size, false, NULL);
-   return true;
-}
-
-static bool dummy_io_out(struct ioport *ioport, struct kvm_cpu *vcpu, u16 
port, void *data, int size)
-{
-   dummy_io(vcpu, port, data, size, true, NULL);
-   return true;
-}
-
-static struct ioport_operations dummy_read_write_ioport_ops = {
-   .io_in  = dummy_io_in,
-   .io_out = dummy_io_out,
-};
-
-static struct ioport_operations dummy_write_only_ioport_ops = {
-   .io_out = dummy_io_out,
-};
-
 /*
  * The "fast A20 gate"
  */
@@ -76,17 +59,6 @@ static void ps2_control_io(struct kvm_cpu *vcpu, u64 addr, 
u8 *data, u32 len,
ioport__write8(data, 0x02);
 }
 
-static bool ps2_control_a_io_in(struct ioport *ioport, struct kvm_cpu *vcpu, 
u16 port, void *data, int size)
-{
-   ps2_control_io(vcpu, port, data, size, false, NULL);
-   return true;
-}
-
-static struct ioport_operations ps2_control_a_ops = {
-   .io_in  = ps2_control_a_io_in,
-   .io_out = dummy_io_out,
-};
-
 void ioport__map_irq(u8 *irq)
 {
 }
@@ -98,75 +70,76 @@ static int ioport__setup_arch(struct kvm *kvm)
/* Legacy ioport setup */
 
/*  - 001F - DMA1 controller */
-   r = ioport__register(kvm, 0x, _read_write_ioport_ops, 32, 
NULL);
+   r = kvm__register_pio(kvm, 0x, 32, dummy_io, NULL);
if (r < 0)
return r;
 
/* 0x0020 - 0x003F - 8259A PIC 1 */
-   r = ioport__register(kvm, 0x0020, _read_write_ioport_ops, 2, 
NULL);
+   r = kvm__register_pio(kvm, 0x0020, 2, dummy_io, NULL);
if (r < 0)
return r;
 
/* PORT 0040-005F - PIT - PROGRAMMABLE INTERVAL TIMER (8253, 8254) */
-   r = ioport__register(kvm, 0x0040, _read_write_ioport_ops, 4, 
NULL);
+   r = kvm__register_pio(kvm, 0x0040, 4, dummy_io, NULL);
if (r < 0)
return r;
 
/* 0092 - PS/2 system control port A */
-   r = ioport__register(kvm, 0x0092, _control_a_ops, 1, NULL);
+   r = kvm__register_pio(kvm, 0x0092, 1, ps2_control_io, NULL);
if (r < 0)
return r;
 
/* 0x00A0 - 0x00AF - 8259A PIC 2 */
-   r = ioport__register(kvm, 0x00A0, _read_write_ioport_ops, 2, 
NULL);
+   r = kvm__register_pio(kvm, 0x00A0, 2, dummy_io, NULL);
if (r < 0)
return r;
 
/* 00C0 - 001F - DMA2 controller */
-   r = 

[PATCH kvmtool v3 08/22] x86/ioport: Refactor trap handlers

2021-03-15 Thread Andre Przywara
With the planned retirement of the special ioport emulation code, we
need to provide emulation functions compatible with the MMIO
prototype.

Adjust the trap handlers to use that new function, and provide shims to
implement the old ioport interface, for now.

Signed-off-by: Andre Przywara 
Reviewed-by: Alexandru Elisei 
---
 x86/ioport.c | 30 ++
 1 file changed, 26 insertions(+), 4 deletions(-)

diff --git a/x86/ioport.c b/x86/ioport.c
index a8d2bb1a..b198de7a 100644
--- a/x86/ioport.c
+++ b/x86/ioport.c
@@ -3,8 +3,14 @@
 #include 
 #include 
 
+static void dummy_io(struct kvm_cpu *vcpu, u64 addr, u8 *data, u32 len,
+u8 is_write, void *ptr)
+{
+}
+
 static bool debug_io_out(struct ioport *ioport, struct kvm_cpu *vcpu, u16 
port, void *data, int size)
 {
+   dummy_io(vcpu, port, data, size, true, NULL);
return 0;
 }
 
@@ -12,15 +18,23 @@ static struct ioport_operations debug_ops = {
.io_out = debug_io_out,
 };
 
-static bool seabios_debug_io_out(struct ioport *ioport, struct kvm_cpu *vcpu, 
u16 port, void *data, int size)
+static void seabios_debug_io(struct kvm_cpu *vcpu, u64 addr, u8 *data,
+u32 len, u8 is_write, void *ptr)
 {
char ch;
 
+   if (!is_write)
+   return;
+
ch = ioport__read8(data);
 
putchar(ch);
+}
 
-   return true;
+static bool seabios_debug_io_out(struct ioport *ioport, struct kvm_cpu *vcpu, 
u16 port, void *data, int size)
+{
+   seabios_debug_io(vcpu, port, data, size, true, NULL);
+   return 0;
 }
 
 static struct ioport_operations seabios_debug_ops = {
@@ -29,11 +43,13 @@ static struct ioport_operations seabios_debug_ops = {
 
 static bool dummy_io_in(struct ioport *ioport, struct kvm_cpu *vcpu, u16 port, 
void *data, int size)
 {
+   dummy_io(vcpu, port, data, size, false, NULL);
return true;
 }
 
 static bool dummy_io_out(struct ioport *ioport, struct kvm_cpu *vcpu, u16 
port, void *data, int size)
 {
+   dummy_io(vcpu, port, data, size, true, NULL);
return true;
 }
 
@@ -50,13 +66,19 @@ static struct ioport_operations dummy_write_only_ioport_ops 
= {
  * The "fast A20 gate"
  */
 
-static bool ps2_control_a_io_in(struct ioport *ioport, struct kvm_cpu *vcpu, 
u16 port, void *data, int size)
+static void ps2_control_io(struct kvm_cpu *vcpu, u64 addr, u8 *data, u32 len,
+  u8 is_write, void *ptr)
 {
/*
 * A20 is always enabled.
 */
-   ioport__write8(data, 0x02);
+   if (!is_write)
+   ioport__write8(data, 0x02);
+}
 
+static bool ps2_control_a_io_in(struct ioport *ioport, struct kvm_cpu *vcpu, 
u16 port, void *data, int size)
+{
+   ps2_control_io(vcpu, port, data, size, false, NULL);
return true;
 }
 
-- 
2.17.5

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH kvmtool v3 07/22] hw/i8042: Switch to new trap handlers

2021-03-15 Thread Andre Przywara
Now that the PC keyboard has a trap handler adhering to the MMIO fault
handler prototype, let's switch over to the joint registration routine.

This allows us to get rid of the ioport shim routines.

Make the kbd_init() function static on the way.

Signed-off-by: Andre Przywara 
Reviewed-by: Alexandru Elisei 
---
 hw/i8042.c  | 30 --
 include/kvm/i8042.h |  1 -
 2 files changed, 4 insertions(+), 27 deletions(-)

diff --git a/hw/i8042.c b/hw/i8042.c
index ab82..20be36c4 100644
--- a/hw/i8042.c
+++ b/hw/i8042.c
@@ -325,40 +325,18 @@ static void kbd_io(struct kvm_cpu *vcpu, u64 addr, u8 
*data, u32 len,
ioport__write8(data, value);
 }
 
-/*
- * Called when the OS has written to one of the keyboard's ports (0x60 or 0x64)
- */
-static bool kbd_in(struct ioport *ioport, struct kvm_cpu *vcpu, u16 port, void 
*data, int size)
-{
-   kbd_io(vcpu, port, data, size, false, NULL);
-
-   return true;
-}
-
-static bool kbd_out(struct ioport *ioport, struct kvm_cpu *vcpu, u16 port, 
void *data, int size)
-{
-   kbd_io(vcpu, port, data, size, true, NULL);
-
-   return true;
-}
-
-static struct ioport_operations kbd_ops = {
-   .io_in  = kbd_in,
-   .io_out = kbd_out,
-};
-
-int kbd__init(struct kvm *kvm)
+static int kbd__init(struct kvm *kvm)
 {
int r;
 
kbd_reset();
state.kvm = kvm;
-   r = ioport__register(kvm, I8042_DATA_REG, _ops, 2, NULL);
+   r = kvm__register_pio(kvm, I8042_DATA_REG, 2, kbd_io, NULL);
if (r < 0)
return r;
-   r = ioport__register(kvm, I8042_COMMAND_REG, _ops, 2, NULL);
+   r = kvm__register_pio(kvm, I8042_COMMAND_REG, 2, kbd_io, NULL);
if (r < 0) {
-   ioport__unregister(kvm, I8042_DATA_REG);
+   kvm__deregister_pio(kvm, I8042_DATA_REG);
return r;
}
 
diff --git a/include/kvm/i8042.h b/include/kvm/i8042.h
index 3b4ab688..cd4ae6bb 100644
--- a/include/kvm/i8042.h
+++ b/include/kvm/i8042.h
@@ -7,6 +7,5 @@ struct kvm;
 
 void mouse_queue(u8 c);
 void kbd_queue(u8 c);
-int kbd__init(struct kvm *kvm);
 
 #endif
-- 
2.17.5

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH kvmtool v3 06/22] hw/i8042: Refactor trap handler

2021-03-15 Thread Andre Przywara
With the planned retirement of the special ioport emulation code, we
need to provide an emulation function compatible with the MMIO
prototype.

Adjust the trap handler to use that new function, and provide shims to
implement the old ioport interface, for now.

Signed-off-by: Andre Przywara 
Reviewed-by: Alexandru Elisei 
---
 hw/i8042.c | 68 +++---
 1 file changed, 34 insertions(+), 34 deletions(-)

diff --git a/hw/i8042.c b/hw/i8042.c
index 7d1f9772..ab82 100644
--- a/hw/i8042.c
+++ b/hw/i8042.c
@@ -292,52 +292,52 @@ static void kbd_reset(void)
};
 }
 
-/*
- * Called when the OS has written to one of the keyboard's ports (0x60 or 0x64)
- */
-static bool kbd_in(struct ioport *ioport, struct kvm_cpu *vcpu, u16 port, void 
*data, int size)
+static void kbd_io(struct kvm_cpu *vcpu, u64 addr, u8 *data, u32 len,
+  u8 is_write, void *ptr)
 {
-   switch (port) {
-   case I8042_COMMAND_REG: {
-   u8 value = kbd_read_status();
-   ioport__write8(data, value);
+   u8 value;
+
+   if (is_write)
+   value = ioport__read8(data);
+
+   switch (addr) {
+   case I8042_COMMAND_REG:
+   if (is_write)
+   kbd_write_command(vcpu->kvm, value);
+   else
+   value = kbd_read_status();
break;
-   }
-   case I8042_DATA_REG: {
-   u8 value = kbd_read_data();
-   ioport__write8(data, value);
+   case I8042_DATA_REG:
+   if (is_write)
+   kbd_write_data(value);
+   else
+   value = kbd_read_data();
break;
-   }
-   case I8042_PORT_B_REG: {
-   ioport__write8(data, 0x20);
+   case I8042_PORT_B_REG:
+   if (!is_write)
+   value = 0x20;
break;
-   }
default:
-   return false;
+   return;
}
 
+   if (!is_write)
+   ioport__write8(data, value);
+}
+
+/*
+ * Called when the OS has written to one of the keyboard's ports (0x60 or 0x64)
+ */
+static bool kbd_in(struct ioport *ioport, struct kvm_cpu *vcpu, u16 port, void 
*data, int size)
+{
+   kbd_io(vcpu, port, data, size, false, NULL);
+
return true;
 }
 
 static bool kbd_out(struct ioport *ioport, struct kvm_cpu *vcpu, u16 port, 
void *data, int size)
 {
-   switch (port) {
-   case I8042_COMMAND_REG: {
-   u8 value = ioport__read8(data);
-   kbd_write_command(vcpu->kvm, value);
-   break;
-   }
-   case I8042_DATA_REG: {
-   u8 value = ioport__read8(data);
-   kbd_write_data(value);
-   break;
-   }
-   case I8042_PORT_B_REG: {
-   break;
-   }
-   default:
-   return false;
-   }
+   kbd_io(vcpu, port, data, size, true, NULL);
 
return true;
 }
-- 
2.17.5

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH kvmtool v3 05/22] hw/i8042: Clean up data types

2021-03-15 Thread Andre Przywara
The i8042 is clearly an 8-bit era device, so there is little room for
32-bit registers.
Clean up the data types used.

Signed-off-by: Andre Przywara 
Reviewed-by: Alexandru Elisei 
---
 hw/i8042.c | 26 +-
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/hw/i8042.c b/hw/i8042.c
index 37a99a2d..7d1f9772 100644
--- a/hw/i8042.c
+++ b/hw/i8042.c
@@ -64,11 +64,11 @@
 struct kbd_state {
struct kvm  *kvm;
 
-   charkq[QUEUE_SIZE]; /* Keyboard queue */
+   u8  kq[QUEUE_SIZE]; /* Keyboard queue */
int kread, kwrite;  /* Indexes into the queue */
int kcount; /* number of elements in queue 
*/
 
-   charmq[QUEUE_SIZE];
+   u8  mq[QUEUE_SIZE];
int mread, mwrite;
int mcount;
 
@@ -82,7 +82,7 @@ struct kbd_state {
 * Some commands (on port 0x64) have arguments;
 * we store the command here while we wait for the argument
 */
-   u32 write_cmd;
+   u8  write_cmd;
 };
 
 static struct kbd_statestate;
@@ -173,9 +173,9 @@ static void kbd_write_command(struct kvm *kvm, u8 val)
 /*
  * Called when the OS reads from port 0x60 (PS/2 data)
  */
-static u32 kbd_read_data(void)
+static u8 kbd_read_data(void)
 {
-   u32 ret;
+   u8 ret;
int i;
 
if (state.kcount != 0) {
@@ -202,9 +202,9 @@ static u32 kbd_read_data(void)
 /*
  * Called when the OS read from port 0x64, the command port
  */
-static u32 kbd_read_status(void)
+static u8 kbd_read_status(void)
 {
-   return (u32)state.status;
+   return state.status;
 }
 
 /*
@@ -212,7 +212,7 @@ static u32 kbd_read_status(void)
  * Things written here are generally arguments to commands previously
  * written to port 0x64 and stored in state.write_cmd
  */
-static void kbd_write_data(u32 val)
+static void kbd_write_data(u8 val)
 {
switch (state.write_cmd) {
case I8042_CMD_CTL_WCTR:
@@ -266,8 +266,8 @@ static void kbd_write_data(u32 val)
break;
default:
break;
-   }
-   break;
+   }
+   break;
case 0:
/* Just send the ID */
kbd_queue(RESPONSE_ACK);
@@ -304,8 +304,8 @@ static bool kbd_in(struct ioport *ioport, struct kvm_cpu 
*vcpu, u16 port, void *
break;
}
case I8042_DATA_REG: {
-   u32 value = kbd_read_data();
-   ioport__write32(data, value);
+   u8 value = kbd_read_data();
+   ioport__write8(data, value);
break;
}
case I8042_PORT_B_REG: {
@@ -328,7 +328,7 @@ static bool kbd_out(struct ioport *ioport, struct kvm_cpu 
*vcpu, u16 port, void
break;
}
case I8042_DATA_REG: {
-   u32 value = ioport__read32(data);
+   u8 value = ioport__read8(data);
kbd_write_data(value);
break;
}
-- 
2.17.5

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH kvmtool v3 04/22] mmio: Extend handling to include ioport emulation

2021-03-15 Thread Andre Przywara
In their core functionality MMIO and I/O port traps are not really
different, yet we still have two totally separate code paths for
handling them. Devices need to decide on one conduit or need to provide
different handler functions for each of them.

Extend the existing MMIO emulation to also cover ioport handlers.
This just adds another RB tree root for holding the I/O port handlers,
but otherwise uses the same tree population and lookup code.
"ioport" or "mmio" just become a flag in the registration function.
Provide wrappers to not break existing users, and allow an easy
transition for the existing ioport handlers.

This also means that ioport handlers now can use the same emulation
callback prototype as MMIO handlers, which means we have to migrate them
over. To allow a smooth transition, we hook up the new I/O emulate
function to the end of the existing ioport emulation code.

Signed-off-by: Andre Przywara 
Reviewed-by: Alexandru Elisei 
---
 include/kvm/kvm.h | 49 ---
 ioport.c  |  4 +--
 mmio.c| 65 +++
 3 files changed, 102 insertions(+), 16 deletions(-)

diff --git a/include/kvm/kvm.h b/include/kvm/kvm.h
index f1f0afd7..306b258a 100644
--- a/include/kvm/kvm.h
+++ b/include/kvm/kvm.h
@@ -27,10 +27,23 @@
 #define PAGE_SIZE (sysconf(_SC_PAGE_SIZE))
 #endif
 
+/*
+ * We are reusing the existing DEVICE_BUS_MMIO and DEVICE_BUS_IOPORT constants
+ * from kvm/devices.h to differentiate between registering an I/O port and an
+ * MMIO region.
+ * To avoid collisions with future additions of more bus types, we reserve
+ * a generous 4 bits for the bus mask here.
+ */
+#define IOTRAP_BUS_MASK0xf
+#define IOTRAP_COALESCE(1U << 4)
+
 #define DEFINE_KVM_EXT(ext)\
.name = #ext,   \
.code = ext
 
+struct kvm_cpu;
+typedef void (*mmio_handler_fn)(struct kvm_cpu *vcpu, u64 addr, u8 *data,
+   u32 len, u8 is_write, void *ptr);
 typedef void (*fdt_irq_fn)(void *fdt, u8 irq, enum irq_type irq_type);
 
 enum {
@@ -113,6 +126,8 @@ void kvm__irq_line(struct kvm *kvm, int irq, int level);
 void kvm__irq_trigger(struct kvm *kvm, int irq);
 bool kvm__emulate_io(struct kvm_cpu *vcpu, u16 port, void *data, int 
direction, int size, u32 count);
 bool kvm__emulate_mmio(struct kvm_cpu *vcpu, u64 phys_addr, u8 *data, u32 len, 
u8 is_write);
+bool kvm__emulate_pio(struct kvm_cpu *vcpu, u16 port, void *data,
+ int direction, int size, u32 count);
 int kvm__destroy_mem(struct kvm *kvm, u64 guest_phys, u64 size, void 
*userspace_addr);
 int kvm__register_mem(struct kvm *kvm, u64 guest_phys, u64 size, void 
*userspace_addr,
  enum kvm_mem_type type);
@@ -136,10 +151,36 @@ static inline int kvm__reserve_mem(struct kvm *kvm, u64 
guest_phys, u64 size)
 KVM_MEM_TYPE_RESERVED);
 }
 
-int __must_check kvm__register_mmio(struct kvm *kvm, u64 phys_addr, u64 
phys_addr_len, bool coalesce,
-   void (*mmio_fn)(struct kvm_cpu *vcpu, u64 
addr, u8 *data, u32 len, u8 is_write, void *ptr),
-   void *ptr);
-bool kvm__deregister_mmio(struct kvm *kvm, u64 phys_addr);
+int __must_check kvm__register_iotrap(struct kvm *kvm, u64 phys_addr, u64 len,
+ mmio_handler_fn mmio_fn, void *ptr,
+ unsigned int flags);
+
+static inline
+int __must_check kvm__register_mmio(struct kvm *kvm, u64 phys_addr,
+   u64 phys_addr_len, bool coalesce,
+   mmio_handler_fn mmio_fn, void *ptr)
+{
+   return kvm__register_iotrap(kvm, phys_addr, phys_addr_len, mmio_fn, ptr,
+   DEVICE_BUS_MMIO | (coalesce ? IOTRAP_COALESCE : 0));
+}
+static inline
+int __must_check kvm__register_pio(struct kvm *kvm, u16 port, u16 len,
+  mmio_handler_fn mmio_fn, void *ptr)
+{
+   return kvm__register_iotrap(kvm, port, len, mmio_fn, ptr,
+   DEVICE_BUS_IOPORT);
+}
+
+bool kvm__deregister_iotrap(struct kvm *kvm, u64 phys_addr, unsigned int 
flags);
+static inline bool kvm__deregister_mmio(struct kvm *kvm, u64 phys_addr)
+{
+   return kvm__deregister_iotrap(kvm, phys_addr, DEVICE_BUS_MMIO);
+}
+static inline bool kvm__deregister_pio(struct kvm *kvm, u16 port)
+{
+   return kvm__deregister_iotrap(kvm, port, DEVICE_BUS_IOPORT);
+}
+
 void kvm__reboot(struct kvm *kvm);
 void kvm__pause(struct kvm *kvm);
 void kvm__continue(struct kvm *kvm);
diff --git a/ioport.c b/ioport.c
index e0123f27..ce29e7e7 100644
--- a/ioport.c
+++ b/ioport.c
@@ -162,7 +162,8 @@ bool kvm__emulate_io(struct kvm_cpu *vcpu, u16 port, void 
*data, int direction,
 
entry = ioport_get(_tree, port);
if (!entry)
-   goto out;
+   return 

[PATCH kvmtool v3 03/22] ioport: Retire .generate_fdt_node functionality

2021-03-15 Thread Andre Przywara
The ioport routines support a special way of registering FDT node
generator functions. There is no reason to have this separate from the
already existing way via the device header.

Now that the only user of this special ioport variety has been
transferred, we can retire this code, to simplify ioport handling.

Signed-off-by: Andre Przywara 
Reviewed-by: Alexandru Elisei 
---
 include/kvm/ioport.h |  4 
 ioport.c | 34 --
 2 files changed, 38 deletions(-)

diff --git a/include/kvm/ioport.h b/include/kvm/ioport.h
index d0213541..a61038e2 100644
--- a/include/kvm/ioport.h
+++ b/include/kvm/ioport.h
@@ -29,10 +29,6 @@ struct ioport {
 struct ioport_operations {
bool (*io_in)(struct ioport *ioport, struct kvm_cpu *vcpu, u16 port, 
void *data, int size);
bool (*io_out)(struct ioport *ioport, struct kvm_cpu *vcpu, u16 port, 
void *data, int size);
-   void (*generate_fdt_node)(struct ioport *ioport, void *fdt,
- void (*generate_irq_prop)(void *fdt,
-   u8 irq,
-   enum irq_type));
 };
 
 void ioport__map_irq(u8 *irq);
diff --git a/ioport.c b/ioport.c
index a6972179..e0123f27 100644
--- a/ioport.c
+++ b/ioport.c
@@ -56,7 +56,6 @@ static struct ioport *ioport_get(struct rb_root *root, u64 
addr)
 /* Called with ioport_lock held. */
 static void ioport_unregister(struct rb_root *root, struct ioport *data)
 {
-   device__unregister(>dev_hdr);
ioport_remove(root, data);
free(data);
 }
@@ -70,30 +69,6 @@ static void ioport_put(struct rb_root *root, struct ioport 
*data)
mutex_unlock(_lock);
 }
 
-#ifdef CONFIG_HAS_LIBFDT
-static void generate_ioport_fdt_node(void *fdt,
-struct device_header *dev_hdr,
-void (*generate_irq_prop)(void *fdt,
-  u8 irq,
-  enum irq_type))
-{
-   struct ioport *ioport = container_of(dev_hdr, struct ioport, dev_hdr);
-   struct ioport_operations *ops = ioport->ops;
-
-   if (ops->generate_fdt_node)
-   ops->generate_fdt_node(ioport, fdt, generate_irq_prop);
-}
-#else
-static void generate_ioport_fdt_node(void *fdt,
-struct device_header *dev_hdr,
-void (*generate_irq_prop)(void *fdt,
-  u8 irq,
-  enum irq_type))
-{
-   die("Unable to generate device tree nodes without libfdt\n");
-}
-#endif
-
 int ioport__register(struct kvm *kvm, u16 port, struct ioport_operations *ops, 
int count, void *param)
 {
struct ioport *entry;
@@ -107,10 +82,6 @@ int ioport__register(struct kvm *kvm, u16 port, struct 
ioport_operations *ops, i
.node   = RB_INT_INIT(port, port + count),
.ops= ops,
.priv   = param,
-   .dev_hdr= (struct device_header) {
-   .bus_type   = DEVICE_BUS_IOPORT,
-   .data   = generate_ioport_fdt_node,
-   },
/*
 * Start from 0 because ioport__unregister() doesn't decrement
 * the reference count.
@@ -123,15 +94,10 @@ int ioport__register(struct kvm *kvm, u16 port, struct 
ioport_operations *ops, i
r = ioport_insert(_tree, entry);
if (r < 0)
goto out_free;
-   r = device__register(>dev_hdr);
-   if (r < 0)
-   goto out_remove;
mutex_unlock(_lock);
 
return port;
 
-out_remove:
-   ioport_remove(_tree, entry);
 out_free:
free(entry);
mutex_unlock(_lock);
-- 
2.17.5

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH kvmtool v3 01/22] ioport: Remove ioport__setup_arch()

2021-03-15 Thread Andre Przywara
Since x86 had a special need for registering tons of special I/O ports,
we had an ioport__setup_arch() callback, to allow each architecture
to do the same. As it turns out no one uses it beside x86, so we remove
that unnecessary abstraction.

The generic function was registered via a device_base_init() call, so
we just do the same for the x86 specific function only, and can remove
the unneeded ioport__setup_arch().

Signed-off-by: Andre Przywara 
Reviewed-by: Alexandru Elisei 
---
 arm/ioport.c | 5 -
 include/kvm/ioport.h | 1 -
 ioport.c | 6 --
 mips/kvm.c   | 5 -
 powerpc/ioport.c | 6 --
 x86/ioport.c | 3 ++-
 6 files changed, 2 insertions(+), 24 deletions(-)

diff --git a/arm/ioport.c b/arm/ioport.c
index 2f0feb9a..24092c9d 100644
--- a/arm/ioport.c
+++ b/arm/ioport.c
@@ -1,11 +1,6 @@
 #include "kvm/ioport.h"
 #include "kvm/irq.h"
 
-int ioport__setup_arch(struct kvm *kvm)
-{
-   return 0;
-}
-
 void ioport__map_irq(u8 *irq)
 {
*irq = irq__alloc_line();
diff --git a/include/kvm/ioport.h b/include/kvm/ioport.h
index 039633f7..d0213541 100644
--- a/include/kvm/ioport.h
+++ b/include/kvm/ioport.h
@@ -35,7 +35,6 @@ struct ioport_operations {
enum irq_type));
 };
 
-int ioport__setup_arch(struct kvm *kvm);
 void ioport__map_irq(u8 *irq);
 
 int __must_check ioport__register(struct kvm *kvm, u16 port, struct 
ioport_operations *ops,
diff --git a/ioport.c b/ioport.c
index 844a832d..a6972179 100644
--- a/ioport.c
+++ b/ioport.c
@@ -221,12 +221,6 @@ out:
return !kvm->cfg.ioport_debug;
 }
 
-int ioport__init(struct kvm *kvm)
-{
-   return ioport__setup_arch(kvm);
-}
-dev_base_init(ioport__init);
-
 int ioport__exit(struct kvm *kvm)
 {
ioport__unregister_all();
diff --git a/mips/kvm.c b/mips/kvm.c
index 26355930..e110e5d5 100644
--- a/mips/kvm.c
+++ b/mips/kvm.c
@@ -100,11 +100,6 @@ void kvm__irq_trigger(struct kvm *kvm, int irq)
die_perror("KVM_IRQ_LINE ioctl");
 }
 
-int ioport__setup_arch(struct kvm *kvm)
-{
-   return 0;
-}
-
 bool kvm__arch_cpu_supports_vm(void)
 {
return true;
diff --git a/powerpc/ioport.c b/powerpc/ioport.c
index 0c188b61..a5cff4ee 100644
--- a/powerpc/ioport.c
+++ b/powerpc/ioport.c
@@ -12,12 +12,6 @@
 
 #include 
 
-int ioport__setup_arch(struct kvm *kvm)
-{
-   /* PPC has no legacy ioports to set up */
-   return 0;
-}
-
 void ioport__map_irq(u8 *irq)
 {
 }
diff --git a/x86/ioport.c b/x86/ioport.c
index 7ad7b8f3..a8d2bb1a 100644
--- a/x86/ioport.c
+++ b/x86/ioport.c
@@ -69,7 +69,7 @@ void ioport__map_irq(u8 *irq)
 {
 }
 
-int ioport__setup_arch(struct kvm *kvm)
+static int ioport__setup_arch(struct kvm *kvm)
 {
int r;
 
@@ -150,3 +150,4 @@ int ioport__setup_arch(struct kvm *kvm)
 
return 0;
 }
+dev_base_init(ioport__setup_arch);
-- 
2.17.5

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH kvmtool v3 02/22] hw/serial: Use device abstraction for FDT generator function

2021-03-15 Thread Andre Przywara
At the moment we use the .generate_fdt_node member of the ioport ops
structure to store the function pointer for the FDT node generator
function. ioport__register() will then put a wrapper and this pointer
into the device header.
The serial device is the only device making use of this special ioport
feature, so let's move this over to using the device header directly.

This will allow us to get rid of this .generate_fdt_node member in the
ops and simplify the code.

Signed-off-by: Andre Przywara 
Reviewed-by: Alexandru Elisei 
---
 hw/serial.c   | 49 +--
 include/kvm/kvm.h |  2 ++
 2 files changed, 41 insertions(+), 10 deletions(-)

diff --git a/hw/serial.c b/hw/serial.c
index 13c4663e..b0465d99 100644
--- a/hw/serial.c
+++ b/hw/serial.c
@@ -23,6 +23,7 @@
 #define UART_IIR_TYPE_BITS 0xc0
 
 struct serial8250_device {
+   struct device_headerdev_hdr;
struct mutexmutex;
u8  id;
 
@@ -53,9 +54,20 @@ struct serial8250_device {
.msr= UART_MSR_DCD | UART_MSR_DSR | UART_MSR_CTS, \
.mcr= UART_MCR_OUT2,
 
+#ifdef CONFIG_HAS_LIBFDT
+static
+void serial8250_generate_fdt_node(void *fdt, struct device_header *dev_hdr,
+ fdt_irq_fn irq_fn);
+#else
+#define serial8250_generate_fdt_node   NULL
+#endif
 static struct serial8250_device devices[] = {
/* ttyS0 */
[0] = {
+   .dev_hdr = {
+   .bus_type   = DEVICE_BUS_IOPORT,
+   .data   = serial8250_generate_fdt_node,
+   },
.mutex  = MUTEX_INITIALIZER,
 
.id = 0,
@@ -66,6 +78,10 @@ static struct serial8250_device devices[] = {
},
/* ttyS1 */
[1] = {
+   .dev_hdr = {
+   .bus_type   = DEVICE_BUS_IOPORT,
+   .data   = serial8250_generate_fdt_node,
+   },
.mutex  = MUTEX_INITIALIZER,
 
.id = 1,
@@ -76,6 +92,10 @@ static struct serial8250_device devices[] = {
},
/* ttyS2 */
[2] = {
+   .dev_hdr = {
+   .bus_type   = DEVICE_BUS_IOPORT,
+   .data   = serial8250_generate_fdt_node,
+   },
.mutex  = MUTEX_INITIALIZER,
 
.id = 2,
@@ -86,6 +106,10 @@ static struct serial8250_device devices[] = {
},
/* ttyS3 */
[3] = {
+   .dev_hdr = {
+   .bus_type   = DEVICE_BUS_IOPORT,
+   .data   = serial8250_generate_fdt_node,
+   },
.mutex  = MUTEX_INITIALIZER,
 
.id = 3,
@@ -371,13 +395,14 @@ char *fdt_stdout_path = NULL;
 
 #define DEVICE_NAME_MAX_LEN 32
 static
-void serial8250_generate_fdt_node(struct ioport *ioport, void *fdt,
- void (*generate_irq_prop)(void *fdt,
-   u8 irq,
-   enum irq_type))
+void serial8250_generate_fdt_node(void *fdt, struct device_header *dev_hdr,
+ fdt_irq_fn irq_fn)
 {
char dev_name[DEVICE_NAME_MAX_LEN];
-   struct serial8250_device *dev = ioport->priv;
+   struct serial8250_device *dev = container_of(dev_hdr,
+struct serial8250_device,
+dev_hdr);
+
u64 addr = KVM_IOPORT_AREA + dev->iobase;
u64 reg_prop[] = {
cpu_to_fdt64(addr),
@@ -395,24 +420,26 @@ void serial8250_generate_fdt_node(struct ioport *ioport, 
void *fdt,
_FDT(fdt_begin_node(fdt, dev_name));
_FDT(fdt_property_string(fdt, "compatible", "ns16550a"));
_FDT(fdt_property(fdt, "reg", reg_prop, sizeof(reg_prop)));
-   generate_irq_prop(fdt, dev->irq, IRQ_TYPE_LEVEL_HIGH);
+   irq_fn(fdt, dev->irq, IRQ_TYPE_LEVEL_HIGH);
_FDT(fdt_property_cell(fdt, "clock-frequency", 1843200));
_FDT(fdt_end_node(fdt));
 }
-#else
-#define serial8250_generate_fdt_node   NULL
 #endif
 
 static struct ioport_operations serial8250_ops = {
.io_in  = serial8250_in,
.io_out = serial8250_out,
-   .generate_fdt_node  = serial8250_generate_fdt_node,
 };
 
-static int serial8250__device_init(struct kvm *kvm, struct serial8250_device 
*dev)
+static int serial8250__device_init(struct kvm *kvm,
+  struct serial8250_device *dev)
 {
int r;
 
+   r = device__register(>dev_hdr);
+   if (r < 0)
+   return 

[PATCH kvmtool v3 00/22] Unify I/O port and MMIO trap handling

2021-03-15 Thread Andre Przywara
Hi,

this version is addressing Alexandru's comments, fixing mostly minor
issues in the naming scheme. The biggest change is to keep the
ioport__read/ioport_write wrappers for the serial device.
For more details see the changelog below.
==

At the moment we use two separate code paths to handle exits for
KVM_EXIT_IO (ioport.c) and KVM_EXIT_MMIO (mmio.c), even though they
are semantically very similar. Because the trap handler callback routine
is different, devices need to decide on one conduit or need to provide
different handler functions for both of them.

This is not only unnecessary code duplication, but makes switching
devices from I/O port to MMIO a tedious task, even though there is no
real difference between the two, especially on ARM and PowerPC.

For ARM we aim at providing a flexible memory layout, and also have
trouble with the UART and RTC device overlapping with the PCI I/O area,
so it seems indicated to tackle this once and for all.

The first three patches do some cleanup, to simplify things later.

Patch 04/22 lays the groundwork, by extending mmio.c to be able to also
register I/O port trap handlers, using the same callback prototype as
we use for MMIO.

The next 14 patches then convert devices that use the I/O port
interface over to the new joint interface. This requires to rework
the trap handler routine to adhere to the same prototype as the existing
MMIO handlers. For most devices this is done in two steps: a first to
introduce the reworked handler routine, and a second to switch to the new
joint registration routine. For some devices the first step is trivial,
so it's done in one patch.

Patch 19/22 then retires the old I/O port interface, by removing ioport.c
and friends.
Patch 20/22 uses the opportunity to clean up the memory map description,
also declares a new region (from 16MB on), where the final two patches
switch the UART and the RTC device to. They are now registered
on the MMIO "bus", when running on ARM or arm64. This moves them away
from the first 64KB, so they are not in the PCI I/O area anymore.

Please have a look and comment!

Cheers,
Andre

Changelog v2 .. v3:
- use _io as function prefix for x86 I/O port devices
- retain ioport__{read,write}8() wrappers for serial device
- fix memory map ASCII art
- fix serial base declaration
- minor nit fixes
- add Reviewed-by: tags

Changelog v1 .. v2:
- rework memory map definition
- add explicit debug output for debug I/O port
- add explicit check for MMIO coalescing on I/O ports
- drop usage of ioport__{read,write}8() from serial
- drop explicit I/O port cleanup routine (to mimic MMIO operation)
- add comment for IOTRAP_BUS_MASK
- minor cleanups / formatting changes


Andre Przywara (22):
  ioport: Remove ioport__setup_arch()
  hw/serial: Use device abstraction for FDT generator function
  ioport: Retire .generate_fdt_node functionality
  mmio: Extend handling to include ioport emulation
  hw/i8042: Clean up data types
  hw/i8042: Refactor trap handler
  hw/i8042: Switch to new trap handlers
  x86/ioport: Refactor trap handlers
  x86/ioport: Switch to new trap handlers
  hw/rtc: Refactor trap handlers
  hw/rtc: Switch to new trap handler
  hw/vesa: Switch trap handling to use MMIO handler
  hw/serial: Refactor trap handler
  hw/serial: Switch to new trap handlers
  vfio: Refactor ioport trap handler
  vfio: Switch to new ioport trap handlers
  virtio: Switch trap handling to use MMIO handler
  pci: Switch trap handling to use MMIO handler
  Remove ioport specific routines
  arm: Reorganise and document memory map
  hw/serial: ARM/arm64: Use MMIO at higher addresses
  hw/rtc: ARM/arm64: Use MMIO at higher addresses

 Makefile  |   1 -
 arm/include/arm-common/kvm-arch.h |  47 --
 arm/ioport.c  |   5 -
 hw/i8042.c|  94 +---
 hw/rtc.c  |  91 ++--
 hw/serial.c   | 126 +++-
 hw/vesa.c |  19 +--
 include/kvm/i8042.h   |   1 -
 include/kvm/ioport.h  |  32 
 include/kvm/kvm.h |  49 ++-
 ioport.c  | 235 --
 mips/kvm.c|   5 -
 mmio.c|  65 +++--
 pci.c |  82 +++
 powerpc/ioport.c  |   6 -
 vfio/core.c   |  50 ---
 virtio/pci.c  |  46 ++
 x86/ioport.c  | 108 +++---
 18 files changed, 421 insertions(+), 641 deletions(-)
 delete mode 100644 ioport.c

-- 
2.17.5

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 7/8] KVM: arm64: Workaround firmware wrongly advertising GICv2-on-v3 compatibility

2021-03-15 Thread Marc Zyngier
On Mon, 15 Mar 2021 12:55:42 +,
Shameerali Kolothum Thodi  wrote:
> 
> 
> 
> > -Original Message-
> > From: Marc Zyngier [mailto:m...@kernel.org]
> > Sent: 05 March 2021 18:53
> > To: Paolo Bonzini 
> > Cc: Alexandru Elisei ; Andre Przywara
> > ; Andrew Scull ; Catalin
> > Marinas ; Christoffer Dall
> > ; Howard Zhang ; Jia
> > He ; Mark Rutland ; Quentin
> > Perret ; Shameerali Kolothum Thodi
> > ; Suzuki K Poulose
> > ; Will Deacon ; James Morse
> > ; Julien Thierry ;
> > kernel-t...@android.com; linux-arm-ker...@lists.infradead.org;
> > kvmarm@lists.cs.columbia.edu; k...@vger.kernel.org
> > Subject: [PATCH 7/8] KVM: arm64: Workaround firmware wrongly advertising
> > GICv2-on-v3 compatibility
> > 
> > It looks like we have broken firmware out there that wrongly advertises
> > a GICv2 compatibility interface, despite the CPUs not being able to deal
> > with it.
> > 
> > To work around this, check that the CPU initialising KVM is actually able
> > to switch to MMIO instead of system registers, and use that as a
> > precondition to enable GICv2 compatibility in KVM.
> > 
> > Note that the detection happens on a single CPU. If the firmware is
> > lying *and* that the CPUs are asymetric, all hope is lost anyway.
> > 
> > Reported-by: Shameerali Kolothum Thodi
> > 
> > Tested-by: Shameer Kolothum 
> > Signed-off-by: Marc Zyngier 
> 
> Is it possible to add stable tag for this? Looks like we do have
> systems out there and reports issues.

It is already merged. Which kernel versions do you need that for? In
any case, please submit the backports, and I'll review them.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 5.11 306/306] KVM: arm64: Fix nVHE hyp panic host context restore

2021-03-15 Thread gregkh
From: Greg Kroah-Hartman 

From: Andrew Scull 

Commit c4b000c3928d4f20acef79dccf3a65ae3795e0b0 upstream.

When panicking from the nVHE hyp and restoring the host context, x29 is
expected to hold a pointer to the host context. This wasn't being done
so fix it to make sure there's a valid pointer the host context being
used.

Rather than passing a boolean indicating whether or not the host context
should be restored, instead pass the pointer to the host context. NULL
is passed to indicate that no context should be restored.

Fixes: a2e102e20fd6 ("KVM: arm64: nVHE: Handle hyp panics")
Cc: sta...@vger.kernel.org # 5.11.y only
Signed-off-by: Andrew Scull 
Signed-off-by: Marc Zyngier 
Link: https://lore.kernel.org/r/20210219122406.1337626-1-asc...@google.com
Signed-off-by: Greg Kroah-Hartman 
---
 arch/arm64/include/asm/kvm_hyp.h |3 ++-
 arch/arm64/kvm/hyp/nvhe/host.S   |   20 ++--
 arch/arm64/kvm/hyp/nvhe/switch.c |3 +--
 3 files changed, 13 insertions(+), 13 deletions(-)

--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -102,7 +102,8 @@ bool kvm_host_psci_handler(struct kvm_cp
 
 void __noreturn hyp_panic(void);
 #ifdef __KVM_NVHE_HYPERVISOR__
-void __noreturn __hyp_do_panic(bool restore_host, u64 spsr, u64 elr, u64 par);
+void __noreturn __hyp_do_panic(struct kvm_cpu_context *host_ctxt, u64 spsr,
+  u64 elr, u64 par);
 #endif
 
 #endif /* __ARM64_KVM_HYP_H__ */
--- a/arch/arm64/kvm/hyp/nvhe/host.S
+++ b/arch/arm64/kvm/hyp/nvhe/host.S
@@ -71,10 +71,15 @@ SYM_FUNC_START(__host_enter)
 SYM_FUNC_END(__host_enter)
 
 /*
- * void __noreturn __hyp_do_panic(bool restore_host, u64 spsr, u64 elr, u64 
par);
+ * void __noreturn __hyp_do_panic(struct kvm_cpu_context *host_ctxt, u64 spsr,
+ *   u64 elr, u64 par);
  */
 SYM_FUNC_START(__hyp_do_panic)
-   /* Load the format arguments into x1-7 */
+   mov x29, x0
+
+   /* Load the format string into x0 and arguments into x1-7 */
+   ldr x0, =__hyp_panic_string
+
mov x6, x3
get_vcpu_ptr x7, x3
 
@@ -89,13 +94,8 @@ SYM_FUNC_START(__hyp_do_panic)
ldr lr, =panic
msr elr_el2, lr
 
-   /*
-* Set the panic format string and enter the host, conditionally
-* restoring the host context.
-*/
-   cmp x0, xzr
-   ldr x0, =__hyp_panic_string
-   b.eq__host_enter_without_restoring
+   /* Enter the host, conditionally restoring the host context. */
+   cbz x29, __host_enter_without_restoring
b   __host_enter_for_panic
 SYM_FUNC_END(__hyp_do_panic)
 
@@ -150,7 +150,7 @@ SYM_FUNC_END(__hyp_do_panic)
 
 .macro invalid_host_el1_vect
.align 7
-   mov x0, xzr /* restore_host = false */
+   mov x0, xzr /* host_ctxt = NULL */
mrs x1, spsr_el2
mrs x2, elr_el2
mrs x3, par_el1
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -266,7 +266,6 @@ void __noreturn hyp_panic(void)
u64 spsr = read_sysreg_el2(SYS_SPSR);
u64 elr = read_sysreg_el2(SYS_ELR);
u64 par = read_sysreg_par();
-   bool restore_host = true;
struct kvm_cpu_context *host_ctxt;
struct kvm_vcpu *vcpu;
 
@@ -280,7 +279,7 @@ void __noreturn hyp_panic(void)
__sysreg_restore_state_nvhe(host_ctxt);
}
 
-   __hyp_do_panic(restore_host, spsr, elr, par);
+   __hyp_do_panic(host_ctxt, spsr, elr, par);
unreachable();
 }
 


___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 5.10 289/290] KVM: arm64: Fix nVHE hyp panic host context restore

2021-03-15 Thread gregkh
From: Greg Kroah-Hartman 

From: Andrew Scull 

Commit c4b000c3928d4f20acef79dccf3a65ae3795e0b0 upstream.

When panicking from the nVHE hyp and restoring the host context, x29 is
expected to hold a pointer to the host context. This wasn't being done
so fix it to make sure there's a valid pointer the host context being
used.

Rather than passing a boolean indicating whether or not the host context
should be restored, instead pass the pointer to the host context. NULL
is passed to indicate that no context should be restored.

Fixes: a2e102e20fd6 ("KVM: arm64: nVHE: Handle hyp panics")
Cc: sta...@vger.kernel.org # 5.10.y only
Signed-off-by: Andrew Scull 
Signed-off-by: Marc Zyngier 
Link: https://lore.kernel.org/r/20210219122406.1337626-1-asc...@google.com
Signed-off-by: Greg Kroah-Hartman 
---
 arch/arm64/include/asm/kvm_hyp.h |3 ++-
 arch/arm64/kvm/hyp/nvhe/host.S   |   20 ++--
 arch/arm64/kvm/hyp/nvhe/switch.c |3 +--
 3 files changed, 13 insertions(+), 13 deletions(-)

--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -99,7 +99,8 @@ u64 __guest_enter(struct kvm_vcpu *vcpu)
 
 void __noreturn hyp_panic(void);
 #ifdef __KVM_NVHE_HYPERVISOR__
-void __noreturn __hyp_do_panic(bool restore_host, u64 spsr, u64 elr, u64 par);
+void __noreturn __hyp_do_panic(struct kvm_cpu_context *host_ctxt, u64 spsr,
+  u64 elr, u64 par);
 #endif
 
 #endif /* __ARM64_KVM_HYP_H__ */
--- a/arch/arm64/kvm/hyp/nvhe/host.S
+++ b/arch/arm64/kvm/hyp/nvhe/host.S
@@ -64,10 +64,15 @@ __host_enter_without_restoring:
 SYM_FUNC_END(__host_exit)
 
 /*
- * void __noreturn __hyp_do_panic(bool restore_host, u64 spsr, u64 elr, u64 
par);
+ * void __noreturn __hyp_do_panic(struct kvm_cpu_context *host_ctxt, u64 spsr,
+ *   u64 elr, u64 par);
  */
 SYM_FUNC_START(__hyp_do_panic)
-   /* Load the format arguments into x1-7 */
+   mov x29, x0
+
+   /* Load the format string into x0 and arguments into x1-7 */
+   ldr x0, =__hyp_panic_string
+
mov x6, x3
get_vcpu_ptr x7, x3
 
@@ -82,13 +87,8 @@ SYM_FUNC_START(__hyp_do_panic)
ldr lr, =panic
msr elr_el2, lr
 
-   /*
-* Set the panic format string and enter the host, conditionally
-* restoring the host context.
-*/
-   cmp x0, xzr
-   ldr x0, =__hyp_panic_string
-   b.eq__host_enter_without_restoring
+   /* Enter the host, conditionally restoring the host context. */
+   cbz x29, __host_enter_without_restoring
b   __host_enter_for_panic
 SYM_FUNC_END(__hyp_do_panic)
 
@@ -144,7 +144,7 @@ SYM_FUNC_END(__hyp_do_panic)
 
 .macro invalid_host_el1_vect
.align 7
-   mov x0, xzr /* restore_host = false */
+   mov x0, xzr /* host_ctxt = NULL */
mrs x1, spsr_el2
mrs x2, elr_el2
mrs x3, par_el1
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -260,7 +260,6 @@ void __noreturn hyp_panic(void)
u64 spsr = read_sysreg_el2(SYS_SPSR);
u64 elr = read_sysreg_el2(SYS_ELR);
u64 par = read_sysreg_par();
-   bool restore_host = true;
struct kvm_cpu_context *host_ctxt;
struct kvm_vcpu *vcpu;
 
@@ -274,7 +273,7 @@ void __noreturn hyp_panic(void)
__sysreg_restore_state_nvhe(host_ctxt);
}
 
-   __hyp_do_panic(restore_host, spsr, elr, par);
+   __hyp_do_panic(host_ctxt, spsr, elr, par);
unreachable();
 }
 


___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


RE: [PATCH 7/8] KVM: arm64: Workaround firmware wrongly advertising GICv2-on-v3 compatibility

2021-03-15 Thread Shameerali Kolothum Thodi



> -Original Message-
> From: Marc Zyngier [mailto:m...@kernel.org]
> Sent: 05 March 2021 18:53
> To: Paolo Bonzini 
> Cc: Alexandru Elisei ; Andre Przywara
> ; Andrew Scull ; Catalin
> Marinas ; Christoffer Dall
> ; Howard Zhang ; Jia
> He ; Mark Rutland ; Quentin
> Perret ; Shameerali Kolothum Thodi
> ; Suzuki K Poulose
> ; Will Deacon ; James Morse
> ; Julien Thierry ;
> kernel-t...@android.com; linux-arm-ker...@lists.infradead.org;
> kvmarm@lists.cs.columbia.edu; k...@vger.kernel.org
> Subject: [PATCH 7/8] KVM: arm64: Workaround firmware wrongly advertising
> GICv2-on-v3 compatibility
> 
> It looks like we have broken firmware out there that wrongly advertises
> a GICv2 compatibility interface, despite the CPUs not being able to deal
> with it.
> 
> To work around this, check that the CPU initialising KVM is actually able
> to switch to MMIO instead of system registers, and use that as a
> precondition to enable GICv2 compatibility in KVM.
> 
> Note that the detection happens on a single CPU. If the firmware is
> lying *and* that the CPUs are asymetric, all hope is lost anyway.
> 
> Reported-by: Shameerali Kolothum Thodi
> 
> Tested-by: Shameer Kolothum 
> Signed-off-by: Marc Zyngier 

Is it possible to add stable tag for this? Looks like we do have systems out 
there
and reports issues.

Thanks,
Shameer

> ---
>  arch/arm64/kvm/hyp/vgic-v3-sr.c | 35 +++--
>  arch/arm64/kvm/vgic/vgic-v3.c   |  8 ++--
>  2 files changed, 39 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c
> b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> index 005daa0c9dd7..ee3682b9873c 100644
> --- a/arch/arm64/kvm/hyp/vgic-v3-sr.c
> +++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> @@ -408,11 +408,42 @@ void __vgic_v3_init_lrs(void)
>  /*
>   * Return the GIC CPU configuration:
>   * - [31:0]  ICH_VTR_EL2
> - * - [63:32] RES0
> + * - [62:32] RES0
> + * - [63]MMIO (GICv2) capable
>   */
>  u64 __vgic_v3_get_gic_config(void)
>  {
> - return read_gicreg(ICH_VTR_EL2);
> + u64 val, sre = read_gicreg(ICC_SRE_EL1);
> + unsigned long flags = 0;
> +
> + /*
> +  * To check whether we have a MMIO-based (GICv2 compatible)
> +  * CPU interface, we need to disable the system register
> +  * view. To do that safely, we have to prevent any interrupt
> +  * from firing (which would be deadly).
> +  *
> +  * Note that this only makes sense on VHE, as interrupts are
> +  * already masked for nVHE as part of the exception entry to
> +  * EL2.
> +  */
> + if (has_vhe())
> + flags = local_daif_save();
> +
> + write_gicreg(0, ICC_SRE_EL1);
> + isb();
> +
> + val = read_gicreg(ICC_SRE_EL1);
> +
> + write_gicreg(sre, ICC_SRE_EL1);
> + isb();
> +
> + if (has_vhe())
> + local_daif_restore(flags);
> +
> + val  = (val & ICC_SRE_EL1_SRE) ? 0 : (1ULL << 63);
> + val |= read_gicreg(ICH_VTR_EL2);
> +
> + return val;
>  }
> 
>  u64 __vgic_v3_read_vmcr(void)
> diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c
> index c3e6c3fd333b..6f530925a231 100644
> --- a/arch/arm64/kvm/vgic/vgic-v3.c
> +++ b/arch/arm64/kvm/vgic/vgic-v3.c
> @@ -575,8 +575,10 @@ early_param("kvm-arm.vgic_v4_enable",
> early_gicv4_enable);
>  int vgic_v3_probe(const struct gic_kvm_info *info)
>  {
>   u64 ich_vtr_el2 = kvm_call_hyp_ret(__vgic_v3_get_gic_config);
> + bool has_v2;
>   int ret;
> 
> + has_v2 = ich_vtr_el2 >> 63;
>   ich_vtr_el2 = (u32)ich_vtr_el2;
> 
>   /*
> @@ -596,13 +598,15 @@ int vgic_v3_probe(const struct gic_kvm_info *info)
>gicv4_enable ? "en" : "dis");
>   }
> 
> + kvm_vgic_global_state.vcpu_base = 0;
> +
>   if (!info->vcpu.start) {
>   kvm_info("GICv3: no GICV resource entry\n");
> - kvm_vgic_global_state.vcpu_base = 0;
> + } else if (!has_v2) {
> + pr_warn(FW_BUG "CPU interface incapable of MMIO access\n");
>   } else if (!PAGE_ALIGNED(info->vcpu.start)) {
>   pr_warn("GICV physical address 0x%llx not page aligned\n",
>   (unsigned long long)info->vcpu.start);
> - kvm_vgic_global_state.vcpu_base = 0;
>   } else {
>   kvm_vgic_global_state.vcpu_base = info->vcpu.start;
>   kvm_vgic_global_state.can_emulate_gicv2 = true;
> --
> 2.29.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Patch "KVM: arm64: Fix nVHE hyp panic host context restore" has been added to the 5.11-stable tree

2021-03-15 Thread gregkh


This is a note to let you know that I've just added the patch titled

KVM: arm64: Fix nVHE hyp panic host context restore

to the 5.11-stable tree which can be found at:

http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
 kvm-arm64-fix-nvhe-hyp-panic-host-context-restore.patch
and it can be found in the queue-5.11 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let  know about it.


>From foo@baz Mon Mar 15 01:37:54 PM CET 2021
From: Andrew Scull 
Date: Mon, 15 Mar 2021 12:21:36 +
Subject: KVM: arm64: Fix nVHE hyp panic host context restore
To: kvmarm@lists.cs.columbia.edu
Cc: m...@kernel.org, kernel-t...@android.com, Andrew Scull , 
sta...@vger.kernel.org
Message-ID: <20210315122136.1687370-1-asc...@google.com>

From: Andrew Scull 

Commit c4b000c3928d4f20acef79dccf3a65ae3795e0b0 upstream.

When panicking from the nVHE hyp and restoring the host context, x29 is
expected to hold a pointer to the host context. This wasn't being done
so fix it to make sure there's a valid pointer the host context being
used.

Rather than passing a boolean indicating whether or not the host context
should be restored, instead pass the pointer to the host context. NULL
is passed to indicate that no context should be restored.

Fixes: a2e102e20fd6 ("KVM: arm64: nVHE: Handle hyp panics")
Cc: sta...@vger.kernel.org # 5.11.y only
Signed-off-by: Andrew Scull 
Signed-off-by: Marc Zyngier 
Link: https://lore.kernel.org/r/20210219122406.1337626-1-asc...@google.com
Signed-off-by: Greg Kroah-Hartman 
---
 arch/arm64/include/asm/kvm_hyp.h |3 ++-
 arch/arm64/kvm/hyp/nvhe/host.S   |   20 ++--
 arch/arm64/kvm/hyp/nvhe/switch.c |3 +--
 3 files changed, 13 insertions(+), 13 deletions(-)

--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -102,7 +102,8 @@ bool kvm_host_psci_handler(struct kvm_cp
 
 void __noreturn hyp_panic(void);
 #ifdef __KVM_NVHE_HYPERVISOR__
-void __noreturn __hyp_do_panic(bool restore_host, u64 spsr, u64 elr, u64 par);
+void __noreturn __hyp_do_panic(struct kvm_cpu_context *host_ctxt, u64 spsr,
+  u64 elr, u64 par);
 #endif
 
 #endif /* __ARM64_KVM_HYP_H__ */
--- a/arch/arm64/kvm/hyp/nvhe/host.S
+++ b/arch/arm64/kvm/hyp/nvhe/host.S
@@ -71,10 +71,15 @@ SYM_FUNC_START(__host_enter)
 SYM_FUNC_END(__host_enter)
 
 /*
- * void __noreturn __hyp_do_panic(bool restore_host, u64 spsr, u64 elr, u64 
par);
+ * void __noreturn __hyp_do_panic(struct kvm_cpu_context *host_ctxt, u64 spsr,
+ *   u64 elr, u64 par);
  */
 SYM_FUNC_START(__hyp_do_panic)
-   /* Load the format arguments into x1-7 */
+   mov x29, x0
+
+   /* Load the format string into x0 and arguments into x1-7 */
+   ldr x0, =__hyp_panic_string
+
mov x6, x3
get_vcpu_ptr x7, x3
 
@@ -89,13 +94,8 @@ SYM_FUNC_START(__hyp_do_panic)
ldr lr, =panic
msr elr_el2, lr
 
-   /*
-* Set the panic format string and enter the host, conditionally
-* restoring the host context.
-*/
-   cmp x0, xzr
-   ldr x0, =__hyp_panic_string
-   b.eq__host_enter_without_restoring
+   /* Enter the host, conditionally restoring the host context. */
+   cbz x29, __host_enter_without_restoring
b   __host_enter_for_panic
 SYM_FUNC_END(__hyp_do_panic)
 
@@ -150,7 +150,7 @@ SYM_FUNC_END(__hyp_do_panic)
 
 .macro invalid_host_el1_vect
.align 7
-   mov x0, xzr /* restore_host = false */
+   mov x0, xzr /* host_ctxt = NULL */
mrs x1, spsr_el2
mrs x2, elr_el2
mrs x3, par_el1
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -266,7 +266,6 @@ void __noreturn hyp_panic(void)
u64 spsr = read_sysreg_el2(SYS_SPSR);
u64 elr = read_sysreg_el2(SYS_ELR);
u64 par = read_sysreg_par();
-   bool restore_host = true;
struct kvm_cpu_context *host_ctxt;
struct kvm_vcpu *vcpu;
 
@@ -280,7 +279,7 @@ void __noreturn hyp_panic(void)
__sysreg_restore_state_nvhe(host_ctxt);
}
 
-   __hyp_do_panic(restore_host, spsr, elr, par);
+   __hyp_do_panic(host_ctxt, spsr, elr, par);
unreachable();
 }
 


Patches currently in stable-queue which might be from asc...@google.com are

queue-5.11/kvm-arm64-fix-nvhe-hyp-panic-host-context-restore.patch
queue-5.11/kvm-arm64-avoid-corrupting-vcpu-context-register-in-guest-exit.patch
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH] KVM: arm64: Fix nVHE hyp panic host context restore

2021-03-15 Thread Greg KH
On Mon, Mar 15, 2021 at 12:21:36PM +, Andrew Scull wrote:
> Commit c4b000c3928d4f20acef79dccf3a65ae3795e0b0 upstream.
> 
> When panicking from the nVHE hyp and restoring the host context, x29 is
> expected to hold a pointer to the host context. This wasn't being done
> so fix it to make sure there's a valid pointer the host context being
> used.
> 
> Rather than passing a boolean indicating whether or not the host context
> should be restored, instead pass the pointer to the host context. NULL
> is passed to indicate that no context should be restored.
> 
> Fixes: a2e102e20fd6 ("KVM: arm64: nVHE: Handle hyp panics")
> Cc: sta...@vger.kernel.org # 5.11.y only
> Signed-off-by: Andrew Scull 
> Signed-off-by: Marc Zyngier 
> Link: https://lore.kernel.org/r/20210219122406.1337626-1-asc...@google.com
> ---
>  arch/arm64/include/asm/kvm_hyp.h |  3 ++-
>  arch/arm64/kvm/hyp/nvhe/host.S   | 20 ++--
>  arch/arm64/kvm/hyp/nvhe/switch.c |  3 +--
>  3 files changed, 13 insertions(+), 13 deletions(-)

Both backports now queued up, thanks.

greg k-h
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Patch "KVM: arm64: Fix nVHE hyp panic host context restore" has been added to the 5.10-stable tree

2021-03-15 Thread gregkh


This is a note to let you know that I've just added the patch titled

KVM: arm64: Fix nVHE hyp panic host context restore

to the 5.10-stable tree which can be found at:

http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
 kvm-arm64-fix-nvhe-hyp-panic-host-context-restore.patch
and it can be found in the queue-5.10 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let  know about it.


>From foo@baz Mon Mar 15 01:38:17 PM CET 2021
From: Andrew Scull 
Date: Mon, 15 Mar 2021 12:22:10 +
Subject: KVM: arm64: Fix nVHE hyp panic host context restore
To: kvmarm@lists.cs.columbia.edu
Cc: m...@kernel.org, kernel-t...@android.com, Andrew Scull , 
sta...@vger.kernel.org
Message-ID: <20210315122210.1688894-1-asc...@google.com>

From: Andrew Scull 

Commit c4b000c3928d4f20acef79dccf3a65ae3795e0b0 upstream.

When panicking from the nVHE hyp and restoring the host context, x29 is
expected to hold a pointer to the host context. This wasn't being done
so fix it to make sure there's a valid pointer the host context being
used.

Rather than passing a boolean indicating whether or not the host context
should be restored, instead pass the pointer to the host context. NULL
is passed to indicate that no context should be restored.

Fixes: a2e102e20fd6 ("KVM: arm64: nVHE: Handle hyp panics")
Cc: sta...@vger.kernel.org # 5.10.y only
Signed-off-by: Andrew Scull 
Signed-off-by: Marc Zyngier 
Link: https://lore.kernel.org/r/20210219122406.1337626-1-asc...@google.com
Signed-off-by: Greg Kroah-Hartman 
---
 arch/arm64/include/asm/kvm_hyp.h |3 ++-
 arch/arm64/kvm/hyp/nvhe/host.S   |   20 ++--
 arch/arm64/kvm/hyp/nvhe/switch.c |3 +--
 3 files changed, 13 insertions(+), 13 deletions(-)

--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -99,7 +99,8 @@ u64 __guest_enter(struct kvm_vcpu *vcpu)
 
 void __noreturn hyp_panic(void);
 #ifdef __KVM_NVHE_HYPERVISOR__
-void __noreturn __hyp_do_panic(bool restore_host, u64 spsr, u64 elr, u64 par);
+void __noreturn __hyp_do_panic(struct kvm_cpu_context *host_ctxt, u64 spsr,
+  u64 elr, u64 par);
 #endif
 
 #endif /* __ARM64_KVM_HYP_H__ */
--- a/arch/arm64/kvm/hyp/nvhe/host.S
+++ b/arch/arm64/kvm/hyp/nvhe/host.S
@@ -64,10 +64,15 @@ __host_enter_without_restoring:
 SYM_FUNC_END(__host_exit)
 
 /*
- * void __noreturn __hyp_do_panic(bool restore_host, u64 spsr, u64 elr, u64 
par);
+ * void __noreturn __hyp_do_panic(struct kvm_cpu_context *host_ctxt, u64 spsr,
+ *   u64 elr, u64 par);
  */
 SYM_FUNC_START(__hyp_do_panic)
-   /* Load the format arguments into x1-7 */
+   mov x29, x0
+
+   /* Load the format string into x0 and arguments into x1-7 */
+   ldr x0, =__hyp_panic_string
+
mov x6, x3
get_vcpu_ptr x7, x3
 
@@ -82,13 +87,8 @@ SYM_FUNC_START(__hyp_do_panic)
ldr lr, =panic
msr elr_el2, lr
 
-   /*
-* Set the panic format string and enter the host, conditionally
-* restoring the host context.
-*/
-   cmp x0, xzr
-   ldr x0, =__hyp_panic_string
-   b.eq__host_enter_without_restoring
+   /* Enter the host, conditionally restoring the host context. */
+   cbz x29, __host_enter_without_restoring
b   __host_enter_for_panic
 SYM_FUNC_END(__hyp_do_panic)
 
@@ -144,7 +144,7 @@ SYM_FUNC_END(__hyp_do_panic)
 
 .macro invalid_host_el1_vect
.align 7
-   mov x0, xzr /* restore_host = false */
+   mov x0, xzr /* host_ctxt = NULL */
mrs x1, spsr_el2
mrs x2, elr_el2
mrs x3, par_el1
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -260,7 +260,6 @@ void __noreturn hyp_panic(void)
u64 spsr = read_sysreg_el2(SYS_SPSR);
u64 elr = read_sysreg_el2(SYS_ELR);
u64 par = read_sysreg_par();
-   bool restore_host = true;
struct kvm_cpu_context *host_ctxt;
struct kvm_vcpu *vcpu;
 
@@ -274,7 +273,7 @@ void __noreturn hyp_panic(void)
__sysreg_restore_state_nvhe(host_ctxt);
}
 
-   __hyp_do_panic(restore_host, spsr, elr, par);
+   __hyp_do_panic(host_ctxt, spsr, elr, par);
unreachable();
 }
 


Patches currently in stable-queue which might be from asc...@google.com are

queue-5.10/kvm-arm64-fix-nvhe-hyp-panic-host-context-restore.patch
queue-5.10/kvm-arm64-avoid-corrupting-vcpu-context-register-in-guest-exit.patch
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH kvmtool v2 21/22] hw/serial: ARM/arm64: Use MMIO at higher addresses

2021-03-15 Thread Andre Przywara
On Tue, 9 Mar 2021 16:02:20 +
Alexandru Elisei  wrote:

> Hi Andre,
> 
> I think you forgot to change the way the address is generated in
> serial8250_generate_fdt_node, it's still KVM_IOPORT_AREA + dev->iobase. It's
> technically correct, as KVM_IOPORT_AREA == ARM_IOPORT_AREA == 0x0, but very
> confusing (and prone to breakage is something changes in the memory layout).

So I moved the addition of KVM_IOPORT_AREA into the definition of
serial_iobase() (at the beginning of the file). This means I can remove
it below, and just use dev->iobase as is.

> One more comment below.
> 
> On 2/25/21 12:59 AM, Andre Przywara wrote:
> > Using the UART devices at their legacy I/O addresses as set by IBM in
> > 1981 was a kludge we used for simplicity on ARM platforms as well.
> > However this imposes problems due to their missing alignment and overlap
> > with the PCI I/O address space.
> >
> > Now that we can switch a device easily between using ioports and MMIO,
> > let's move the UARTs out of the first 4K of memory on ARM platforms.
> >
> > That should be transparent for well behaved guests, since the change is
> > naturally reflected in the device tree. Even "earlycon" keeps working,
> > as the stdout-path property is adjusted automatically.
> >
> > People providing direct earlycon parameters via the command line need to
> > adjust it to: "earlycon=uart,mmio,0x100".
> >
> > Signed-off-by: Andre Przywara 
> > ---
> >  arm/include/arm-common/kvm-arch.h |  3 +++
> >  hw/serial.c   | 45 ---
> >  2 files changed, 32 insertions(+), 16 deletions(-)
> >
> > diff --git a/arm/include/arm-common/kvm-arch.h 
> > b/arm/include/arm-common/kvm-arch.h
> > index b12255b0..633ea8fa 100644
> > --- a/arm/include/arm-common/kvm-arch.h
> > +++ b/arm/include/arm-common/kvm-arch.h
> > @@ -28,6 +28,9 @@
> >  #define ARM_IOPORT_SIZE(1U << 16)
> >  
> >  
> > +#define ARM_UART_MMIO_BASE ARM_MMIO_AREA
> > +#define ARM_UART_MMIO_SIZE 0x1
> > +
> >  #define KVM_FLASH_MMIO_BASE(ARM_MMIO_AREA + 0x100)
> >  #define KVM_FLASH_MAX_SIZE 0x100
> >  
> > diff --git a/hw/serial.c b/hw/serial.c
> > index 4be188a1..1854add2 100644
> > --- a/hw/serial.c
> > +++ b/hw/serial.c
> > @@ -13,6 +13,17 @@
> >  
> >  #include 
> >  
> > +#if defined(CONFIG_ARM) || defined(CONFIG_ARM64)
> > +#define serial_iobase(nr)  (ARM_UART_MMIO_BASE + (nr) * 0x1000)
> > +#define serial_irq(nr) (32 + (nr))
> > +#define SERIAL8250_BUS_TYPEDEVICE_BUS_MMIO
> > +#else
> > +#define serial_iobase(nr)  nr) & 1) ? 0x200 : 0x300) + \
> > +((nr) >= 2 ? 0xe8 : 0xf8))
> > +#define serial_irq(nr) (((nr) & 1) ? 3 : 4)  
> 
> Those two defines are hard to read, is there a reason for changing them from 
> v1?
> They looked a lot more readable in v1.

Oh, good catch, must have been a rebase artefact, from the very first
draft version I had. Fixed that.

Cheers,
Andre

> 
> > +#define SERIAL8250_BUS_TYPEDEVICE_BUS_IOPORT
> > +#endif
> > +
> >  /*
> >   * This fakes a U6_16550A. The fifo len needs to be 64 as the kernel
> >   * expects that for autodetection.
> > @@ -27,7 +38,7 @@ struct serial8250_device {
> > struct mutexmutex;
> > u8  id;
> >  
> > -   u16 iobase;
> > +   u32 iobase;
> > u8  irq;
> > u8  irq_state;
> > int txcnt;
> > @@ -65,56 +76,56 @@ static struct serial8250_device devices[] = {
> > /* ttyS0 */
> > [0] = {
> > .dev_hdr = {
> > -   .bus_type   = DEVICE_BUS_IOPORT,
> > +   .bus_type   = SERIAL8250_BUS_TYPE,
> > .data   = serial8250_generate_fdt_node,
> > },
> > .mutex  = MUTEX_INITIALIZER,
> >  
> > .id = 0,
> > -   .iobase = 0x3f8,
> > -   .irq= 4,
> > +   .iobase = serial_iobase(0),
> > +   .irq= serial_irq(0),
> >  
> > SERIAL_REGS_SETTING
> > },
> > /* ttyS1 */
> > [1] = {
> > .dev_hdr = {
> > -   .bus_type   = DEVICE_BUS_IOPORT,
> > +   .bus_type   = SERIAL8250_BUS_TYPE,
> > .data   = serial8250_generate_fdt_node,
> > },
> > .mutex  = MUTEX_INITIALIZER,
> >  
> > .id = 1,
> > -   .iobase = 0x2f8,
> > -   .irq= 3,
> > +   .iobase = serial_iobase(1),
> > +   .irq= serial_irq(1),
> >  
> > SERIAL_REGS_SETTING
> > },
> > /* ttyS2 */
> > [2] = {
> > .dev_hdr = {
> > -   

Re: [PATCH kvmtool v2 20/22] arm: Reorganise and document memory map

2021-03-15 Thread Andre Przywara
On Tue, 9 Mar 2021 15:46:29 +
Alexandru Elisei  wrote:

Hi,

> Hi Andre,
> 
> This is a really good idea, thank you for implementing it!
> 
> Some comments below.
> 
> On 2/25/21 12:59 AM, Andre Przywara wrote:
> > The hardcoded memory map we expose to a guest is currently described
> > using a series of partially interconnected preprocessor constants,
> > which is hard to read and follow.
> >
> > In preparation for moving the UART and RTC to some different MMIO
> > region, document the current map with some ASCII art, and clean up the
> > definition of the sections.
> >
> > No functional change.
> >
> > Signed-off-by: Andre Przywara 
> > ---
> >  arm/include/arm-common/kvm-arch.h | 41 ++-
> >  1 file changed, 29 insertions(+), 12 deletions(-)
> >
> > diff --git a/arm/include/arm-common/kvm-arch.h 
> > b/arm/include/arm-common/kvm-arch.h
> > index d84e50cd..b12255b0 100644
> > --- a/arm/include/arm-common/kvm-arch.h
> > +++ b/arm/include/arm-common/kvm-arch.h
> > @@ -7,14 +7,33 @@
> >  
> >  #include "arm-common/gic.h"
> >  
> > +/*
> > + * The memory map used for ARM guests (not to scale):
> > + *
> > + * 0  64K  16M 32M 48M1GB   2GB
> > + * +---+-..-+---+---+----+-+--.--+---..
> > + * | (PCI) || int.  |   || | |
> > + * |  I/O  || MMIO: | Flash | virtio | GIC |   PCI   |  DRAM
> > + * | ports || UART, |   |  MMIO  | |  (AXI)  |
> > + * |   || RTC   |   || | |
> > + * +---+-..-+---+---+----+-+--.--+---..
> > + */  
> 
> Nitpick: I searched the PCI Local Bus Specification revision 3.0 (which 
> kvmtool
> currently implements) for the term I/O ports, and found one mention in a 
> schematic
> for an add-in card. The I/O region is called in the spec I/O Space.

Right, will change that.
 
> I don't know what "int." means in the region for the UART and RTC.

It was meant to mean "internal", but this is really nonsensical, so I
will replace it with "plat MMIO".
 
> The comment says that the art is not to scale, so I don't think there's any 
> need
> for the "..." between the corners of the regions. To my eyes, it makes the 
> ASCII
> art look crooked.

fixed.
 
> The next patches add the UART and RTC outside the first 64K, I think the 
> region
> should be documented in the patches where the changes are made, not here. 
> Another
> alternative would be to move this patch to the end of the series instead of
> incrementally changing the memory ASCII art (which I imagine is time 
> consuming).

You passed the scrutiny test ;-)
I noticed this last minute, but didn't feel like changing it then.
Have done it now.

> 
> Otherwise, the numbers look OK.
> 
> > +
> >  #define ARM_IOPORT_AREA_AC(0x, UL)
> > -#define ARM_FLASH_AREA _AC(0x0200, UL)
> > -#define ARM_MMIO_AREA  _AC(0x0300, UL)
> > +#define ARM_MMIO_AREA  _AC(0x0100, UL)  
> 
> The patch says it is *documenting* the memory layout, but here it is 
> *changing*
> the layout. Other than that, I like the shuffling of definitions so the 
> kvmtool
> global defines are closer to the arch values.

I amended the commit message to mention that I change the value of
ARM_MMIO_AREA, but that stays internal to that file and doesn't affect
the other definitions.
In fact I imported this header into a C file and printed all
externally used names, before and after: it didn't show any changes.

Cheers,
Andre

> >  #define ARM_AXI_AREA   _AC(0x4000, UL)
> >  #define ARM_MEMORY_AREA_AC(0x8000, UL)
> >  
> > -#define ARM_LOMAP_MAX_MEMORY   ((1ULL << 32) - ARM_MEMORY_AREA)
> > -#define ARM_HIMAP_MAX_MEMORY   ((1ULL << 40) - ARM_MEMORY_AREA)
> > +#define KVM_IOPORT_AREAARM_IOPORT_AREA
> > +#define ARM_IOPORT_SIZE(1U << 16)
> > +
> > +
> > +#define KVM_FLASH_MMIO_BASE(ARM_MMIO_AREA + 0x100)
> > +#define KVM_FLASH_MAX_SIZE 0x100
> > +
> > +#define KVM_VIRTIO_MMIO_AREA   (KVM_FLASH_MMIO_BASE + 
> > KVM_FLASH_MAX_SIZE)
> > +#define ARM_VIRTIO_MMIO_SIZE   (ARM_AXI_AREA - \
> > +   (KVM_VIRTIO_MMIO_AREA + ARM_GIC_SIZE))
> >  
> >  #define ARM_GIC_DIST_BASE  (ARM_AXI_AREA - ARM_GIC_DIST_SIZE)
> >  #define ARM_GIC_CPUI_BASE  (ARM_GIC_DIST_BASE - ARM_GIC_CPUI_SIZE)
> > @@ -22,19 +41,17 @@
> >  #define ARM_GIC_DIST_SIZE  0x1
> >  #define ARM_GIC_CPUI_SIZE  0x2
> >  
> > -#define KVM_FLASH_MMIO_BASEARM_FLASH_AREA
> > -#define KVM_FLASH_MAX_SIZE (ARM_MMIO_AREA - ARM_FLASH_AREA)
> >  
> > -#define ARM_IOPORT_SIZE(1U << 16)
> > -#define ARM_VIRTIO_MMIO_SIZE   (ARM_AXI_AREA - (ARM_MMIO_AREA + 
> > ARM_GIC_SIZE))
> > +#define KVM_PCI_CFG_AREA   ARM_AXI_AREA
> >  #define ARM_PCI_CFG_SIZE   (1ULL << 24)
> > +#define KVM_PCI_MMIO_AREA  (KVM_PCI_CFG_AREA + 

Re: [PATCH kvmtool v2 13/22] hw/serial: Refactor trap handler

2021-03-15 Thread Andre Przywara
On Fri, 12 Mar 2021 11:29:43 +
Alexandru Elisei  wrote:

> Hi Andre,
> 
> On 2/25/21 12:59 AM, Andre Przywara wrote:
> > With the planned retirement of the special ioport emulation code, we
> > need to provide an emulation function compatible with the MMIO prototype.
> >
> > Adjust the trap handler to use that new function, and provide shims to
> > implement the old ioport interface, for now.
> >
> > We drop the usage of ioport__read8/write8 entirely, as this would only
> > be applicable for I/O port accesses, and does nothing for 8-bit wide
> > accesses anyway.
> >
> > Signed-off-by: Andre Przywara 
> > ---
> >  hw/serial.c | 93 +
> >  1 file changed, 58 insertions(+), 35 deletions(-)
> >
> > diff --git a/hw/serial.c b/hw/serial.c
> > index b0465d99..c495eac1 100644
> > --- a/hw/serial.c
> > +++ b/hw/serial.c
> > @@ -242,36 +242,31 @@ void serial8250__inject_sysrq(struct kvm *kvm, char 
> > sysrq)
> > sysrq_pending = sysrq;
> >  }
> >  
> > -static bool serial8250_out(struct ioport *ioport, struct kvm_cpu *vcpu, 
> > u16 port,
> > -  void *data, int size)
> > +static bool serial8250_out(struct serial8250_device *dev, struct kvm_cpu 
> > *vcpu,
> > +  u16 offset, u8 data)
> >  {
> > -   struct serial8250_device *dev = ioport->priv;
> > -   u16 offset;
> > bool ret = true;
> > -   char *addr = data;
> >  
> > mutex_lock(>mutex);
> >  
> > -   offset = port - dev->iobase;
> > -
> > switch (offset) {
> > case UART_TX:
> > if (dev->lcr & UART_LCR_DLAB) {
> > -   dev->dll = ioport__read8(data);
> > +   dev->dll = data;
> > break;
> > }
> >  
> > /* Loopback mode */
> > if (dev->mcr & UART_MCR_LOOP) {
> > if (dev->rxcnt < FIFO_LEN) {
> > -   dev->rxbuf[dev->rxcnt++] = *addr;
> > +   dev->rxbuf[dev->rxcnt++] = data;
> > dev->lsr |= UART_LSR_DR;
> > }
> > break;
> > }
> >  
> > if (dev->txcnt < FIFO_LEN) {
> > -   dev->txbuf[dev->txcnt++] = *addr;
> > +   dev->txbuf[dev->txcnt++] = data;
> > dev->lsr &= ~UART_LSR_TEMT;
> > if (dev->txcnt == FIFO_LEN / 2)
> > dev->lsr &= ~UART_LSR_THRE;
> > @@ -283,18 +278,18 @@ static bool serial8250_out(struct ioport *ioport, 
> > struct kvm_cpu *vcpu, u16 port
> > break;
> > case UART_IER:
> > if (!(dev->lcr & UART_LCR_DLAB))
> > -   dev->ier = ioport__read8(data) & 0x0f;
> > +   dev->ier = data & 0x0f;
> > else
> > -   dev->dlm = ioport__read8(data);
> > +   dev->dlm = data;
> > break;
> > case UART_FCR:
> > -   dev->fcr = ioport__read8(data);
> > +   dev->fcr = data;
> > break;
> > case UART_LCR:
> > -   dev->lcr = ioport__read8(data);
> > +   dev->lcr = data;
> > break;
> > case UART_MCR:
> > -   dev->mcr = ioport__read8(data);
> > +   dev->mcr = data;
> > break;
> > case UART_LSR:
> > /* Factory test */
> > @@ -303,7 +298,7 @@ static bool serial8250_out(struct ioport *ioport, 
> > struct kvm_cpu *vcpu, u16 port
> > /* Not used */
> > break;
> > case UART_SCR:
> > -   dev->scr = ioport__read8(data);
> > +   dev->scr = data;
> > break;
> > default:
> > ret = false;
> > @@ -317,7 +312,7 @@ static bool serial8250_out(struct ioport *ioport, 
> > struct kvm_cpu *vcpu, u16 port
> > return ret;
> >  }
> >  
> > -static void serial8250_rx(struct serial8250_device *dev, void *data)
> > +static void serial8250_rx(struct serial8250_device *dev, u8 *data)
> >  {
> > if (dev->rxdone == dev->rxcnt)
> > return;
> > @@ -325,57 +320,54 @@ static void serial8250_rx(struct serial8250_device 
> > *dev, void *data)
> > /* Break issued ? */
> > if (dev->lsr & UART_LSR_BI) {
> > dev->lsr &= ~UART_LSR_BI;
> > -   ioport__write8(data, 0);
> > +   *data = 0;
> > return;
> > }
> >  
> > -   ioport__write8(data, dev->rxbuf[dev->rxdone++]);
> > +   *data = dev->rxbuf[dev->rxdone++];
> > if (dev->rxcnt == dev->rxdone) {
> > dev->lsr &= ~UART_LSR_DR;
> > dev->rxcnt = dev->rxdone = 0;
> > }
> >  }
> >  
> > -static bool serial8250_in(struct ioport *ioport, struct kvm_cpu *vcpu, u16 
> > port, void *data, int size)
> > +static bool serial8250_in(struct serial8250_device *dev, struct kvm_cpu 
> > *vcpu,
> > + u16 offset, u8 *data)
> >  {
> > -   struct serial8250_device *dev = ioport->priv;
> > -   u16 offset;
> > bool 

Re: [PATCH kvmtool v2 09/22] x86/ioport: Switch to new trap handlers

2021-03-15 Thread Andre Przywara
On Tue, 9 Mar 2021 12:09:52 +
Alexandru Elisei  wrote:

> Hi Andre,
> 
> On 2/25/21 12:59 AM, Andre Przywara wrote:
> > Now that the x86 I/O ports have trap handlers adhering to the MMIO fault
> > handler prototype, let's switch over to the joint registration routine.
> >
> > This allows us to get rid of the ioport shim routines.
> >
> > Since the debug output was done in ioport.c, we would lose this
> > functionality when moving over to the MMIO handlers. So bring this back
> > here explicitly, by introducing debug_mmio().
> >
> > Signed-off-by: Andre Przywara 
> > ---
> >  x86/ioport.c | 102 +++
> >  1 file changed, 37 insertions(+), 65 deletions(-)
> >
> > diff --git a/x86/ioport.c b/x86/ioport.c
> > index 78f9a863..9fcbb6c9 100644
> > --- a/x86/ioport.c
> > +++ b/x86/ioport.c
> > @@ -3,21 +3,35 @@
> >  #include 
> >  #include 
> >  
> > -static void dummy_mmio(struct kvm_cpu *vcpu, u64 addr, u8 *data, u32 len,
> > +static void debug_mmio(struct kvm_cpu *vcpu, u64 addr, u8 *data, u32 len,
> >u8 is_write, void *ptr)  
> 
> Since the function is about ioports (it even checks cfg.ioport_debug), 
> shouldn't
> something like debug_io/debug_pio/debug_ioport/ related> be
> more appropriate?

Yes, I changed the function to debug_io(), to be in line with the other
functions and the names in the other devices.

> Otherwise looks good: the only emulation callback that could fail was
> debug_ops->debug_io_out, which triggered a print if cfg.ioport_debug was set, 
> and
> the callback is replaced by debug_mmio. With the name change:
> 
> Reviewed-by: Alexandru Elisei 

Thanks!

Andre

> >  {
> > +   if (!vcpu->kvm->cfg.ioport_debug)
> > +   return;
> > +
> > +   fprintf(stderr, "debug port %s from VCPU%lu: port=0x%lx, size=%u",
> > +   is_write ? "write" : "read", vcpu->cpu_id,
> > +   (unsigned long)addr, len);
> > +   if (is_write) {
> > +   u32 value;
> > +
> > +   switch (len) {
> > +   case 1: value = ioport__read8(data); break;
> > +   case 2: value = ioport__read16((u16*)data); break;
> > +   case 4: value = ioport__read32((u32*)data); break;
> > +   default: value = 0; break;
> > +   }
> > +   fprintf(stderr, ", data: 0x%x\n", value);
> > +   } else {
> > +   fprintf(stderr, "\n");
> > +   }
> >  }
> >  
> > -static bool debug_io_out(struct ioport *ioport, struct kvm_cpu *vcpu, u16 
> > port, void *data, int size)
> > +static void dummy_mmio(struct kvm_cpu *vcpu, u64 addr, u8 *data, u32 len,
> > +  u8 is_write, void *ptr)
> >  {
> > -   dummy_mmio(vcpu, port, data, size, true, NULL);
> > -   return 0;
> >  }
> >  
> > -static struct ioport_operations debug_ops = {
> > -   .io_out = debug_io_out,
> > -};
> > -
> >  static void seabios_debug_mmio(struct kvm_cpu *vcpu, u64 addr, u8 *data,
> >u32 len, u8 is_write, void *ptr)
> >  {
> > @@ -31,37 +45,6 @@ static void seabios_debug_mmio(struct kvm_cpu *vcpu, u64 
> > addr, u8 *data,
> > putchar(ch);
> >  }
> >  
> > -static bool seabios_debug_io_out(struct ioport *ioport, struct kvm_cpu 
> > *vcpu, u16 port, void *data, int size)
> > -{
> > -   seabios_debug_mmio(vcpu, port, data, size, true, NULL);
> > -   return 0;
> > -}
> > -
> > -static struct ioport_operations seabios_debug_ops = {
> > -   .io_out = seabios_debug_io_out,
> > -};
> > -
> > -static bool dummy_io_in(struct ioport *ioport, struct kvm_cpu *vcpu, u16 
> > port, void *data, int size)
> > -{
> > -   dummy_mmio(vcpu, port, data, size, false, NULL);
> > -   return true;
> > -}
> > -
> > -static bool dummy_io_out(struct ioport *ioport, struct kvm_cpu *vcpu, u16 
> > port, void *data, int size)
> > -{
> > -   dummy_mmio(vcpu, port, data, size, true, NULL);
> > -   return true;
> > -}
> > -
> > -static struct ioport_operations dummy_read_write_ioport_ops = {
> > -   .io_in  = dummy_io_in,
> > -   .io_out = dummy_io_out,
> > -};
> > -
> > -static struct ioport_operations dummy_write_only_ioport_ops = {
> > -   .io_out = dummy_io_out,
> > -};
> > -
> >  /*
> >   * The "fast A20 gate"
> >   */
> > @@ -76,17 +59,6 @@ static void ps2_control_mmio(struct kvm_cpu *vcpu, u64 
> > addr, u8 *data, u32 len,
> > ioport__write8(data, 0x02);
> >  }
> >  
> > -static bool ps2_control_a_io_in(struct ioport *ioport, struct kvm_cpu 
> > *vcpu, u16 port, void *data, int size)
> > -{
> > -   ps2_control_mmio(vcpu, port, data, size, false, NULL);
> > -   return true;
> > -}
> > -
> > -static struct ioport_operations ps2_control_a_ops = {
> > -   .io_in  = ps2_control_a_io_in,
> > -   .io_out = dummy_io_out,
> > -};
> > -
> >  void ioport__map_irq(u8 *irq)
> >  {
> >  }
> > @@ -98,75 +70,75 @@ static int ioport__setup_arch(struct kvm *kvm)
> > /* Legacy ioport setup */
> >  
> > /*  - 001F - DMA1 controller */
> > -   r = 

Re: [PATCH kvmtool v2 08/22] x86/ioport: Refactor trap handlers

2021-03-15 Thread Andre Przywara
On Tue, 9 Mar 2021 11:49:47 +
Alexandru Elisei  wrote:

> Hi Andre,
> 
> Regarding the naming of the functions, these are real ioport emulation 
> functions,
> which are executed because a KVM_EXIT_IO exit reason from KVM_RUN. Wouldn't 
> naming
> the functions something like *_pio or *_io be more appropriate?

Yes, indeed these devices here are per definition pure port I/O
devices. I changed the names to _io, to be in line with the later
patches.

Cheers,
Andre
 
> On 2/25/21 12:59 AM, Andre Przywara wrote:
> > With the planned retirement of the special ioport emulation code, we
> > need to provide emulation functions compatible with the MMIO
> > prototype.
> >
> > Adjust the trap handlers to use that new function, and provide shims to
> > implement the old ioport interface, for now.
> >
> > Signed-off-by: Andre Przywara 
> > Reviewed-by: Alexandru Elisei 
> > ---
> >  x86/ioport.c | 30 ++
> >  1 file changed, 26 insertions(+), 4 deletions(-)
> >
> > diff --git a/x86/ioport.c b/x86/ioport.c
> > index a8d2bb1a..78f9a863 100644
> > --- a/x86/ioport.c
> > +++ b/x86/ioport.c
> > @@ -3,8 +3,14 @@
> >  #include 
> >  #include 
> >  
> > +static void dummy_mmio(struct kvm_cpu *vcpu, u64 addr, u8 *data, u32 len,
> > +  u8 is_write, void *ptr)
> > +{
> > +}
> > +
> >  static bool debug_io_out(struct ioport *ioport, struct kvm_cpu *vcpu, u16 
> > port, void *data, int size)
> >  {
> > +   dummy_mmio(vcpu, port, data, size, true, NULL);
> > return 0;
> >  }
> >  
> > @@ -12,15 +18,23 @@ static struct ioport_operations debug_ops = {
> > .io_out = debug_io_out,
> >  };
> >  
> > -static bool seabios_debug_io_out(struct ioport *ioport, struct kvm_cpu 
> > *vcpu, u16 port, void *data, int size)
> > +static void seabios_debug_mmio(struct kvm_cpu *vcpu, u64 addr, u8 *data,
> > +  u32 len, u8 is_write, void *ptr)
> >  {
> > char ch;
> >  
> > +   if (!is_write)
> > +   return;
> > +
> > ch = ioport__read8(data);
> >  
> > putchar(ch);
> > +}
> >  
> > -   return true;
> > +static bool seabios_debug_io_out(struct ioport *ioport, struct kvm_cpu 
> > *vcpu, u16 port, void *data, int size)
> > +{
> > +   seabios_debug_mmio(vcpu, port, data, size, true, NULL);
> > +   return 0;
> >  }
> >  
> >  static struct ioport_operations seabios_debug_ops = {
> > @@ -29,11 +43,13 @@ static struct ioport_operations seabios_debug_ops = {
> >  
> >  static bool dummy_io_in(struct ioport *ioport, struct kvm_cpu *vcpu, u16 
> > port, void *data, int size)
> >  {
> > +   dummy_mmio(vcpu, port, data, size, false, NULL);
> > return true;
> >  }
> >  
> >  static bool dummy_io_out(struct ioport *ioport, struct kvm_cpu *vcpu, u16 
> > port, void *data, int size)
> >  {
> > +   dummy_mmio(vcpu, port, data, size, true, NULL);
> > return true;
> >  }
> >  
> > @@ -50,13 +66,19 @@ static struct ioport_operations 
> > dummy_write_only_ioport_ops = {
> >   * The "fast A20 gate"
> >   */
> >  
> > -static bool ps2_control_a_io_in(struct ioport *ioport, struct kvm_cpu 
> > *vcpu, u16 port, void *data, int size)
> > +static void ps2_control_mmio(struct kvm_cpu *vcpu, u64 addr, u8 *data, u32 
> > len,
> > +u8 is_write, void *ptr)
> >  {
> > /*
> >  * A20 is always enabled.
> >  */
> > -   ioport__write8(data, 0x02);
> > +   if (!is_write)
> > +   ioport__write8(data, 0x02);
> > +}
> >  
> > +static bool ps2_control_a_io_in(struct ioport *ioport, struct kvm_cpu 
> > *vcpu, u16 port, void *data, int size)
> > +{
> > +   ps2_control_mmio(vcpu, port, data, size, false, NULL);
> > return true;
> >  }
> >

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH kvmtool v2 04/22] mmio: Extend handling to include ioport emulation

2021-03-15 Thread Andre Przywara
On Wed, 3 Mar 2021 17:58:29 +
Alexandru Elisei  wrote:

Hi Alex,

> On 2/25/21 12:58 AM, Andre Przywara wrote:
> > In their core functionality MMIO and I/O port traps are not really
> > different, yet we still have two totally separate code paths for
> > handling them. Devices need to decide on one conduit or need to provide
> > different handler functions for each of them.
> >
> > Extend the existing MMIO emulation to also cover ioport handlers.
> > This just adds another RB tree root for holding the I/O port handlers,
> > but otherwise uses the same tree population and lookup code.
> > "ioport" or "mmio" just become a flag in the registration function.
> > Provide wrappers to not break existing users, and allow an easy
> > transition for the existing ioport handlers.
> >
> > This also means that ioport handlers now can use the same emulation
> > callback prototype as MMIO handlers, which means we have to migrate them
> > over. To allow a smooth transition, we hook up the new I/O emulate
> > function to the end of the existing ioport emulation code.
> >
> > Signed-off-by: Andre Przywara 
> > ---
> >  include/kvm/kvm.h | 49 ---
> >  ioport.c  |  4 +--
> >  mmio.c| 65 +++
> >  3 files changed, 102 insertions(+), 16 deletions(-)
> >
> > diff --git a/include/kvm/kvm.h b/include/kvm/kvm.h
> > index f1f0afd7..306b258a 100644
> > --- a/include/kvm/kvm.h
> > +++ b/include/kvm/kvm.h
> > @@ -27,10 +27,23 @@
> >  #define PAGE_SIZE (sysconf(_SC_PAGE_SIZE))
> >  #endif
> >  
> > +/*
> > + * We are reusing the existing DEVICE_BUS_MMIO and DEVICE_BUS_IOPORT 
> > constants
> > + * from kvm/devices.h to differentiate between registering an I/O port and 
> > an
> > + * MMIO region.
> > + * To avoid collisions with future additions of more bus types, we reserve
> > + * a generous 4 bits for the bus mask here.
> > + */
> > +#define IOTRAP_BUS_MASK0xf
> > +#define IOTRAP_COALESCE(1U << 4)
> > +
> >  #define DEFINE_KVM_EXT(ext)\
> > .name = #ext,   \
> > .code = ext
> >  
> > +struct kvm_cpu;
> > +typedef void (*mmio_handler_fn)(struct kvm_cpu *vcpu, u64 addr, u8 *data,
> > +   u32 len, u8 is_write, void *ptr);
> >  typedef void (*fdt_irq_fn)(void *fdt, u8 irq, enum irq_type irq_type);
> >  
> >  enum {
> > @@ -113,6 +126,8 @@ void kvm__irq_line(struct kvm *kvm, int irq, int level);
> >  void kvm__irq_trigger(struct kvm *kvm, int irq);
> >  bool kvm__emulate_io(struct kvm_cpu *vcpu, u16 port, void *data, int 
> > direction, int size, u32 count);
> >  bool kvm__emulate_mmio(struct kvm_cpu *vcpu, u64 phys_addr, u8 *data, u32 
> > len, u8 is_write);
> > +bool kvm__emulate_pio(struct kvm_cpu *vcpu, u16 port, void *data,
> > + int direction, int size, u32 count);
> >  int kvm__destroy_mem(struct kvm *kvm, u64 guest_phys, u64 size, void 
> > *userspace_addr);
> >  int kvm__register_mem(struct kvm *kvm, u64 guest_phys, u64 size, void 
> > *userspace_addr,
> >   enum kvm_mem_type type);
> > @@ -136,10 +151,36 @@ static inline int kvm__reserve_mem(struct kvm *kvm, 
> > u64 guest_phys, u64 size)
> >  KVM_MEM_TYPE_RESERVED);
> >  }
> >  
> > -int __must_check kvm__register_mmio(struct kvm *kvm, u64 phys_addr, u64 
> > phys_addr_len, bool coalesce,
> > -   void (*mmio_fn)(struct kvm_cpu *vcpu, u64 
> > addr, u8 *data, u32 len, u8 is_write, void *ptr),
> > -   void *ptr);
> > -bool kvm__deregister_mmio(struct kvm *kvm, u64 phys_addr);
> > +int __must_check kvm__register_iotrap(struct kvm *kvm, u64 phys_addr, u64 
> > len,
> > + mmio_handler_fn mmio_fn, void *ptr,
> > + unsigned int flags);
> > +
> > +static inline
> > +int __must_check kvm__register_mmio(struct kvm *kvm, u64 phys_addr,
> > +   u64 phys_addr_len, bool coalesce,
> > +   mmio_handler_fn mmio_fn, void *ptr)
> > +{
> > +   return kvm__register_iotrap(kvm, phys_addr, phys_addr_len, mmio_fn, ptr,
> > +   DEVICE_BUS_MMIO | (coalesce ? IOTRAP_COALESCE : 0));
> > +}
> > +static inline
> > +int __must_check kvm__register_pio(struct kvm *kvm, u16 port, u16 len,
> > +  mmio_handler_fn mmio_fn, void *ptr)
> > +{
> > +   return kvm__register_iotrap(kvm, port, len, mmio_fn, ptr,
> > +   DEVICE_BUS_IOPORT);
> > +}
> > +
> > +bool kvm__deregister_iotrap(struct kvm *kvm, u64 phys_addr, unsigned int 
> > flags);
> > +static inline bool kvm__deregister_mmio(struct kvm *kvm, u64 phys_addr)
> > +{
> > +   return kvm__deregister_iotrap(kvm, phys_addr, DEVICE_BUS_MMIO);
> > +}
> > +static inline bool kvm__deregister_pio(struct kvm *kvm, u16 port)
> > +{
> > +   return kvm__deregister_iotrap(kvm, port, 

Re: [PATCH][stable-4.{4,9}] KVM: arm64: Fix exclusive limit for IPA size

2021-03-15 Thread Greg KH
On Mon, Mar 15, 2021 at 11:46:46AM +, Marc Zyngier wrote:
> Commit 262b003d059c6671601a19057e9fe1a5e7f23722 upstream.
> 
> When registering a memslot, we check the size and location of that
> memslot against the IPA size to ensure that we can provide guest
> access to the whole of the memory.
> 
> Unfortunately, this check rejects memslot that end-up at the exact
> limit of the addressing capability for a given IPA size. For example,
> it refuses the creation of a 2GB memslot at 0x800 with a 32bit
> IPA space.
> 
> Fix it by relaxing the check to accept a memslot reaching the
> limit of the IPA space.
> 
> Fixes: c3058d5da222 ("arm/arm64: KVM: Ensure memslots are within 
> KVM_PHYS_SIZE")
> Reviewed-by: Eric Auger 
> Signed-off-by: Marc Zyngier 
> Cc: sta...@vger.kernel.org # 4.4, 4.9
> Reviewed-by: Andrew Jones 
> Link: https://lore.kernel.org/r/20210311100016.3830038-3-...@kernel.org
> ---
>  arch/arm/kvm/mmu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

That worked, now queued up, thanks!

greg k-h
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Patch "KVM: arm64: Fix exclusive limit for IPA size" has been added to the 4.9-stable tree

2021-03-15 Thread gregkh


This is a note to let you know that I've just added the patch titled

KVM: arm64: Fix exclusive limit for IPA size

to the 4.9-stable tree which can be found at:

http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
 kvm-arm64-fix-exclusive-limit-for-ipa-size.patch
and it can be found in the queue-4.9 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let  know about it.


>From foo@baz Mon Mar 15 01:04:48 PM CET 2021
From: Marc Zyngier 
Date: Mon, 15 Mar 2021 11:46:46 +
Subject: KVM: arm64: Fix exclusive limit for IPA size
To: gre...@linuxfoundation.org
Cc: kernel-t...@android.com, kvmarm@lists.cs.columbia.edu, Eric Auger 
, sta...@vger.kernel.org, Andrew Jones 

Message-ID: <20210315114646.4137198-1-...@kernel.org>

From: Marc Zyngier 

Commit 262b003d059c6671601a19057e9fe1a5e7f23722 upstream.

When registering a memslot, we check the size and location of that
memslot against the IPA size to ensure that we can provide guest
access to the whole of the memory.

Unfortunately, this check rejects memslot that end-up at the exact
limit of the addressing capability for a given IPA size. For example,
it refuses the creation of a 2GB memslot at 0x800 with a 32bit
IPA space.

Fix it by relaxing the check to accept a memslot reaching the
limit of the IPA space.

Fixes: c3058d5da222 ("arm/arm64: KVM: Ensure memslots are within KVM_PHYS_SIZE")
Reviewed-by: Eric Auger 
Signed-off-by: Marc Zyngier 
Cc: sta...@vger.kernel.org # 4.4, 4.9
Reviewed-by: Andrew Jones 
Link: https://lore.kernel.org/r/20210311100016.3830038-3-...@kernel.org
Signed-off-by: Greg Kroah-Hartman 
---
 arch/arm/kvm/mmu.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -1834,7 +1834,7 @@ int kvm_arch_prepare_memory_region(struc
 * Prevent userspace from creating a memory region outside of the IPA
 * space addressable by the KVM guest IPA space.
 */
-   if (memslot->base_gfn + memslot->npages >=
+   if (memslot->base_gfn + memslot->npages >
(KVM_PHYS_SIZE >> PAGE_SHIFT))
return -EFAULT;
 


Patches currently in stable-queue which might be from m...@kernel.org are

queue-4.9/kvm-arm64-fix-exclusive-limit-for-ipa-size.patch
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Patch "KVM: arm64: Fix exclusive limit for IPA size" has been added to the 4.4-stable tree

2021-03-15 Thread gregkh


This is a note to let you know that I've just added the patch titled

KVM: arm64: Fix exclusive limit for IPA size

to the 4.4-stable tree which can be found at:

http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
 kvm-arm64-fix-exclusive-limit-for-ipa-size.patch
and it can be found in the queue-4.4 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let  know about it.


>From foo@baz Mon Mar 15 01:04:46 PM CET 2021
From: Marc Zyngier 
Date: Mon, 15 Mar 2021 11:46:46 +
Subject: KVM: arm64: Fix exclusive limit for IPA size
To: gre...@linuxfoundation.org
Cc: kernel-t...@android.com, kvmarm@lists.cs.columbia.edu, Eric Auger 
, sta...@vger.kernel.org, Andrew Jones 

Message-ID: <20210315114646.4137198-1-...@kernel.org>

From: Marc Zyngier 

Commit 262b003d059c6671601a19057e9fe1a5e7f23722 upstream.

When registering a memslot, we check the size and location of that
memslot against the IPA size to ensure that we can provide guest
access to the whole of the memory.

Unfortunately, this check rejects memslot that end-up at the exact
limit of the addressing capability for a given IPA size. For example,
it refuses the creation of a 2GB memslot at 0x800 with a 32bit
IPA space.

Fix it by relaxing the check to accept a memslot reaching the
limit of the IPA space.

Fixes: c3058d5da222 ("arm/arm64: KVM: Ensure memslots are within KVM_PHYS_SIZE")
Reviewed-by: Eric Auger 
Signed-off-by: Marc Zyngier 
Cc: sta...@vger.kernel.org # 4.4, 4.9
Reviewed-by: Andrew Jones 
Link: https://lore.kernel.org/r/20210311100016.3830038-3-...@kernel.org
Signed-off-by: Greg Kroah-Hartman 
---
 arch/arm/kvm/mmu.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -1789,7 +1789,7 @@ int kvm_arch_prepare_memory_region(struc
 * Prevent userspace from creating a memory region outside of the IPA
 * space addressable by the KVM guest IPA space.
 */
-   if (memslot->base_gfn + memslot->npages >=
+   if (memslot->base_gfn + memslot->npages >
(KVM_PHYS_SIZE >> PAGE_SHIFT))
return -EFAULT;
 


Patches currently in stable-queue which might be from m...@kernel.org are

queue-4.4/kvm-arm64-fix-exclusive-limit-for-ipa-size.patch
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v6 3/5] ARM: implement support for SMCCC TRNG entropy source

2021-03-15 Thread Ard Biesheuvel
On Wed, 6 Jan 2021 at 11:35, Andre Przywara  wrote:
>
> From: Ard Biesheuvel 
>
> Implement arch_get_random_seed_*() for ARM based on the firmware
> or hypervisor provided entropy source described in ARM DEN0098.
>
> This will make the kernel's random number generator consume entropy
> provided by this interface, at early boot, and periodically at
> runtime when reseeding.
>
> Cc: Linus Walleij 
> Cc: Russell King 
> Signed-off-by: Ard Biesheuvel 
> [Andre: rework to be initialised by the SMCCC firmware driver]
> Signed-off-by: Andre Przywara 
> Reviewed-by: Linus Walleij 

I think this one could be dropped into rmk's patch tracker now, right?


> ---
>  arch/arm/Kconfig  |  4 ++
>  arch/arm/include/asm/archrandom.h | 64 +++
>  2 files changed, 68 insertions(+)
>
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index 138248999df7..bfe642510b0a 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -1644,6 +1644,10 @@ config STACKPROTECTOR_PER_TASK
>   Enable this option to switch to a different method that uses a
>   different canary value for each task.
>
> +config ARCH_RANDOM
> +   def_bool y
> +   depends on HAVE_ARM_SMCCC_DISCOVERY
> +
>  endmenu
>
>  menu "Boot options"
> diff --git a/arch/arm/include/asm/archrandom.h 
> b/arch/arm/include/asm/archrandom.h
> index a8e84ca5c2ee..f3e96a5b65f8 100644
> --- a/arch/arm/include/asm/archrandom.h
> +++ b/arch/arm/include/asm/archrandom.h
> @@ -2,9 +2,73 @@
>  #ifndef _ASM_ARCHRANDOM_H
>  #define _ASM_ARCHRANDOM_H
>
> +#ifdef CONFIG_ARCH_RANDOM
> +
> +#include 
> +#include 
> +
> +#define ARM_SMCCC_TRNG_MIN_VERSION 0x1UL
> +
> +extern bool smccc_trng_available;
> +
> +static inline bool __init smccc_probe_trng(void)
> +{
> +   struct arm_smccc_res res;
> +
> +   arm_smccc_1_1_invoke(ARM_SMCCC_TRNG_VERSION, );
> +   if ((s32)res.a0 < 0)
> +   return false;
> +   if (res.a0 >= ARM_SMCCC_TRNG_MIN_VERSION) {
> +   /* double check that the 32-bit flavor is available */
> +   arm_smccc_1_1_invoke(ARM_SMCCC_TRNG_FEATURES,
> +ARM_SMCCC_TRNG_RND32,
> +);
> +   if ((s32)res.a0 >= 0)
> +   return true;
> +   }
> +
> +   return false;
> +}
> +
> +static inline bool __must_check arch_get_random_long(unsigned long *v)
> +{
> +   return false;
> +}
> +
> +static inline bool __must_check arch_get_random_int(unsigned int *v)
> +{
> +   return false;
> +}
> +
> +static inline bool __must_check arch_get_random_seed_long(unsigned long *v)
> +{
> +   struct arm_smccc_res res;
> +
> +   if (smccc_trng_available) {
> +   arm_smccc_1_1_invoke(ARM_SMCCC_TRNG_RND32, 8 * sizeof(*v), 
> );
> +
> +   if (res.a0 != 0)
> +   return false;
> +
> +   *v = res.a3;
> +   return true;
> +   }
> +
> +   return false;
> +}
> +
> +static inline bool __must_check arch_get_random_seed_int(unsigned int *v)
> +{
> +   return arch_get_random_seed_long((unsigned long *)v);
> +}
> +
> +
> +#else /* !CONFIG_ARCH_RANDOM */
> +
>  static inline bool __init smccc_probe_trng(void)
>  {
> return false;
>  }
>
> +#endif /* CONFIG_ARCH_RANDOM */
>  #endif /* _ASM_ARCHRANDOM_H */
> --
> 2.17.1
>
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH][stable-4.{4,9}] KVM: arm64: Fix exclusive limit for IPA size

2021-03-15 Thread Marc Zyngier
Commit 262b003d059c6671601a19057e9fe1a5e7f23722 upstream.

When registering a memslot, we check the size and location of that
memslot against the IPA size to ensure that we can provide guest
access to the whole of the memory.

Unfortunately, this check rejects memslot that end-up at the exact
limit of the addressing capability for a given IPA size. For example,
it refuses the creation of a 2GB memslot at 0x800 with a 32bit
IPA space.

Fix it by relaxing the check to accept a memslot reaching the
limit of the IPA space.

Fixes: c3058d5da222 ("arm/arm64: KVM: Ensure memslots are within KVM_PHYS_SIZE")
Reviewed-by: Eric Auger 
Signed-off-by: Marc Zyngier 
Cc: sta...@vger.kernel.org # 4.4, 4.9
Reviewed-by: Andrew Jones 
Link: https://lore.kernel.org/r/20210311100016.3830038-3-...@kernel.org
---
 arch/arm/kvm/mmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index edd392fdc14b..b44fdee5cd6b 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -1789,7 +1789,7 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
 * Prevent userspace from creating a memory region outside of the IPA
 * space addressable by the KVM guest IPA space.
 */
-   if (memslot->base_gfn + memslot->npages >=
+   if (memslot->base_gfn + memslot->npages >
(KVM_PHYS_SIZE >> PAGE_SHIFT))
return -EFAULT;
 
-- 
2.29.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


  1   2   >