Re: [PATCH v2 00/11] KVM: arm64: Enable SVE support on nVHE systems

2021-03-19 Thread Mark Brown
On Thu, Mar 18, 2021 at 12:25:21PM +, Marc Zyngier wrote:

> Most of this series only repaints things so that VHE and nVHE look as
> similar as possible, the ZCR_EL2 management being the most visible
> exception. This results in a bunch of preparatory patches that aim at
> making the code slightly more readable.

That readability stuff is definitely helping from my PoV.

Reviewed-by: Mark Brown 

(FWIW, from the SVE side)

> This has been tested on a FVP model with both VHE/nVHE configurations
> using the string tests included with the "optimized-routines"
> library[2].

There's also some tests Dave wrote which got upstreamed in kselftest
now, some normal kselftests and also a stress test that sits and
writes/reads bit patterns into the registers and is pretty good at
picking up any context switch issues.


signature.asc
Description: PGP signature
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v14 07/13] iommu/smmuv3: Implement cache_invalidate

2021-03-19 Thread Auger Eric
Hi Chenxiang,

On 3/4/21 8:55 AM, chenxiang (M) wrote:
> Hi Eric,
> 
> 
> 在 2021/2/24 4:56, Eric Auger 写道:
>> Implement domain-selective, pasid selective and page-selective
>> IOTLB invalidations.
>>
>> Signed-off-by: Eric Auger 
>>
>> ---
>>
>> v13 -> v14:
>> - Add domain invalidation
>> - do global inval when asid is not provided with addr
>>   granularity
>>
>> v7 -> v8:
>> - ASID based invalidation using iommu_inv_pasid_info
>> - check ARCHID/PASID flags in addr based invalidation
>> - use __arm_smmu_tlb_inv_context and __arm_smmu_tlb_inv_range_nosync
>>
>> v6 -> v7
>> - check the uapi version
>>
>> v3 -> v4:
>> - adapt to changes in the uapi
>> - add support for leaf parameter
>> - do not use arm_smmu_tlb_inv_range_nosync or arm_smmu_tlb_inv_context
>>   anymore
>>
>> v2 -> v3:
>> - replace __arm_smmu_tlb_sync by arm_smmu_cmdq_issue_sync
>>
>> v1 -> v2:
>> - properly pass the asid
>> ---
>>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 74 +
>>  1 file changed, 74 insertions(+)
>>
>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
>> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> index 4c19a1114de4..df3adc49111c 100644
>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> @@ -2949,6 +2949,79 @@ static void arm_smmu_detach_pasid_table(struct 
>> iommu_domain *domain)
>>  mutex_unlock(_domain->init_mutex);
>>  }
>>  
>> +static int
>> +arm_smmu_cache_invalidate(struct iommu_domain *domain, struct device *dev,
>> +  struct iommu_cache_invalidate_info *inv_info)
>> +{
>> +struct arm_smmu_cmdq_ent cmd = {.opcode = CMDQ_OP_TLBI_NSNH_ALL};
>> +struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>> +struct arm_smmu_device *smmu = smmu_domain->smmu;
>> +
>> +if (smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED)
>> +return -EINVAL;
>> +
>> +if (!smmu)
>> +return -EINVAL;
>> +
>> +if (inv_info->version != IOMMU_CACHE_INVALIDATE_INFO_VERSION_1)
>> +return -EINVAL;
>> +
>> +if (inv_info->cache & IOMMU_CACHE_INV_TYPE_PASID ||
>> +inv_info->cache & IOMMU_CACHE_INV_TYPE_DEV_IOTLB) {
>> +return -ENOENT;
>> +}
>> +
>> +if (!(inv_info->cache & IOMMU_CACHE_INV_TYPE_IOTLB))
>> +return -EINVAL;
>> +
>> +/* IOTLB invalidation */
>> +
>> +switch (inv_info->granularity) {
>> +case IOMMU_INV_GRANU_PASID:
>> +{
>> +struct iommu_inv_pasid_info *info =
>> +_info->granu.pasid_info;
>> +
>> +if (info->flags & IOMMU_INV_ADDR_FLAGS_PASID)
>> +return -ENOENT;
>> +if (!(info->flags & IOMMU_INV_PASID_FLAGS_ARCHID))
>> +return -EINVAL;
>> +
>> +__arm_smmu_tlb_inv_context(smmu_domain, info->archid);
>> +return 0;
>> +}
>> +case IOMMU_INV_GRANU_ADDR:
>> +{
>> +struct iommu_inv_addr_info *info = _info->granu.addr_info;
>> +size_t size = info->nb_granules * info->granule_size;
>> +bool leaf = info->flags & IOMMU_INV_ADDR_FLAGS_LEAF;
>> +
>> +if (info->flags & IOMMU_INV_ADDR_FLAGS_PASID)
>> +return -ENOENT;
>> +
>> +if (!(info->flags & IOMMU_INV_ADDR_FLAGS_ARCHID))
>> +break;
>> +
>> +arm_smmu_tlb_inv_range_domain(info->addr, size,
>> +  info->granule_size, leaf,
>> +  info->archid, smmu_domain);
> 
> Is it possible to add a check before the function to make sure that
> info->granule_size can be recognized by SMMU?
> There is a scenario which will cause TLBI issue: RIL feature is enabled
> on guest but is disabled on host, and then on
> host it just invalidate 4K/2M/1G once a time, but from QEMU,
> info->nb_granules is always 1 and info->granule_size = size,
> if size is not equal to 4K or 2M or 1G (for example size = granule_size
> is 5M), it will only part of the size it wants to invalidate.
> 
> I think maybe we can add a check here: if RIL is not enabled and also
> size is not the granule_size (4K/2M/1G) supported by
> SMMU hardware, can we just simply use the smallest granule_size
> supported by hardware all the time?
> 
>> +
>> +arm_smmu_cmdq_issue_sync(smmu);
>> +return 0;
>> +}
>> +case IOMMU_INV_GRANU_DOMAIN:
>> +break;
> 
> I check the qemu code
> (https://github.com/eauger/qemu/tree/v5.2.0-2stage-rfcv8), for opcode
> CMD_TLBI_NH_ALL or CMD_TLBI_NSNH_ALL from guest OS
> it calls smmu_inv_notifiers_all() to unamp all notifiers of all mr's,
> but it seems not set event.entry.granularity which i think it should set
> IOMMU_INV_GRAN_ADDR.
this is because IOMMU_INV_GRAN_ADDR = 0. But for clarity I should rather
set it explicitly ;-)
> BTW, for opcode CMD_TLBI_NH_ALL or CMD_TLBI_NSNH_ALL, it needs to unmap
> size = 0x1 on 48bit 

Re: [PATCH v2 05/11] arm64: sve: Provide a conditional update accessor for ZCR_ELx

2021-03-19 Thread Mark Brown
On Fri, Mar 19, 2021 at 04:51:46PM +, Marc Zyngier wrote:
> On 2021-03-19 16:42, Mark Brown wrote:

> > Do compilers actually do much better with this than with a static
> > inline like the other functions in this header?  Seems like something
> > they should be figuring out.

> It's not about performance or anything of the sort: in most cases
> where we end-up using this, it is on the back of an exception.
> So performance is the least of our worries.

> However, the "reg" parameter to read/write_sysreg_s() cannot
> be a variable, because it is directly fed to the assembler.
> If you want to use functions, you need to specialise them per
> register. At this point, I'm pretty happy with a #define.

Ah, that makes sense - it was more of a "that's weird" than "that's
actually a problem", hence the Reviwed-by.


signature.asc
Description: PGP signature
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[kvm-unit-tests PATCH v3] configure: arm/arm64: Add --earlycon option to set UART type and address

2021-03-19 Thread Alexandru Elisei
Currently, the UART early address is set indirectly with the --vmm option
and there are only two possible values: if the VMM is qemu (the default),
then the UART address is set to 0x0900; if the VMM is kvmtool, then the
UART address is set to 0x3f8.

The upstream kvmtool commit 45b4968e0de1 ("hw/serial: ARM/arm64: Use MMIO
at higher addresses") changed the UART address to 0x100, and
kvm-unit-tests so far hasn't had mechanism to let the user set a specific
address, which means that for recent versions of kvmtool the early UART
won't be available.

This situation will only become worse as kvm-unit-tests gains support to
run as an EFI app, as each platform will have their own UART type and
address.

To address both issues, a new configure option is added, --earlycon. The
syntax and semantics are identical to the kernel parameter with the same
name. For example, for kvmtool, --earlycon=uart,mmio,0x100 will set the
correct UART address. Specifying this option will overwrite the UART
address set by --vmm.

At the moment, the UART type and register width parameters are ignored
since both qemu's and kvmtool's UART emulation use the same offset for the
TX register and no other registers are used by kvm-unit-tests, but the
parameters will become relevant once EFI support is added.

Signed-off-by: Alexandru Elisei 
---
Besides working with current versions of kvmtool, this will also make early
console work if the user specifies a custom memory layout [1] (patches are
old, but I plan to pick them up at some point in the future).

Changes in v3:
* Switched to using IFS and read instead of cut.
* Fixed typo in option description.
* Added check that $addr is a valid number.

Changes in v2:
* kvmtool patches were merged, so I reworked the commit message to point to
  the corresponding kvmtool commit.
* Restricted pl011 register size to 32 bits, as per Arm Base System
  Architecture 1.0 (DEN0094A), and to match Linux.
* Reworked the way the fields are extracted to make it more precise
  (without the -s argument, the entire string is echo'ed when no delimiter
  is found).
* The changes are not trivial, so I dropped Drew's Reviewed-by.

[1] 
https://lore.kernel.org/kvm/1569245722-23375-1-git-send-email-alexandru.eli...@arm.com/

 configure | 53 +
 1 file changed, 53 insertions(+)

diff --git a/configure b/configure
index cdcd34e94030..01a0b262a9f2 100755
--- a/configure
+++ b/configure
@@ -26,6 +26,7 @@ errata_force=0
 erratatxt="$srcdir/errata.txt"
 host_key_document=
 page_size=
+earlycon=
 
 usage() {
 cat <<-EOF
@@ -54,6 +55,18 @@ usage() {
--page-size=PAGE_SIZE
   Specify the page size (translation granule) 
(4k, 16k or
   64k, default is 64k, arm64 only)
+   --earlycon=EARLYCON
+  Specify the UART name, type and address 
(optional, arm and
+  arm64 only). The specified address will 
overwrite the UART
+  address set by the --vmm option. EARLYCON 
can be one of
+  (case sensitive):
+  uart[8250],mmio,ADDR
+  Specify an 8250 compatible UART at address 
ADDR. Supported
+  register stride is 8 bit only.
+  pl011,ADDR
+  pl011,mmio32,ADDR
+  Specify a PL011 compatible UART at address 
ADDR. Supported
+  register stride is 32 bit only.
 EOF
 exit 1
 }
@@ -112,6 +125,9 @@ while [[ "$1" = -* ]]; do
--page-size)
page_size="$arg"
;;
+   --earlycon)
+   earlycon="$arg"
+   ;;
--help)
usage
;;
@@ -170,6 +186,43 @@ elif [ "$arch" = "arm" ] || [ "$arch" = "arm64" ]; then
 echo '--vmm must be one of "qemu" or "kvmtool"!'
 usage
 fi
+
+if [ "$earlycon" ]; then
+IFS=, read -r name type_addr addr <<<"$earlycon"
+if [ "$name" != "uart" ] && [ "$name" != "uart8250" ] &&
+[ "$name" != "pl011" ]; then
+echo "unknown earlycon name: $name"
+usage
+fi
+
+if [ "$name" = "pl011" ]; then
+if [ -z "$addr" ]; then
+addr=$type_addr
+else
+if [ "$type_addr" != "mmio32" ]; then
+echo "unknown $name earlycon type: $type_addr"
+usage
+fi
+fi
+else
+if [ "$type_addr" != "mmio" ]; then
+echo "unknown $name earlycon type: $type_addr"
+usage
+fi
+fi
+
+if [ -z "$addr" ]; then
+echo "missing $name earlycon address"
+usage
+fi
+if [[ $addr =~ ^0(x|X)[0-9a-fA-F]+$ ]] ||
+  

Re: [PATCH v2 05/11] arm64: sve: Provide a conditional update accessor for ZCR_ELx

2021-03-19 Thread Marc Zyngier

On 2021-03-19 16:42, Mark Brown wrote:

On Thu, Mar 18, 2021 at 12:25:26PM +, Marc Zyngier wrote:


A common pattern is to conditionally update ZCR_ELx in order
to avoid the "self-synchronizing" effect that writing to this
register has.

Let's provide an accessor that does exactly this.


Reviewed-by: Mark Brown 


+#define sve_cond_update_zcr_vq(val, reg)   \
+   do {\
+   u64 __zcr = read_sysreg_s((reg));   \
+   u64 __new = __zcr & ~ZCR_ELx_LEN_MASK;  \
+   __new |= (val) & ZCR_ELx_LEN_MASK;  \
+   if (__zcr != __new) \
+   write_sysreg_s(__new, (reg));   \
+   } while (0)
+


Do compilers actually do much better with this than with a static
inline like the other functions in this header?  Seems like something
they should be figuring out.


It's not about performance or anything of the sort: in most cases
where we end-up using this, it is on the back of an exception.
So performance is the least of our worries.

However, the "reg" parameter to read/write_sysreg_s() cannot
be a variable, because it is directly fed to the assembler.
If you want to use functions, you need to specialise them per
register. At this point, I'm pretty happy with a #define.

M.
--
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v2 05/11] arm64: sve: Provide a conditional update accessor for ZCR_ELx

2021-03-19 Thread Mark Brown
On Thu, Mar 18, 2021 at 12:25:26PM +, Marc Zyngier wrote:

> A common pattern is to conditionally update ZCR_ELx in order
> to avoid the "self-synchronizing" effect that writing to this
> register has.
> 
> Let's provide an accessor that does exactly this.

Reviewed-by: Mark Brown 

> +#define sve_cond_update_zcr_vq(val, reg) \
> + do {\
> + u64 __zcr = read_sysreg_s((reg));   \
> + u64 __new = __zcr & ~ZCR_ELx_LEN_MASK;  \
> + __new |= (val) & ZCR_ELx_LEN_MASK;  \
> + if (__zcr != __new) \
> + write_sysreg_s(__new, (reg));   \
> + } while (0)
> +

Do compilers actually do much better with this than with a static
inline like the other functions in this header?  Seems like something
they should be figuring out.


signature.asc
Description: PGP signature
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC PATCH 3/4] KVM: arm64: Install the block entry before unmapping the page mappings

2021-03-19 Thread Alexandru Elisei
Hi Yanan,

Sorry for taking so long to reply, been busy with other things unfortunately. I
did notice that you sent a new version of this series, but I would like to
continue our discussion on this patch, since it's easier to get the full 
context.

On 3/4/21 7:07 AM, wangyanan (Y) wrote:
> Hi Alex,
>
> On 2021/3/4 1:27, Alexandru Elisei wrote:
>> Hi Yanan,
>>
>> On 3/3/21 11:04 AM, wangyanan (Y) wrote:
>>> Hi Alex,
>>>
>>> On 2021/3/3 1:13, Alexandru Elisei wrote:
 Hello,

 On 2/8/21 11:22 AM, Yanan Wang wrote:
> When KVM needs to coalesce the normal page mappings into a block mapping,
> we currently invalidate the old table entry first followed by invalidation
> of TLB, then unmap the page mappings, and install the block entry at last.
>
> It will cost a long time to unmap the numerous page mappings, which means
> there will be a long period when the table entry can be found invalid.
> If other vCPUs access any guest page within the block range and find the
> table entry invalid, they will all exit from guest with a translation 
> fault
> which is not necessary. And KVM will make efforts to handle these faults,
> especially when performing CMOs by block range.
>
> So let's quickly install the block entry at first to ensure uninterrupted
> memory access of the other vCPUs, and then unmap the page mappings after
> installation. This will reduce most of the time when the table entry is
> invalid, and avoid most of the unnecessary translation faults.
 I'm not convinced I've fully understood what is going on yet, but it seems 
 to me
 that the idea is sound. Some questions and comments below.
>>> What I am trying to do in this patch is to adjust the order of rebuilding 
>>> block
>>> mappings from page mappings.
>>> Take the rebuilding of 1G block mappings as an example.
>>> Before this patch, the order is like:
>>> 1) invalidate the table entry of the 1st level(PUD)
>>> 2) flush TLB by VMID
>>> 3) unmap the old PMD/PTE tables
>>> 4) install the new block entry to the 1st level(PUD)
>>>
>>> So entry in the 1st level can be found invalid by other vcpus in 1), 2), 
>>> and 3),
>>> and it's a long time in 3) to unmap
>>> the numerous old PMD/PTE tables, which means the total time of the entry 
>>> being
>>> invalid is long enough to
>>> affect the performance.
>>>
>>> After this patch, the order is like:
>>> 1) invalidate the table ebtry of the 1st level(PUD)
>>> 2) flush TLB by VMID
>>> 3) install the new block entry to the 1st level(PUD)
>>> 4) unmap the old PMD/PTE tables
>>>
>>> The change ensures that period of entry in the 1st level(PUD) being invalid 
>>> is
>>> only in 1) and 2),
>>> so if other vcpus access memory within 1G, there will be less chance to 
>>> find the
>>> entry invalid
>>> and as a result trigger an unnecessary translation fault.
>> Thank you for the explanation, that was my understand of it also, and I 
>> believe
>> your idea is correct. I was more concerned that I got some of the details 
>> wrong,
>> and you have kindly corrected me below.
>>
> Signed-off-by: Yanan Wang 
> ---
>    arch/arm64/kvm/hyp/pgtable.c | 26 --
>    1 file changed, 12 insertions(+), 14 deletions(-)
>
> diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
> index 78a560446f80..308c36b9cd21 100644
> --- a/arch/arm64/kvm/hyp/pgtable.c
> +++ b/arch/arm64/kvm/hyp/pgtable.c
> @@ -434,6 +434,7 @@ struct stage2_map_data {
>    kvm_pte_t    attr;
>      kvm_pte_t    *anchor;
> +    kvm_pte_t    *follow;
>      struct kvm_s2_mmu    *mmu;
>    struct kvm_mmu_memory_cache    *memcache;
> @@ -553,15 +554,14 @@ static int stage2_map_walk_table_pre(u64 addr, u64 
> end,
> u32 level,
>    if (!kvm_block_mapping_supported(addr, end, data->phys, level))
>    return 0;
>    -    kvm_set_invalid_pte(ptep);
> -
>    /*
> - * Invalidate the whole stage-2, as we may have numerous leaf
> - * entries below us which would otherwise need invalidating
> - * individually.
> + * If we need to coalesce existing table entries into a block here,
> + * then install the block entry first and the sub-level page mappings
> + * will be unmapped later.
>     */
> -    kvm_call_hyp(__kvm_tlb_flush_vmid, data->mmu);
>    data->anchor = ptep;
> +    data->follow = kvm_pte_follow(*ptep);
> +    stage2_coalesce_tables_into_block(addr, level, ptep, data);
 Here's how stage2_coalesce_tables_into_block() is implemented from the 
 previous
 patch (it might be worth merging it with this patch, I found it impossible 
 to
 judge if the function is correct without seeing how it is used and what is
 replacing):
>>> Ok, will do this if v2 is going to be post.
 static void 

Re: [PATCH v3 5/5] KVM: arm64: Log source when panicking from nVHE hyp

2021-03-19 Thread Andrew Scull
> > > diff --git a/arch/arm64/include/asm/kvm_mmu.h 
> > > b/arch/arm64/include/asm/kvm_mmu.h
> > > index 90873851f677..7c17a67d2291 100644
> > > --- a/arch/arm64/include/asm/kvm_mmu.h
> > > +++ b/arch/arm64/include/asm/kvm_mmu.h
> > > @@ -121,6 +121,8 @@ void kvm_update_va_mask(struct alt_instr *alt,
> > >  void kvm_compute_layout(void);
> > >  void kvm_apply_hyp_relocations(void);
> > >
> > > +#define __hyp_pa(x) (((phys_addr_t)(x)) + hyp_physvirt_offset)
> >
> > Just a heads up: but Quentin's series moves this same macro, but to a
> > different header (arch/arm64/kvm/hyp/include/nvhe/memory.h)
>
> I can make sure we're putting it in the same header to ease the merge.

Seems I was too optimistic since nvhe/memory.h can only be used from
nVHE hyp. I could rebase no top of Quentin to move the macro somewhere
that can be used from here, but that would add a series dependency
that you usually seem to try and avoid so I'm not going to change
this, unless you'd prefer that I do.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v6 38/38] KVM: arm64: Protect the .hyp sections from the host

2021-03-19 Thread Quentin Perret
When KVM runs in nVHE protected mode, use the host stage 2 to unmap the
hypervisor sections by marking them as owned by the hypervisor itself.
The long-term goal is to ensure the EL2 code can remain robust
regardless of the host's state, so this starts by making sure the host
cannot e.g. write to the .hyp sections directly.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_asm.h  |  1 +
 arch/arm64/kvm/arm.c  | 46 +++
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |  2 +
 arch/arm64/kvm/hyp/nvhe/hyp-main.c|  9 
 arch/arm64/kvm/hyp/nvhe/mem_protect.c | 33 +
 5 files changed, 91 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 4149283b4cd1..cf8df032b9c3 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -62,6 +62,7 @@
 #define __KVM_HOST_SMCCC_FUNC___pkvm_create_private_mapping17
 #define __KVM_HOST_SMCCC_FUNC___pkvm_cpu_set_vector18
 #define __KVM_HOST_SMCCC_FUNC___pkvm_prot_finalize 19
+#define __KVM_HOST_SMCCC_FUNC___pkvm_mark_hyp  20
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index d237c378e6fb..368159021dee 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1899,11 +1899,57 @@ void _kvm_host_prot_finalize(void *discard)
WARN_ON(kvm_call_hyp_nvhe(__pkvm_prot_finalize));
 }
 
+static inline int pkvm_mark_hyp(phys_addr_t start, phys_addr_t end)
+{
+   return kvm_call_hyp_nvhe(__pkvm_mark_hyp, start, end);
+}
+
+#define pkvm_mark_hyp_section(__section)   \
+   pkvm_mark_hyp(__pa_symbol(__section##_start),   \
+   __pa_symbol(__section##_end))
+
 static int finalize_hyp_mode(void)
 {
+   int cpu, ret;
+
if (!is_protected_kvm_enabled())
return 0;
 
+   ret = pkvm_mark_hyp_section(__hyp_idmap_text);
+   if (ret)
+   return ret;
+
+   ret = pkvm_mark_hyp_section(__hyp_text);
+   if (ret)
+   return ret;
+
+   ret = pkvm_mark_hyp_section(__hyp_rodata);
+   if (ret)
+   return ret;
+
+   ret = pkvm_mark_hyp_section(__hyp_bss);
+   if (ret)
+   return ret;
+
+   ret = pkvm_mark_hyp(hyp_mem_base, hyp_mem_base + hyp_mem_size);
+   if (ret)
+   return ret;
+
+   for_each_possible_cpu(cpu) {
+   phys_addr_t start = virt_to_phys((void 
*)kvm_arm_hyp_percpu_base[cpu]);
+   phys_addr_t end = start + (PAGE_SIZE << nvhe_percpu_order());
+
+   ret = pkvm_mark_hyp(start, end);
+   if (ret)
+   return ret;
+
+   start = virt_to_phys((void *)per_cpu(kvm_arm_hyp_stack_page, 
cpu));
+   end = start + PAGE_SIZE;
+   ret = pkvm_mark_hyp(start, end);
+   if (ret)
+   return ret;
+   }
+
/*
 * Flip the static key upfront as that may no longer be possible
 * once the host stage 2 is installed.
diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h 
b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
index d293cb328cc4..42d81ec739fa 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
@@ -21,6 +21,8 @@ struct host_kvm {
 extern struct host_kvm host_kvm;
 
 int __pkvm_prot_finalize(void);
+int __pkvm_mark_hyp(phys_addr_t start, phys_addr_t end);
+
 int kvm_host_prepare_stage2(void *mem_pgt_pool, void *dev_pgt_pool);
 void handle_host_mem_abort(struct kvm_cpu_context *host_ctxt);
 
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c 
b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 69163f2cbb63..b4eaa7ef13e0 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -156,6 +156,14 @@ static void handle___pkvm_prot_finalize(struct 
kvm_cpu_context *host_ctxt)
 {
cpu_reg(host_ctxt, 1) = __pkvm_prot_finalize();
 }
+
+static void handle___pkvm_mark_hyp(struct kvm_cpu_context *host_ctxt)
+{
+   DECLARE_REG(phys_addr_t, start, host_ctxt, 1);
+   DECLARE_REG(phys_addr_t, end, host_ctxt, 2);
+
+   cpu_reg(host_ctxt, 1) = __pkvm_mark_hyp(start, end);
+}
 typedef void (*hcall_t)(struct kvm_cpu_context *);
 
 #define HANDLE_FUNC(x) [__KVM_HOST_SMCCC_FUNC_##x] = (hcall_t)handle_##x
@@ -180,6 +188,7 @@ static const hcall_t host_hcall[] = {
HANDLE_FUNC(__pkvm_create_mappings),
HANDLE_FUNC(__pkvm_create_private_mapping),
HANDLE_FUNC(__pkvm_prot_finalize),
+   HANDLE_FUNC(__pkvm_mark_hyp),
 };
 
 static void handle_host_hcall(struct kvm_cpu_context *host_ctxt)
diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c 
b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index 77b48c47344d..808e2471091b 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ 

[PATCH v6 37/38] KVM: arm64: Disable PMU support in protected mode

2021-03-19 Thread Quentin Perret
The host currently writes directly in EL2 per-CPU data sections from
the PMU code when running in nVHE. In preparation for unmapping the EL2
sections from the host stage 2, disable PMU support in protected mode as
we currently do not have a use-case for it.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/kvm/perf.c | 3 ++-
 arch/arm64/kvm/pmu.c  | 8 
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kvm/perf.c b/arch/arm64/kvm/perf.c
index 739164324afe..8f860ae56bb7 100644
--- a/arch/arm64/kvm/perf.c
+++ b/arch/arm64/kvm/perf.c
@@ -55,7 +55,8 @@ int kvm_perf_init(void)
 * hardware performance counters. This could ensure the presence of
 * a physical PMU and CONFIG_PERF_EVENT is selected.
 */
-   if (IS_ENABLED(CONFIG_ARM_PMU) && perf_num_counters() > 0)
+   if (IS_ENABLED(CONFIG_ARM_PMU) && perf_num_counters() > 0
+  && !is_protected_kvm_enabled())
static_branch_enable(_arm_pmu_available);
 
return perf_register_guest_info_callbacks(_guest_cbs);
diff --git a/arch/arm64/kvm/pmu.c b/arch/arm64/kvm/pmu.c
index faf32a44ba04..03a6c1f4a09a 100644
--- a/arch/arm64/kvm/pmu.c
+++ b/arch/arm64/kvm/pmu.c
@@ -33,7 +33,7 @@ void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr)
 {
struct kvm_host_data *ctx = this_cpu_ptr_hyp_sym(kvm_host_data);
 
-   if (!ctx || !kvm_pmu_switch_needed(attr))
+   if (!kvm_arm_support_pmu_v3() || !ctx || !kvm_pmu_switch_needed(attr))
return;
 
if (!attr->exclude_host)
@@ -49,7 +49,7 @@ void kvm_clr_pmu_events(u32 clr)
 {
struct kvm_host_data *ctx = this_cpu_ptr_hyp_sym(kvm_host_data);
 
-   if (!ctx)
+   if (!kvm_arm_support_pmu_v3() || !ctx)
return;
 
ctx->pmu_events.events_host &= ~clr;
@@ -172,7 +172,7 @@ void kvm_vcpu_pmu_restore_guest(struct kvm_vcpu *vcpu)
struct kvm_host_data *host;
u32 events_guest, events_host;
 
-   if (!has_vhe())
+   if (!kvm_arm_support_pmu_v3() || !has_vhe())
return;
 
preempt_disable();
@@ -193,7 +193,7 @@ void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu)
struct kvm_host_data *host;
u32 events_guest, events_host;
 
-   if (!has_vhe())
+   if (!kvm_arm_support_pmu_v3() || !has_vhe())
return;
 
host = this_cpu_ptr_hyp_sym(kvm_host_data);
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v6 36/38] KVM: arm64: Page-align the .hyp sections

2021-03-19 Thread Quentin Perret
We will soon unmap the .hyp sections from the host stage 2 in Protected
nVHE mode, which obviously works with at least page granularity, so make
sure to align them correctly.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/kernel/vmlinux.lds.S | 22 +-
 1 file changed, 9 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index e96173ce211b..709d2c433c5e 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -15,9 +15,11 @@
 
 #define HYPERVISOR_DATA_SECTIONS   \
HYP_SECTION_NAME(.rodata) : {   \
+   . = ALIGN(PAGE_SIZE);   \
__hyp_rodata_start = .; \
*(HYP_SECTION_NAME(.data..ro_after_init))   \
*(HYP_SECTION_NAME(.rodata))\
+   . = ALIGN(PAGE_SIZE);   \
__hyp_rodata_end = .;   \
}
 
@@ -72,21 +74,14 @@ ENTRY(_text)
 jiffies = jiffies_64;
 
 #define HYPERVISOR_TEXT\
-   /*  \
-* Align to 4 KB so that\
-* a) the HYP vector table is at its minimum\
-*alignment of 2048 bytes   \
-* b) the HYP init code will not cross a page   \
-*boundary if its size does not exceed  \
-*4 KB (see related ASSERT() below) \
-*/ \
-   . = ALIGN(SZ_4K);   \
+   . = ALIGN(PAGE_SIZE);   \
__hyp_idmap_text_start = .; \
*(.hyp.idmap.text)  \
__hyp_idmap_text_end = .;   \
__hyp_text_start = .;   \
*(.hyp.text)\
HYPERVISOR_EXTABLE  \
+   . = ALIGN(PAGE_SIZE);   \
__hyp_text_end = .;
 
 #define IDMAP_TEXT \
@@ -322,11 +317,12 @@ SECTIONS
 #include "image-vars.h"
 
 /*
- * The HYP init code and ID map text can't be longer than a page each,
- * and should not cross a page boundary.
+ * The HYP init code and ID map text can't be longer than a page each. The
+ * former is page-aligned, but the latter may not be with 16K or 64K pages, so
+ * it should also not cross a page boundary.
  */
-ASSERT(__hyp_idmap_text_end - (__hyp_idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K,
-   "HYP init code too big or misaligned")
+ASSERT(__hyp_idmap_text_end - __hyp_idmap_text_start <= PAGE_SIZE,
+   "HYP init code too big")
 ASSERT(__idmap_text_end - (__idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K,
"ID map text too big or misaligned")
 #ifdef CONFIG_HIBERNATION
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v6 33/38] KVM: arm64: Introduce KVM_PGTABLE_S2_IDMAP stage 2 flag

2021-03-19 Thread Quentin Perret
Introduce a new stage 2 configuration flag to specify that all mappings
in a given page-table will be identity-mapped, as will be the case for
the host. This allows to introduce sanity checks in the map path and to
avoid programming errors.

Suggested-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_pgtable.h | 2 ++
 arch/arm64/kvm/hyp/pgtable.c | 3 +++
 2 files changed, 5 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_pgtable.h 
b/arch/arm64/include/asm/kvm_pgtable.h
index 55452f4831d2..c3674c47d48c 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -60,9 +60,11 @@ struct kvm_pgtable_mm_ops {
  * enum kvm_pgtable_stage2_flags - Stage-2 page-table flags.
  * @KVM_PGTABLE_S2_NOFWB:  Don't enforce Normal-WB even if the CPUs have
  * ARM64_HAS_STAGE2_FWB.
+ * @KVM_PGTABLE_S2_IDMAP:  Only use identity mappings.
  */
 enum kvm_pgtable_stage2_flags {
KVM_PGTABLE_S2_NOFWB= BIT(0),
+   KVM_PGTABLE_S2_IDMAP= BIT(1),
 };
 
 /**
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index b22b4860630c..c37c1dc4feaf 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -760,6 +760,9 @@ int kvm_pgtable_stage2_map(struct kvm_pgtable *pgt, u64 
addr, u64 size,
.arg= _data,
};
 
+   if (WARN_ON((pgt->flags & KVM_PGTABLE_S2_IDMAP) && (addr != phys)))
+   return -EINVAL;
+
ret = stage2_set_prot_attr(pgt, prot, _data.attr);
if (ret)
return ret;
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v6 19/38] KVM: arm64: Use kvm_arch for stage 2 pgtable

2021-03-19 Thread Quentin Perret
In order to make use of the stage 2 pgtable code for the host stage 2,
use struct kvm_arch in lieu of struct kvm as the host will have the
former but not the latter.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_pgtable.h | 5 +++--
 arch/arm64/kvm/hyp/pgtable.c | 6 +++---
 arch/arm64/kvm/mmu.c | 2 +-
 3 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_pgtable.h 
b/arch/arm64/include/asm/kvm_pgtable.h
index bf7a3cc49420..7945ec87eaec 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -162,12 +162,13 @@ int kvm_pgtable_hyp_map(struct kvm_pgtable *pgt, u64 
addr, u64 size, u64 phys,
 /**
  * kvm_pgtable_stage2_init() - Initialise a guest stage-2 page-table.
  * @pgt:   Uninitialised page-table structure to initialise.
- * @kvm:   KVM structure representing the guest virtual machine.
+ * @arch:  Arch-specific KVM structure representing the guest virtual
+ * machine.
  * @mm_ops:Memory management callbacks.
  *
  * Return: 0 on success, negative error code on failure.
  */
-int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm *kvm,
+int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_arch *arch,
struct kvm_pgtable_mm_ops *mm_ops);
 
 /**
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 82aca35a22f6..ea95bbc6ba80 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -880,11 +880,11 @@ int kvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 
addr, u64 size)
return kvm_pgtable_walk(pgt, addr, size, );
 }
 
-int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm *kvm,
+int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_arch *arch,
struct kvm_pgtable_mm_ops *mm_ops)
 {
size_t pgd_sz;
-   u64 vtcr = kvm->arch.vtcr;
+   u64 vtcr = arch->vtcr;
u32 ia_bits = VTCR_EL2_IPA(vtcr);
u32 sl0 = FIELD_GET(VTCR_EL2_SL0_MASK, vtcr);
u32 start_level = VTCR_EL2_TGRAN_SL0_BASE - sl0;
@@ -897,7 +897,7 @@ int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct 
kvm *kvm,
pgt->ia_bits= ia_bits;
pgt->start_level= start_level;
pgt->mm_ops = mm_ops;
-   pgt->mmu= >arch.mmu;
+   pgt->mmu= >mmu;
 
/* Ensure zeroed PGD pages are visible to the hardware walker */
dsb(ishst);
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index de0ad79d2c90..d6eb1fb21232 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -457,7 +457,7 @@ int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu 
*mmu)
if (!pgt)
return -ENOMEM;
 
-   err = kvm_pgtable_stage2_init(pgt, kvm, _s2_mm_ops);
+   err = kvm_pgtable_stage2_init(pgt, >arch, _s2_mm_ops);
if (err)
goto out_free_pgtable;
 
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v6 34/38] KVM: arm64: Provide sanitized mmfr* registers at EL2

2021-03-19 Thread Quentin Perret
We will need to read sanitized values of mmfr{0,1}_el1 at EL2 soon, so
add them to the list of copied variables.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_cpufeature.h | 2 ++
 arch/arm64/kvm/hyp/nvhe/hyp-smp.c   | 2 ++
 arch/arm64/kvm/sys_regs.c   | 2 ++
 3 files changed, 6 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_cpufeature.h 
b/arch/arm64/include/asm/kvm_cpufeature.h
index c2e7735f502b..ff302d15e840 100644
--- a/arch/arm64/include/asm/kvm_cpufeature.h
+++ b/arch/arm64/include/asm/kvm_cpufeature.h
@@ -20,5 +20,7 @@
 #endif
 
 DECLARE_KVM_HYP_CPU_FTR_REG(arm64_ftr_reg_ctrel0);
+DECLARE_KVM_HYP_CPU_FTR_REG(arm64_ftr_reg_id_aa64mmfr0_el1);
+DECLARE_KVM_HYP_CPU_FTR_REG(arm64_ftr_reg_id_aa64mmfr1_el1);
 
 #endif
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-smp.c 
b/arch/arm64/kvm/hyp/nvhe/hyp-smp.c
index 71f00aca90e7..17ad1b3a9530 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-smp.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-smp.c
@@ -13,6 +13,8 @@
  * Copies of the host's CPU features registers holding sanitized values.
  */
 DEFINE_KVM_HYP_CPU_FTR_REG(arm64_ftr_reg_ctrel0);
+DEFINE_KVM_HYP_CPU_FTR_REG(arm64_ftr_reg_id_aa64mmfr0_el1);
+DEFINE_KVM_HYP_CPU_FTR_REG(arm64_ftr_reg_id_aa64mmfr1_el1);
 
 /*
  * nVHE copy of data structures tracking available CPU cores.
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 3ec34c25e877..dfb3b4f9ca84 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -2784,6 +2784,8 @@ struct __ftr_reg_copy_entry {
struct arm64_ftr_reg*dst;
 } hyp_ftr_regs[] __initdata = {
CPU_FTR_REG_HYP_COPY(SYS_CTR_EL0, arm64_ftr_reg_ctrel0),
+   CPU_FTR_REG_HYP_COPY(SYS_ID_AA64MMFR0_EL1, 
arm64_ftr_reg_id_aa64mmfr0_el1),
+   CPU_FTR_REG_HYP_COPY(SYS_ID_AA64MMFR1_EL1, 
arm64_ftr_reg_id_aa64mmfr1_el1),
 };
 
 void __init setup_kvm_el2_caps(void)
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v6 32/38] KVM: arm64: Introduce KVM_PGTABLE_S2_NOFWB stage 2 flag

2021-03-19 Thread Quentin Perret
In order to further configure stage 2 page-tables, pass flags to the
init function using a new enum.

The first of these flags allows to disable FWB even if the hardware
supports it as we will need to do so for the host stage 2.

Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_pgtable.h  | 43 +---
 arch/arm64/include/asm/pgtable-prot.h |  4 +-
 arch/arm64/kvm/hyp/pgtable.c  | 56 +++
 3 files changed, 62 insertions(+), 41 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_pgtable.h 
b/arch/arm64/include/asm/kvm_pgtable.h
index e1fed14aee17..55452f4831d2 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -56,6 +56,15 @@ struct kvm_pgtable_mm_ops {
phys_addr_t (*virt_to_phys)(void *addr);
 };
 
+/**
+ * enum kvm_pgtable_stage2_flags - Stage-2 page-table flags.
+ * @KVM_PGTABLE_S2_NOFWB:  Don't enforce Normal-WB even if the CPUs have
+ * ARM64_HAS_STAGE2_FWB.
+ */
+enum kvm_pgtable_stage2_flags {
+   KVM_PGTABLE_S2_NOFWB= BIT(0),
+};
+
 /**
  * struct kvm_pgtable - KVM page-table.
  * @ia_bits:   Maximum input address size, in bits.
@@ -72,6 +81,7 @@ struct kvm_pgtable {
 
/* Stage-2 only */
struct kvm_s2_mmu   *mmu;
+   enum kvm_pgtable_stage2_flags   flags;
 };
 
 /**
@@ -196,20 +206,25 @@ int kvm_pgtable_hyp_map(struct kvm_pgtable *pgt, u64 
addr, u64 size, u64 phys,
 u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift);
 
 /**
- * kvm_pgtable_stage2_init() - Initialise a guest stage-2 page-table.
+ * kvm_pgtable_stage2_init_flags() - Initialise a guest stage-2 page-table.
  * @pgt:   Uninitialised page-table structure to initialise.
  * @arch:  Arch-specific KVM structure representing the guest virtual
  * machine.
  * @mm_ops:Memory management callbacks.
+ * @flags: Stage-2 configuration flags.
  *
  * Return: 0 on success, negative error code on failure.
  */
-int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_arch *arch,
-   struct kvm_pgtable_mm_ops *mm_ops);
+int kvm_pgtable_stage2_init_flags(struct kvm_pgtable *pgt, struct kvm_arch 
*arch,
+ struct kvm_pgtable_mm_ops *mm_ops,
+ enum kvm_pgtable_stage2_flags flags);
+
+#define kvm_pgtable_stage2_init(pgt, arch, mm_ops) \
+   kvm_pgtable_stage2_init_flags(pgt, arch, mm_ops, 0)
 
 /**
  * kvm_pgtable_stage2_destroy() - Destroy an unused guest stage-2 page-table.
- * @pgt:   Page-table structure initialised by kvm_pgtable_stage2_init().
+ * @pgt:   Page-table structure initialised by kvm_pgtable_stage2_init*().
  *
  * The page-table is assumed to be unreachable by any hardware walkers prior
  * to freeing and therefore no TLB invalidation is performed.
@@ -218,7 +233,7 @@ void kvm_pgtable_stage2_destroy(struct kvm_pgtable *pgt);
 
 /**
  * kvm_pgtable_stage2_map() - Install a mapping in a guest stage-2 page-table.
- * @pgt:   Page-table structure initialised by kvm_pgtable_stage2_init().
+ * @pgt:   Page-table structure initialised by kvm_pgtable_stage2_init*().
  * @addr:  Intermediate physical address at which to place the mapping.
  * @size:  Size of the mapping.
  * @phys:  Physical address of the memory to map.
@@ -251,7 +266,7 @@ int kvm_pgtable_stage2_map(struct kvm_pgtable *pgt, u64 
addr, u64 size,
 /**
  * kvm_pgtable_stage2_set_owner() - Unmap and annotate pages in the IPA space 
to
  * track ownership.
- * @pgt:   Page-table structure initialised by kvm_pgtable_stage2_init().
+ * @pgt:   Page-table structure initialised by kvm_pgtable_stage2_init*().
  * @addr:  Base intermediate physical address to annotate.
  * @size:  Size of the annotated range.
  * @mc:Cache of pre-allocated and zeroed memory from which to 
allocate
@@ -270,7 +285,7 @@ int kvm_pgtable_stage2_set_owner(struct kvm_pgtable *pgt, 
u64 addr, u64 size,
 
 /**
  * kvm_pgtable_stage2_unmap() - Remove a mapping from a guest stage-2 
page-table.
- * @pgt:   Page-table structure initialised by kvm_pgtable_stage2_init().
+ * @pgt:   Page-table structure initialised by kvm_pgtable_stage2_init*().
  * @addr:  Intermediate physical address from which to remove the mapping.
  * @size:  Size of the mapping.
  *
@@ -290,7 +305,7 @@ int kvm_pgtable_stage2_unmap(struct kvm_pgtable *pgt, u64 
addr, u64 size);
 /**
  * kvm_pgtable_stage2_wrprotect() - Write-protect guest stage-2 address range
  *  without TLB invalidation.
- * @pgt:   Page-table structure initialised by kvm_pgtable_stage2_init().
+ * @pgt:   Page-table structure initialised by kvm_pgtable_stage2_init*().
  * @addr:  Intermediate physical address from which to write-protect,
  * @size:  Size of the range.
  *

[PATCH v6 26/38] KVM: arm64: Reserve memory for host stage 2

2021-03-19 Thread Quentin Perret
Extend the memory pool allocated for the hypervisor to include enough
pages to map all of memory at page granularity for the host stage 2.
While at it, also reserve some memory for device mappings.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/kvm/hyp/include/nvhe/mm.h | 27 ++-
 arch/arm64/kvm/hyp/nvhe/setup.c  | 12 
 arch/arm64/kvm/hyp/reserved_mem.c|  2 ++
 3 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/hyp/include/nvhe/mm.h 
b/arch/arm64/kvm/hyp/include/nvhe/mm.h
index ac0f7fcffd08..0095f6289742 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/mm.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/mm.h
@@ -53,7 +53,7 @@ static inline unsigned long __hyp_pgtable_max_pages(unsigned 
long nr_pages)
return total;
 }
 
-static inline unsigned long hyp_s1_pgtable_pages(void)
+static inline unsigned long __hyp_pgtable_total_pages(void)
 {
unsigned long res = 0, i;
 
@@ -63,9 +63,34 @@ static inline unsigned long hyp_s1_pgtable_pages(void)
res += __hyp_pgtable_max_pages(reg->size >> PAGE_SHIFT);
}
 
+   return res;
+}
+
+static inline unsigned long hyp_s1_pgtable_pages(void)
+{
+   unsigned long res;
+
+   res = __hyp_pgtable_total_pages();
+
/* Allow 1 GiB for private mappings */
res += __hyp_pgtable_max_pages(SZ_1G >> PAGE_SHIFT);
 
return res;
 }
+
+static inline unsigned long host_s2_mem_pgtable_pages(void)
+{
+   /*
+* Include an extra 16 pages to safely upper-bound the worst case of
+* concatenated pgds.
+*/
+   return __hyp_pgtable_total_pages() + 16;
+}
+
+static inline unsigned long host_s2_dev_pgtable_pages(void)
+{
+   /* Allow 1 GiB for MMIO mappings */
+   return __hyp_pgtable_max_pages(SZ_1G >> PAGE_SHIFT);
+}
+
 #endif /* __KVM_HYP_MM_H */
diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c
index 1e8bcd8b0299..c1a3e7e0ebbc 100644
--- a/arch/arm64/kvm/hyp/nvhe/setup.c
+++ b/arch/arm64/kvm/hyp/nvhe/setup.c
@@ -24,6 +24,8 @@ unsigned long hyp_nr_cpus;
 
 static void *vmemmap_base;
 static void *hyp_pgt_base;
+static void *host_s2_mem_pgt_base;
+static void *host_s2_dev_pgt_base;
 
 static int divide_memory_pool(void *virt, unsigned long size)
 {
@@ -42,6 +44,16 @@ static int divide_memory_pool(void *virt, unsigned long size)
if (!hyp_pgt_base)
return -ENOMEM;
 
+   nr_pages = host_s2_mem_pgtable_pages();
+   host_s2_mem_pgt_base = hyp_early_alloc_contig(nr_pages);
+   if (!host_s2_mem_pgt_base)
+   return -ENOMEM;
+
+   nr_pages = host_s2_dev_pgtable_pages();
+   host_s2_dev_pgt_base = hyp_early_alloc_contig(nr_pages);
+   if (!host_s2_dev_pgt_base)
+   return -ENOMEM;
+
return 0;
 }
 
diff --git a/arch/arm64/kvm/hyp/reserved_mem.c 
b/arch/arm64/kvm/hyp/reserved_mem.c
index 9bc6a6d27904..fd42705a3c26 100644
--- a/arch/arm64/kvm/hyp/reserved_mem.c
+++ b/arch/arm64/kvm/hyp/reserved_mem.c
@@ -52,6 +52,8 @@ void __init kvm_hyp_reserve(void)
}
 
hyp_mem_pages += hyp_s1_pgtable_pages();
+   hyp_mem_pages += host_s2_mem_pgtable_pages();
+   hyp_mem_pages += host_s2_dev_pgtable_pages();
 
/*
 * The hyp_vmemmap needs to be backed by pages, but these pages
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v6 31/38] KVM: arm64: Add kvm_pgtable_stage2_find_range()

2021-03-19 Thread Quentin Perret
Since the host stage 2 will be identity mapped, and since it will own
most of memory, it would preferable for performance to try and use large
block mappings whenever that is possible. To ease this, introduce a new
helper in the KVM page-table code which allows to search for large
ranges of available IPA space. This will be used in the host memory
abort path to greedily idmap large portion of the PA space.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_pgtable.h | 29 +
 arch/arm64/kvm/hyp/pgtable.c | 89 ++--
 2 files changed, 114 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_pgtable.h 
b/arch/arm64/include/asm/kvm_pgtable.h
index eea2e2b0acaa..e1fed14aee17 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -94,6 +94,16 @@ enum kvm_pgtable_prot {
 #define PAGE_HYP_RO(KVM_PGTABLE_PROT_R)
 #define PAGE_HYP_DEVICE(PAGE_HYP | KVM_PGTABLE_PROT_DEVICE)
 
+/**
+ * struct kvm_mem_range - Range of Intermediate Physical Addresses
+ * @start: Start of the range.
+ * @end:   End of the range.
+ */
+struct kvm_mem_range {
+   u64 start;
+   u64 end;
+};
+
 /**
  * enum kvm_pgtable_walk_flags - Flags to control a depth-first page-table 
walk.
  * @KVM_PGTABLE_WALK_LEAF: Visit leaf entries, including invalid
@@ -397,4 +407,23 @@ int kvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 
addr, u64 size);
 int kvm_pgtable_walk(struct kvm_pgtable *pgt, u64 addr, u64 size,
 struct kvm_pgtable_walker *walker);
 
+/**
+ * kvm_pgtable_stage2_find_range() - Find a range of Intermediate Physical
+ *  Addresses with compatible permission
+ *  attributes.
+ * @pgt:   Page-table structure initialised by kvm_pgtable_stage2_init().
+ * @addr:  Address that must be covered by the range.
+ * @prot:  Protection attributes that the range must be compatible with.
+ * @range: Range structure used to limit the search space at call time and
+ * that will hold the result.
+ *
+ * The offset of @addr within a page is ignored. An IPA is compatible with 
@prot
+ * iff its corresponding stage-2 page-table entry has default ownership and, if
+ * valid, is mapped with protection attributes identical to @prot.
+ *
+ * Return: 0 on success, negative error code on failure.
+ */
+int kvm_pgtable_stage2_find_range(struct kvm_pgtable *pgt, u64 addr,
+ enum kvm_pgtable_prot prot,
+ struct kvm_mem_range *range);
 #endif /* __ARM64_KVM_PGTABLE_H__ */
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index f4a514a2e7ae..dc6ef2cfe3eb 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -48,6 +48,8 @@
 KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W | \
 KVM_PTE_LEAF_ATTR_HI_S2_XN)
 
+#define KVM_PTE_LEAF_ATTR_S2_IGNORED   GENMASK(58, 55)
+
 #define KVM_INVALID_PTE_OWNER_MASK GENMASK(63, 56)
 #define KVM_MAX_OWNER_ID   1
 
@@ -77,15 +79,20 @@ static bool kvm_phys_is_valid(u64 phys)
return phys < 
BIT(id_aa64mmfr0_parange_to_phys_shift(ID_AA64MMFR0_PARANGE_MAX));
 }
 
-static bool kvm_block_mapping_supported(u64 addr, u64 end, u64 phys, u32 level)
+static bool kvm_level_supports_block_mapping(u32 level)
 {
-   u64 granule = kvm_granule_size(level);
-
/*
 * Reject invalid block mappings and don't bother with 4TB mappings for
 * 52-bit PAs.
 */
-   if (level == 0 || (PAGE_SIZE != SZ_4K && level == 1))
+   return !(level == 0 || (PAGE_SIZE != SZ_4K && level == 1));
+}
+
+static bool kvm_block_mapping_supported(u64 addr, u64 end, u64 phys, u32 level)
+{
+   u64 granule = kvm_granule_size(level);
+
+   if (!kvm_level_supports_block_mapping(level))
return false;
 
if (granule > (end - addr))
@@ -1053,3 +1060,77 @@ void kvm_pgtable_stage2_destroy(struct kvm_pgtable *pgt)
pgt->mm_ops->free_pages_exact(pgt->pgd, pgd_sz);
pgt->pgd = NULL;
 }
+
+#define KVM_PTE_LEAF_S2_COMPAT_MASK(KVM_PTE_LEAF_ATTR_S2_PERMS | \
+KVM_PTE_LEAF_ATTR_LO_S2_MEMATTR | \
+KVM_PTE_LEAF_ATTR_S2_IGNORED)
+
+static int stage2_check_permission_walker(u64 addr, u64 end, u32 level,
+ kvm_pte_t *ptep,
+ enum kvm_pgtable_walk_flags flag,
+ void * const arg)
+{
+   kvm_pte_t old_attr, pte = *ptep, *new_attr = arg;
+
+   /*
+* Compatible mappings are either invalid and owned by the page-table
+* owner (whose id is 0), or valid with matching permission attributes.
+*/
+   

[PATCH v6 18/38] KVM: arm64: Elevate hypervisor mappings creation at EL2

2021-03-19 Thread Quentin Perret
Previous commits have introduced infrastructure to enable the EL2 code
to manage its own stage 1 mappings. However, this was preliminary work,
and none of it is currently in use.

Put all of this together by elevating the mapping creation at EL2 when
memory protection is enabled. In this case, the host kernel running
at EL1 still creates _temporary_ EL2 mappings, only used while
initializing the hypervisor, but frees them right after.

As such, all calls to create_hyp_mappings() after kvm init has finished
turn into hypercalls, as the host now has no 'legal' way to modify the
hypevisor page tables directly.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_mmu.h |  2 +-
 arch/arm64/kvm/arm.c | 87 +---
 arch/arm64/kvm/mmu.c | 43 ++--
 3 files changed, 120 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 5c42ec023cc7..ce02a4052dcf 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -166,7 +166,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu);
 
 phys_addr_t kvm_mmu_get_httbr(void);
 phys_addr_t kvm_get_idmap_vector(void);
-int kvm_mmu_init(void);
+int kvm_mmu_init(u32 *hyp_va_bits);
 
 static inline void *__kvm_vector_slot2addr(void *base,
   enum arm64_hyp_spectre_vector slot)
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index e2c471117bff..d93ea0b82491 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1426,7 +1426,7 @@ static void cpu_prepare_hyp_mode(int cpu)
kvm_flush_dcache_to_poc(params, sizeof(*params));
 }
 
-static void cpu_init_hyp_mode(void)
+static void hyp_install_host_vector(void)
 {
struct kvm_nvhe_init_params *params;
struct arm_smccc_res res;
@@ -1444,6 +1444,11 @@ static void cpu_init_hyp_mode(void)
params = this_cpu_ptr_nvhe_sym(kvm_init_params);
arm_smccc_1_1_hvc(KVM_HOST_SMCCC_FUNC(__kvm_hyp_init), 
virt_to_phys(params), );
WARN_ON(res.a0 != SMCCC_RET_SUCCESS);
+}
+
+static void cpu_init_hyp_mode(void)
+{
+   hyp_install_host_vector();
 
/*
 * Disabling SSBD on a non-VHE system requires us to enable SSBS
@@ -1486,7 +1491,10 @@ static void cpu_set_hyp_vector(void)
struct bp_hardening_data *data = this_cpu_ptr(_hardening_data);
void *vector = hyp_spectre_vector_selector[data->slot];
 
-   *this_cpu_ptr_hyp_sym(kvm_hyp_vector) = (unsigned long)vector;
+   if (!is_protected_kvm_enabled())
+   *this_cpu_ptr_hyp_sym(kvm_hyp_vector) = (unsigned long)vector;
+   else
+   kvm_call_hyp_nvhe(__pkvm_cpu_set_vector, data->slot);
 }
 
 static void cpu_hyp_reinit(void)
@@ -1494,13 +1502,14 @@ static void cpu_hyp_reinit(void)

kvm_init_host_cpu_context(_cpu_ptr_hyp_sym(kvm_host_data)->host_ctxt);
 
cpu_hyp_reset();
-   cpu_set_hyp_vector();
 
if (is_kernel_in_hyp_mode())
kvm_timer_init_vhe();
else
cpu_init_hyp_mode();
 
+   cpu_set_hyp_vector();
+
kvm_arm_init_debug();
 
if (vgic_present)
@@ -1696,18 +1705,59 @@ static void teardown_hyp_mode(void)
}
 }
 
+static int do_pkvm_init(u32 hyp_va_bits)
+{
+   void *per_cpu_base = kvm_ksym_ref(kvm_arm_hyp_percpu_base);
+   int ret;
+
+   preempt_disable();
+   hyp_install_host_vector();
+   ret = kvm_call_hyp_nvhe(__pkvm_init, hyp_mem_base, hyp_mem_size,
+   num_possible_cpus(), kern_hyp_va(per_cpu_base),
+   hyp_va_bits);
+   preempt_enable();
+
+   return ret;
+}
+
+static int kvm_hyp_init_protection(u32 hyp_va_bits)
+{
+   void *addr = phys_to_virt(hyp_mem_base);
+   int ret;
+
+   ret = create_hyp_mappings(addr, addr + hyp_mem_size, PAGE_HYP);
+   if (ret)
+   return ret;
+
+   ret = do_pkvm_init(hyp_va_bits);
+   if (ret)
+   return ret;
+
+   free_hyp_pgds();
+
+   return 0;
+}
+
 /**
  * Inits Hyp-mode on all online CPUs
  */
 static int init_hyp_mode(void)
 {
+   u32 hyp_va_bits;
int cpu;
-   int err = 0;
+   int err = -ENOMEM;
+
+   /*
+* The protected Hyp-mode cannot be initialized if the memory pool
+* allocation has failed.
+*/
+   if (is_protected_kvm_enabled() && !hyp_mem_base)
+   goto out_err;
 
/*
 * Allocate Hyp PGD and setup Hyp identity mapping
 */
-   err = kvm_mmu_init();
+   err = kvm_mmu_init(_va_bits);
if (err)
goto out_err;
 
@@ -1823,6 +1873,14 @@ static int init_hyp_mode(void)
goto out_err;
}
 
+   if (is_protected_kvm_enabled()) {
+   err = kvm_hyp_init_protection(hyp_va_bits);
+   if (err) {
+   kvm_err("Failed 

[PATCH v6 14/38] KVM: arm64: Provide __flush_dcache_area at EL2

2021-03-19 Thread Quentin Perret
We will need to do cache maintenance at EL2 soon, so compile a copy of
__flush_dcache_area at EL2, and provide a copy of arm64_ftr_reg_ctrel0
as it is needed by the read_ctr macro.

Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_cpufeature.h |  2 ++
 arch/arm64/kvm/hyp/nvhe/Makefile|  3 ++-
 arch/arm64/kvm/hyp/nvhe/cache.S | 13 +
 arch/arm64/kvm/hyp/nvhe/hyp-smp.c   |  6 ++
 arch/arm64/kvm/sys_regs.c   |  1 +
 5 files changed, 24 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/kvm/hyp/nvhe/cache.S

diff --git a/arch/arm64/include/asm/kvm_cpufeature.h 
b/arch/arm64/include/asm/kvm_cpufeature.h
index 3d245f96a9fe..c2e7735f502b 100644
--- a/arch/arm64/include/asm/kvm_cpufeature.h
+++ b/arch/arm64/include/asm/kvm_cpufeature.h
@@ -19,4 +19,6 @@
 #define DEFINE_KVM_HYP_CPU_FTR_REG(name) BUILD_BUG()
 #endif
 
+DECLARE_KVM_HYP_CPU_FTR_REG(arm64_ftr_reg_ctrel0);
+
 #endif
diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index 6894a917f290..42dde4bb80b1 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -13,7 +13,8 @@ lib-objs := clear_page.o copy_page.o memcpy.o memset.o
 lib-objs := $(addprefix ../../../lib/, $(lib-objs))
 
 obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
-hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o
+hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o \
+cache.o
 obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
 ../fpsimd.o ../hyp-entry.o ../exception.o
 obj-y += $(lib-objs)
diff --git a/arch/arm64/kvm/hyp/nvhe/cache.S b/arch/arm64/kvm/hyp/nvhe/cache.S
new file mode 100644
index ..36cef6915428
--- /dev/null
+++ b/arch/arm64/kvm/hyp/nvhe/cache.S
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Code copied from arch/arm64/mm/cache.S.
+ */
+
+#include 
+#include 
+#include 
+
+SYM_FUNC_START_PI(__flush_dcache_area)
+   dcache_by_line_op civac, sy, x0, x1, x2, x3
+   ret
+SYM_FUNC_END_PI(__flush_dcache_area)
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-smp.c 
b/arch/arm64/kvm/hyp/nvhe/hyp-smp.c
index 879559057dee..71f00aca90e7 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-smp.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-smp.c
@@ -5,9 +5,15 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 
+/*
+ * Copies of the host's CPU features registers holding sanitized values.
+ */
+DEFINE_KVM_HYP_CPU_FTR_REG(arm64_ftr_reg_ctrel0);
+
 /*
  * nVHE copy of data structures tracking available CPU cores.
  * Only entries for CPUs that were online at KVM init are populated.
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 6c5d133689ae..3ec34c25e877 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -2783,6 +2783,7 @@ struct __ftr_reg_copy_entry {
u32 sys_id;
struct arm64_ftr_reg*dst;
 } hyp_ftr_regs[] __initdata = {
+   CPU_FTR_REG_HYP_COPY(SYS_CTR_EL0, arm64_ftr_reg_ctrel0),
 };
 
 void __init setup_kvm_el2_caps(void)
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v6 35/38] KVM: arm64: Wrap the host with a stage 2

2021-03-19 Thread Quentin Perret
When KVM runs in protected nVHE mode, make use of a stage 2 page-table
to give the hypervisor some control over the host memory accesses. The
host stage 2 is created lazily using large block mappings if possible,
and will default to page mappings in absence of a better solution.

>From this point on, memory accesses from the host to protected memory
regions (e.g. not 'owned' by the host) are fatal and lead to hyp_panic().

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_asm.h  |   1 +
 arch/arm64/kernel/image-vars.h|   3 +
 arch/arm64/kvm/arm.c  |  10 +
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |  34 +++
 arch/arm64/kvm/hyp/nvhe/Makefile  |   2 +-
 arch/arm64/kvm/hyp/nvhe/hyp-init.S|   1 +
 arch/arm64/kvm/hyp/nvhe/hyp-main.c|  10 +
 arch/arm64/kvm/hyp/nvhe/mem_protect.c | 246 ++
 arch/arm64/kvm/hyp/nvhe/setup.c   |   5 +
 arch/arm64/kvm/hyp/nvhe/switch.c  |   7 +-
 arch/arm64/kvm/hyp/nvhe/tlb.c |   4 +-
 11 files changed, 316 insertions(+), 7 deletions(-)
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
 create mode 100644 arch/arm64/kvm/hyp/nvhe/mem_protect.c

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 08f63c09cd11..4149283b4cd1 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -61,6 +61,7 @@
 #define __KVM_HOST_SMCCC_FUNC___pkvm_create_mappings   16
 #define __KVM_HOST_SMCCC_FUNC___pkvm_create_private_mapping17
 #define __KVM_HOST_SMCCC_FUNC___pkvm_cpu_set_vector18
+#define __KVM_HOST_SMCCC_FUNC___pkvm_prot_finalize 19
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index 940c378fa837..d5dc2b792651 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -131,6 +131,9 @@ KVM_NVHE_ALIAS(__hyp_bss_end);
 KVM_NVHE_ALIAS(__hyp_rodata_start);
 KVM_NVHE_ALIAS(__hyp_rodata_end);
 
+/* pKVM static key */
+KVM_NVHE_ALIAS(kvm_protected_mode_initialized);
+
 #endif /* CONFIG_KVM */
 
 #endif /* __ARM64_KERNEL_IMAGE_VARS_H */
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index a6b5ba195ca9..d237c378e6fb 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1894,12 +1894,22 @@ static int init_hyp_mode(void)
return err;
 }
 
+void _kvm_host_prot_finalize(void *discard)
+{
+   WARN_ON(kvm_call_hyp_nvhe(__pkvm_prot_finalize));
+}
+
 static int finalize_hyp_mode(void)
 {
if (!is_protected_kvm_enabled())
return 0;
 
+   /*
+* Flip the static key upfront as that may no longer be possible
+* once the host stage 2 is installed.
+*/
static_branch_enable(_protected_mode_initialized);
+   on_each_cpu(_kvm_host_prot_finalize, NULL, 1);
 
return 0;
 }
diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h 
b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
new file mode 100644
index ..d293cb328cc4
--- /dev/null
+++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
@@ -0,0 +1,34 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2020 Google LLC
+ * Author: Quentin Perret 
+ */
+
+#ifndef __KVM_NVHE_MEM_PROTECT__
+#define __KVM_NVHE_MEM_PROTECT__
+#include 
+#include 
+#include 
+#include 
+#include 
+
+struct host_kvm {
+   struct kvm_arch arch;
+   struct kvm_pgtable pgt;
+   struct kvm_pgtable_mm_ops mm_ops;
+   hyp_spinlock_t lock;
+};
+extern struct host_kvm host_kvm;
+
+int __pkvm_prot_finalize(void);
+int kvm_host_prepare_stage2(void *mem_pgt_pool, void *dev_pgt_pool);
+void handle_host_mem_abort(struct kvm_cpu_context *host_ctxt);
+
+static __always_inline void __load_host_stage2(void)
+{
+   if (static_branch_likely(_protected_mode_initialized))
+   __load_stage2(_kvm.arch.mmu, host_kvm.arch.vtcr);
+   else
+   write_sysreg(0, vttbr_el2);
+}
+#endif /* __KVM_NVHE_MEM_PROTECT__ */
diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index b334354b8dd0..f55201a7ff33 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -14,7 +14,7 @@ lib-objs := $(addprefix ../../../lib/, $(lib-objs))
 
 obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
 hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o \
-cache.o setup.o mm.o
+cache.o setup.o mm.o mem_protect.o
 obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
 ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o
 obj-y += $(lib-objs)
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-init.S 
b/arch/arm64/kvm/hyp/nvhe/hyp-init.S
index 60d51515153e..c953fb4b9a13 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-init.S
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-init.S

[PATCH v6 13/38] KVM: arm64: Enable access to sanitized CPU features at EL2

2021-03-19 Thread Quentin Perret
Introduce the infrastructure in KVM enabling to copy CPU feature
registers into EL2-owned data-structures, to allow reading sanitised
values directly at EL2 in nVHE.

Given that only a subset of these features are being read by the
hypervisor, the ones that need to be copied are to be listed under
 together with the name of the nVHE variable that
will hold the copy. This introduces only the infrastructure enabling
this copy. The first users will follow shortly.

Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/cpufeature.h |  1 +
 arch/arm64/include/asm/kvm_cpufeature.h | 22 ++
 arch/arm64/include/asm/kvm_host.h   |  4 
 arch/arm64/kernel/cpufeature.c  | 13 +
 arch/arm64/kvm/sys_regs.c   | 19 +++
 5 files changed, 59 insertions(+)
 create mode 100644 arch/arm64/include/asm/kvm_cpufeature.h

diff --git a/arch/arm64/include/asm/cpufeature.h 
b/arch/arm64/include/asm/cpufeature.h
index 61177bac49fa..a85cea2cac57 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -607,6 +607,7 @@ void check_local_cpu_capabilities(void);
 
 u64 read_sanitised_ftr_reg(u32 id);
 u64 __read_sysreg_by_encoding(u32 sys_id);
+int copy_ftr_reg(u32 id, struct arm64_ftr_reg *dst);
 
 static inline bool cpu_supports_mixed_endian_el0(void)
 {
diff --git a/arch/arm64/include/asm/kvm_cpufeature.h 
b/arch/arm64/include/asm/kvm_cpufeature.h
new file mode 100644
index ..3d245f96a9fe
--- /dev/null
+++ b/arch/arm64/include/asm/kvm_cpufeature.h
@@ -0,0 +1,22 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2020 - Google LLC
+ * Author: Quentin Perret 
+ */
+
+#ifndef __ARM64_KVM_CPUFEATURE_H__
+#define __ARM64_KVM_CPUFEATURE_H__
+
+#include 
+
+#include 
+
+#if defined(__KVM_NVHE_HYPERVISOR__)
+#define DECLARE_KVM_HYP_CPU_FTR_REG(name) extern struct arm64_ftr_reg name
+#define DEFINE_KVM_HYP_CPU_FTR_REG(name) struct arm64_ftr_reg name
+#else
+#define DECLARE_KVM_HYP_CPU_FTR_REG(name) extern struct arm64_ftr_reg 
kvm_nvhe_sym(name)
+#define DEFINE_KVM_HYP_CPU_FTR_REG(name) BUILD_BUG()
+#endif
+
+#endif
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 6a2031af9562..02e172dc5087 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -740,9 +740,13 @@ void kvm_clr_pmu_events(u32 clr);
 
 void kvm_vcpu_pmu_restore_guest(struct kvm_vcpu *vcpu);
 void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu);
+
+void setup_kvm_el2_caps(void);
 #else
 static inline void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr) {}
 static inline void kvm_clr_pmu_events(u32 clr) {}
+
+static inline void setup_kvm_el2_caps(void) {}
 #endif
 
 void kvm_vcpu_load_sysregs_vhe(struct kvm_vcpu *vcpu);
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 066030717a4c..6252476e4e73 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1154,6 +1154,18 @@ u64 read_sanitised_ftr_reg(u32 id)
 }
 EXPORT_SYMBOL_GPL(read_sanitised_ftr_reg);
 
+int copy_ftr_reg(u32 id, struct arm64_ftr_reg *dst)
+{
+   struct arm64_ftr_reg *regp = get_arm64_ftr_reg(id);
+
+   if (!regp)
+   return -EINVAL;
+
+   *dst = *regp;
+
+   return 0;
+}
+
 #define read_sysreg_case(r)\
case r: val = read_sysreg_s(r); break;
 
@@ -2773,6 +2785,7 @@ void __init setup_cpu_features(void)
 
setup_system_capabilities();
setup_elf_hwcaps(arm64_elf_hwcaps);
+   setup_kvm_el2_caps();
 
if (system_supports_32bit_el0())
setup_elf_hwcaps(compat_elf_hwcaps);
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 4f2f1e3145de..6c5d133689ae 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -21,6 +21,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -2775,3 +2776,21 @@ void kvm_sys_reg_table_init(void)
/* Clear all higher bits. */
cache_levels &= (1 << (i*3))-1;
 }
+
+#define CPU_FTR_REG_HYP_COPY(id, name) \
+   { .sys_id = id, .dst = (struct arm64_ftr_reg *)_nvhe_sym(name) }
+struct __ftr_reg_copy_entry {
+   u32 sys_id;
+   struct arm64_ftr_reg*dst;
+} hyp_ftr_regs[] __initdata = {
+};
+
+void __init setup_kvm_el2_caps(void)
+{
+   int i;
+
+   for (i = 0; i < ARRAY_SIZE(hyp_ftr_regs); i++) {
+   WARN(copy_ftr_reg(hyp_ftr_regs[i].sys_id, hyp_ftr_regs[i].dst),
+"%u feature register not found\n", hyp_ftr_regs[i].sys_id);
+   }
+}
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v6 25/38] KVM: arm64: Make memcache anonymous in pgtable allocator

2021-03-19 Thread Quentin Perret
The current stage2 page-table allocator uses a memcache to get
pre-allocated pages when it needs any. To allow re-using this code at
EL2 which uses a concept of memory pools, make the memcache argument of
kvm_pgtable_stage2_map() anonymous, and let the mm_ops zalloc_page()
callbacks use it the way they need to.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_pgtable.h | 6 +++---
 arch/arm64/kvm/hyp/pgtable.c | 4 ++--
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_pgtable.h 
b/arch/arm64/include/asm/kvm_pgtable.h
index 9cdc198ea6b4..4ae19247837b 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -213,8 +213,8 @@ void kvm_pgtable_stage2_destroy(struct kvm_pgtable *pgt);
  * @size:  Size of the mapping.
  * @phys:  Physical address of the memory to map.
  * @prot:  Permissions and attributes for the mapping.
- * @mc:Cache of pre-allocated GFP_PGTABLE_USER memory from 
which to
- * allocate page-table pages.
+ * @mc:Cache of pre-allocated and zeroed memory from which to 
allocate
+ * page-table pages.
  *
  * The offset of @addr within a page is ignored, @size is rounded-up to
  * the next page boundary and @phys is rounded-down to the previous page
@@ -236,7 +236,7 @@ void kvm_pgtable_stage2_destroy(struct kvm_pgtable *pgt);
  */
 int kvm_pgtable_stage2_map(struct kvm_pgtable *pgt, u64 addr, u64 size,
   u64 phys, enum kvm_pgtable_prot prot,
-  struct kvm_mmu_memory_cache *mc);
+  void *mc);
 
 /**
  * kvm_pgtable_stage2_unmap() - Remove a mapping from a guest stage-2 
page-table.
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 4e15ccafd640..15de1708cfcd 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -446,7 +446,7 @@ struct stage2_map_data {
kvm_pte_t   *anchor;
 
struct kvm_s2_mmu   *mmu;
-   struct kvm_mmu_memory_cache *memcache;
+   void*memcache;
 
struct kvm_pgtable_mm_ops   *mm_ops;
 };
@@ -670,7 +670,7 @@ static int stage2_map_walker(u64 addr, u64 end, u32 level, 
kvm_pte_t *ptep,
 
 int kvm_pgtable_stage2_map(struct kvm_pgtable *pgt, u64 addr, u64 size,
   u64 phys, enum kvm_pgtable_prot prot,
-  struct kvm_mmu_memory_cache *mc)
+  void *mc)
 {
int ret;
struct stage2_map_data map_data = {
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v6 10/38] KVM: arm64: Introduce an early Hyp page allocator

2021-03-19 Thread Quentin Perret
With nVHE, the host currently creates all stage 1 hypervisor mappings at
EL1 during boot, installs them at EL2, and extends them as required
(e.g. when creating a new VM). But in a world where the host is no
longer trusted, it cannot have full control over the code mapped in the
hypervisor.

In preparation for enabling the hypervisor to create its own stage 1
mappings during boot, introduce an early page allocator, with minimal
functionality. This allocator is designed to be used only during early
bootstrap of the hyp code when memory protection is enabled, which will
then switch to using a full-fledged page allocator after init.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/kvm/hyp/include/nvhe/early_alloc.h | 14 +
 arch/arm64/kvm/hyp/include/nvhe/memory.h  | 24 +
 arch/arm64/kvm/hyp/nvhe/Makefile  |  2 +-
 arch/arm64/kvm/hyp/nvhe/early_alloc.c | 54 +++
 arch/arm64/kvm/hyp/nvhe/psci-relay.c  |  4 +-
 5 files changed, 94 insertions(+), 4 deletions(-)
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/early_alloc.h
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/memory.h
 create mode 100644 arch/arm64/kvm/hyp/nvhe/early_alloc.c

diff --git a/arch/arm64/kvm/hyp/include/nvhe/early_alloc.h 
b/arch/arm64/kvm/hyp/include/nvhe/early_alloc.h
new file mode 100644
index ..dc61aaa56f31
--- /dev/null
+++ b/arch/arm64/kvm/hyp/include/nvhe/early_alloc.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef __KVM_HYP_EARLY_ALLOC_H
+#define __KVM_HYP_EARLY_ALLOC_H
+
+#include 
+
+void hyp_early_alloc_init(void *virt, unsigned long size);
+unsigned long hyp_early_alloc_nr_used_pages(void);
+void *hyp_early_alloc_page(void *arg);
+void *hyp_early_alloc_contig(unsigned int nr_pages);
+
+extern struct kvm_pgtable_mm_ops hyp_early_alloc_mm_ops;
+
+#endif /* __KVM_HYP_EARLY_ALLOC_H */
diff --git a/arch/arm64/kvm/hyp/include/nvhe/memory.h 
b/arch/arm64/kvm/hyp/include/nvhe/memory.h
new file mode 100644
index ..3e49eaa7e682
--- /dev/null
+++ b/arch/arm64/kvm/hyp/include/nvhe/memory.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef __KVM_HYP_MEMORY_H
+#define __KVM_HYP_MEMORY_H
+
+#include 
+
+#include 
+
+extern s64 hyp_physvirt_offset;
+
+#define __hyp_pa(virt) ((phys_addr_t)(virt) + hyp_physvirt_offset)
+#define __hyp_va(phys) ((void *)((phys_addr_t)(phys) - hyp_physvirt_offset))
+
+static inline void *hyp_phys_to_virt(phys_addr_t phys)
+{
+   return __hyp_va(phys);
+}
+
+static inline phys_addr_t hyp_virt_to_phys(void *addr)
+{
+   return __hyp_pa(addr);
+}
+
+#endif /* __KVM_HYP_MEMORY_H */
diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index bc98f8e3d1da..24ff99e2eac5 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -13,7 +13,7 @@ lib-objs := clear_page.o copy_page.o memcpy.o memset.o
 lib-objs := $(addprefix ../../../lib/, $(lib-objs))
 
 obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
-hyp-main.o hyp-smp.o psci-relay.o
+hyp-main.o hyp-smp.o psci-relay.o early_alloc.o
 obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
 ../fpsimd.o ../hyp-entry.o ../exception.o
 obj-y += $(lib-objs)
diff --git a/arch/arm64/kvm/hyp/nvhe/early_alloc.c 
b/arch/arm64/kvm/hyp/nvhe/early_alloc.c
new file mode 100644
index ..1306c430ab87
--- /dev/null
+++ b/arch/arm64/kvm/hyp/nvhe/early_alloc.c
@@ -0,0 +1,54 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2020 Google LLC
+ * Author: Quentin Perret 
+ */
+
+#include 
+
+#include 
+#include 
+
+struct kvm_pgtable_mm_ops hyp_early_alloc_mm_ops;
+s64 __ro_after_init hyp_physvirt_offset;
+
+static unsigned long base;
+static unsigned long end;
+static unsigned long cur;
+
+unsigned long hyp_early_alloc_nr_used_pages(void)
+{
+   return (cur - base) >> PAGE_SHIFT;
+}
+
+void *hyp_early_alloc_contig(unsigned int nr_pages)
+{
+   unsigned long size = (nr_pages << PAGE_SHIFT);
+   void *ret = (void *)cur;
+
+   if (!nr_pages)
+   return NULL;
+
+   if (end - cur < size)
+   return NULL;
+
+   cur += size;
+   memset(ret, 0, size);
+
+   return ret;
+}
+
+void *hyp_early_alloc_page(void *arg)
+{
+   return hyp_early_alloc_contig(1);
+}
+
+void hyp_early_alloc_init(void *virt, unsigned long size)
+{
+   base = cur = (unsigned long)virt;
+   end = base + size;
+
+   hyp_early_alloc_mm_ops.zalloc_page = hyp_early_alloc_page;
+   hyp_early_alloc_mm_ops.phys_to_virt = hyp_phys_to_virt;
+   hyp_early_alloc_mm_ops.virt_to_phys = hyp_virt_to_phys;
+}
diff --git a/arch/arm64/kvm/hyp/nvhe/psci-relay.c 
b/arch/arm64/kvm/hyp/nvhe/psci-relay.c
index 63de71c0481e..08508783ec3d 100644
--- a/arch/arm64/kvm/hyp/nvhe/psci-relay.c
+++ b/arch/arm64/kvm/hyp/nvhe/psci-relay.c
@@ -11,6 +11,7 @@
 #include 
 

[PATCH v6 20/38] KVM: arm64: Use kvm_arch in kvm_s2_mmu

2021-03-19 Thread Quentin Perret
In order to make use of the stage 2 pgtable code for the host stage 2,
change kvm_s2_mmu to use a kvm_arch pointer in lieu of the kvm pointer,
as the host will have the former but not the latter.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_host.h | 2 +-
 arch/arm64/include/asm/kvm_mmu.h  | 6 +-
 arch/arm64/kvm/mmu.c  | 8 
 3 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index f813e1191027..4859c9de75d7 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -94,7 +94,7 @@ struct kvm_s2_mmu {
/* The last vcpu id that ran on each physical CPU */
int __percpu *last_vcpu_ran;
 
-   struct kvm *kvm;
+   struct kvm_arch *arch;
 };
 
 struct kvm_arch_memory_slot {
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index ce02a4052dcf..6f743e20cb06 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -272,7 +272,7 @@ static __always_inline u64 kvm_get_vttbr(struct kvm_s2_mmu 
*mmu)
  */
 static __always_inline void __load_guest_stage2(struct kvm_s2_mmu *mmu)
 {
-   write_sysreg(kern_hyp_va(mmu->kvm)->arch.vtcr, vtcr_el2);
+   write_sysreg(kern_hyp_va(mmu->arch)->vtcr, vtcr_el2);
write_sysreg(kvm_get_vttbr(mmu), vttbr_el2);
 
/*
@@ -283,5 +283,9 @@ static __always_inline void __load_guest_stage2(struct 
kvm_s2_mmu *mmu)
asm(ALTERNATIVE("nop", "isb", ARM64_WORKAROUND_SPECULATIVE_AT));
 }
 
+static inline struct kvm *kvm_s2_mmu_to_kvm(struct kvm_s2_mmu *mmu)
+{
+   return container_of(mmu->arch, struct kvm, arch);
+}
 #endif /* __ASSEMBLY__ */
 #endif /* __ARM64_KVM_MMU_H__ */
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index d6eb1fb21232..0f16b70befa8 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -165,7 +165,7 @@ static void *kvm_host_va(phys_addr_t phys)
 static void __unmap_stage2_range(struct kvm_s2_mmu *mmu, phys_addr_t start, 
u64 size,
 bool may_block)
 {
-   struct kvm *kvm = mmu->kvm;
+   struct kvm *kvm = kvm_s2_mmu_to_kvm(mmu);
phys_addr_t end = start + size;
 
assert_spin_locked(>mmu_lock);
@@ -470,7 +470,7 @@ int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu 
*mmu)
for_each_possible_cpu(cpu)
*per_cpu_ptr(mmu->last_vcpu_ran, cpu) = -1;
 
-   mmu->kvm = kvm;
+   mmu->arch = >arch;
mmu->pgt = pgt;
mmu->pgd_phys = __pa(pgt->pgd);
mmu->vmid.vmid_gen = 0;
@@ -552,7 +552,7 @@ void stage2_unmap_vm(struct kvm *kvm)
 
 void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu)
 {
-   struct kvm *kvm = mmu->kvm;
+   struct kvm *kvm = kvm_s2_mmu_to_kvm(mmu);
struct kvm_pgtable *pgt = NULL;
 
spin_lock(>mmu_lock);
@@ -621,7 +621,7 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t 
guest_ipa,
  */
 static void stage2_wp_range(struct kvm_s2_mmu *mmu, phys_addr_t addr, 
phys_addr_t end)
 {
-   struct kvm *kvm = mmu->kvm;
+   struct kvm *kvm = kvm_s2_mmu_to_kvm(mmu);
stage2_apply_range_resched(kvm, addr, end, 
kvm_pgtable_stage2_wrprotect);
 }
 
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v6 09/38] KVM: arm64: Allow using kvm_nvhe_sym() in hyp code

2021-03-19 Thread Quentin Perret
In order to allow the usage of code shared by the host and the hyp in
static inline library functions, allow the usage of kvm_nvhe_sym() at
EL2 by defaulting to the raw symbol name.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/hyp_image.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/arm64/include/asm/hyp_image.h 
b/arch/arm64/include/asm/hyp_image.h
index 78cd77990c9c..b4b3076a76fb 100644
--- a/arch/arm64/include/asm/hyp_image.h
+++ b/arch/arm64/include/asm/hyp_image.h
@@ -10,11 +10,15 @@
 #define __HYP_CONCAT(a, b) a ## b
 #define HYP_CONCAT(a, b)   __HYP_CONCAT(a, b)
 
+#ifndef __KVM_NVHE_HYPERVISOR__
 /*
  * KVM nVHE code has its own symbol namespace prefixed with __kvm_nvhe_,
  * to separate it from the kernel proper.
  */
 #define kvm_nvhe_sym(sym)  __kvm_nvhe_##sym
+#else
+#define kvm_nvhe_sym(sym)  sym
+#endif
 
 #ifdef LINKER_SCRIPT
 
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v6 17/38] KVM: arm64: Prepare the creation of s1 mappings at EL2

2021-03-19 Thread Quentin Perret
When memory protection is enabled, the EL2 code needs the ability to
create and manage its own page-table. To do so, introduce a new set of
hypercalls to bootstrap a memory management system at EL2.

This leads to the following boot flow in nVHE Protected mode:

 1. the host allocates memory for the hypervisor very early on, using
the memblock API;

 2. the host creates a set of stage 1 page-table for EL2, installs the
EL2 vectors, and issues the __pkvm_init hypercall;

 3. during __pkvm_init, the hypervisor re-creates its stage 1 page-table
and stores it in the memory pool provided by the host;

 4. the hypervisor then extends its stage 1 mappings to include a
vmemmap in the EL2 VA space, hence allowing to use the buddy
allocator introduced in a previous patch;

 5. the hypervisor jumps back in the idmap page, switches from the
host-provided page-table to the new one, and wraps up its
initialization by enabling the new allocator, before returning to
the host.

 6. the host can free the now unused page-table created for EL2, and
will now need to issue hypercalls to make changes to the EL2 stage 1
mappings instead of modifying them directly.

Note that for the sake of simplifying the review, this patch focuses on
the hypervisor side of things. In other words, this only implements the
new hypercalls, but does not make use of them from the host yet. The
host-side changes will follow in a subsequent patch.

Credits to Will for __pkvm_init_switch_pgd.

Acked-by: Will Deacon 
Co-authored-by: Will Deacon 
Signed-off-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_asm.h |   4 +
 arch/arm64/include/asm/kvm_host.h|   7 +
 arch/arm64/include/asm/kvm_hyp.h |   8 ++
 arch/arm64/include/asm/kvm_pgtable.h |   2 +
 arch/arm64/kernel/image-vars.h   |  16 +++
 arch/arm64/kvm/hyp/Makefile  |   2 +-
 arch/arm64/kvm/hyp/include/nvhe/mm.h |  71 ++
 arch/arm64/kvm/hyp/nvhe/Makefile |   4 +-
 arch/arm64/kvm/hyp/nvhe/hyp-init.S   |  27 
 arch/arm64/kvm/hyp/nvhe/hyp-main.c   |  49 +++
 arch/arm64/kvm/hyp/nvhe/mm.c | 173 +++
 arch/arm64/kvm/hyp/nvhe/setup.c  | 197 +++
 arch/arm64/kvm/hyp/pgtable.c |   2 -
 arch/arm64/kvm/hyp/reserved_mem.c|  92 +
 arch/arm64/mm/init.c |   3 +
 15 files changed, 652 insertions(+), 5 deletions(-)
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/mm.h
 create mode 100644 arch/arm64/kvm/hyp/nvhe/mm.c
 create mode 100644 arch/arm64/kvm/hyp/nvhe/setup.c
 create mode 100644 arch/arm64/kvm/hyp/reserved_mem.c

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index a7ab84f781f7..ebe7007b820b 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -57,6 +57,10 @@
 #define __KVM_HOST_SMCCC_FUNC___kvm_get_mdcr_el2   12
 #define __KVM_HOST_SMCCC_FUNC___vgic_v3_save_aprs  13
 #define __KVM_HOST_SMCCC_FUNC___vgic_v3_restore_aprs   14
+#define __KVM_HOST_SMCCC_FUNC___pkvm_init  15
+#define __KVM_HOST_SMCCC_FUNC___pkvm_create_mappings   16
+#define __KVM_HOST_SMCCC_FUNC___pkvm_create_private_mapping17
+#define __KVM_HOST_SMCCC_FUNC___pkvm_cpu_set_vector18
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 02e172dc5087..f813e1191027 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -770,5 +770,12 @@ bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu);
(test_bit(KVM_ARM_VCPU_PMU_V3, (vcpu)->arch.features))
 
 int kvm_trng_call(struct kvm_vcpu *vcpu);
+#ifdef CONFIG_KVM
+extern phys_addr_t hyp_mem_base;
+extern phys_addr_t hyp_mem_size;
+void __init kvm_hyp_reserve(void);
+#else
+static inline void kvm_hyp_reserve(void) { }
+#endif
 
 #endif /* __ARM64_KVM_HOST_H__ */
diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index 8b6c3a7aac51..de40a565d7e5 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -108,4 +108,12 @@ void __noreturn __hyp_do_panic(struct kvm_cpu_context 
*host_ctxt, u64 spsr,
   u64 elr, u64 par);
 #endif
 
+#ifdef __KVM_NVHE_HYPERVISOR__
+void __pkvm_init_switch_pgd(phys_addr_t phys, unsigned long size,
+   phys_addr_t pgd, void *sp, void *cont_fn);
+int __pkvm_init(phys_addr_t phys, unsigned long size, unsigned long nr_cpus,
+   unsigned long *per_cpu_base, u32 hyp_va_bits);
+void __noreturn __host_enter(struct kvm_cpu_context *host_ctxt);
+#endif
+
 #endif /* __ARM64_KVM_HYP_H__ */
diff --git a/arch/arm64/include/asm/kvm_pgtable.h 
b/arch/arm64/include/asm/kvm_pgtable.h
index bbe840e430cb..bf7a3cc49420 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -11,6 +11,8 

[PATCH v6 30/38] KVM: arm64: Refactor the *_map_set_prot_attr() helpers

2021-03-19 Thread Quentin Perret
In order to ease their re-use in other code paths, refactor the
*_map_set_prot_attr() helpers to not depend on a map_data struct.
No functional change intended.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/kvm/hyp/pgtable.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index eefb226e1d1e..f4a514a2e7ae 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -325,8 +325,7 @@ struct hyp_map_data {
struct kvm_pgtable_mm_ops   *mm_ops;
 };
 
-static int hyp_map_set_prot_attr(enum kvm_pgtable_prot prot,
-struct hyp_map_data *data)
+static int hyp_set_prot_attr(enum kvm_pgtable_prot prot, kvm_pte_t *ptep)
 {
bool device = prot & KVM_PGTABLE_PROT_DEVICE;
u32 mtype = device ? MT_DEVICE_nGnRE : MT_NORMAL;
@@ -351,7 +350,8 @@ static int hyp_map_set_prot_attr(enum kvm_pgtable_prot prot,
attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S1_AP, ap);
attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S1_SH, sh);
attr |= KVM_PTE_LEAF_ATTR_LO_S1_AF;
-   data->attr = attr;
+   *ptep = attr;
+
return 0;
 }
 
@@ -408,7 +408,7 @@ int kvm_pgtable_hyp_map(struct kvm_pgtable *pgt, u64 addr, 
u64 size, u64 phys,
.arg= _data,
};
 
-   ret = hyp_map_set_prot_attr(prot, _data);
+   ret = hyp_set_prot_attr(prot, _data.attr);
if (ret)
return ret;
 
@@ -501,8 +501,7 @@ u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift)
return vtcr;
 }
 
-static int stage2_map_set_prot_attr(enum kvm_pgtable_prot prot,
-   struct stage2_map_data *data)
+static int stage2_set_prot_attr(enum kvm_pgtable_prot prot, kvm_pte_t *ptep)
 {
bool device = prot & KVM_PGTABLE_PROT_DEVICE;
kvm_pte_t attr = device ? PAGE_S2_MEMATTR(DEVICE_nGnRE) :
@@ -522,7 +521,8 @@ static int stage2_map_set_prot_attr(enum kvm_pgtable_prot 
prot,
 
attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S2_SH, sh);
attr |= KVM_PTE_LEAF_ATTR_LO_S2_AF;
-   data->attr = attr;
+   *ptep = attr;
+
return 0;
 }
 
@@ -742,7 +742,7 @@ int kvm_pgtable_stage2_map(struct kvm_pgtable *pgt, u64 
addr, u64 size,
.arg= _data,
};
 
-   ret = stage2_map_set_prot_attr(prot, _data);
+   ret = stage2_set_prot_attr(prot, _data.attr);
if (ret)
return ret;
 
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v6 29/38] KVM: arm64: Use page-table to track page ownership

2021-03-19 Thread Quentin Perret
As the host stage 2 will be identity mapped, all the .hyp memory regions
and/or memory pages donated to protected guestis will have to marked
invalid in the host stage 2 page-table. At the same time, the hypervisor
will need a way to track the ownership of each physical page to ensure
memory sharing or donation between entities (host, guests, hypervisor) is
legal.

In order to enable this tracking at EL2, let's use the host stage 2
page-table itself. The idea is to use the top bits of invalid mappings
to store the unique identifier of the page owner. The page-table owner
(the host) gets identifier 0 such that, at boot time, it owns the entire
IPA space as the pgd starts zeroed.

Provide kvm_pgtable_stage2_set_owner() which allows to modify the
ownership of pages in the host stage 2. It re-uses most of the map()
logic, but ends up creating invalid mappings instead. This impacts
how we do refcount as we now need to count invalid mappings when they
are used for ownership tracking.

Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_pgtable.h |  20 +
 arch/arm64/kvm/hyp/pgtable.c | 126 ++-
 2 files changed, 122 insertions(+), 24 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_pgtable.h 
b/arch/arm64/include/asm/kvm_pgtable.h
index 4ae19247837b..eea2e2b0acaa 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -238,6 +238,26 @@ int kvm_pgtable_stage2_map(struct kvm_pgtable *pgt, u64 
addr, u64 size,
   u64 phys, enum kvm_pgtable_prot prot,
   void *mc);
 
+/**
+ * kvm_pgtable_stage2_set_owner() - Unmap and annotate pages in the IPA space 
to
+ * track ownership.
+ * @pgt:   Page-table structure initialised by kvm_pgtable_stage2_init().
+ * @addr:  Base intermediate physical address to annotate.
+ * @size:  Size of the annotated range.
+ * @mc:Cache of pre-allocated and zeroed memory from which to 
allocate
+ * page-table pages.
+ * @owner_id:  Unique identifier for the owner of the page.
+ *
+ * By default, all page-tables are owned by identifier 0. This function can be
+ * used to mark portions of the IPA space as owned by other entities. When a
+ * stage 2 is used with identity-mappings, these annotations allow to use the
+ * page-table data structure as a simple rmap.
+ *
+ * Return: 0 on success, negative error code on failure.
+ */
+int kvm_pgtable_stage2_set_owner(struct kvm_pgtable *pgt, u64 addr, u64 size,
+void *mc, u8 owner_id);
+
 /**
  * kvm_pgtable_stage2_unmap() - Remove a mapping from a guest stage-2 
page-table.
  * @pgt:   Page-table structure initialised by kvm_pgtable_stage2_init().
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 0a674010afb6..eefb226e1d1e 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -48,6 +48,9 @@
 KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W | \
 KVM_PTE_LEAF_ATTR_HI_S2_XN)
 
+#define KVM_INVALID_PTE_OWNER_MASK GENMASK(63, 56)
+#define KVM_MAX_OWNER_ID   1
+
 struct kvm_pgtable_walk_data {
struct kvm_pgtable  *pgt;
struct kvm_pgtable_walker   *walker;
@@ -67,6 +70,13 @@ static u64 kvm_granule_size(u32 level)
return BIT(kvm_granule_shift(level));
 }
 
+#define KVM_PHYS_INVALID (-1ULL)
+
+static bool kvm_phys_is_valid(u64 phys)
+{
+   return phys < 
BIT(id_aa64mmfr0_parange_to_phys_shift(ID_AA64MMFR0_PARANGE_MAX));
+}
+
 static bool kvm_block_mapping_supported(u64 addr, u64 end, u64 phys, u32 level)
 {
u64 granule = kvm_granule_size(level);
@@ -81,7 +91,10 @@ static bool kvm_block_mapping_supported(u64 addr, u64 end, 
u64 phys, u32 level)
if (granule > (end - addr))
return false;
 
-   return IS_ALIGNED(addr, granule) && IS_ALIGNED(phys, granule);
+   if (kvm_phys_is_valid(phys) && !IS_ALIGNED(phys, granule))
+   return false;
+
+   return IS_ALIGNED(addr, granule);
 }
 
 static u32 kvm_pgtable_idx(struct kvm_pgtable_walk_data *data, u32 level)
@@ -186,6 +199,11 @@ static kvm_pte_t kvm_init_valid_leaf_pte(u64 pa, kvm_pte_t 
attr, u32 level)
return pte;
 }
 
+static kvm_pte_t kvm_init_invalid_leaf_owner(u8 owner_id)
+{
+   return FIELD_PREP(KVM_INVALID_PTE_OWNER_MASK, owner_id);
+}
+
 static int kvm_pgtable_visitor_cb(struct kvm_pgtable_walk_data *data, u64 addr,
  u32 level, kvm_pte_t *ptep,
  enum kvm_pgtable_walk_flags flag)
@@ -441,6 +459,7 @@ void kvm_pgtable_hyp_destroy(struct kvm_pgtable *pgt)
 struct stage2_map_data {
u64 phys;
kvm_pte_t   attr;
+   u8  owner_id;
 
kvm_pte_t   

[PATCH v6 16/38] arm64: asm: Provide set_sctlr_el2 macro

2021-03-19 Thread Quentin Perret
We will soon need to turn the EL2 stage 1 MMU on and off in nVHE
protected mode, so refactor the set_sctlr_el1 macro to make it usable
for that purpose.

Acked-by: Will Deacon 
Suggested-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/assembler.h | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h 
b/arch/arm64/include/asm/assembler.h
index ca31594d3d6c..fb651c1f26e9 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -676,11 +676,11 @@ USER(\label, ic   ivau, \tmp2)// 
invalidate I line PoU
.endm
 
 /*
- * Set SCTLR_EL1 to the passed value, and invalidate the local icache
+ * Set SCTLR_ELx to the @reg value, and invalidate the local icache
  * in the process. This is called when setting the MMU on.
  */
-.macro set_sctlr_el1, reg
-   msr sctlr_el1, \reg
+.macro set_sctlr, sreg, reg
+   msr \sreg, \reg
isb
/*
 * Invalidate the local I-cache so that any instructions fetched
@@ -692,6 +692,14 @@ USER(\label, icivau, \tmp2)// 
invalidate I line PoU
isb
 .endm
 
+.macro set_sctlr_el1, reg
+   set_sctlr sctlr_el1, \reg
+.endm
+
+.macro set_sctlr_el2, reg
+   set_sctlr sctlr_el2, \reg
+.endm
+
 /*
  * Check whether to yield to another runnable task from kernel mode NEON code
  * (which runs with preemption disabled).
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v6 28/38] KVM: arm64: Always zero invalid PTEs

2021-03-19 Thread Quentin Perret
kvm_set_invalid_pte() currently only clears bit 0 from a PTE because
stage2_map_walk_table_post() needs to be able to follow the anchor. In
preparation for re-using bits 63-01 from invalid PTEs, make sure to zero
it entirely by ensuring to cache the anchor's child upfront.

Acked-by: Will Deacon 
Suggested-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/kvm/hyp/pgtable.c | 26 --
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 15de1708cfcd..0a674010afb6 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -156,10 +156,9 @@ static kvm_pte_t *kvm_pte_follow(kvm_pte_t pte, struct 
kvm_pgtable_mm_ops *mm_op
return mm_ops->phys_to_virt(kvm_pte_to_phys(pte));
 }
 
-static void kvm_set_invalid_pte(kvm_pte_t *ptep)
+static void kvm_clear_pte(kvm_pte_t *ptep)
 {
-   kvm_pte_t pte = *ptep;
-   WRITE_ONCE(*ptep, pte & ~KVM_PTE_VALID);
+   WRITE_ONCE(*ptep, 0);
 }
 
 static void kvm_set_table_pte(kvm_pte_t *ptep, kvm_pte_t *childp,
@@ -444,6 +443,7 @@ struct stage2_map_data {
kvm_pte_t   attr;
 
kvm_pte_t   *anchor;
+   kvm_pte_t   *childp;
 
struct kvm_s2_mmu   *mmu;
void*memcache;
@@ -533,7 +533,7 @@ static int stage2_map_walker_try_leaf(u64 addr, u64 end, 
u32 level,
 * There's an existing different valid leaf entry, so perform
 * break-before-make.
 */
-   kvm_set_invalid_pte(ptep);
+   kvm_clear_pte(ptep);
kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, data->mmu, addr, level);
mm_ops->put_page(ptep);
}
@@ -554,7 +554,8 @@ static int stage2_map_walk_table_pre(u64 addr, u64 end, u32 
level,
if (!kvm_block_mapping_supported(addr, end, data->phys, level))
return 0;
 
-   kvm_set_invalid_pte(ptep);
+   data->childp = kvm_pte_follow(*ptep, data->mm_ops);
+   kvm_clear_pte(ptep);
 
/*
 * Invalidate the whole stage-2, as we may have numerous leaf
@@ -600,7 +601,7 @@ static int stage2_map_walk_leaf(u64 addr, u64 end, u32 
level, kvm_pte_t *ptep,
 * will be mapped lazily.
 */
if (kvm_pte_valid(pte)) {
-   kvm_set_invalid_pte(ptep);
+   kvm_clear_pte(ptep);
kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, data->mmu, addr, level);
mm_ops->put_page(ptep);
}
@@ -616,19 +617,24 @@ static int stage2_map_walk_table_post(u64 addr, u64 end, 
u32 level,
  struct stage2_map_data *data)
 {
struct kvm_pgtable_mm_ops *mm_ops = data->mm_ops;
+   kvm_pte_t *childp;
int ret = 0;
 
if (!data->anchor)
return 0;
 
-   mm_ops->put_page(kvm_pte_follow(*ptep, mm_ops));
-   mm_ops->put_page(ptep);
-
if (data->anchor == ptep) {
+   childp = data->childp;
data->anchor = NULL;
+   data->childp = NULL;
ret = stage2_map_walk_leaf(addr, end, level, ptep, data);
+   } else {
+   childp = kvm_pte_follow(*ptep, mm_ops);
}
 
+   mm_ops->put_page(childp);
+   mm_ops->put_page(ptep);
+
return ret;
 }
 
@@ -737,7 +743,7 @@ static int stage2_unmap_walker(u64 addr, u64 end, u32 
level, kvm_pte_t *ptep,
 * block entry and rely on the remaining portions being faulted
 * back lazily.
 */
-   kvm_set_invalid_pte(ptep);
+   kvm_clear_pte(ptep);
kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, mmu, addr, level);
mm_ops->put_page(ptep);
 
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v6 27/38] KVM: arm64: Sort the hypervisor memblocks

2021-03-19 Thread Quentin Perret
We will soon need to check if a Physical Address belongs to a memblock
at EL2, so make sure to sort them so this can be done efficiently.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/kvm/hyp/reserved_mem.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/arch/arm64/kvm/hyp/reserved_mem.c 
b/arch/arm64/kvm/hyp/reserved_mem.c
index fd42705a3c26..83ca23ac259b 100644
--- a/arch/arm64/kvm/hyp/reserved_mem.c
+++ b/arch/arm64/kvm/hyp/reserved_mem.c
@@ -6,6 +6,7 @@
 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -18,6 +19,23 @@ static unsigned int *hyp_memblock_nr_ptr = 
_nvhe_sym(hyp_memblock_nr);
 phys_addr_t hyp_mem_base;
 phys_addr_t hyp_mem_size;
 
+static int cmp_hyp_memblock(const void *p1, const void *p2)
+{
+   const struct memblock_region *r1 = p1;
+   const struct memblock_region *r2 = p2;
+
+   return r1->base < r2->base ? -1 : (r1->base > r2->base);
+}
+
+static void __init sort_memblock_regions(void)
+{
+   sort(hyp_memory,
+*hyp_memblock_nr_ptr,
+sizeof(struct memblock_region),
+cmp_hyp_memblock,
+NULL);
+}
+
 static int __init register_memblock_regions(void)
 {
struct memblock_region *reg;
@@ -29,6 +47,7 @@ static int __init register_memblock_regions(void)
hyp_memory[*hyp_memblock_nr_ptr] = *reg;
(*hyp_memblock_nr_ptr)++;
}
+   sort_memblock_regions();
 
return 0;
 }
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v6 15/38] KVM: arm64: Factor out vector address calculation

2021-03-19 Thread Quentin Perret
In order to re-map the guest vectors at EL2 when pKVM is enabled,
refactor __kvm_vector_slot2idx() and kvm_init_vector_slot() to move all
the address calculation logic in a static inline function.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_mmu.h | 8 
 arch/arm64/kvm/arm.c | 9 +
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 90873851f677..5c42ec023cc7 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -168,6 +168,14 @@ phys_addr_t kvm_mmu_get_httbr(void);
 phys_addr_t kvm_get_idmap_vector(void);
 int kvm_mmu_init(void);
 
+static inline void *__kvm_vector_slot2addr(void *base,
+  enum arm64_hyp_spectre_vector slot)
+{
+   int idx = slot - (slot != HYP_VECTOR_DIRECT);
+
+   return base + (idx * SZ_2K);
+}
+
 struct kvm;
 
 #define kvm_flush_dcache_to_poc(a,l)   __flush_dcache_area((a), (l))
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 22d6df525254..e2c471117bff 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1350,16 +1350,9 @@ static unsigned long nvhe_percpu_order(void)
 /* A lookup table holding the hypervisor VA for each vector slot */
 static void *hyp_spectre_vector_selector[BP_HARDEN_EL2_SLOTS];
 
-static int __kvm_vector_slot2idx(enum arm64_hyp_spectre_vector slot)
-{
-   return slot - (slot != HYP_VECTOR_DIRECT);
-}
-
 static void kvm_init_vector_slot(void *base, enum arm64_hyp_spectre_vector 
slot)
 {
-   int idx = __kvm_vector_slot2idx(slot);
-
-   hyp_spectre_vector_selector[slot] = base + (idx * SZ_2K);
+   hyp_spectre_vector_selector[slot] = __kvm_vector_slot2addr(base, slot);
 }
 
 static int kvm_init_vector_slots(void)
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v6 12/38] KVM: arm64: Introduce a Hyp buddy page allocator

2021-03-19 Thread Quentin Perret
When memory protection is enabled, the hyp code will require a basic
form of memory management in order to allocate and free memory pages at
EL2. This is needed for various use-cases, including the creation of hyp
mappings or the allocation of stage 2 page tables.

To address these use-case, introduce a simple memory allocator in the
hyp code. The allocator is designed as a conventional 'buddy allocator',
working with a page granularity. It allows to allocate and free
physically contiguous pages from memory 'pools', with a guaranteed order
alignment in the PA space. Each page in a memory pool is associated
with a struct hyp_page which holds the page's metadata, including its
refcount, as well as its current order, hence mimicking the kernel's
buddy system in the GFP infrastructure. The hyp_page metadata are made
accessible through a hyp_vmemmap, following the concept of
SPARSE_VMEMMAP in the kernel.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/kvm/hyp/include/nvhe/gfp.h|  68 
 arch/arm64/kvm/hyp/include/nvhe/memory.h |  28 
 arch/arm64/kvm/hyp/nvhe/Makefile |   2 +-
 arch/arm64/kvm/hyp/nvhe/page_alloc.c | 195 +++
 4 files changed, 292 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/gfp.h
 create mode 100644 arch/arm64/kvm/hyp/nvhe/page_alloc.c

diff --git a/arch/arm64/kvm/hyp/include/nvhe/gfp.h 
b/arch/arm64/kvm/hyp/include/nvhe/gfp.h
new file mode 100644
index ..55b3f0ce5bc8
--- /dev/null
+++ b/arch/arm64/kvm/hyp/include/nvhe/gfp.h
@@ -0,0 +1,68 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef __KVM_HYP_GFP_H
+#define __KVM_HYP_GFP_H
+
+#include 
+
+#include 
+#include 
+
+#define HYP_NO_ORDER   UINT_MAX
+
+struct hyp_pool {
+   /*
+* Spinlock protecting concurrent changes to the memory pool as well as
+* the struct hyp_page of the pool's pages until we have a proper atomic
+* API at EL2.
+*/
+   hyp_spinlock_t lock;
+   struct list_head free_area[MAX_ORDER];
+   phys_addr_t range_start;
+   phys_addr_t range_end;
+   unsigned int max_order;
+};
+
+static inline void hyp_page_ref_inc(struct hyp_page *p)
+{
+   struct hyp_pool *pool = hyp_page_to_pool(p);
+
+   hyp_spin_lock(>lock);
+   p->refcount++;
+   hyp_spin_unlock(>lock);
+}
+
+static inline int hyp_page_ref_dec_and_test(struct hyp_page *p)
+{
+   struct hyp_pool *pool = hyp_page_to_pool(p);
+   int ret;
+
+   hyp_spin_lock(>lock);
+   p->refcount--;
+   ret = (p->refcount == 0);
+   hyp_spin_unlock(>lock);
+
+   return ret;
+}
+
+static inline void hyp_set_page_refcounted(struct hyp_page *p)
+{
+   struct hyp_pool *pool = hyp_page_to_pool(p);
+
+   hyp_spin_lock(>lock);
+   if (p->refcount) {
+   hyp_spin_unlock(>lock);
+   hyp_panic();
+   }
+   p->refcount = 1;
+   hyp_spin_unlock(>lock);
+}
+
+/* Allocation */
+void *hyp_alloc_pages(struct hyp_pool *pool, unsigned int order);
+void hyp_get_page(void *addr);
+void hyp_put_page(void *addr);
+
+/* Used pages cannot be freed */
+int hyp_pool_init(struct hyp_pool *pool, u64 pfn, unsigned int nr_pages,
+ unsigned int reserved_pages);
+#endif /* __KVM_HYP_GFP_H */
diff --git a/arch/arm64/kvm/hyp/include/nvhe/memory.h 
b/arch/arm64/kvm/hyp/include/nvhe/memory.h
index 3e49eaa7e682..d2fb307c5952 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/memory.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/memory.h
@@ -6,7 +6,17 @@
 
 #include 
 
+struct hyp_pool;
+struct hyp_page {
+   unsigned int refcount;
+   unsigned int order;
+   struct hyp_pool *pool;
+   struct list_head node;
+};
+
 extern s64 hyp_physvirt_offset;
+extern u64 __hyp_vmemmap;
+#define hyp_vmemmap ((struct hyp_page *)__hyp_vmemmap)
 
 #define __hyp_pa(virt) ((phys_addr_t)(virt) + hyp_physvirt_offset)
 #define __hyp_va(phys) ((void *)((phys_addr_t)(phys) - hyp_physvirt_offset))
@@ -21,4 +31,22 @@ static inline phys_addr_t hyp_virt_to_phys(void *addr)
return __hyp_pa(addr);
 }
 
+#define hyp_phys_to_pfn(phys)  ((phys) >> PAGE_SHIFT)
+#define hyp_pfn_to_phys(pfn)   ((phys_addr_t)((pfn) << PAGE_SHIFT))
+#define hyp_phys_to_page(phys) (_vmemmap[hyp_phys_to_pfn(phys)])
+#define hyp_virt_to_page(virt) hyp_phys_to_page(__hyp_pa(virt))
+#define hyp_virt_to_pfn(virt)  hyp_phys_to_pfn(__hyp_pa(virt))
+
+#define hyp_page_to_pfn(page)  ((struct hyp_page *)(page) - hyp_vmemmap)
+#define hyp_page_to_phys(page)  hyp_pfn_to_phys((hyp_page_to_pfn(page)))
+#define hyp_page_to_virt(page) __hyp_va(hyp_page_to_phys(page))
+#define hyp_page_to_pool(page) (((struct hyp_page *)page)->pool)
+
+static inline int hyp_page_count(void *addr)
+{
+   struct hyp_page *p = hyp_virt_to_page(addr);
+
+   return p->refcount;
+}
+
 #endif /* __KVM_HYP_MEMORY_H */
diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index 

[PATCH v6 23/38] KVM: arm64: Refactor __load_guest_stage2()

2021-03-19 Thread Quentin Perret
Refactor __load_guest_stage2() to introduce __load_stage2() which will
be re-used when loading the host stage 2.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_mmu.h | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 6f743e20cb06..9d64fa73ee67 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -270,9 +270,9 @@ static __always_inline u64 kvm_get_vttbr(struct kvm_s2_mmu 
*mmu)
  * Must be called from hyp code running at EL2 with an updated VTTBR
  * and interrupts disabled.
  */
-static __always_inline void __load_guest_stage2(struct kvm_s2_mmu *mmu)
+static __always_inline void __load_stage2(struct kvm_s2_mmu *mmu, unsigned 
long vtcr)
 {
-   write_sysreg(kern_hyp_va(mmu->arch)->vtcr, vtcr_el2);
+   write_sysreg(vtcr, vtcr_el2);
write_sysreg(kvm_get_vttbr(mmu), vttbr_el2);
 
/*
@@ -283,6 +283,11 @@ static __always_inline void __load_guest_stage2(struct 
kvm_s2_mmu *mmu)
asm(ALTERNATIVE("nop", "isb", ARM64_WORKAROUND_SPECULATIVE_AT));
 }
 
+static __always_inline void __load_guest_stage2(struct kvm_s2_mmu *mmu)
+{
+   __load_stage2(mmu, kern_hyp_va(mmu->arch)->vtcr);
+}
+
 static inline struct kvm *kvm_s2_mmu_to_kvm(struct kvm_s2_mmu *mmu)
 {
return container_of(mmu->arch, struct kvm, arch);
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v6 24/38] KVM: arm64: Refactor __populate_fault_info()

2021-03-19 Thread Quentin Perret
Refactor __populate_fault_info() to introduce __get_fault_info() which
will be used once the host is wrapped in a stage 2.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/kvm/hyp/include/hyp/switch.h | 28 +++--
 1 file changed, 17 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h 
b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 1073f176e92c..cdf42e347d3f 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -160,18 +160,10 @@ static inline bool __translate_far_to_hpfar(u64 far, u64 
*hpfar)
return true;
 }
 
-static inline bool __populate_fault_info(struct kvm_vcpu *vcpu)
+static inline bool __get_fault_info(u64 esr, struct kvm_vcpu_fault_info *fault)
 {
-   u8 ec;
-   u64 esr;
u64 hpfar, far;
 
-   esr = vcpu->arch.fault.esr_el2;
-   ec = ESR_ELx_EC(esr);
-
-   if (ec != ESR_ELx_EC_DABT_LOW && ec != ESR_ELx_EC_IABT_LOW)
-   return true;
-
far = read_sysreg_el2(SYS_FAR);
 
/*
@@ -194,11 +186,25 @@ static inline bool __populate_fault_info(struct kvm_vcpu 
*vcpu)
hpfar = read_sysreg(hpfar_el2);
}
 
-   vcpu->arch.fault.far_el2 = far;
-   vcpu->arch.fault.hpfar_el2 = hpfar;
+   fault->far_el2 = far;
+   fault->hpfar_el2 = hpfar;
return true;
 }
 
+static inline bool __populate_fault_info(struct kvm_vcpu *vcpu)
+{
+   u8 ec;
+   u64 esr;
+
+   esr = vcpu->arch.fault.esr_el2;
+   ec = ESR_ELx_EC(esr);
+
+   if (ec != ESR_ELx_EC_DABT_LOW && ec != ESR_ELx_EC_IABT_LOW)
+   return true;
+
+   return __get_fault_info(esr, >arch.fault);
+}
+
 static inline void __hyp_sve_save_host(struct kvm_vcpu *vcpu)
 {
struct thread_struct *thread;
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v6 11/38] KVM: arm64: Stub CONFIG_DEBUG_LIST at Hyp

2021-03-19 Thread Quentin Perret
In order to use the kernel list library at EL2, introduce stubs for the
CONFIG_DEBUG_LIST out-of-lines calls.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/kvm/hyp/nvhe/Makefile |  2 +-
 arch/arm64/kvm/hyp/nvhe/stub.c   | 22 ++
 2 files changed, 23 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/kvm/hyp/nvhe/stub.c

diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index 24ff99e2eac5..144da72ad510 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -13,7 +13,7 @@ lib-objs := clear_page.o copy_page.o memcpy.o memset.o
 lib-objs := $(addprefix ../../../lib/, $(lib-objs))
 
 obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
-hyp-main.o hyp-smp.o psci-relay.o early_alloc.o
+hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o
 obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
 ../fpsimd.o ../hyp-entry.o ../exception.o
 obj-y += $(lib-objs)
diff --git a/arch/arm64/kvm/hyp/nvhe/stub.c b/arch/arm64/kvm/hyp/nvhe/stub.c
new file mode 100644
index ..c0aa6bbfd79d
--- /dev/null
+++ b/arch/arm64/kvm/hyp/nvhe/stub.c
@@ -0,0 +1,22 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Stubs for out-of-line function calls caused by re-using kernel
+ * infrastructure at EL2.
+ *
+ * Copyright (C) 2020 - Google LLC
+ */
+
+#include 
+
+#ifdef CONFIG_DEBUG_LIST
+bool __list_add_valid(struct list_head *new, struct list_head *prev,
+ struct list_head *next)
+{
+   return true;
+}
+
+bool __list_del_entry_valid(struct list_head *entry)
+{
+   return true;
+}
+#endif
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v6 07/38] KVM: arm64: Introduce a BSS section for use at Hyp

2021-03-19 Thread Quentin Perret
Currently, the hyp code cannot make full use of a bss, as the kernel
section is mapped read-only.

While this mapping could simply be changed to read-write, it would
intermingle even more the hyp and kernel state than they currently are.
Instead, introduce a __hyp_bss section, that uses reserved pages, and
create the appropriate RW hyp mappings during KVM init.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/sections.h |  1 +
 arch/arm64/kernel/vmlinux.lds.S   | 52 ---
 arch/arm64/kvm/arm.c  | 14 -
 arch/arm64/kvm/hyp/nvhe/hyp.lds.S |  1 +
 4 files changed, 49 insertions(+), 19 deletions(-)

diff --git a/arch/arm64/include/asm/sections.h 
b/arch/arm64/include/asm/sections.h
index 2f36b16a5b5d..e4ad9db53af1 100644
--- a/arch/arm64/include/asm/sections.h
+++ b/arch/arm64/include/asm/sections.h
@@ -13,6 +13,7 @@ extern char __hyp_idmap_text_start[], __hyp_idmap_text_end[];
 extern char __hyp_text_start[], __hyp_text_end[];
 extern char __hyp_rodata_start[], __hyp_rodata_end[];
 extern char __hyp_reloc_begin[], __hyp_reloc_end[];
+extern char __hyp_bss_start[], __hyp_bss_end[];
 extern char __idmap_text_start[], __idmap_text_end[];
 extern char __initdata_begin[], __initdata_end[];
 extern char __inittext_begin[], __inittext_end[];
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 7eea7888bb02..e96173ce211b 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -5,24 +5,7 @@
  * Written by Martin Mares 
  */
 
-#define RO_EXCEPTION_TABLE_ALIGN   8
-#define RUNTIME_DISCARD_EXIT
-
-#include 
-#include 
 #include 
-#include 
-#include 
-#include 
-
-#include "image.h"
-
-OUTPUT_ARCH(aarch64)
-ENTRY(_text)
-
-jiffies = jiffies_64;
-
-
 #ifdef CONFIG_KVM
 #define HYPERVISOR_EXTABLE \
. = ALIGN(SZ_8);\
@@ -51,13 +34,43 @@ jiffies = jiffies_64;
__hyp_reloc_end = .;\
}
 
+#define BSS_FIRST_SECTIONS \
+   __hyp_bss_start = .;\
+   *(HYP_SECTION_NAME(.bss))   \
+   . = ALIGN(PAGE_SIZE);   \
+   __hyp_bss_end = .;
+
+/*
+ * We require that __hyp_bss_start and __bss_start are aligned, and enforce it
+ * with an assertion. But the BSS_SECTION macro places an empty .sbss section
+ * between them, which can in some cases cause the linker to misalign them. To
+ * work around the issue, force a page alignment for __bss_start.
+ */
+#define SBSS_ALIGN PAGE_SIZE
 #else /* CONFIG_KVM */
 #define HYPERVISOR_EXTABLE
 #define HYPERVISOR_DATA_SECTIONS
 #define HYPERVISOR_PERCPU_SECTION
 #define HYPERVISOR_RELOC_SECTION
+#define SBSS_ALIGN 0
 #endif
 
+#define RO_EXCEPTION_TABLE_ALIGN   8
+#define RUNTIME_DISCARD_EXIT
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "image.h"
+
+OUTPUT_ARCH(aarch64)
+ENTRY(_text)
+
+jiffies = jiffies_64;
+
 #define HYPERVISOR_TEXT\
/*  \
 * Align to 4 KB so that\
@@ -276,7 +289,7 @@ SECTIONS
__pecoff_data_rawsize = ABSOLUTE(. - __initdata_begin);
_edata = .;
 
-   BSS_SECTION(0, 0, 0)
+   BSS_SECTION(SBSS_ALIGN, 0, 0)
 
. = ALIGN(PAGE_SIZE);
init_pg_dir = .;
@@ -324,6 +337,9 @@ ASSERT(__hibernate_exit_text_end - 
(__hibernate_exit_text_start & ~(SZ_4K - 1))
 ASSERT((__entry_tramp_text_end - __entry_tramp_text_start) == PAGE_SIZE,
"Entry trampoline text too big")
 #endif
+#ifdef CONFIG_KVM
+ASSERT(__hyp_bss_start == __bss_start, "HYP and Host BSS are misaligned")
+#endif
 /*
  * If padding is applied before .head.text, virt<->phys conversions will fail.
  */
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 2adb8d878bb9..22d6df525254 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1775,7 +1775,19 @@ static int init_hyp_mode(void)
goto out_err;
}
 
-   err = create_hyp_mappings(kvm_ksym_ref(__bss_start),
+   /*
+* .hyp.bss is guaranteed to be placed at the beginning of the .bss
+* section thanks to an assertion in the linker script. Map it RW and
+* the rest of .bss RO.
+*/
+   err = create_hyp_mappings(kvm_ksym_ref(__hyp_bss_start),
+ kvm_ksym_ref(__hyp_bss_end), PAGE_HYP);
+   if (err) {
+   kvm_err("Cannot map hyp bss section: %d\n", err);
+   goto out_err;
+   }
+
+   err = create_hyp_mappings(kvm_ksym_ref(__hyp_bss_end),
  kvm_ksym_ref(__bss_stop), PAGE_HYP_RO);
if (err) {
kvm_err("Cannot map bss section\n");
diff --git 

[PATCH v6 08/38] KVM: arm64: Make kvm_call_hyp() a function call at Hyp

2021-03-19 Thread Quentin Perret
kvm_call_hyp() has some logic to issue a function call or a hypercall
depending on the EL at which the kernel is running. However, all the
code compiled under __KVM_NVHE_HYPERVISOR__ is guaranteed to only run
at EL2 which allows us to simplify.

Add ifdefery to kvm_host.h to simplify kvm_call_hyp() in .hyp.text.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_host.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 08f500b2551a..6a2031af9562 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -593,6 +593,7 @@ int kvm_test_age_hva(struct kvm *kvm, unsigned long hva);
 void kvm_arm_halt_guest(struct kvm *kvm);
 void kvm_arm_resume_guest(struct kvm *kvm);
 
+#ifndef __KVM_NVHE_HYPERVISOR__
 #define kvm_call_hyp_nvhe(f, ...)  
\
({  \
struct arm_smccc_res res;   \
@@ -632,6 +633,11 @@ void kvm_arm_resume_guest(struct kvm *kvm);
\
ret;\
})
+#else /* __KVM_NVHE_HYPERVISOR__ */
+#define kvm_call_hyp(f, ...) f(__VA_ARGS__)
+#define kvm_call_hyp_ret(f, ...) f(__VA_ARGS__)
+#define kvm_call_hyp_nvhe(f, ...) f(__VA_ARGS__)
+#endif /* __KVM_NVHE_HYPERVISOR__ */
 
 void force_vm_exit(const cpumask_t *mask);
 void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v6 22/38] KVM: arm64: Refactor kvm_arm_setup_stage2()

2021-03-19 Thread Quentin Perret
In order to re-use some of the stage 2 setup code at EL2, factor parts
of kvm_arm_setup_stage2() out into separate functions.

No functional change intended.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_pgtable.h | 26 +
 arch/arm64/kvm/hyp/pgtable.c | 32 +
 arch/arm64/kvm/reset.c   | 42 +++-
 3 files changed, 62 insertions(+), 38 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_pgtable.h 
b/arch/arm64/include/asm/kvm_pgtable.h
index 7945ec87eaec..9cdc198ea6b4 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -13,6 +13,16 @@
 
 #define KVM_PGTABLE_MAX_LEVELS 4U
 
+static inline u64 kvm_get_parange(u64 mmfr0)
+{
+   u64 parange = cpuid_feature_extract_unsigned_field(mmfr0,
+   ID_AA64MMFR0_PARANGE_SHIFT);
+   if (parange > ID_AA64MMFR0_PARANGE_MAX)
+   parange = ID_AA64MMFR0_PARANGE_MAX;
+
+   return parange;
+}
+
 typedef u64 kvm_pte_t;
 
 /**
@@ -159,6 +169,22 @@ void kvm_pgtable_hyp_destroy(struct kvm_pgtable *pgt);
 int kvm_pgtable_hyp_map(struct kvm_pgtable *pgt, u64 addr, u64 size, u64 phys,
enum kvm_pgtable_prot prot);
 
+/**
+ * kvm_get_vtcr() - Helper to construct VTCR_EL2
+ * @mmfr0: Sanitized value of SYS_ID_AA64MMFR0_EL1 register.
+ * @mmfr1: Sanitized value of SYS_ID_AA64MMFR1_EL1 register.
+ * @phys_shfit:Value to set in VTCR_EL2.T0SZ.
+ *
+ * The VTCR value is common across all the physical CPUs on the system.
+ * We use system wide sanitised values to fill in different fields,
+ * except for Hardware Management of Access Flags. HA Flag is set
+ * unconditionally on all CPUs, as it is safe to run with or without
+ * the feature and the bit is RES0 on CPUs that don't support it.
+ *
+ * Return: VTCR_EL2 value
+ */
+u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift);
+
 /**
  * kvm_pgtable_stage2_init() - Initialise a guest stage-2 page-table.
  * @pgt:   Uninitialised page-table structure to initialise.
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index ea95bbc6ba80..4e15ccafd640 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -9,6 +9,7 @@
 
 #include 
 #include 
+#include 
 
 #define KVM_PTE_VALID  BIT(0)
 
@@ -450,6 +451,37 @@ struct stage2_map_data {
struct kvm_pgtable_mm_ops   *mm_ops;
 };
 
+u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift)
+{
+   u64 vtcr = VTCR_EL2_FLAGS;
+   u8 lvls;
+
+   vtcr |= kvm_get_parange(mmfr0) << VTCR_EL2_PS_SHIFT;
+   vtcr |= VTCR_EL2_T0SZ(phys_shift);
+   /*
+* Use a minimum 2 level page table to prevent splitting
+* host PMD huge pages at stage2.
+*/
+   lvls = stage2_pgtable_levels(phys_shift);
+   if (lvls < 2)
+   lvls = 2;
+   vtcr |= VTCR_EL2_LVLS_TO_SL0(lvls);
+
+   /*
+* Enable the Hardware Access Flag management, unconditionally
+* on all CPUs. The features is RES0 on CPUs without the support
+* and must be ignored by the CPUs.
+*/
+   vtcr |= VTCR_EL2_HA;
+
+   /* Set the vmid bits */
+   vtcr |= (get_vmid_bits(mmfr1) == 16) ?
+   VTCR_EL2_VS_16BIT :
+   VTCR_EL2_VS_8BIT;
+
+   return vtcr;
+}
+
 static int stage2_map_set_prot_attr(enum kvm_pgtable_prot prot,
struct stage2_map_data *data)
 {
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 67f30953d6d0..86d94f616a1e 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -329,19 +329,10 @@ int kvm_set_ipa_limit(void)
return 0;
 }
 
-/*
- * Configure the VTCR_EL2 for this VM. The VTCR value is common
- * across all the physical CPUs on the system. We use system wide
- * sanitised values to fill in different fields, except for Hardware
- * Management of Access Flags. HA Flag is set unconditionally on
- * all CPUs, as it is safe to run with or without the feature and
- * the bit is RES0 on CPUs that don't support it.
- */
 int kvm_arm_setup_stage2(struct kvm *kvm, unsigned long type)
 {
-   u64 vtcr = VTCR_EL2_FLAGS, mmfr0;
-   u32 parange, phys_shift;
-   u8 lvls;
+   u64 mmfr0, mmfr1;
+   u32 phys_shift;
 
if (type & ~KVM_VM_TYPE_ARM_IPA_SIZE_MASK)
return -EINVAL;
@@ -361,33 +352,8 @@ int kvm_arm_setup_stage2(struct kvm *kvm, unsigned long 
type)
}
 
mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
-   parange = cpuid_feature_extract_unsigned_field(mmfr0,
-   ID_AA64MMFR0_PARANGE_SHIFT);
-   if (parange > ID_AA64MMFR0_PARANGE_MAX)
-   parange = ID_AA64MMFR0_PARANGE_MAX;
-   vtcr |= parange << VTCR_EL2_PS_SHIFT;
-
-   vtcr |= VTCR_EL2_T0SZ(phys_shift);
-   /*
-* Use a minimum 2 level page 

[PATCH v6 21/38] KVM: arm64: Set host stage 2 using kvm_nvhe_init_params

2021-03-19 Thread Quentin Perret
Move the registers relevant to host stage 2 enablement to
kvm_nvhe_init_params to prepare the ground for enabling it in later
patches.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_asm.h   |  3 +++
 arch/arm64/kernel/asm-offsets.c|  3 +++
 arch/arm64/kvm/arm.c   |  5 +
 arch/arm64/kvm/hyp/nvhe/hyp-init.S | 14 +-
 arch/arm64/kvm/hyp/nvhe/switch.c   |  5 +
 5 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index ebe7007b820b..08f63c09cd11 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -158,6 +158,9 @@ struct kvm_nvhe_init_params {
unsigned long tpidr_el2;
unsigned long stack_hyp_va;
phys_addr_t pgd_pa;
+   unsigned long hcr_el2;
+   unsigned long vttbr;
+   unsigned long vtcr;
 };
 
 /* Translate a kernel address @ptr into its equivalent linear mapping */
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index a36e2fc330d4..8930b42f6418 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -120,6 +120,9 @@ int main(void)
   DEFINE(NVHE_INIT_TPIDR_EL2,  offsetof(struct kvm_nvhe_init_params, 
tpidr_el2));
   DEFINE(NVHE_INIT_STACK_HYP_VA,   offsetof(struct kvm_nvhe_init_params, 
stack_hyp_va));
   DEFINE(NVHE_INIT_PGD_PA, offsetof(struct kvm_nvhe_init_params, pgd_pa));
+  DEFINE(NVHE_INIT_HCR_EL2,offsetof(struct kvm_nvhe_init_params, hcr_el2));
+  DEFINE(NVHE_INIT_VTTBR,  offsetof(struct kvm_nvhe_init_params, vttbr));
+  DEFINE(NVHE_INIT_VTCR,   offsetof(struct kvm_nvhe_init_params, vtcr));
 #endif
 #ifdef CONFIG_CPU_PM
   DEFINE(CPU_CTX_SP,   offsetof(struct cpu_suspend_ctx, sp));
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index d93ea0b82491..a6b5ba195ca9 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1418,6 +1418,11 @@ static void cpu_prepare_hyp_mode(int cpu)
 
params->stack_hyp_va = kern_hyp_va(per_cpu(kvm_arm_hyp_stack_page, cpu) 
+ PAGE_SIZE);
params->pgd_pa = kvm_mmu_get_httbr();
+   if (is_protected_kvm_enabled())
+   params->hcr_el2 = HCR_HOST_NVHE_PROTECTED_FLAGS;
+   else
+   params->hcr_el2 = HCR_HOST_NVHE_FLAGS;
+   params->vttbr = params->vtcr = 0;
 
/*
 * Flush the init params from the data cache because the struct will
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-init.S 
b/arch/arm64/kvm/hyp/nvhe/hyp-init.S
index 557fb79b29cf..60d51515153e 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-init.S
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-init.S
@@ -83,11 +83,6 @@ SYM_CODE_END(__kvm_hyp_init)
  * x0: struct kvm_nvhe_init_params PA
  */
 SYM_CODE_START_LOCAL(___kvm_hyp_init)
-alternative_if ARM64_KVM_PROTECTED_MODE
-   mov_q   x1, HCR_HOST_NVHE_PROTECTED_FLAGS
-   msr hcr_el2, x1
-alternative_else_nop_endif
-
ldr x1, [x0, #NVHE_INIT_TPIDR_EL2]
msr tpidr_el2, x1
 
@@ -97,6 +92,15 @@ alternative_else_nop_endif
ldr x1, [x0, #NVHE_INIT_MAIR_EL2]
msr mair_el2, x1
 
+   ldr x1, [x0, #NVHE_INIT_HCR_EL2]
+   msr hcr_el2, x1
+
+   ldr x1, [x0, #NVHE_INIT_VTTBR]
+   msr vttbr_el2, x1
+
+   ldr x1, [x0, #NVHE_INIT_VTCR]
+   msr vtcr_el2, x1
+
ldr x1, [x0, #NVHE_INIT_PGD_PA]
phys_to_ttbr x2, x1
 alternative_if ARM64_HAS_CNP
diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index f6d542ecf6a7..99323563022a 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -97,10 +97,7 @@ static void __deactivate_traps(struct kvm_vcpu *vcpu)
mdcr_el2 |= MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT;
 
write_sysreg(mdcr_el2, mdcr_el2);
-   if (is_protected_kvm_enabled())
-   write_sysreg(HCR_HOST_NVHE_PROTECTED_FLAGS, hcr_el2);
-   else
-   write_sysreg(HCR_HOST_NVHE_FLAGS, hcr_el2);
+   write_sysreg(this_cpu_ptr(_init_params)->hcr_el2, hcr_el2);
 
cptr = CPTR_EL2_DEFAULT;
if (vcpu_has_sve(vcpu) && (vcpu->arch.flags & KVM_ARM64_FP_ENABLED))
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v6 02/38] KVM: arm64: Link position-independent string routines into .hyp.text

2021-03-19 Thread Quentin Perret
From: Will Deacon 

Pull clear_page(), copy_page(), memcpy() and memset() into the nVHE hyp
code and ensure that we always execute the '__pi_' entry point on the
offchance that it changes in future.

[ qperret: Commit title nits and added linker script alias ]

Signed-off-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/hyp_image.h |  3 +++
 arch/arm64/kernel/image-vars.h | 11 +++
 arch/arm64/kvm/hyp/nvhe/Makefile   |  4 
 3 files changed, 18 insertions(+)

diff --git a/arch/arm64/include/asm/hyp_image.h 
b/arch/arm64/include/asm/hyp_image.h
index 737ded6b6d0d..78cd77990c9c 100644
--- a/arch/arm64/include/asm/hyp_image.h
+++ b/arch/arm64/include/asm/hyp_image.h
@@ -56,6 +56,9 @@
  */
 #define KVM_NVHE_ALIAS(sym)kvm_nvhe_sym(sym) = sym;
 
+/* Defines a linker script alias for KVM nVHE hyp symbols */
+#define KVM_NVHE_ALIAS_HYP(first, sec) kvm_nvhe_sym(first) = kvm_nvhe_sym(sec);
+
 #endif /* LINKER_SCRIPT */
 
 #endif /* __ARM64_HYP_IMAGE_H__ */
diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index 5aa9ed1e9ec6..4eb7a15c8b60 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -104,6 +104,17 @@ KVM_NVHE_ALIAS(kvm_arm_hyp_percpu_base);
 /* PMU available static key */
 KVM_NVHE_ALIAS(kvm_arm_pmu_available);
 
+/* Position-independent library routines */
+KVM_NVHE_ALIAS_HYP(clear_page, __pi_clear_page);
+KVM_NVHE_ALIAS_HYP(copy_page, __pi_copy_page);
+KVM_NVHE_ALIAS_HYP(memcpy, __pi_memcpy);
+KVM_NVHE_ALIAS_HYP(memset, __pi_memset);
+
+#ifdef CONFIG_KASAN
+KVM_NVHE_ALIAS_HYP(__memcpy, __pi_memcpy);
+KVM_NVHE_ALIAS_HYP(__memset, __pi_memset);
+#endif
+
 #endif /* CONFIG_KVM */
 
 #endif /* __ARM64_KERNEL_IMAGE_VARS_H */
diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index a6707df4f6c0..bc98f8e3d1da 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -9,10 +9,14 @@ ccflags-y := -D__KVM_NVHE_HYPERVISOR__ -D__DISABLE_EXPORTS
 hostprogs := gen-hyprel
 HOST_EXTRACFLAGS += -I$(objtree)/include
 
+lib-objs := clear_page.o copy_page.o memcpy.o memset.o
+lib-objs := $(addprefix ../../../lib/, $(lib-objs))
+
 obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
 hyp-main.o hyp-smp.o psci-relay.o
 obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
 ../fpsimd.o ../hyp-entry.o ../exception.o
+obj-y += $(lib-objs)
 
 ##
 ## Build rules for compiling nVHE hyp code
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v6 03/38] arm64: kvm: Add standalone ticket spinlock implementation for use at hyp

2021-03-19 Thread Quentin Perret
From: Will Deacon 

We will soon need to synchronise multiple CPUs in the hyp text at EL2.
The qspinlock-based locking used by the host is overkill for this purpose
and relies on the kernel's "percpu" implementation for the MCS nodes.

Implement a simple ticket locking scheme based heavily on the code removed
by commit c11090474d70 ("arm64: locking: Replace ticket lock implementation
with qspinlock").

Signed-off-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/kvm/hyp/include/nvhe/spinlock.h | 92 ++
 1 file changed, 92 insertions(+)
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/spinlock.h

diff --git a/arch/arm64/kvm/hyp/include/nvhe/spinlock.h 
b/arch/arm64/kvm/hyp/include/nvhe/spinlock.h
new file mode 100644
index ..76b537f8d1c6
--- /dev/null
+++ b/arch/arm64/kvm/hyp/include/nvhe/spinlock.h
@@ -0,0 +1,92 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * A stand-alone ticket spinlock implementation for use by the non-VHE
+ * KVM hypervisor code running at EL2.
+ *
+ * Copyright (C) 2020 Google LLC
+ * Author: Will Deacon 
+ *
+ * Heavily based on the implementation removed by c11090474d70 which was:
+ * Copyright (C) 2012 ARM Ltd.
+ */
+
+#ifndef __ARM64_KVM_NVHE_SPINLOCK_H__
+#define __ARM64_KVM_NVHE_SPINLOCK_H__
+
+#include 
+#include 
+
+typedef union hyp_spinlock {
+   u32 __val;
+   struct {
+#ifdef __AARCH64EB__
+   u16 next, owner;
+#else
+   u16 owner, next;
+#endif
+   };
+} hyp_spinlock_t;
+
+#define hyp_spin_lock_init(l)  \
+do {   \
+   *(l) = (hyp_spinlock_t){ .__val = 0 };  \
+} while (0)
+
+static inline void hyp_spin_lock(hyp_spinlock_t *lock)
+{
+   u32 tmp;
+   hyp_spinlock_t lockval, newval;
+
+   asm volatile(
+   /* Atomically increment the next ticket. */
+   ARM64_LSE_ATOMIC_INSN(
+   /* LL/SC */
+"  prfmpstl1strm, %3\n"
+"1:ldaxr   %w0, %3\n"
+"  add %w1, %w0, #(1 << 16)\n"
+"  stxr%w2, %w1, %3\n"
+"  cbnz%w2, 1b\n",
+   /* LSE atomics */
+"  mov %w2, #(1 << 16)\n"
+"  ldadda  %w2, %w0, %3\n"
+   __nops(3))
+
+   /* Did we get the lock? */
+"  eor %w1, %w0, %w0, ror #16\n"
+"  cbz %w1, 3f\n"
+   /*
+* No: spin on the owner. Send a local event to avoid missing an
+* unlock before the exclusive load.
+*/
+"  sevl\n"
+"2:wfe\n"
+"  ldaxrh  %w2, %4\n"
+"  eor %w1, %w2, %w0, lsr #16\n"
+"  cbnz%w1, 2b\n"
+   /* We got the lock. Critical section starts here. */
+"3:"
+   : "=" (lockval), "=" (newval), "=" (tmp), "+Q" (*lock)
+   : "Q" (lock->owner)
+   : "memory");
+}
+
+static inline void hyp_spin_unlock(hyp_spinlock_t *lock)
+{
+   u64 tmp;
+
+   asm volatile(
+   ARM64_LSE_ATOMIC_INSN(
+   /* LL/SC */
+   "   ldrh%w1, %0\n"
+   "   add %w1, %w1, #1\n"
+   "   stlrh   %w1, %0",
+   /* LSE atomics */
+   "   mov %w1, #1\n"
+   "   staddlh %w1, %0\n"
+   __nops(1))
+   : "=Q" (lock->owner), "=" (tmp)
+   :
+   : "memory");
+}
+
+#endif /* __ARM64_KVM_NVHE_SPINLOCK_H__ */
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v6 00/38] KVM: arm64: Stage-2 for the host

2021-03-19 Thread Quentin Perret
Hi all,

This is the v6 of the series previously posted here:

  https://lore.kernel.org/r/20210315143536.214621-1-qper...@google.com/

This basically allows us to wrap the host with a stage 2 when running in
nVHE, hence paving the way for protecting guest memory from the host in
the future (among other use-cases). For more details about the
motivation and the design angle taken here, I would recommend to have a
look at the cover letter of v1, and/or to watch these presentations at
LPC [1] and KVM forum 2020 [2].

Changes since v5:

 - disabled FWB for the host even when the CPUs support it using stage-2
   config flags;

 - added a stage-2 config flag to enfore identity mappings for the host;

 - refactored/simplified the cpu feature register copy;

 - removed unecessary ISB() from the set_ownership() path, and improved
   kerneldoc;

 - rebased on kvmarm/next to fix (trivial) conflicts with Marc's SVE
   series [3].

And as usual, there is a branch available here:

  https://android-kvm.googlesource.com/linux qperret/host-stage2-v6

Thanks,
Quentin

[1] https://youtu.be/54q6RzS9BpQ?t=10859
[2] https://youtu.be/wY-u6n75iXc
[3] https://lore.kernel.org/r/20210318122532.505263-1-...@kernel.org/

Quentin Perret (35):
  KVM: arm64: Initialize kvm_nvhe_init_params early
  KVM: arm64: Avoid free_page() in page-table allocator
  KVM: arm64: Factor memory allocation out of pgtable.c
  KVM: arm64: Introduce a BSS section for use at Hyp
  KVM: arm64: Make kvm_call_hyp() a function call at Hyp
  KVM: arm64: Allow using kvm_nvhe_sym() in hyp code
  KVM: arm64: Introduce an early Hyp page allocator
  KVM: arm64: Stub CONFIG_DEBUG_LIST at Hyp
  KVM: arm64: Introduce a Hyp buddy page allocator
  KVM: arm64: Enable access to sanitized CPU features at EL2
  KVM: arm64: Provide __flush_dcache_area at EL2
  KVM: arm64: Factor out vector address calculation
  arm64: asm: Provide set_sctlr_el2 macro
  KVM: arm64: Prepare the creation of s1 mappings at EL2
  KVM: arm64: Elevate hypervisor mappings creation at EL2
  KVM: arm64: Use kvm_arch for stage 2 pgtable
  KVM: arm64: Use kvm_arch in kvm_s2_mmu
  KVM: arm64: Set host stage 2 using kvm_nvhe_init_params
  KVM: arm64: Refactor kvm_arm_setup_stage2()
  KVM: arm64: Refactor __load_guest_stage2()
  KVM: arm64: Refactor __populate_fault_info()
  KVM: arm64: Make memcache anonymous in pgtable allocator
  KVM: arm64: Reserve memory for host stage 2
  KVM: arm64: Sort the hypervisor memblocks
  KVM: arm64: Always zero invalid PTEs
  KVM: arm64: Use page-table to track page ownership
  KVM: arm64: Refactor the *_map_set_prot_attr() helpers
  KVM: arm64: Add kvm_pgtable_stage2_find_range()
  KVM: arm64: Introduce KVM_PGTABLE_S2_NOFWB stage 2 flag
  KVM: arm64: Introduce KVM_PGTABLE_S2_IDMAP stage 2 flag
  KVM: arm64: Provide sanitized mmfr* registers at EL2
  KVM: arm64: Wrap the host with a stage 2
  KVM: arm64: Page-align the .hyp sections
  KVM: arm64: Disable PMU support in protected mode
  KVM: arm64: Protect the .hyp sections from the host

Will Deacon (3):
  arm64: lib: Annotate {clear,copy}_page() as position-independent
  KVM: arm64: Link position-independent string routines into .hyp.text
  arm64: kvm: Add standalone ticket spinlock implementation for use at
hyp

 arch/arm64/include/asm/assembler.h|  14 +-
 arch/arm64/include/asm/cpufeature.h   |   1 +
 arch/arm64/include/asm/hyp_image.h|   7 +
 arch/arm64/include/asm/kvm_asm.h  |   9 +
 arch/arm64/include/asm/kvm_cpufeature.h   |  26 ++
 arch/arm64/include/asm/kvm_host.h |  19 +-
 arch/arm64/include/asm/kvm_hyp.h  |   8 +
 arch/arm64/include/asm/kvm_mmu.h  |  23 +-
 arch/arm64/include/asm/kvm_pgtable.h  | 164 ++-
 arch/arm64/include/asm/pgtable-prot.h |   4 +-
 arch/arm64/include/asm/sections.h |   1 +
 arch/arm64/kernel/asm-offsets.c   |   3 +
 arch/arm64/kernel/cpufeature.c|  13 +
 arch/arm64/kernel/image-vars.h|  30 ++
 arch/arm64/kernel/vmlinux.lds.S   |  74 ++--
 arch/arm64/kvm/arm.c  | 199 +++--
 arch/arm64/kvm/hyp/Makefile   |   2 +-
 arch/arm64/kvm/hyp/include/hyp/switch.h   |  28 +-
 arch/arm64/kvm/hyp/include/nvhe/early_alloc.h |  14 +
 arch/arm64/kvm/hyp/include/nvhe/gfp.h |  68 +++
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |  36 ++
 arch/arm64/kvm/hyp/include/nvhe/memory.h  |  52 +++
 arch/arm64/kvm/hyp/include/nvhe/mm.h  |  96 
 arch/arm64/kvm/hyp/include/nvhe/spinlock.h|  92 
 arch/arm64/kvm/hyp/nvhe/Makefile  |   9 +-
 arch/arm64/kvm/hyp/nvhe/cache.S   |  13 +
 arch/arm64/kvm/hyp/nvhe/early_alloc.c |  54 +++
 arch/arm64/kvm/hyp/nvhe/hyp-init.S|  42 +-
 arch/arm64/kvm/hyp/nvhe/hyp-main.c|  68 +++
 arch/arm64/kvm/hyp/nvhe/hyp-smp.c |   8 +
 arch/arm64/kvm/hyp/nvhe/hyp.lds.S | 

[PATCH v6 05/38] KVM: arm64: Avoid free_page() in page-table allocator

2021-03-19 Thread Quentin Perret
Currently, the KVM page-table allocator uses a mix of put_page() and
free_page() calls depending on the context even though page-allocation
is always achieved using variants of __get_free_page().

Make the code consistent by using put_page() throughout, and reduce the
memory management API surface used by the page-table code. This will
ease factoring out page-allocation from pgtable.c, which is a
pre-requisite to creating page-tables at EL2.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/kvm/hyp/pgtable.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 926fc07074f5..0990fda19198 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -414,7 +414,7 @@ int kvm_pgtable_hyp_init(struct kvm_pgtable *pgt, u32 
va_bits)
 static int hyp_free_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep,
   enum kvm_pgtable_walk_flags flag, void * const arg)
 {
-   free_page((unsigned long)kvm_pte_follow(*ptep));
+   put_page(virt_to_page(kvm_pte_follow(*ptep)));
return 0;
 }
 
@@ -426,7 +426,7 @@ void kvm_pgtable_hyp_destroy(struct kvm_pgtable *pgt)
};
 
WARN_ON(kvm_pgtable_walk(pgt, 0, BIT(pgt->ia_bits), ));
-   free_page((unsigned long)pgt->pgd);
+   put_page(virt_to_page(pgt->pgd));
pgt->pgd = NULL;
 }
 
@@ -578,7 +578,7 @@ static int stage2_map_walk_table_post(u64 addr, u64 end, 
u32 level,
if (!data->anchor)
return 0;
 
-   free_page((unsigned long)kvm_pte_follow(*ptep));
+   put_page(virt_to_page(kvm_pte_follow(*ptep)));
put_page(virt_to_page(ptep));
 
if (data->anchor == ptep) {
@@ -701,7 +701,7 @@ static int stage2_unmap_walker(u64 addr, u64 end, u32 
level, kvm_pte_t *ptep,
}
 
if (childp)
-   free_page((unsigned long)childp);
+   put_page(virt_to_page(childp));
 
return 0;
 }
@@ -898,7 +898,7 @@ static int stage2_free_walker(u64 addr, u64 end, u32 level, 
kvm_pte_t *ptep,
put_page(virt_to_page(ptep));
 
if (kvm_pte_table(pte, level))
-   free_page((unsigned long)kvm_pte_follow(pte));
+   put_page(virt_to_page(kvm_pte_follow(pte)));
 
return 0;
 }
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v6 06/38] KVM: arm64: Factor memory allocation out of pgtable.c

2021-03-19 Thread Quentin Perret
In preparation for enabling the creation of page-tables at EL2, factor
all memory allocation out of the page-table code, hence making it
re-usable with any compatible memory allocator.

No functional changes intended.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/include/asm/kvm_pgtable.h | 41 +++-
 arch/arm64/kvm/hyp/pgtable.c | 98 +---
 arch/arm64/kvm/mmu.c | 66 ++-
 3 files changed, 163 insertions(+), 42 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_pgtable.h 
b/arch/arm64/include/asm/kvm_pgtable.h
index 8886d43cfb11..bbe840e430cb 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -13,17 +13,50 @@
 
 typedef u64 kvm_pte_t;
 
+/**
+ * struct kvm_pgtable_mm_ops - Memory management callbacks.
+ * @zalloc_page:   Allocate a single zeroed memory page. The @arg parameter
+ * can be used by the walker to pass a memcache. The
+ * initial refcount of the page is 1.
+ * @zalloc_pages_exact:Allocate an exact number of zeroed memory 
pages. The
+ * @size parameter is in bytes, and is rounded-up to the
+ * next page boundary. The resulting allocation is
+ * physically contiguous.
+ * @free_pages_exact:  Free an exact number of memory pages previously
+ * allocated by zalloc_pages_exact.
+ * @get_page:  Increment the refcount on a page.
+ * @put_page:  Decrement the refcount on a page. When the refcount
+ * reaches 0 the page is automatically freed.
+ * @page_count:Return the refcount of a page.
+ * @phys_to_virt:  Convert a physical address into a virtual address mapped
+ * in the current context.
+ * @virt_to_phys:  Convert a virtual address mapped in the current context
+ * into a physical address.
+ */
+struct kvm_pgtable_mm_ops {
+   void*   (*zalloc_page)(void *arg);
+   void*   (*zalloc_pages_exact)(size_t size);
+   void(*free_pages_exact)(void *addr, size_t size);
+   void(*get_page)(void *addr);
+   void(*put_page)(void *addr);
+   int (*page_count)(void *addr);
+   void*   (*phys_to_virt)(phys_addr_t phys);
+   phys_addr_t (*virt_to_phys)(void *addr);
+};
+
 /**
  * struct kvm_pgtable - KVM page-table.
  * @ia_bits:   Maximum input address size, in bits.
  * @start_level:   Level at which the page-table walk starts.
  * @pgd:   Pointer to the first top-level entry of the page-table.
+ * @mm_ops:Memory management callbacks.
  * @mmu:   Stage-2 KVM MMU struct. Unused for stage-1 page-tables.
  */
 struct kvm_pgtable {
u32 ia_bits;
u32 start_level;
kvm_pte_t   *pgd;
+   struct kvm_pgtable_mm_ops   *mm_ops;
 
/* Stage-2 only */
struct kvm_s2_mmu   *mmu;
@@ -86,10 +119,12 @@ struct kvm_pgtable_walker {
  * kvm_pgtable_hyp_init() - Initialise a hypervisor stage-1 page-table.
  * @pgt:   Uninitialised page-table structure to initialise.
  * @va_bits:   Maximum virtual address bits.
+ * @mm_ops:Memory management callbacks.
  *
  * Return: 0 on success, negative error code on failure.
  */
-int kvm_pgtable_hyp_init(struct kvm_pgtable *pgt, u32 va_bits);
+int kvm_pgtable_hyp_init(struct kvm_pgtable *pgt, u32 va_bits,
+struct kvm_pgtable_mm_ops *mm_ops);
 
 /**
  * kvm_pgtable_hyp_destroy() - Destroy an unused hypervisor stage-1 page-table.
@@ -126,10 +161,12 @@ int kvm_pgtable_hyp_map(struct kvm_pgtable *pgt, u64 
addr, u64 size, u64 phys,
  * kvm_pgtable_stage2_init() - Initialise a guest stage-2 page-table.
  * @pgt:   Uninitialised page-table structure to initialise.
  * @kvm:   KVM structure representing the guest virtual machine.
+ * @mm_ops:Memory management callbacks.
  *
  * Return: 0 on success, negative error code on failure.
  */
-int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm *kvm);
+int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm *kvm,
+   struct kvm_pgtable_mm_ops *mm_ops);
 
 /**
  * kvm_pgtable_stage2_destroy() - Destroy an unused guest stage-2 page-table.
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 0990fda19198..ff478a576f4d 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -152,9 +152,9 @@ static kvm_pte_t kvm_phys_to_pte(u64 pa)
return pte;
 }
 
-static kvm_pte_t *kvm_pte_follow(kvm_pte_t pte)
+static kvm_pte_t *kvm_pte_follow(kvm_pte_t pte, struct kvm_pgtable_mm_ops 
*mm_ops)
 {
-   return __va(kvm_pte_to_phys(pte));
+   

[PATCH v6 04/38] KVM: arm64: Initialize kvm_nvhe_init_params early

2021-03-19 Thread Quentin Perret
Move the initialization of kvm_nvhe_init_params in a dedicated function
that is run early, and only once during KVM init, rather than every time
the KVM vectors are set and reset.

This also opens the opportunity for the hypervisor to change the init
structs during boot, hence simplifying the replacement of host-provided
page-table by the one the hypervisor will create for itself.

Acked-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/kvm/arm.c | 30 ++
 1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index c2df58be5b0c..2adb8d878bb9 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1388,22 +1388,18 @@ static int kvm_init_vector_slots(void)
return 0;
 }
 
-static void cpu_init_hyp_mode(void)
+static void cpu_prepare_hyp_mode(int cpu)
 {
-   struct kvm_nvhe_init_params *params = 
this_cpu_ptr_nvhe_sym(kvm_init_params);
-   struct arm_smccc_res res;
+   struct kvm_nvhe_init_params *params = 
per_cpu_ptr_nvhe_sym(kvm_init_params, cpu);
unsigned long tcr;
 
-   /* Switch from the HYP stub to our own HYP init vector */
-   __hyp_set_vectors(kvm_get_idmap_vector());
-
/*
 * Calculate the raw per-cpu offset without a translation from the
 * kernel's mapping to the linear mapping, and store it in tpidr_el2
 * so that we can use adr_l to access per-cpu variables in EL2.
 * Also drop the KASAN tag which gets in the way...
 */
-   params->tpidr_el2 = (unsigned 
long)kasan_reset_tag(this_cpu_ptr_nvhe_sym(__per_cpu_start)) -
+   params->tpidr_el2 = (unsigned 
long)kasan_reset_tag(per_cpu_ptr_nvhe_sym(__per_cpu_start, cpu)) -
(unsigned 
long)kvm_ksym_ref(CHOOSE_NVHE_SYM(__per_cpu_start));
 
params->mair_el2 = read_sysreg(mair_el1);
@@ -1427,7 +1423,7 @@ static void cpu_init_hyp_mode(void)
tcr |= (idmap_t0sz & GENMASK(TCR_TxSZ_WIDTH - 1, 0)) << TCR_T0SZ_OFFSET;
params->tcr_el2 = tcr;
 
-   params->stack_hyp_va = 
kern_hyp_va(__this_cpu_read(kvm_arm_hyp_stack_page) + PAGE_SIZE);
+   params->stack_hyp_va = kern_hyp_va(per_cpu(kvm_arm_hyp_stack_page, cpu) 
+ PAGE_SIZE);
params->pgd_pa = kvm_mmu_get_httbr();
 
/*
@@ -1435,6 +1431,15 @@ static void cpu_init_hyp_mode(void)
 * be read while the MMU is off.
 */
kvm_flush_dcache_to_poc(params, sizeof(*params));
+}
+
+static void cpu_init_hyp_mode(void)
+{
+   struct kvm_nvhe_init_params *params;
+   struct arm_smccc_res res;
+
+   /* Switch from the HYP stub to our own HYP init vector */
+   __hyp_set_vectors(kvm_get_idmap_vector());
 
/*
 * Call initialization code, and switch to the full blown HYP code.
@@ -1443,6 +1448,7 @@ static void cpu_init_hyp_mode(void)
 * cpus_have_const_cap() wrapper.
 */
BUG_ON(!system_capabilities_finalized());
+   params = this_cpu_ptr_nvhe_sym(kvm_init_params);
arm_smccc_1_1_hvc(KVM_HOST_SMCCC_FUNC(__kvm_hyp_init), 
virt_to_phys(params), );
WARN_ON(res.a0 != SMCCC_RET_SUCCESS);
 
@@ -1790,19 +1796,19 @@ static int init_hyp_mode(void)
}
}
 
-   /*
-* Map Hyp percpu pages
-*/
for_each_possible_cpu(cpu) {
char *percpu_begin = (char *)kvm_arm_hyp_percpu_base[cpu];
char *percpu_end = percpu_begin + nvhe_percpu_size();
 
+   /* Map Hyp percpu pages */
err = create_hyp_mappings(percpu_begin, percpu_end, PAGE_HYP);
-
if (err) {
kvm_err("Cannot map hyp percpu region\n");
goto out_err;
}
+
+   /* Prepare the CPU initialization parameters */
+   cpu_prepare_hyp_mode(cpu);
}
 
if (is_protected_kvm_enabled()) {
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v6 01/38] arm64: lib: Annotate {clear, copy}_page() as position-independent

2021-03-19 Thread Quentin Perret
From: Will Deacon 

clear_page() and copy_page() are suitable for use outside of the kernel
address space, so annotate them as position-independent code.

Signed-off-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
 arch/arm64/lib/clear_page.S | 4 ++--
 arch/arm64/lib/copy_page.S  | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/lib/clear_page.S b/arch/arm64/lib/clear_page.S
index 073acbf02a7c..b84b179edba3 100644
--- a/arch/arm64/lib/clear_page.S
+++ b/arch/arm64/lib/clear_page.S
@@ -14,7 +14,7 @@
  * Parameters:
  * x0 - dest
  */
-SYM_FUNC_START(clear_page)
+SYM_FUNC_START_PI(clear_page)
mrs x1, dczid_el0
and w1, w1, #0xf
mov x2, #4
@@ -25,5 +25,5 @@ SYM_FUNC_START(clear_page)
tst x0, #(PAGE_SIZE - 1)
b.ne1b
ret
-SYM_FUNC_END(clear_page)
+SYM_FUNC_END_PI(clear_page)
 EXPORT_SYMBOL(clear_page)
diff --git a/arch/arm64/lib/copy_page.S b/arch/arm64/lib/copy_page.S
index e7a793961408..29144f4cd449 100644
--- a/arch/arm64/lib/copy_page.S
+++ b/arch/arm64/lib/copy_page.S
@@ -17,7 +17,7 @@
  * x0 - dest
  * x1 - src
  */
-SYM_FUNC_START(copy_page)
+SYM_FUNC_START_PI(copy_page)
 alternative_if ARM64_HAS_NO_HW_PREFETCH
// Prefetch three cache lines ahead.
prfmpldl1strm, [x1, #128]
@@ -75,5 +75,5 @@ alternative_else_nop_endif
stnpx16, x17, [x0, #112 - 256]
 
ret
-SYM_FUNC_END(copy_page)
+SYM_FUNC_END_PI(copy_page)
 EXPORT_SYMBOL(copy_page)
-- 
2.31.0.rc2.261.g7f71774620-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 5/5] KVM: arm64: Log source when panicking from nVHE hyp

2021-03-19 Thread Andrew Scull
On Thu, 18 Mar 2021 at 16:59, Will Deacon  wrote:
>
> On Thu, Mar 18, 2021 at 02:33:11PM +, Andrew Scull wrote:
> > To aid with debugging, add details of the source of a panic from nVHE
> > hyp. This is done by having nVHE hyp exit to nvhe_hyp_panic_handler()
> > rather than directly to panic(). The handler will then add the extra
> > details for debugging before panicking the kernel.
> >
> > If the panic was due to a BUG(), look up the metadata to log the file
> > and line, if available, otherwise log an address that can be looked up
> > in vmlinux. The hyp offset is also logged to allow other hyp VAs to be
> > converted, similar to how the kernel offset is logged during a panic.
> >
> > __hyp_panic_string is now inlined since it no longer needs to be
> > referenced as a symbol and the message is free to diverge between VHE
> > and nVHE.
> >
> > The following is an example of the logs generated by a BUG in nVHE hyp.
> >
> > [   46.754840] kvm [307]: nVHE hyp BUG at: 
> > arch/arm64/kvm/hyp/nvhe/switch.c:242!
> > [   46.755357] kvm [307]: Hyp Offset: 0xfffea6c58e1e
> > [   46.755824] Kernel panic - not syncing: HYP panic:
> > [   46.755824] PS:43c9 PC:d93a82c705ac ESR:f2000800
> > [   46.755824] FAR:8008 HPFAR:00800800 
> > PAR:
> > [   46.755824] VCPU:d93a880d
> > [   46.756960] CPU: 3 PID: 307 Comm: kvm-vcpu-0 Not tainted 
> > 5.12.0-rc3-5-gc572b99cf65b-dirty #133
> > [   46.757459] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 
> > 02/06/2015
> > [   46.758366] Call trace:
> > [   46.758601]  dump_backtrace+0x0/0x1b0
> > [   46.758856]  show_stack+0x18/0x70
> > [   46.759057]  dump_stack+0xd0/0x12c
> > [   46.759236]  panic+0x16c/0x334
> > [   46.759426]  arm64_kernel_unmapped_at_el0+0x0/0x30
> > [   46.759661]  kvm_arch_vcpu_ioctl_run+0x134/0x750
> > [   46.759936]  kvm_vcpu_ioctl+0x2f0/0x970
> > [   46.760156]  __arm64_sys_ioctl+0xa8/0xec
> > [   46.760379]  el0_svc_common.constprop.0+0x60/0x120
> > [   46.760627]  do_el0_svc+0x24/0x90
> > [   46.760766]  el0_svc+0x2c/0x54
> > [   46.760915]  el0_sync_handler+0x1a4/0x1b0
> > [   46.761146]  el0_sync+0x170/0x180
> > [   46.761889] SMP: stopping secondary CPUs
> > [   46.762786] Kernel Offset: 0x3e1cd282 from 0x80001000
> > [   46.763142] PHYS_OFFSET: 0xa9f68000
> > [   46.763359] CPU features: 0x00240022,61806008
> > [   46.763651] Memory Limit: none
>
> Nice!
>
> > [   46.813867] ---[ end Kernel panic - not syncing: HYP panic:
> > [   46.813867] PS:43c9 PC:d93a82c705ac ESR:f2000800
> > [   46.813867] FAR:8008 HPFAR:00800800 
> > PAR:
> > [   46.813867] VCPU:d93a880d ]---
>
> Why did these last three lines get printed twice?
>
> > Signed-off-by: Andrew Scull 
> > ---
> >  arch/arm64/include/asm/kvm_mmu.h|  2 ++
> >  arch/arm64/kernel/image-vars.h  |  3 +-
> >  arch/arm64/kvm/handle_exit.c| 45 +
> >  arch/arm64/kvm/hyp/include/hyp/switch.h |  2 --
> >  arch/arm64/kvm/hyp/nvhe/host.S  | 18 --
> >  arch/arm64/kvm/hyp/nvhe/psci-relay.c|  2 --
> >  arch/arm64/kvm/hyp/vhe/switch.c |  4 +--
> >  7 files changed, 56 insertions(+), 20 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/kvm_mmu.h 
> > b/arch/arm64/include/asm/kvm_mmu.h
> > index 90873851f677..7c17a67d2291 100644
> > --- a/arch/arm64/include/asm/kvm_mmu.h
> > +++ b/arch/arm64/include/asm/kvm_mmu.h
> > @@ -121,6 +121,8 @@ void kvm_update_va_mask(struct alt_instr *alt,
> >  void kvm_compute_layout(void);
> >  void kvm_apply_hyp_relocations(void);
> >
> > +#define __hyp_pa(x) (((phys_addr_t)(x)) + hyp_physvirt_offset)
>
> Just a heads up: but Quentin's series moves this same macro, but to a
> different header (arch/arm64/kvm/hyp/include/nvhe/memory.h)

I can make sure we're putting it in the same header to ease the merge.

> >  static __always_inline unsigned long __kern_hyp_va(unsigned long v)
> >  {
> >   asm volatile(ALTERNATIVE_CB("and %0, %0, #1\n"
> > diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
> > index 5aa9ed1e9ec6..5ff2b6909387 100644
> > --- a/arch/arm64/kernel/image-vars.h
> > +++ b/arch/arm64/kernel/image-vars.h
> > @@ -70,8 +70,7 @@ KVM_NVHE_ALIAS(kvm_get_kimage_voffset);
> >  KVM_NVHE_ALIAS(kvm_vgic_global_state);
> >
> >  /* Kernel symbols used to call panic() from nVHE hyp code (via ERET). */
> > -KVM_NVHE_ALIAS(__hyp_panic_string);
> > -KVM_NVHE_ALIAS(panic);
> > +KVM_NVHE_ALIAS(nvhe_hyp_panic_handler);
> >
> >  /* Vectors installed by hyp-init on reset HVC. */
> >  KVM_NVHE_ALIAS(__hyp_stub_vectors);
> > diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> > index cebe39f3b1b6..6f48336b1d86 100644
> > --- a/arch/arm64/kvm/handle_exit.c
> > +++ b/arch/arm64/kvm/handle_exit.c
> > @@ -291,3 +291,48 @@ void handle_exit_early(struct kvm_vcpu *vcpu, int 
> > exception_index)
> >   if (exception_index 

Re: [PATCH v14 00/13] SMMUv3 Nested Stage Setup (IOMMU part)

2021-03-19 Thread Auger Eric
Hi Krishna,

On 3/18/21 1:16 AM, Krishna Reddy wrote:
> Tested-by: Krishna Reddy 
> 
> Validated nested translations with NVMe PCI device assigned to Guest VM. 
> Tested with both v12 and v13 of Jean-Philippe's patches as base.

Many thanks for that.
> 
>> This is based on Jean-Philippe's
>> [PATCH v12 00/10] iommu: I/O page faults for SMMUv3
>> https://lore.kernel.org/linux-arm-kernel/YBfij71tyYvh8LhB@myrica/T/
> 
> With Jean-Philippe's V13, Patch 12 of this series has a conflict that had to 
> be resolved manually.

Yep I will respin accordingly.

Best Regards

Eric
> 
> -KR
> 
> 

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v14 05/13] iommu/smmuv3: Implement attach/detach_pasid_table

2021-03-19 Thread Auger Eric
Hi Keqian,

On 3/2/21 9:35 AM, Keqian Zhu wrote:
> Hi Eric,
> 
> On 2021/2/24 4:56, Eric Auger wrote:
>> On attach_pasid_table() we program STE S1 related info set
>> by the guest into the actual physical STEs. At minimum
>> we need to program the context descriptor GPA and compute
>> whether the stage1 is translated/bypassed or aborted.
>>
>> On detach, the stage 1 config is unset and the abort flag is
>> unset.
>>
>> Signed-off-by: Eric Auger 
>>
> [...]
> 
>> +
>> +/*
>> + * we currently support a single CD so s1fmt and s1dss
>> + * fields are also ignored
>> + */
>> +if (cfg->pasid_bits)
>> +goto out;
>> +
>> +smmu_domain->s1_cfg.cdcfg.cdtab_dma = cfg->base_ptr;
> only the "cdtab_dma" field of "cdcfg" is set, we are not able to locate a 
> specific cd using arm_smmu_get_cd_ptr().
> 
> Maybe we'd better use a specialized function to fill other fields of "cdcfg" 
> or add a sanity check in arm_smmu_get_cd_ptr()
> to prevent calling it under nested mode?
> 
> As now we just call arm_smmu_get_cd_ptr() during finalise_s1(), no problem 
> found. Just a suggestion ;-)

forgive me for the delay. yes I can indeed make sure that code is not
called in nested mode. Please could you detail why you would need to
call arm_smmu_get_cd_ptr()?

Thanks

Eric
> 
> Thanks,
> Keqian
> 
> 
>> +smmu_domain->s1_cfg.set = true;
>> +smmu_domain->abort = false;
>> +break;
>> +default:
>> +goto out;
>> +}
>> +spin_lock_irqsave(_domain->devices_lock, flags);
>> +list_for_each_entry(master, _domain->devices, domain_head)
>> +arm_smmu_install_ste_for_dev(master);
>> +spin_unlock_irqrestore(_domain->devices_lock, flags);
>> +ret = 0;
>> +out:
>> +mutex_unlock(_domain->init_mutex);
>> +return ret;
>> +}
>> +
>> +static void arm_smmu_detach_pasid_table(struct iommu_domain *domain)
>> +{
>> +struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>> +struct arm_smmu_master *master;
>> +unsigned long flags;
>> +
>> +mutex_lock(_domain->init_mutex);
>> +
>> +if (smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED)
>> +goto unlock;
>> +
>> +smmu_domain->s1_cfg.set = false;
>> +smmu_domain->abort = false;
>> +
>> +spin_lock_irqsave(_domain->devices_lock, flags);
>> +list_for_each_entry(master, _domain->devices, domain_head)
>> +arm_smmu_install_ste_for_dev(master);
>> +spin_unlock_irqrestore(_domain->devices_lock, flags);
>> +
>> +unlock:
>> +mutex_unlock(_domain->init_mutex);
>> +}
>> +
>>  static bool arm_smmu_dev_has_feature(struct device *dev,
>>   enum iommu_dev_features feat)
>>  {
>> @@ -2939,6 +3026,8 @@ static struct iommu_ops arm_smmu_ops = {
>>  .of_xlate   = arm_smmu_of_xlate,
>>  .get_resv_regions   = arm_smmu_get_resv_regions,
>>  .put_resv_regions   = generic_iommu_put_resv_regions,
>> +.attach_pasid_table = arm_smmu_attach_pasid_table,
>> +.detach_pasid_table = arm_smmu_detach_pasid_table,
>>  .dev_has_feat   = arm_smmu_dev_has_feature,
>>  .dev_feat_enabled   = arm_smmu_dev_feature_enabled,
>>  .dev_enable_feat= arm_smmu_dev_enable_feature,
>>
> 

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


RE: [PATCH v14 00/13] SMMUv3 Nested Stage Setup (IOMMU part)

2021-03-19 Thread Krishna Reddy
Tested-by: Krishna Reddy 

Validated nested translations with NVMe PCI device assigned to Guest VM. 
Tested with both v12 and v13 of Jean-Philippe's patches as base.

> This is based on Jean-Philippe's
> [PATCH v12 00/10] iommu: I/O page faults for SMMUv3
> https://lore.kernel.org/linux-arm-kernel/YBfij71tyYvh8LhB@myrica/T/

With Jean-Philippe's V13, Patch 12 of this series has a conflict that had to be 
resolved manually.

-KR


___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


RE: [PATCH v12 00/13] SMMUv3 Nested Stage Setup (VFIO part)

2021-03-19 Thread Krishna Reddy
Tested-by: Krishna Reddy 

Validated Nested SMMUv3 translations for NVMe PCIe device from Guest VM and is 
functional.

This patch series resolved the mismatch(seen with v11 patches) for 
VFIO_IOMMU_SET_PASID_TABLE and VFIO_IOMMU_CACHE_INVALIDATE Ioctls between linux 
and QEMU patch series "vSMMUv3/pSMMUv3 2 stage VFIO integration" 
(v5.2.0-2stage-rfcv8). 

-KR

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [kvm-unit-tests PATCH v2] configure: arm/arm64: Add --earlycon option to set UART type and address

2021-03-19 Thread Andrew Jones
On Fri, Mar 19, 2021 at 11:37:51AM +, Alexandru Elisei wrote:
> > You can also drop 'cut' and just do something like
> >
> > IFS=, read -r name type_addr addr <<<"$earlycon"
> 
> That looks much nicer and concise, and I prefer it to my approach.
> 
> However, I believe it requires a certain version of bash to work. On bash 
> 5.1.4
> and 4.3.48 (copyright says it's from 2013), it works as expected. On bash 
> 3.2.57
> (version used by MacOS), the result of the command is that the variable name
> contains the string with the comma replaced by space, and the other variables 
> are
> empty. Using cut works with this version. According to the copyright notice, 
> bash
> 3.2.57 is from 2007, so extremely old.
> 
> Did some googling for the query "bash split string" and according to this 
> stack
> overflow question [1] (second reply), using IFS works for bash >= 4.2. Don't 
> know
> how accurate it is.
> 
> I guess the question here is how compatible we want to be with regard to the 
> bash
> version. I am not familiar with bash programming, so I will defer to your 
> decision.

>From time to time we've had this come up with kvm-unit-tests. The result
has always been to say "yeah, we should figure out our minimally required
Bash version and document that", but then we never do... It sounds like
you've identified Bash 4.2 being required for this IFS idiom. As we
already use this idiom in other places in kvm-unit-tests, then I think
the right thing to do is to test running all the current scripts with
Bash 4.2, and if that works, finally document it. I'll do that ASAP.

> > And/or should we do a quick sanity check on the address?
> > Something like
> >
> >   [[ $addr =~ ^0?x?[0-9a-f]+$ ]]
> 
> Another great suggestion. The pattern above doesn't take into account 
> character
> case and the fact that 0xa is a valid number, but a is not. Best I could come 
> up
> with is:
> 
> [[ $addr =~ ^0(x|X)[0-9a-fA-F]+$ ]] || [[ $addr =~ ^[0-9]+$ ]]
> 
> What do you think?

LGTM

Thanks,
drew

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [kvm-unit-tests PATCH v2] configure: arm/arm64: Add --earlycon option to set UART type and address

2021-03-19 Thread Alexandru Elisei
Hi Drew,

Thanks for having another look!

On 3/18/21 4:41 PM, Andrew Jones wrote:
> On Thu, Mar 18, 2021 at 04:20:22PM +, Alexandru Elisei wrote:
>> Currently, the UART early address is set indirectly with the --vmm option
>> and there are only two possible values: if the VMM is qemu (the default),
>> then the UART address is set to 0x0900; if the VMM is kvmtool, then the
>> UART address is set to 0x3f8.
>>
>> The upstream kvmtool commit 45b4968e0de1 ("hw/serial: ARM/arm64: Use MMIO
>> at higher addresses") changed the UART address to 0x100, and
>> kvm-unit-tests so far hasn't had mechanism to let the user set a specific
>> address, which means that for recent versions of kvmtool the early UART
>> won't be available.
>>
>> This situation will only become worse as kvm-unit-tests gains support to
>> run as an EFI app, as each platform will have their own UART type and
>> address.
>>
>> To address both issues, a new configure option is added, --earlycon. The
>> syntax and semantics are identical to the kernel parameter with the same
>> name. For example, for kvmtool, --earlycon=uart,mmio,0x100 will set the
>> correct UART address. Specifying this option will overwrite the UART
>> address set by --vmm.
>>
>> At the moment, the UART type and register width parameters are ignored
>> since both qemu's and kvmtool's UART emulation use the same offset for the
>> TX register and no other registers are used by kvm-unit-tests, but the
>> parameters will become relevant once EFI support is added.
>>
>> Signed-off-by: Alexandru Elisei 
>> ---
>> Besides working with current versions of kvmtool, this will also make early
>> console work if the user specifies a custom memory layout [1] (patches are
>> old, but I plan to pick them up at some point in the future).
>>
>> Changes in v2:
>> * kvmtool patches were merged, so I reworked the commit message to point to
>>   the corresponding kvmtool commit.
>> * Restricted pl011 register size to 32 bits, as per Arm Base System
>>   Architecture 1.0 (DEN0094A), and to match Linux.
>> * Reworked the way the fields are extracted to make it more precise
>>   (without the -s argument, the entire string is echo'ed when no delimiter
>>   is found).
> You can also drop 'cut' and just do something like
>
> IFS=, read -r name type_addr addr <<<"$earlycon"

That looks much nicer and concise, and I prefer it to my approach.

However, I believe it requires a certain version of bash to work. On bash 5.1.4
and 4.3.48 (copyright says it's from 2013), it works as expected. On bash 3.2.57
(version used by MacOS), the result of the command is that the variable name
contains the string with the comma replaced by space, and the other variables 
are
empty. Using cut works with this version. According to the copyright notice, 
bash
3.2.57 is from 2007, so extremely old.

Did some googling for the query "bash split string" and according to this stack
overflow question [1] (second reply), using IFS works for bash >= 4.2. Don't 
know
how accurate it is.

I guess the question here is how compatible we want to be with regard to the 
bash
version. I am not familiar with bash programming, so I will defer to your 
decision.

[1]
https://stackoverflow.com/questions/918886/how-do-i-split-a-string-on-a-delimiter-in-bash

>
>> * The changes are not trivial, so I dropped Drew's Reviewed-by.
>>
>> [1] 
>> https://lore.kernel.org/kvm/1569245722-23375-1-git-send-email-alexandru.eli...@arm.com/
>>
>>  configure | 61 +++
>>  1 file changed, 61 insertions(+)
>>
>> diff --git a/configure b/configure
>> index cdcd34e94030..137b165db18f 100755
>> --- a/configure
>> +++ b/configure
>> @@ -26,6 +26,7 @@ errata_force=0
>>  erratatxt="$srcdir/errata.txt"
>>  host_key_document=
>>  page_size=
>> +earlycon=
>>  
>>  usage() {
>>  cat <<-EOF
>> @@ -54,6 +55,18 @@ usage() {
>>  --page-size=PAGE_SIZE
>> Specify the page size (translation granule) 
>> (4k, 16k or
>> 64k, default is 64k, arm64 only)
>> +--earlycon=EARLYCON
>> +   Specify the UART name, type and address 
>> (optional, arm and
>> +   arm64 only). The specified address will 
>> overwrite the UART
>> +   address set by the --vmm option. EARLYCON 
>> can be on of (case
> 'on of' typo still here

Sorry, I missed that in your previous comment.

>
>> +   sensitive):
>> +   uart[8250],mmio,ADDR
>> +   Specify an 8250 compatible UART at address 
>> ADDR. Supported
>> +   register stride is 8 bit only.
>> +   pl011,ADDR
>> +   pl011,mmio32,ADDR
>> +   Specify a PL011 compatible UART at address 
>> ADDR. Supported
>> +   register stride is 32 bit only.
>>  EOF
>> 

Re: [PATCH v3 5/5] KVM: arm64: Log source when panicking from nVHE hyp

2021-03-19 Thread Marc Zyngier

On 2021-03-19 10:38, Andrew Scull wrote:

> > diff --git a/arch/arm64/include/asm/kvm_mmu.h 
b/arch/arm64/include/asm/kvm_mmu.h
> > index 90873851f677..7c17a67d2291 100644
> > --- a/arch/arm64/include/asm/kvm_mmu.h
> > +++ b/arch/arm64/include/asm/kvm_mmu.h
> > @@ -121,6 +121,8 @@ void kvm_update_va_mask(struct alt_instr *alt,
> >  void kvm_compute_layout(void);
> >  void kvm_apply_hyp_relocations(void);
> >
> > +#define __hyp_pa(x) (((phys_addr_t)(x)) + hyp_physvirt_offset)
>
> Just a heads up: but Quentin's series moves this same macro, but to a
> different header (arch/arm64/kvm/hyp/include/nvhe/memory.h)

I can make sure we're putting it in the same header to ease the merge.


Seems I was too optimistic since nvhe/memory.h can only be used from
nVHE hyp. I could rebase no top of Quentin to move the macro somewhere
that can be used from here, but that would add a series dependency
that you usually seem to try and avoid so I'm not going to change
this, unless you'd prefer that I do.


I can deal with this at merge time.

M.
--
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 5/5] KVM: arm64: Log source when panicking from nVHE hyp

2021-03-19 Thread Marc Zyngier
On Thu, 18 Mar 2021 16:59:47 +,
Will Deacon  wrote:
> 
> On Thu, Mar 18, 2021 at 02:33:11PM +, Andrew Scull wrote:
> > To aid with debugging, add details of the source of a panic from nVHE
> > hyp. This is done by having nVHE hyp exit to nvhe_hyp_panic_handler()
> > rather than directly to panic(). The handler will then add the extra
> > details for debugging before panicking the kernel.
> > 
> > If the panic was due to a BUG(), look up the metadata to log the file
> > and line, if available, otherwise log an address that can be looked up
> > in vmlinux. The hyp offset is also logged to allow other hyp VAs to be
> > converted, similar to how the kernel offset is logged during a panic.
> > 
> > __hyp_panic_string is now inlined since it no longer needs to be
> > referenced as a symbol and the message is free to diverge between VHE
> > and nVHE.
> > 
> > The following is an example of the logs generated by a BUG in nVHE hyp.
> > 
> > [   46.754840] kvm [307]: nVHE hyp BUG at: 
> > arch/arm64/kvm/hyp/nvhe/switch.c:242!
> > [   46.755357] kvm [307]: Hyp Offset: 0xfffea6c58e1e
> > [   46.755824] Kernel panic - not syncing: HYP panic:
> > [   46.755824] PS:43c9 PC:d93a82c705ac ESR:f2000800
> > [   46.755824] FAR:8008 HPFAR:00800800 
> > PAR:
> > [   46.755824] VCPU:d93a880d
> > [   46.756960] CPU: 3 PID: 307 Comm: kvm-vcpu-0 Not tainted 
> > 5.12.0-rc3-5-gc572b99cf65b-dirty #133
> > [   46.757459] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 
> > 02/06/2015
> > [   46.758366] Call trace:
> > [   46.758601]  dump_backtrace+0x0/0x1b0
> > [   46.758856]  show_stack+0x18/0x70
> > [   46.759057]  dump_stack+0xd0/0x12c
> > [   46.759236]  panic+0x16c/0x334
> > [   46.759426]  arm64_kernel_unmapped_at_el0+0x0/0x30
> > [   46.759661]  kvm_arch_vcpu_ioctl_run+0x134/0x750
> > [   46.759936]  kvm_vcpu_ioctl+0x2f0/0x970
> > [   46.760156]  __arm64_sys_ioctl+0xa8/0xec
> > [   46.760379]  el0_svc_common.constprop.0+0x60/0x120
> > [   46.760627]  do_el0_svc+0x24/0x90
> > [   46.760766]  el0_svc+0x2c/0x54
> > [   46.760915]  el0_sync_handler+0x1a4/0x1b0
> > [   46.761146]  el0_sync+0x170/0x180
> > [   46.761889] SMP: stopping secondary CPUs
> > [   46.762786] Kernel Offset: 0x3e1cd282 from 0x80001000
> > [   46.763142] PHYS_OFFSET: 0xa9f68000
> > [   46.763359] CPU features: 0x00240022,61806008
> > [   46.763651] Memory Limit: none
> 
> Nice!
> 
> > [   46.813867] ---[ end Kernel panic - not syncing: HYP panic:
> > [   46.813867] PS:43c9 PC:d93a82c705ac ESR:f2000800
> > [   46.813867] FAR:8008 HPFAR:00800800 
> > PAR:
> > [   46.813867] VCPU:d93a880d ]---
> 
> Why did these last three lines get printed twice?

That's the panic string that gets repeated, a standard "feature" of
the panic code:

root@tiger-roach:~# echo -n 'c' > /proc/sysrq-trigger 
[250622.941867] sysrq: Trigger a crash
[250622.941903] Kernel panic - not syncing: sysrq triggered crash
[250622.945515] CPU: 0 PID: 4890 Comm: bash Tainted: GE 5.11.0 
#3124
[250622.952930] Hardware name:  , BIOS 2021.01-rc2-00012-gde865f7ee1 11/16/2020
[250622.959917] Call trace:
[250622.962417]  dump_backtrace+0x0/0x1e4
[250622.966126]  show_stack+0x24/0x80
[250622.969490]  dump_stack+0xc8/0x120
[250622.972940]  panic+0x178/0x38c
[250622.976045]  sysrq_handle_crash+0x28/0x30
[250622.980098]  __handle_sysrq+0x98/0x1a4
[250622.983893]  write_sysrq_trigger+0x94/0x12c
[250622.988120]  proc_reg_write+0xb4/0xf0
[250622.991828]  vfs_write+0xfc/0x29c
[250622.995192]  ksys_write+0x74/0x100
[250622.998642]  __arm64_sys_write+0x28/0x3c
[250623.002610]  el0_svc_common.constprop.0+0x80/0x1c0
[250623.007440]  do_el0_svc+0x30/0xa0
[250623.010803]  el0_svc+0x28/0x70
[250623.013908]  el0_sync_handler+0x1a8/0x1ac
[250623.017962]  el0_sync+0x174/0x180
[250623.021331] SMP: stopping secondary CPUs
[250623.025301] Kernel Offset: 0x435917bf from 0x80001000
[250623.031417] PHYS_OFFSET: 0x9bc5c000
[250623.035643] CPU features: 0x08240022,2aa0a830
[250623.040042] Memory Limit: none
[250623.043153] ---[ end Kernel panic - not syncing: sysrq triggered crash ]---

KVM just happens to have a fairly large, multi line panic string.

M.

-- 
Without deviation from the norm, progress is not possible.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm