Re: [PATCH v3 4/7] arm64: mte: Lock a page for MTE tag initialisation

2022-09-02 Thread Catalin Marinas
On Fri, Sep 02, 2022 at 05:28:47PM +0100, Catalin Marinas wrote:
> On Fri, Sep 02, 2022 at 03:47:33PM +0100, Steven Price wrote:
> > On 10/08/2022 20:30, Peter Collingbourne wrote:
> > > diff --git a/arch/arm64/mm/mteswap.c b/arch/arm64/mm/mteswap.c
> > > index a78c1db23c68..cd5ad0936e16 100644
> > > --- a/arch/arm64/mm/mteswap.c
> > > +++ b/arch/arm64/mm/mteswap.c
> > > @@ -53,6 +53,9 @@ bool mte_restore_tags(swp_entry_t entry, struct page 
> > > *page)
> > >   if (!tags)
> > >   return false;
> > >  
> > > + /* racing tag restoring? */
> > > + if (!try_page_mte_tagging(page))
> > > + return false;
> > >   mte_restore_page_tags(page_address(page), tags);
> > 
> > I feel like adding a "set_page_mte_tagged(page);" in here would avoid
> > the need for the comments about mte_restore_tags() taking the lock.
> 
> Good point. I think I blindly followed the set_bit() places but it makes
> sense to move the bit setting to mte_restore_tags().

Something like this (and I need to do some more testing):

diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
index b956899467f0..be6560e1ff2b 100644
--- a/arch/arm64/include/asm/mte.h
+++ b/arch/arm64/include/asm/mte.h
@@ -25,7 +25,7 @@ unsigned long mte_copy_tags_to_user(void __user *to, void 
*from,
unsigned long n);
 int mte_save_tags(struct page *page);
 void mte_save_page_tags(const void *page_addr, void *tag_storage);
-bool mte_restore_tags(swp_entry_t entry, struct page *page);
+void mte_restore_tags(swp_entry_t entry, struct page *page);
 void mte_restore_page_tags(void *page_addr, const void *tag_storage);
 void mte_invalidate_tags(int type, pgoff_t offset);
 void mte_invalidate_tags_area(int type);
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index e6b82ad1e9e6..7d91379382fd 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -1049,9 +1049,8 @@ static inline void arch_swap_invalidate_area(int type)
 #define __HAVE_ARCH_SWAP_RESTORE
 static inline void arch_swap_restore(swp_entry_t entry, struct folio *folio)
 {
-   /* mte_restore_tags() takes the PG_mte_lock */
-   if (system_supports_mte() && mte_restore_tags(entry, >page))
-   set_page_mte_tagged(>page);
+   if (system_supports_mte())
+   mte_restore_tags(entry, >page);
 }
 
 #endif /* CONFIG_ARM64_MTE */
diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index 634e089b5933..54ab6c4741db 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -41,11 +41,8 @@ static void mte_sync_page_tags(struct page *page, pte_t 
old_pte,
if (check_swap && is_swap_pte(old_pte)) {
swp_entry_t entry = pte_to_swp_entry(old_pte);
 
-   /* mte_restore_tags() takes the PG_mte_lock */
-   if (!non_swap_entry(entry) && mte_restore_tags(entry, page)) {
-   set_page_mte_tagged(page);
-   return;
-   }
+   if (!non_swap_entry(entry))
+   mte_restore_tags(entry, page);
}
 
if (!pte_is_tagged)
diff --git a/arch/arm64/mm/mteswap.c b/arch/arm64/mm/mteswap.c
index cd5ad0936e16..cd508ba80ab1 100644
--- a/arch/arm64/mm/mteswap.c
+++ b/arch/arm64/mm/mteswap.c
@@ -46,19 +46,17 @@ int mte_save_tags(struct page *page)
return 0;
 }
 
-bool mte_restore_tags(swp_entry_t entry, struct page *page)
+void mte_restore_tags(swp_entry_t entry, struct page *page)
 {
void *tags = xa_load(_pages, entry.val);
 
if (!tags)
-   return false;
+   return;
 
-   /* racing tag restoring? */
-   if (!try_page_mte_tagging(page))
-   return false;
-   mte_restore_page_tags(page_address(page), tags);
-
-   return true;
+   if (try_page_mte_tagging(page)) {
+   mte_restore_page_tags(page_address(page), tags);
+   set_page_mte_tagged(page);
+   }
 }
 
 void mte_invalidate_tags(int type, pgoff_t offset)
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 4/7] arm64: mte: Lock a page for MTE tag initialisation

2022-09-02 Thread Catalin Marinas
On Fri, Sep 02, 2022 at 03:47:33PM +0100, Steven Price wrote:
> On 10/08/2022 20:30, Peter Collingbourne wrote:
> > diff --git a/arch/arm64/mm/mteswap.c b/arch/arm64/mm/mteswap.c
> > index a78c1db23c68..cd5ad0936e16 100644
> > --- a/arch/arm64/mm/mteswap.c
> > +++ b/arch/arm64/mm/mteswap.c
> > @@ -53,6 +53,9 @@ bool mte_restore_tags(swp_entry_t entry, struct page 
> > *page)
> > if (!tags)
> > return false;
> >  
> > +   /* racing tag restoring? */
> > +   if (!try_page_mte_tagging(page))
> > +   return false;
> > mte_restore_page_tags(page_address(page), tags);
> 
> I feel like adding a "set_page_mte_tagged(page);" in here would avoid
> the need for the comments about mte_restore_tags() taking the lock.

Good point. I think I blindly followed the set_bit() places but it makes
sense to move the bit setting to mte_restore_tags().

Thanks for the review.

-- 
Catalin
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v2 0/7] KVM: arm64: Use visibility hook to treat ID regs as RAZ

2022-09-02 Thread Oliver Upton
Lol, mess up my own copypasta:

On Fri, Sep 02, 2022 at 03:47:56PM +, Oliver Upton wrote:
> For reasons unknown, the Arm architecture defines the 64-bit views of
> the 32-bit ID registers as UNKNOWN [1]. This combines poorly with the
^ on AArch64-only systems.

> fact that KVM unconditionally exposes these registers to userspace,
> which could throw a wrench in migration between 64-bit only systems.
> 
> This series reworks KVM's definition of these registers to RAZ/WI with
> the goal of providing consistent register values across 64-bit machines.
> 
> Patches 1-3 clean up the ID register accessors, taking advantage of the
> fact that the generic accessors know how to handle RAZ.
> 
> Patches 4-6 start switch the handling of potentially nonzero AArch32 ID
> registers to RAZ/WI. RAZ covers up the architecturally UNKNOWN values,
> and WI allows for migration off of kernels that may provide garbage.
> Note that hidden AArch32 ID registers continue to have RAZ behavior with
> the additional expectation of invariance.
> 
> Lastly, patch 7 includes a small test for the issue.
> 
> Applies to 6.0-rc3. Tested with KVM selftests under the fast model w/
> asymmetric 32 bit support and no 32 bit support whatsoever.

[1]: DDI0487H.a Table D12-2 'Instruction encodings for non-Debug System 
Register accesses'

v1: 
https://lore.kernel.org/kvmarm/20220817214818.3243383-1-oliver.up...@linux.dev/

--
Thanks,
Oliver

> v1 -> v2:
>  - Collect Reiji's r-b tags (thanks!)
>  - Call sysreg_visible_as_raz() from read_id_reg() (Reiji)
>  - Hoist sysreg_user_write_ignore() into kvm_sys_reg_set_user() (Reiji)
> 
> Oliver Upton (7):
>   KVM: arm64: Use visibility hook to treat ID regs as RAZ
>   KVM: arm64: Remove internal accessor helpers for id regs
>   KVM: arm64: Drop raz parameter from read_id_reg()
>   KVM: arm64: Spin off helper for calling visibility hook
>   KVM: arm64: Add a visibility bit to ignore user writes
>   KVM: arm64: Treat 32bit ID registers as RAZ/WI on 64bit-only system
>   KVM: selftests: Add test for RAZ/WI AArch32 ID registers
> 
>  arch/arm64/kvm/sys_regs.c | 150 +-
>  arch/arm64/kvm/sys_regs.h |  24 ++-
>  tools/testing/selftests/kvm/.gitignore|   1 +
>  tools/testing/selftests/kvm/Makefile  |   1 +
>  .../kvm/aarch64/aarch64_only_id_regs.c| 135 
>  5 files changed, 225 insertions(+), 86 deletions(-)
>  create mode 100644 tools/testing/selftests/kvm/aarch64/aarch64_only_id_regs.c
> 
> 
> base-commit: b90cb1053190353cc30f0fef0ef1f378ccc063c5
> -- 
> 2.37.2.789.g6183377224-goog
> 
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v2 7/7] KVM: selftests: Add test for RAZ/WI AArch32 ID registers

2022-09-02 Thread Oliver Upton
Add a test to assert that KVM handles the AArch64 views of the AArch32
ID registers as RAZ/WI (writable only from userspace).

Signed-off-by: Oliver Upton 
---
 tools/testing/selftests/kvm/.gitignore|   1 +
 tools/testing/selftests/kvm/Makefile  |   1 +
 .../kvm/aarch64/aarch64_only_id_regs.c| 135 ++
 3 files changed, 137 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/aarch64/aarch64_only_id_regs.c

diff --git a/tools/testing/selftests/kvm/.gitignore 
b/tools/testing/selftests/kvm/.gitignore
index d625a3f83780..4331af62a982 100644
--- a/tools/testing/selftests/kvm/.gitignore
+++ b/tools/testing/selftests/kvm/.gitignore
@@ -1,4 +1,5 @@
 # SPDX-License-Identifier: GPL-2.0-only
+/aarch64/aarch64_only_id_regs
 /aarch64/arch_timer
 /aarch64/debug-exceptions
 /aarch64/get-reg-list
diff --git a/tools/testing/selftests/kvm/Makefile 
b/tools/testing/selftests/kvm/Makefile
index 4c122f1b1737..efe155259095 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -144,6 +144,7 @@ TEST_GEN_PROGS_x86_64 += system_counter_offset_test
 # Compiled outputs used by test targets
 TEST_GEN_PROGS_EXTENDED_x86_64 += x86_64/nx_huge_pages_test
 
+TEST_GEN_PROGS_aarch64 += aarch64/aarch64_only_id_regs
 TEST_GEN_PROGS_aarch64 += aarch64/arch_timer
 TEST_GEN_PROGS_aarch64 += aarch64/debug-exceptions
 TEST_GEN_PROGS_aarch64 += aarch64/get-reg-list
diff --git a/tools/testing/selftests/kvm/aarch64/aarch64_only_id_regs.c 
b/tools/testing/selftests/kvm/aarch64/aarch64_only_id_regs.c
new file mode 100644
index ..704a3e7524a8
--- /dev/null
+++ b/tools/testing/selftests/kvm/aarch64/aarch64_only_id_regs.c
@@ -0,0 +1,135 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * aarch64_only_id_regs - Test for ID register behavior on AArch64-only systems
+ *
+ * Copyright (c) 2022 Google LLC.
+ *
+ * Test that KVM handles the AArch64 views of the AArch32 ID registers as RAZ
+ * and WI from userspace.
+ */
+
+#include 
+
+#include "kvm_util.h"
+#include "processor.h"
+#include "test_util.h"
+
+#define BAD_ID_REG_VAL 0x1badc0deul
+
+#define GUEST_ASSERT_REG_RAZ(reg)  GUEST_ASSERT_EQ(read_sysreg_s(reg), 0)
+
+static void guest_main(void)
+{
+   GUEST_ASSERT_REG_RAZ(SYS_ID_PFR0_EL1);
+   GUEST_ASSERT_REG_RAZ(SYS_ID_PFR1_EL1);
+   GUEST_ASSERT_REG_RAZ(SYS_ID_DFR0_EL1);
+   GUEST_ASSERT_REG_RAZ(SYS_ID_MMFR0_EL1);
+   GUEST_ASSERT_REG_RAZ(SYS_ID_MMFR1_EL1);
+   GUEST_ASSERT_REG_RAZ(SYS_ID_MMFR2_EL1);
+   GUEST_ASSERT_REG_RAZ(SYS_ID_MMFR3_EL1);
+   GUEST_ASSERT_REG_RAZ(SYS_ID_ISAR0_EL1);
+   GUEST_ASSERT_REG_RAZ(SYS_ID_ISAR1_EL1);
+   GUEST_ASSERT_REG_RAZ(SYS_ID_ISAR2_EL1);
+   GUEST_ASSERT_REG_RAZ(SYS_ID_ISAR3_EL1);
+   GUEST_ASSERT_REG_RAZ(SYS_ID_ISAR4_EL1);
+   GUEST_ASSERT_REG_RAZ(SYS_ID_ISAR5_EL1);
+   GUEST_ASSERT_REG_RAZ(SYS_ID_MMFR4_EL1);
+   GUEST_ASSERT_REG_RAZ(SYS_ID_ISAR6_EL1);
+   GUEST_ASSERT_REG_RAZ(SYS_MVFR0_EL1);
+   GUEST_ASSERT_REG_RAZ(SYS_MVFR1_EL1);
+   GUEST_ASSERT_REG_RAZ(SYS_MVFR2_EL1);
+   GUEST_ASSERT_REG_RAZ(SYS_ID_PFR2_EL1);
+   GUEST_ASSERT_REG_RAZ(SYS_ID_MMFR5_EL1);
+
+   GUEST_DONE();
+}
+
+static void test_guest_raz(struct kvm_vcpu *vcpu)
+{
+   struct ucall uc;
+
+   vcpu_run(vcpu);
+
+   switch (get_ucall(vcpu, )) {
+   case UCALL_ABORT:
+   REPORT_GUEST_ASSERT(uc);
+   break;
+   case UCALL_DONE:
+   break;
+   default:
+   TEST_FAIL("Unexpected ucall: %lu", uc.cmd);
+   }
+}
+
+static uint64_t reg_ids[] = {
+   KVM_ARM64_SYS_REG(SYS_ID_PFR0_EL1),
+   KVM_ARM64_SYS_REG(SYS_ID_PFR1_EL1),
+   KVM_ARM64_SYS_REG(SYS_ID_DFR0_EL1),
+   KVM_ARM64_SYS_REG(SYS_ID_MMFR0_EL1),
+   KVM_ARM64_SYS_REG(SYS_ID_MMFR1_EL1),
+   KVM_ARM64_SYS_REG(SYS_ID_MMFR2_EL1),
+   KVM_ARM64_SYS_REG(SYS_ID_MMFR3_EL1),
+   KVM_ARM64_SYS_REG(SYS_ID_ISAR0_EL1),
+   KVM_ARM64_SYS_REG(SYS_ID_ISAR1_EL1),
+   KVM_ARM64_SYS_REG(SYS_ID_ISAR2_EL1),
+   KVM_ARM64_SYS_REG(SYS_ID_ISAR3_EL1),
+   KVM_ARM64_SYS_REG(SYS_ID_ISAR4_EL1),
+   KVM_ARM64_SYS_REG(SYS_ID_ISAR5_EL1),
+   KVM_ARM64_SYS_REG(SYS_ID_MMFR4_EL1),
+   KVM_ARM64_SYS_REG(SYS_ID_ISAR6_EL1),
+   KVM_ARM64_SYS_REG(SYS_MVFR0_EL1),
+   KVM_ARM64_SYS_REG(SYS_MVFR1_EL1),
+   KVM_ARM64_SYS_REG(SYS_MVFR2_EL1),
+   KVM_ARM64_SYS_REG(SYS_ID_PFR2_EL1),
+   KVM_ARM64_SYS_REG(SYS_ID_MMFR5_EL1),
+};
+
+static void test_user_raz_wi(struct kvm_vcpu *vcpu)
+{
+   int i;
+
+   for (i = 0; i < ARRAY_SIZE(reg_ids); i++) {
+   uint64_t reg_id = reg_ids[i];
+   uint64_t val;
+
+   vcpu_get_reg(vcpu, reg_id, );
+   ASSERT_EQ(val, 0);
+
+   /*
+* Expect the ioctl to succeed with no effect on the register
+* value.
+*/
+   vcpu_set_reg(vcpu, reg_id, 

[PATCH v2 6/7] KVM: arm64: Treat 32bit ID registers as RAZ/WI on 64bit-only system

2022-09-02 Thread Oliver Upton
One of the oddities of the architecture is that the AArch64 views of the
AArch32 ID registers are UNKNOWN if AArch32 isn't implemented at any EL.
Nonetheless, KVM exposes these registers to userspace for the sake of
save/restore. It is possible that the UNKNOWN value could differ between
systems, leading to a rejected write from userspace.

Avoid the issue altogether by handling the AArch32 ID registers as
RAZ/WI when on an AArch64-only system.

Signed-off-by: Oliver Upton 
---
 arch/arm64/kvm/sys_regs.c | 63 ++-
 1 file changed, 43 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 6d0511247df4..9569772cf09a 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1144,6 +1144,20 @@ static unsigned int id_visibility(const struct kvm_vcpu 
*vcpu,
return 0;
 }
 
+static unsigned int aa32_id_visibility(const struct kvm_vcpu *vcpu,
+  const struct sys_reg_desc *r)
+{
+   /*
+* AArch32 ID registers are UNKNOWN if AArch32 isn't implemented at any
+* EL. Promote to RAZ/WI in order to guarantee consistency between
+* systems.
+*/
+   if (!kvm_supports_32bit_el0())
+   return REG_RAZ | REG_USER_WI;
+
+   return id_visibility(vcpu, r);
+}
+
 static unsigned int raz_visibility(const struct kvm_vcpu *vcpu,
   const struct sys_reg_desc *r)
 {
@@ -1331,6 +1345,15 @@ static unsigned int mte_visibility(const struct kvm_vcpu 
*vcpu,
.visibility = id_visibility,\
 }
 
+/* sys_reg_desc initialiser for known cpufeature ID registers */
+#define AA32_ID_SANITISED(name) {  \
+   SYS_DESC(SYS_##name),   \
+   .access = access_id_reg,\
+   .get_user = get_id_reg, \
+   .set_user = set_id_reg, \
+   .visibility = aa32_id_visibility,   \
+}
+
 /*
  * sys_reg_desc initialiser for architecturally unallocated cpufeature ID
  * register with encoding Op0=3, Op1=0, CRn=0, CRm=crm, Op2=op2
@@ -1418,33 +1441,33 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 
/* AArch64 mappings of the AArch32 ID registers */
/* CRm=1 */
-   ID_SANITISED(ID_PFR0_EL1),
-   ID_SANITISED(ID_PFR1_EL1),
-   ID_SANITISED(ID_DFR0_EL1),
+   AA32_ID_SANITISED(ID_PFR0_EL1),
+   AA32_ID_SANITISED(ID_PFR1_EL1),
+   AA32_ID_SANITISED(ID_DFR0_EL1),
ID_HIDDEN(ID_AFR0_EL1),
-   ID_SANITISED(ID_MMFR0_EL1),
-   ID_SANITISED(ID_MMFR1_EL1),
-   ID_SANITISED(ID_MMFR2_EL1),
-   ID_SANITISED(ID_MMFR3_EL1),
+   AA32_ID_SANITISED(ID_MMFR0_EL1),
+   AA32_ID_SANITISED(ID_MMFR1_EL1),
+   AA32_ID_SANITISED(ID_MMFR2_EL1),
+   AA32_ID_SANITISED(ID_MMFR3_EL1),
 
/* CRm=2 */
-   ID_SANITISED(ID_ISAR0_EL1),
-   ID_SANITISED(ID_ISAR1_EL1),
-   ID_SANITISED(ID_ISAR2_EL1),
-   ID_SANITISED(ID_ISAR3_EL1),
-   ID_SANITISED(ID_ISAR4_EL1),
-   ID_SANITISED(ID_ISAR5_EL1),
-   ID_SANITISED(ID_MMFR4_EL1),
-   ID_SANITISED(ID_ISAR6_EL1),
+   AA32_ID_SANITISED(ID_ISAR0_EL1),
+   AA32_ID_SANITISED(ID_ISAR1_EL1),
+   AA32_ID_SANITISED(ID_ISAR2_EL1),
+   AA32_ID_SANITISED(ID_ISAR3_EL1),
+   AA32_ID_SANITISED(ID_ISAR4_EL1),
+   AA32_ID_SANITISED(ID_ISAR5_EL1),
+   AA32_ID_SANITISED(ID_MMFR4_EL1),
+   AA32_ID_SANITISED(ID_ISAR6_EL1),
 
/* CRm=3 */
-   ID_SANITISED(MVFR0_EL1),
-   ID_SANITISED(MVFR1_EL1),
-   ID_SANITISED(MVFR2_EL1),
+   AA32_ID_SANITISED(MVFR0_EL1),
+   AA32_ID_SANITISED(MVFR1_EL1),
+   AA32_ID_SANITISED(MVFR2_EL1),
ID_UNALLOCATED(3,3),
-   ID_SANITISED(ID_PFR2_EL1),
+   AA32_ID_SANITISED(ID_PFR2_EL1),
ID_HIDDEN(ID_DFR1_EL1),
-   ID_SANITISED(ID_MMFR5_EL1),
+   AA32_ID_SANITISED(ID_MMFR5_EL1),
ID_UNALLOCATED(3,7),
 
/* AArch64 ID registers */
-- 
2.37.2.789.g6183377224-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v2 5/7] KVM: arm64: Add a visibility bit to ignore user writes

2022-09-02 Thread Oliver Upton
We're about to ignore writes to AArch32 ID registers on AArch64-only
systems. Add a bit to indicate a register is handled as write ignore
when accessed from userspace.

Signed-off-by: Oliver Upton 
---
 arch/arm64/kvm/sys_regs.c | 3 +++
 arch/arm64/kvm/sys_regs.h | 7 +++
 2 files changed, 10 insertions(+)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 0e20a311ea20..6d0511247df4 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -2775,6 +2775,9 @@ int kvm_sys_reg_set_user(struct kvm_vcpu *vcpu, const 
struct kvm_one_reg *reg,
if (!r)
return -ENOENT;
 
+   if (sysreg_user_write_ignore(vcpu, r))
+   return 0;
+
if (r->set_user) {
ret = (r->set_user)(vcpu, r, val);
} else {
diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
index e78b51059622..e4ebb3a379fd 100644
--- a/arch/arm64/kvm/sys_regs.h
+++ b/arch/arm64/kvm/sys_regs.h
@@ -86,6 +86,7 @@ struct sys_reg_desc {
 
 #define REG_HIDDEN (1 << 0) /* hidden from userspace and guest */
 #define REG_RAZ(1 << 1) /* RAZ from userspace and 
guest */
+#define REG_USER_WI(1 << 2) /* WI from userspace only */
 
 static __printf(2, 3)
 inline void print_sys_reg_msg(const struct sys_reg_params *p,
@@ -157,6 +158,12 @@ static inline bool sysreg_visible_as_raz(const struct 
kvm_vcpu *vcpu,
return sysreg_visibility(vcpu, r) & REG_RAZ;
 }
 
+static inline bool sysreg_user_write_ignore(const struct kvm_vcpu *vcpu,
+   const struct sys_reg_desc *r)
+{
+   return sysreg_visibility(vcpu, r) & REG_USER_WI;
+}
+
 static inline int cmp_sys_reg(const struct sys_reg_desc *i1,
  const struct sys_reg_desc *i2)
 {
-- 
2.37.2.789.g6183377224-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v2 4/7] KVM: arm64: Spin off helper for calling visibility hook

2022-09-02 Thread Oliver Upton
No functional change intended.

Reviewed-by: Reiji Watanabe 
Signed-off-by: Oliver Upton 
---
 arch/arm64/kvm/sys_regs.h | 19 +++
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
index a8c4cc32eb9a..e78b51059622 100644
--- a/arch/arm64/kvm/sys_regs.h
+++ b/arch/arm64/kvm/sys_regs.h
@@ -136,22 +136,25 @@ static inline void reset_val(struct kvm_vcpu *vcpu, const 
struct sys_reg_desc *r
__vcpu_sys_reg(vcpu, r->reg) = r->val;
 }
 
-static inline bool sysreg_hidden(const struct kvm_vcpu *vcpu,
-const struct sys_reg_desc *r)
+static inline unsigned int sysreg_visibility(const struct kvm_vcpu *vcpu,
+const struct sys_reg_desc *r)
 {
if (likely(!r->visibility))
-   return false;
+   return 0;
 
-   return r->visibility(vcpu, r) & REG_HIDDEN;
+   return r->visibility(vcpu, r);
+}
+
+static inline bool sysreg_hidden(const struct kvm_vcpu *vcpu,
+const struct sys_reg_desc *r)
+{
+   return sysreg_visibility(vcpu, r) & REG_HIDDEN;
 }
 
 static inline bool sysreg_visible_as_raz(const struct kvm_vcpu *vcpu,
 const struct sys_reg_desc *r)
 {
-   if (likely(!r->visibility))
-   return false;
-
-   return r->visibility(vcpu, r) & REG_RAZ;
+   return sysreg_visibility(vcpu, r) & REG_RAZ;
 }
 
 static inline int cmp_sys_reg(const struct sys_reg_desc *i1,
-- 
2.37.2.789.g6183377224-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v2 3/7] KVM: arm64: Drop raz parameter from read_id_reg()

2022-09-02 Thread Oliver Upton
There is no longer a need for caller-specified RAZ visibility. Hoist the
call to sysreg_visible_as_raz() into read_id_reg() and drop the
parameter.

No functional change intended.

Suggested-by: Reiji Watanabe 
Signed-off-by: Oliver Upton 
---
 arch/arm64/kvm/sys_regs.c | 19 ++-
 1 file changed, 6 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 26210f3a0b27..0e20a311ea20 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1063,13 +1063,12 @@ static bool access_arch_timer(struct kvm_vcpu *vcpu,
 }
 
 /* Read a sanitised cpufeature ID register by sys_reg_desc */
-static u64 read_id_reg(const struct kvm_vcpu *vcpu,
-   struct sys_reg_desc const *r, bool raz)
+static u64 read_id_reg(const struct kvm_vcpu *vcpu, struct sys_reg_desc const 
*r)
 {
u32 id = reg_to_encoding(r);
u64 val;
 
-   if (raz)
+   if (sysreg_visible_as_raz(vcpu, r))
return 0;
 
val = read_sanitised_ftr_reg(id);
@@ -1157,12 +1156,10 @@ static bool access_id_reg(struct kvm_vcpu *vcpu,
  struct sys_reg_params *p,
  const struct sys_reg_desc *r)
 {
-   bool raz = sysreg_visible_as_raz(vcpu, r);
-
if (p->is_write)
return write_to_read_only(vcpu, p, r);
 
-   p->regval = read_id_reg(vcpu, r, raz);
+   p->regval = read_id_reg(vcpu, r);
return true;
 }
 
@@ -1199,7 +1196,7 @@ static int set_id_aa64pfr0_el1(struct kvm_vcpu *vcpu,
return -EINVAL;
 
/* We can only differ with CSV[23], and anything else is an error */
-   val ^= read_id_reg(vcpu, rd, false);
+   val ^= read_id_reg(vcpu, rd);
val &= ~((0xFUL << ID_AA64PFR0_CSV2_SHIFT) |
 (0xFUL << ID_AA64PFR0_CSV3_SHIFT));
if (val)
@@ -1221,19 +1218,15 @@ static int set_id_aa64pfr0_el1(struct kvm_vcpu *vcpu,
 static int get_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
  u64 *val)
 {
-   bool raz = sysreg_visible_as_raz(vcpu, rd);
-
-   *val = read_id_reg(vcpu, rd, raz);
+   *val = read_id_reg(vcpu, rd);
return 0;
 }
 
 static int set_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
  u64 val)
 {
-   bool raz = sysreg_visible_as_raz(vcpu, rd);
-
/* This is what we mean by invariant: you can't change it. */
-   if (val != read_id_reg(vcpu, rd, raz))
+   if (val != read_id_reg(vcpu, rd))
return -EINVAL;
 
return 0;
-- 
2.37.2.789.g6183377224-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v2 1/7] KVM: arm64: Use visibility hook to treat ID regs as RAZ

2022-09-02 Thread Oliver Upton
The generic id reg accessors already handle RAZ registers by way of the
visibility hook. Add a visibility hook that returns REG_RAZ
unconditionally and throw out the RAZ specific accessors.

Reviewed-by: Reiji Watanabe 
Signed-off-by: Oliver Upton 
---
 arch/arm64/kvm/sys_regs.c | 33 ++---
 1 file changed, 14 insertions(+), 19 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 3234f50b8c4b..e18efb9211f0 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1145,6 +1145,12 @@ static unsigned int id_visibility(const struct kvm_vcpu 
*vcpu,
return 0;
 }
 
+static unsigned int raz_visibility(const struct kvm_vcpu *vcpu,
+  const struct sys_reg_desc *r)
+{
+   return REG_RAZ;
+}
+
 /* cpufeature ID register access trap handlers */
 
 static bool __access_id_reg(struct kvm_vcpu *vcpu,
@@ -1168,13 +1174,6 @@ static bool access_id_reg(struct kvm_vcpu *vcpu,
return __access_id_reg(vcpu, p, r, raz);
 }
 
-static bool access_raz_id_reg(struct kvm_vcpu *vcpu,
- struct sys_reg_params *p,
- const struct sys_reg_desc *r)
-{
-   return __access_id_reg(vcpu, p, r, true);
-}
-
 /* Visibility overrides for SVE-specific control registers */
 static unsigned int sve_visibility(const struct kvm_vcpu *vcpu,
   const struct sys_reg_desc *rd)
@@ -1262,12 +1261,6 @@ static int set_id_reg(struct kvm_vcpu *vcpu, const 
struct sys_reg_desc *rd,
return __set_id_reg(vcpu, rd, val, raz);
 }
 
-static int set_raz_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
- u64 val)
-{
-   return __set_id_reg(vcpu, rd, val, true);
-}
-
 static int get_raz_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
   u64 *val)
 {
@@ -1374,9 +1367,10 @@ static unsigned int mte_visibility(const struct kvm_vcpu 
*vcpu,
  */
 #define ID_UNALLOCATED(crm, op2) { \
Op0(3), Op1(0), CRn(0), CRm(crm), Op2(op2), \
-   .access = access_raz_id_reg,\
-   .get_user = get_raz_reg,\
-   .set_user = set_raz_id_reg, \
+   .access = access_id_reg,\
+   .get_user = get_id_reg, \
+   .set_user = set_id_reg, \
+   .visibility = raz_visibility\
 }
 
 /*
@@ -1386,9 +1380,10 @@ static unsigned int mte_visibility(const struct kvm_vcpu 
*vcpu,
  */
 #define ID_HIDDEN(name) {  \
SYS_DESC(SYS_##name),   \
-   .access = access_raz_id_reg,\
-   .get_user = get_raz_reg,\
-   .set_user = set_raz_id_reg, \
+   .access = access_id_reg,\
+   .get_user = get_id_reg, \
+   .set_user = set_id_reg, \
+   .visibility = raz_visibility,   \
 }
 
 /*
-- 
2.37.2.789.g6183377224-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v2 2/7] KVM: arm64: Remove internal accessor helpers for id regs

2022-09-02 Thread Oliver Upton
The internal accessors are only ever called once. Dump out their
contents in the caller.

No functional change intended.

Signed-off-by: Oliver Upton 
---
 arch/arm64/kvm/sys_regs.c | 46 ++-
 1 file changed, 12 insertions(+), 34 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index e18efb9211f0..26210f3a0b27 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1153,25 +1153,17 @@ static unsigned int raz_visibility(const struct 
kvm_vcpu *vcpu,
 
 /* cpufeature ID register access trap handlers */
 
-static bool __access_id_reg(struct kvm_vcpu *vcpu,
-   struct sys_reg_params *p,
-   const struct sys_reg_desc *r,
-   bool raz)
-{
-   if (p->is_write)
-   return write_to_read_only(vcpu, p, r);
-
-   p->regval = read_id_reg(vcpu, r, raz);
-   return true;
-}
-
 static bool access_id_reg(struct kvm_vcpu *vcpu,
  struct sys_reg_params *p,
  const struct sys_reg_desc *r)
 {
bool raz = sysreg_visible_as_raz(vcpu, r);
 
-   return __access_id_reg(vcpu, p, r, raz);
+   if (p->is_write)
+   return write_to_read_only(vcpu, p, r);
+
+   p->regval = read_id_reg(vcpu, r, raz);
+   return true;
 }
 
 /* Visibility overrides for SVE-specific control registers */
@@ -1226,31 +1218,13 @@ static int set_id_aa64pfr0_el1(struct kvm_vcpu *vcpu,
  * are stored, and for set_id_reg() we don't allow the effective value
  * to be changed.
  */
-static int __get_id_reg(const struct kvm_vcpu *vcpu,
-   const struct sys_reg_desc *rd, u64 *val,
-   bool raz)
-{
-   *val = read_id_reg(vcpu, rd, raz);
-   return 0;
-}
-
-static int __set_id_reg(const struct kvm_vcpu *vcpu,
-   const struct sys_reg_desc *rd, u64 val,
-   bool raz)
-{
-   /* This is what we mean by invariant: you can't change it. */
-   if (val != read_id_reg(vcpu, rd, raz))
-   return -EINVAL;
-
-   return 0;
-}
-
 static int get_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
  u64 *val)
 {
bool raz = sysreg_visible_as_raz(vcpu, rd);
 
-   return __get_id_reg(vcpu, rd, val, raz);
+   *val = read_id_reg(vcpu, rd, raz);
+   return 0;
 }
 
 static int set_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
@@ -1258,7 +1232,11 @@ static int set_id_reg(struct kvm_vcpu *vcpu, const 
struct sys_reg_desc *rd,
 {
bool raz = sysreg_visible_as_raz(vcpu, rd);
 
-   return __set_id_reg(vcpu, rd, val, raz);
+   /* This is what we mean by invariant: you can't change it. */
+   if (val != read_id_reg(vcpu, rd, raz))
+   return -EINVAL;
+
+   return 0;
 }
 
 static int get_raz_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
-- 
2.37.2.789.g6183377224-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v2 0/7] KVM: arm64: Use visibility hook to treat ID regs as RAZ

2022-09-02 Thread Oliver Upton
For reasons unknown, the Arm architecture defines the 64-bit views of
the 32-bit ID registers as UNKNOWN [1]. This combines poorly with the
fact that KVM unconditionally exposes these registers to userspace,
which could throw a wrench in migration between 64-bit only systems.

This series reworks KVM's definition of these registers to RAZ/WI with
the goal of providing consistent register values across 64-bit machines.

Patches 1-3 clean up the ID register accessors, taking advantage of the
fact that the generic accessors know how to handle RAZ.

Patches 4-6 start switch the handling of potentially nonzero AArch32 ID
registers to RAZ/WI. RAZ covers up the architecturally UNKNOWN values,
and WI allows for migration off of kernels that may provide garbage.
Note that hidden AArch32 ID registers continue to have RAZ behavior with
the additional expectation of invariance.

Lastly, patch 7 includes a small test for the issue.

Applies to 6.0-rc3. Tested with KVM selftests under the fast model w/
asymmetric 32 bit support and no 32 bit support whatsoever.

v1 -> v2:
 - Collect Reiji's r-b tags (thanks!)
 - Call sysreg_visible_as_raz() from read_id_reg() (Reiji)
 - Hoist sysreg_user_write_ignore() into kvm_sys_reg_set_user() (Reiji)

Oliver Upton (7):
  KVM: arm64: Use visibility hook to treat ID regs as RAZ
  KVM: arm64: Remove internal accessor helpers for id regs
  KVM: arm64: Drop raz parameter from read_id_reg()
  KVM: arm64: Spin off helper for calling visibility hook
  KVM: arm64: Add a visibility bit to ignore user writes
  KVM: arm64: Treat 32bit ID registers as RAZ/WI on 64bit-only system
  KVM: selftests: Add test for RAZ/WI AArch32 ID registers

 arch/arm64/kvm/sys_regs.c | 150 +-
 arch/arm64/kvm/sys_regs.h |  24 ++-
 tools/testing/selftests/kvm/.gitignore|   1 +
 tools/testing/selftests/kvm/Makefile  |   1 +
 .../kvm/aarch64/aarch64_only_id_regs.c| 135 
 5 files changed, 225 insertions(+), 86 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/aarch64/aarch64_only_id_regs.c


base-commit: b90cb1053190353cc30f0fef0ef1f378ccc063c5
-- 
2.37.2.789.g6183377224-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 6/7] KVM: arm64: permit all VM_MTE_ALLOWED mappings with MTE enabled

2022-09-02 Thread Steven Price
On 02/09/2022 14:45, Catalin Marinas wrote:
> On Wed, Aug 10, 2022 at 12:30:32PM -0700, Peter Collingbourne wrote:
>> Certain VMMs such as crosvm have features (e.g. sandboxing) that depend
>> on being able to map guest memory as MAP_SHARED. The current restriction
>> on sharing MAP_SHARED pages with the guest is preventing the use of
>> those features with MTE. Now that the races between tasks concurrently
>> clearing tags on the same page have been fixed, remove this restriction.
>>
>> Signed-off-by: Peter Collingbourne 
>> ---
>>  arch/arm64/kvm/mmu.c | 8 
>>  1 file changed, 8 deletions(-)
>>
>> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
>> index d54be80e31dd..fc65dc20655d 100644
>> --- a/arch/arm64/kvm/mmu.c
>> +++ b/arch/arm64/kvm/mmu.c
>> @@ -1075,14 +1075,6 @@ static void sanitise_mte_tags(struct kvm *kvm, 
>> kvm_pfn_t pfn,
>>  
>>  static bool kvm_vma_mte_allowed(struct vm_area_struct *vma)
>>  {
>> -/*
>> - * VM_SHARED mappings are not allowed with MTE to avoid races
>> - * when updating the PG_mte_tagged page flag, see
>> - * sanitise_mte_tags for more details.
>> - */
>> -if (vma->vm_flags & VM_SHARED)
>> -return false;
> 
> I think this is fine with the locking in place (BTW, it may be worth
> mentioning in the commit message that it's a relaxation of the ABI). I'd
> like Steven to have a look as well when he gets the time, in case we
> missed anything on the KVM+MTE side.
> 
> Reviewed-by: Catalin Marinas 

Looks fine to me, and thanks for doing the work: I was never very
pleased with the !VM_SHARED restriction, but I couldn't figure a good
way of getting the locking to work.

Reviewed-by: Steven Price 
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 5/7] KVM: arm64: unify the tests for VMAs in memslots when MTE is enabled

2022-09-02 Thread Steven Price
On 10/08/2022 20:30, Peter Collingbourne wrote:
> Previously we allowed creating a memslot containing a private mapping that
> was not VM_MTE_ALLOWED, but would later reject KVM_RUN with -EFAULT. Now
> we reject the memory region at memslot creation time.
> 
> Since this is a minor tweak to the ABI (a VMM that created one of
> these memslots would fail later anyway), no VMM to my knowledge has
> MTE support yet, and the hardware with the necessary features is not
> generally available, we can probably make this ABI change at this point.
> 
> Signed-off-by: Peter Collingbourne 

Reviewed-by: Steven Price 

> ---
>  arch/arm64/kvm/mmu.c | 25 -
>  1 file changed, 16 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index 750a69a97994..d54be80e31dd 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -1073,6 +1073,19 @@ static void sanitise_mte_tags(struct kvm *kvm, 
> kvm_pfn_t pfn,
>   }
>  }
>  
> +static bool kvm_vma_mte_allowed(struct vm_area_struct *vma)
> +{
> + /*
> +  * VM_SHARED mappings are not allowed with MTE to avoid races
> +  * when updating the PG_mte_tagged page flag, see
> +  * sanitise_mte_tags for more details.
> +  */
> + if (vma->vm_flags & VM_SHARED)
> + return false;
> +
> + return vma->vm_flags & VM_MTE_ALLOWED;
> +}
> +
>  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> struct kvm_memory_slot *memslot, unsigned long hva,
> unsigned long fault_status)
> @@ -1249,9 +1262,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
> phys_addr_t fault_ipa,
>   }
>  
>   if (fault_status != FSC_PERM && !device && kvm_has_mte(kvm)) {
> - /* Check the VMM hasn't introduced a new VM_SHARED VMA */
> - if ((vma->vm_flags & VM_MTE_ALLOWED) &&
> - !(vma->vm_flags & VM_SHARED)) {
> + /* Check the VMM hasn't introduced a new disallowed VMA */
> + if (kvm_vma_mte_allowed(vma)) {
>   sanitise_mte_tags(kvm, pfn, vma_pagesize);
>   } else {
>   ret = -EFAULT;
> @@ -1695,12 +1707,7 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
>   if (!vma)
>   break;
>  
> - /*
> -  * VM_SHARED mappings are not allowed with MTE to avoid races
> -  * when updating the PG_mte_tagged page flag, see
> -  * sanitise_mte_tags for more details.
> -  */
> - if (kvm_has_mte(kvm) && vma->vm_flags & VM_SHARED) {
> + if (kvm_has_mte(kvm) && !kvm_vma_mte_allowed(vma)) {
>   ret = -EINVAL;
>   break;
>   }

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 4/7] arm64: mte: Lock a page for MTE tag initialisation

2022-09-02 Thread Steven Price
On 10/08/2022 20:30, Peter Collingbourne wrote:
> From: Catalin Marinas 
> 
> Initialising the tags and setting PG_mte_tagged flag for a page can race
> between multiple set_pte_at() on shared pages or setting the stage 2 pte
> via user_mem_abort(). Introduce a new PG_mte_lock flag as PG_arch_3 and
> set it before attempting page initialisation. Given that PG_mte_tagged
> is never cleared for a page, consider setting this flag to mean page
> unlocked and wait on this bit with acquire semantics if the page is
> locked:
> 
> - try_page_mte_tagging() - lock the page for tagging, return true if it
>   can be tagged, false if already tagged. No acquire semantics if it
>   returns true (PG_mte_tagged not set) as there is no serialisation with
>   a previous set_page_mte_tagged().
> 
> - set_page_mte_tagged() - set PG_mte_tagged with release semantics.
> 
> The two-bit locking is based on Peter Collingbourne's idea.
> 
> Signed-off-by: Catalin Marinas 
> Cc: Will Deacon 
> Cc: Marc Zyngier 
> Cc: Steven Price 
> Cc: Peter Collingbourne 
> ---
>  arch/arm64/include/asm/mte.h | 32 
>  arch/arm64/include/asm/pgtable.h |  1 +
>  arch/arm64/kernel/cpufeature.c   |  2 +-
>  arch/arm64/kernel/mte.c  |  7 +--
>  arch/arm64/kvm/guest.c   | 16 ++--
>  arch/arm64/kvm/mmu.c |  2 +-
>  arch/arm64/mm/copypage.c |  2 ++
>  arch/arm64/mm/fault.c|  2 ++
>  arch/arm64/mm/mteswap.c  |  3 +++
>  9 files changed, 57 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
> index 46618c575eac..ea5158f6f6cb 100644
> --- a/arch/arm64/include/asm/mte.h
> +++ b/arch/arm64/include/asm/mte.h
> @@ -36,6 +36,8 @@ void mte_free_tag_storage(char *storage);
>  
>  /* track which pages have valid allocation tags */
>  #define PG_mte_taggedPG_arch_2
> +/* simple lock to avoid multiple threads tagging the same page */
> +#define PG_mte_lock  PG_arch_3
>  
>  static inline void set_page_mte_tagged(struct page *page)
>  {
> @@ -60,6 +62,32 @@ static inline bool page_mte_tagged(struct page *page)
>   return ret;
>  }
>  
> +/*
> + * Lock the page for tagging and return 'true' if the page can be tagged,
> + * 'false' if already tagged. PG_mte_tagged is never cleared and therefore 
> the
> + * locking only happens once for page initialisation.
> + *
> + * The page MTE lock state:
> + *
> + *   Locked: PG_mte_lock && !PG_mte_tagged
> + *   Unlocked:   !PG_mte_lock || PG_mte_tagged
> + *
> + * Acquire semantics only if the page is tagged (returning 'false').
> + */
> +static inline bool try_page_mte_tagging(struct page *page)
> +{
> + if (!test_and_set_bit(PG_mte_lock, >flags))
> + return true;
> +
> + /*
> +  * The tags are either being initialised or have already been 
> initialised,
> +  * wait for the PG_mte_tagged flag to be set.
> +  */
> + smp_cond_load_acquire(>flags, VAL & (1UL << PG_mte_tagged));
> +
> + return false;
> +}
> +
>  void mte_zero_clear_page_tags(void *addr);
>  void mte_sync_tags(pte_t old_pte, pte_t pte);
>  void mte_copy_page_tags(void *kto, const void *kfrom);
> @@ -84,6 +112,10 @@ static inline bool page_mte_tagged(struct page *page)
>  {
>   return false;
>  }
> +static inline bool try_page_mte_tagging(struct page *page)
> +{
> + return false;
> +}
>  static inline void mte_zero_clear_page_tags(void *addr)
>  {
>  }
> diff --git a/arch/arm64/include/asm/pgtable.h 
> b/arch/arm64/include/asm/pgtable.h
> index 82719fa42c0e..e6b82ad1e9e6 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -1049,6 +1049,7 @@ static inline void arch_swap_invalidate_area(int type)
>  #define __HAVE_ARCH_SWAP_RESTORE
>  static inline void arch_swap_restore(swp_entry_t entry, struct folio *folio)
>  {
> + /* mte_restore_tags() takes the PG_mte_lock */
>   if (system_supports_mte() && mte_restore_tags(entry, >page))
>   set_page_mte_tagged(>page);
>  }
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index 562c301bbf15..33d342ddef87 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -2037,7 +2037,7 @@ static void cpu_enable_mte(struct 
> arm64_cpu_capabilities const *cap)
>* Clear the tags in the zero page. This needs to be done via the
>* linear map which has the Tagged attribute.
>*/
> - if (!page_mte_tagged(ZERO_PAGE(0))) {
> + if (try_page_mte_tagging(ZERO_PAGE(0))) {
>   mte_clear_page_tags(lm_alias(empty_zero_page));
>   set_page_mte_tagged(ZERO_PAGE(0));
>   }
> diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
> index 2287316639f3..634e089b5933 100644
> --- a/arch/arm64/kernel/mte.c
> +++ b/arch/arm64/kernel/mte.c
> @@ -41,6 +41,7 @@ static void mte_sync_page_tags(struct page *page, pte_t 
> old_pte,
>   if 

Re: [PATCH v3 2/7] KVM: arm64: Simplify the sanitise_mte_tags() logic

2022-09-02 Thread Steven Price
On 10/08/2022 20:30, Peter Collingbourne wrote:
> From: Catalin Marinas 
> 
> Currently sanitise_mte_tags() checks if it's an online page before
> attempting to sanitise the tags. Such detection should be done in the
> caller via the VM_MTE_ALLOWED vma flag. Since kvm_set_spte_gfn() does
> not have the vma, leave the page unmapped if not already tagged. Tag
> initialisation will be done on a subsequent access fault in
> user_mem_abort().

Looks correct to me.

Reviewed-by: Steven Price 

> Signed-off-by: Catalin Marinas 
> Cc: Will Deacon 
> Cc: Marc Zyngier 
> Cc: Steven Price 
> Cc: Peter Collingbourne 
> ---
>  arch/arm64/kvm/mmu.c | 40 +++-
>  1 file changed, 15 insertions(+), 25 deletions(-)
> 
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index c9012707f69c..1a3707aeb41f 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -1056,23 +1056,14 @@ static int get_vma_page_shift(struct vm_area_struct 
> *vma, unsigned long hva)
>   * - mmap_lock protects between a VM faulting a page in and the VMM 
> performing
>   *   an mprotect() to add VM_MTE
>   */
> -static int sanitise_mte_tags(struct kvm *kvm, kvm_pfn_t pfn,
> -  unsigned long size)
> +static void sanitise_mte_tags(struct kvm *kvm, kvm_pfn_t pfn,
> +   unsigned long size)
>  {
>   unsigned long i, nr_pages = size >> PAGE_SHIFT;
> - struct page *page;
> + struct page *page = pfn_to_page(pfn);
>  
>   if (!kvm_has_mte(kvm))
> - return 0;
> -
> - /*
> -  * pfn_to_online_page() is used to reject ZONE_DEVICE pages
> -  * that may not support tags.
> -  */
> - page = pfn_to_online_page(pfn);
> -
> - if (!page)
> - return -EFAULT;
> + return;
>  
>   for (i = 0; i < nr_pages; i++, page++) {
>   if (!page_mte_tagged(page)) {
> @@ -1080,8 +1071,6 @@ static int sanitise_mte_tags(struct kvm *kvm, kvm_pfn_t 
> pfn,
>   set_page_mte_tagged(page);
>   }
>   }
> -
> - return 0;
>  }
>  
>  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> @@ -1092,7 +1081,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
> phys_addr_t fault_ipa,
>   bool write_fault, writable, force_pte = false;
>   bool exec_fault;
>   bool device = false;
> - bool shared;
>   unsigned long mmu_seq;
>   struct kvm *kvm = vcpu->kvm;
>   struct kvm_mmu_memory_cache *memcache = >arch.mmu_page_cache;
> @@ -1142,8 +1130,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
> phys_addr_t fault_ipa,
>   vma_shift = get_vma_page_shift(vma, hva);
>   }
>  
> - shared = (vma->vm_flags & VM_SHARED);
> -
>   switch (vma_shift) {
>  #ifndef __PAGETABLE_PMD_FOLDED
>   case PUD_SHIFT:
> @@ -1264,12 +1250,13 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
> phys_addr_t fault_ipa,
>  
>   if (fault_status != FSC_PERM && !device && kvm_has_mte(kvm)) {
>   /* Check the VMM hasn't introduced a new VM_SHARED VMA */
> - if (!shared)
> - ret = sanitise_mte_tags(kvm, pfn, vma_pagesize);
> - else
> + if ((vma->vm_flags & VM_MTE_ALLOWED) &&
> + !(vma->vm_flags & VM_SHARED)) {
> + sanitise_mte_tags(kvm, pfn, vma_pagesize);
> + } else {
>   ret = -EFAULT;
> - if (ret)
>   goto out_unlock;
> + }
>   }
>  
>   if (writable)
> @@ -1491,15 +1478,18 @@ bool kvm_unmap_gfn_range(struct kvm *kvm, struct 
> kvm_gfn_range *range)
>  bool kvm_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
>  {
>   kvm_pfn_t pfn = pte_pfn(range->pte);
> - int ret;
>  
>   if (!kvm->arch.mmu.pgt)
>   return false;
>  
>   WARN_ON(range->end - range->start != 1);
>  
> - ret = sanitise_mte_tags(kvm, pfn, PAGE_SIZE);
> - if (ret)
> + /*
> +  * If the page isn't tagged, defer to user_mem_abort() for sanitising
> +  * the MTE tags. The S2 pte should have been unmapped by
> +  * mmu_notifier_invalidate_range_end().
> +  */
> + if (kvm_has_mte(kvm) && !page_mte_tagged(pfn_to_page(pfn)))
>   return false;
>  
>   /*

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 1/7] arm64: mte: Fix/clarify the PG_mte_tagged semantics

2022-09-02 Thread Steven Price
On 10/08/2022 20:30, Peter Collingbourne wrote:
> From: Catalin Marinas 
> 
> Currently the PG_mte_tagged page flag mostly means the page contains
> valid tags and it should be set after the tags have been cleared or
> restored. However, in mte_sync_tags() it is set before setting the tags
> to avoid, in theory, a race with concurrent mprotect(PROT_MTE) for
> shared pages. However, a concurrent mprotect(PROT_MTE) with a copy on
> write in another thread can cause the new page to have stale tags.
> Similarly, tag reading via ptrace() can read stale tags of the
> PG_mte_tagged flag is set before actually clearing/restoring the tags.
> 
> Fix the PG_mte_tagged semantics so that it is only set after the tags
> have been cleared or restored. This is safe for swap restoring into a
> MAP_SHARED or CoW page since the core code takes the page lock. Add two
> functions to test and set the PG_mte_tagged flag with acquire and
> release semantics. The downside is that concurrent mprotect(PROT_MTE) on
> a MAP_SHARED page may cause tag loss. This is already the case for KVM
> guests if a VMM changes the page protection while the guest triggers a
> user_mem_abort().
> 
> Signed-off-by: Catalin Marinas 
> Cc: Will Deacon 
> Cc: Marc Zyngier 
> Cc: Steven Price 
> Cc: Peter Collingbourne 

Reviewed-by: Steven Price 

> ---
> v3:
> - fix build with CONFIG_ARM64_MTE disabled
> 
>  arch/arm64/include/asm/mte.h | 30 ++
>  arch/arm64/include/asm/pgtable.h |  2 +-
>  arch/arm64/kernel/cpufeature.c   |  4 +++-
>  arch/arm64/kernel/elfcore.c  |  2 +-
>  arch/arm64/kernel/hibernate.c|  2 +-
>  arch/arm64/kernel/mte.c  | 12 +++-
>  arch/arm64/kvm/guest.c   |  4 ++--
>  arch/arm64/kvm/mmu.c |  4 ++--
>  arch/arm64/mm/copypage.c |  4 ++--
>  arch/arm64/mm/fault.c|  2 +-
>  arch/arm64/mm/mteswap.c  |  2 +-
>  11 files changed, 51 insertions(+), 17 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
> index aa523591a44e..46618c575eac 100644
> --- a/arch/arm64/include/asm/mte.h
> +++ b/arch/arm64/include/asm/mte.h
> @@ -37,6 +37,29 @@ void mte_free_tag_storage(char *storage);
>  /* track which pages have valid allocation tags */
>  #define PG_mte_taggedPG_arch_2
>  
> +static inline void set_page_mte_tagged(struct page *page)
> +{
> + /*
> +  * Ensure that the tags written prior to this function are visible
> +  * before the page flags update.
> +  */
> + smp_wmb();
> + set_bit(PG_mte_tagged, >flags);
> +}
> +
> +static inline bool page_mte_tagged(struct page *page)
> +{
> + bool ret = test_bit(PG_mte_tagged, >flags);
> +
> + /*
> +  * If the page is tagged, ensure ordering with a likely subsequent
> +  * read of the tags.
> +  */
> + if (ret)
> + smp_rmb();
> + return ret;
> +}
> +
>  void mte_zero_clear_page_tags(void *addr);
>  void mte_sync_tags(pte_t old_pte, pte_t pte);
>  void mte_copy_page_tags(void *kto, const void *kfrom);
> @@ -54,6 +77,13 @@ size_t mte_probe_user_range(const char __user *uaddr, 
> size_t size);
>  /* unused if !CONFIG_ARM64_MTE, silence the compiler */
>  #define PG_mte_tagged0
>  
> +static inline void set_page_mte_tagged(struct page *page)
> +{
> +}
> +static inline bool page_mte_tagged(struct page *page)
> +{
> + return false;
> +}
>  static inline void mte_zero_clear_page_tags(void *addr)
>  {
>  }
> diff --git a/arch/arm64/include/asm/pgtable.h 
> b/arch/arm64/include/asm/pgtable.h
> index b5df82aa99e6..82719fa42c0e 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -1050,7 +1050,7 @@ static inline void arch_swap_invalidate_area(int type)
>  static inline void arch_swap_restore(swp_entry_t entry, struct folio *folio)
>  {
>   if (system_supports_mte() && mte_restore_tags(entry, >page))
> - set_bit(PG_mte_tagged, >flags);
> + set_page_mte_tagged(>page);
>  }
>  
>  #endif /* CONFIG_ARM64_MTE */
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index 907401e4fffb..562c301bbf15 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -2037,8 +2037,10 @@ static void cpu_enable_mte(struct 
> arm64_cpu_capabilities const *cap)
>* Clear the tags in the zero page. This needs to be done via the
>* linear map which has the Tagged attribute.
>*/
> - if (!test_and_set_bit(PG_mte_tagged, _PAGE(0)->flags))
> + if (!page_mte_tagged(ZERO_PAGE(0))) {
>   mte_clear_page_tags(lm_alias(empty_zero_page));
> + set_page_mte_tagged(ZERO_PAGE(0));
> + }
>  
>   kasan_init_hw_tags_cpu();
>  }
> diff --git a/arch/arm64/kernel/elfcore.c b/arch/arm64/kernel/elfcore.c
> index 98d67444a5b6..f91bb1572d22 100644
> --- a/arch/arm64/kernel/elfcore.c
> +++ b/arch/arm64/kernel/elfcore.c
> @@ -47,7 +47,7 @@ static int 

Re: [PATCH v3 0/7] KVM: arm64: permit MAP_SHARED mappings with MTE enabled

2022-09-02 Thread Catalin Marinas
On Wed, Aug 10, 2022 at 12:30:26PM -0700, Peter Collingbourne wrote:
> I rebased Catalin's series onto -next, addressed the issues that I
> identified in the review and added the proposed userspace enablement
> patches after the series.
> 
> [1] 
> https://lore.kernel.org/all/20220705142619.4135905-1-catalin.mari...@arm.com/
> 
> Catalin Marinas (3):
>   arm64: mte: Fix/clarify the PG_mte_tagged semantics
>   KVM: arm64: Simplify the sanitise_mte_tags() logic
>   arm64: mte: Lock a page for MTE tag initialisation
> 
> Peter Collingbourne (4):
>   mm: Add PG_arch_3 page flag

BTW, I rebased these for patches on top of 6.0-rc3 and hopefully
addressed the review comments. I pushed them here:

git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux devel/mte-pg-flags

You may have rebased them already but just in case you haven't, feel
free to pick them up from the above branch.

-- 
Catalin
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 7/7] Documentation: document the ABI changes for KVM_CAP_ARM_MTE

2022-09-02 Thread Catalin Marinas
On Wed, Aug 10, 2022 at 12:30:33PM -0700, Peter Collingbourne wrote:
> Document both the restriction on VM_MTE_ALLOWED mappings and
> the relaxation for shared mappings.
> 
> Signed-off-by: Peter Collingbourne 

Acked-by: Catalin Marinas 
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 6/7] KVM: arm64: permit all VM_MTE_ALLOWED mappings with MTE enabled

2022-09-02 Thread Catalin Marinas
On Wed, Aug 10, 2022 at 12:30:32PM -0700, Peter Collingbourne wrote:
> Certain VMMs such as crosvm have features (e.g. sandboxing) that depend
> on being able to map guest memory as MAP_SHARED. The current restriction
> on sharing MAP_SHARED pages with the guest is preventing the use of
> those features with MTE. Now that the races between tasks concurrently
> clearing tags on the same page have been fixed, remove this restriction.
> 
> Signed-off-by: Peter Collingbourne 
> ---
>  arch/arm64/kvm/mmu.c | 8 
>  1 file changed, 8 deletions(-)
> 
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index d54be80e31dd..fc65dc20655d 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -1075,14 +1075,6 @@ static void sanitise_mte_tags(struct kvm *kvm, 
> kvm_pfn_t pfn,
>  
>  static bool kvm_vma_mte_allowed(struct vm_area_struct *vma)
>  {
> - /*
> -  * VM_SHARED mappings are not allowed with MTE to avoid races
> -  * when updating the PG_mte_tagged page flag, see
> -  * sanitise_mte_tags for more details.
> -  */
> - if (vma->vm_flags & VM_SHARED)
> - return false;

I think this is fine with the locking in place (BTW, it may be worth
mentioning in the commit message that it's a relaxation of the ABI). I'd
like Steven to have a look as well when he gets the time, in case we
missed anything on the KVM+MTE side.

Reviewed-by: Catalin Marinas 
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 5/7] KVM: arm64: unify the tests for VMAs in memslots when MTE is enabled

2022-09-02 Thread Catalin Marinas
On Wed, Aug 10, 2022 at 12:30:31PM -0700, Peter Collingbourne wrote:
> Previously we allowed creating a memslot containing a private mapping that
> was not VM_MTE_ALLOWED, but would later reject KVM_RUN with -EFAULT. Now
> we reject the memory region at memslot creation time.
> 
> Since this is a minor tweak to the ABI (a VMM that created one of
> these memslots would fail later anyway), no VMM to my knowledge has
> MTE support yet, and the hardware with the necessary features is not
> generally available, we can probably make this ABI change at this point.

I don't think that's a noticeable ABI change. As you said, such VMs
would fail later anyway when trying to access such page.

Reviewed-by: Catalin Marinas 
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 1/7] arm64: mte: Fix/clarify the PG_mte_tagged semantics

2022-09-02 Thread Cornelia Huck
On Wed, Aug 10 2022, Peter Collingbourne  wrote:

> From: Catalin Marinas 
>
> Currently the PG_mte_tagged page flag mostly means the page contains
> valid tags and it should be set after the tags have been cleared or
> restored. However, in mte_sync_tags() it is set before setting the tags
> to avoid, in theory, a race with concurrent mprotect(PROT_MTE) for
> shared pages. However, a concurrent mprotect(PROT_MTE) with a copy on
> write in another thread can cause the new page to have stale tags.
> Similarly, tag reading via ptrace() can read stale tags of the

s/of/if/

> PG_mte_tagged flag is set before actually clearing/restoring the tags.
>
> Fix the PG_mte_tagged semantics so that it is only set after the tags
> have been cleared or restored. This is safe for swap restoring into a
> MAP_SHARED or CoW page since the core code takes the page lock. Add two
> functions to test and set the PG_mte_tagged flag with acquire and
> release semantics. The downside is that concurrent mprotect(PROT_MTE) on
> a MAP_SHARED page may cause tag loss. This is already the case for KVM
> guests if a VMM changes the page protection while the guest triggers a
> user_mem_abort().
>
> Signed-off-by: Catalin Marinas 
> Cc: Will Deacon 
> Cc: Marc Zyngier 
> Cc: Steven Price 
> Cc: Peter Collingbourne 
> ---
> v3:
> - fix build with CONFIG_ARM64_MTE disabled
>
>  arch/arm64/include/asm/mte.h | 30 ++
>  arch/arm64/include/asm/pgtable.h |  2 +-
>  arch/arm64/kernel/cpufeature.c   |  4 +++-
>  arch/arm64/kernel/elfcore.c  |  2 +-
>  arch/arm64/kernel/hibernate.c|  2 +-
>  arch/arm64/kernel/mte.c  | 12 +++-
>  arch/arm64/kvm/guest.c   |  4 ++--
>  arch/arm64/kvm/mmu.c |  4 ++--
>  arch/arm64/mm/copypage.c |  4 ++--
>  arch/arm64/mm/fault.c|  2 +-
>  arch/arm64/mm/mteswap.c  |  2 +-
>  11 files changed, 51 insertions(+), 17 deletions(-)

Reviewed-by: Cornelia Huck 

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm