from:"Alex Bennée"

Re: [PATCH kvm-unit-tests] MAINTAINERS: Change drew's email address

2022-06-23 Thread Alex Bennée


Andrew Jones  writes:

> As a side effect of leaving Red Hat I won't be able to use my Red Hat
> email address anymore. I'm also changing the name of my gitlab group.
>
> Signed-off-by: Andrew Jones 
> Signed-off-by: Andrew Jones 

Reviewed-by: Alex Bennée 

-- 
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [kvm-unit-tests PATCH v3 0/3] GIC ITS tests

2022-02-01 Thread Alex Bennée


Andrew Jones  writes:

> On Tue, Nov 30, 2021 at 02:11:34PM +0000, Alex Bennée wrote:
>> 
>> Andrew Jones  writes:
>> 
>> > On Fri, Nov 19, 2021 at 04:30:47PM +, Alex Bennée wrote:
>> >> 
>> >> Andrew Jones  writes:
>> >> 
>> >> > On Fri, Nov 12, 2021 at 02:08:01PM +, Alex Bennée wrote:
>> >> >> 
>> >> >> Andrew Jones  writes:
>> >> >> 
>> >> >> > On Fri, Nov 12, 2021 at 11:47:31AM +, Alex Bennée wrote:
>> >> >> >> Hi,
>> >> >> >> 
>> >> >> >> Sorry this has been sitting in my tree so long. The changes are 
>> >> >> >> fairly
>> >> >> >> minor from v2. I no longer split the tests up into TCG and KVM
>> >> >> >> versions and instead just ensure that ERRATA_FORCE is always set 
>> >> >> >> when
>> >> >> >> run under TCG.
>> >> >> >> 
>> >> >> >> Alex Bennée (3):
>> >> >> >>   arm64: remove invalid check from its-trigger test
>> >> >> >>   arm64: enable its-migration tests for TCG
>> >> >> >>   arch-run: do not process ERRATA when running under TCG
>> >> >> >> 
>> >> >> >>  scripts/arch-run.bash |  4 +++-
>> >> >> >>  arm/gic.c | 16 ++--
>> >> >> >>  arm/unittests.cfg |  3 ---
>> >> >> >>  3 files changed, 9 insertions(+), 14 deletions(-)
>> >> >> >> 
>> >> >> >> -- 
>> >> >> >> 2.30.2
>> >> >> >> 
>> >> >> >> ___
>> >> >> >> kvmarm mailing list
>> >> >> >> kvmarm@lists.cs.columbia.edu
>> >> >> >> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
>> >> >> >
>> >> >> > Hi Alex,
>> >> >> >
>> >> >> > Thanks for this. I've applied to arm/queue, but I see that
>> >> >> >
>> >> >> > FAIL: gicv3: its-trigger: inv/invall: dev2/eventid=20 pending LPI is 
>> >> >> > received
>> >> >> >
>> >> >> > consistently fails for me. Is that expected? Does it work for you?
>> >> >> 
>> >> >> doh - looks like I cocked up the merge conflict...
>> >> >> 
>> >> >> Did it fail for TCG or for KVM (or both)?
>> >> >
>> >> > Just TCG, which was why I was wondering if it was expected. I've never 
>> >> > run
>> >> > these tests with TCG before.
>> >> 
>> >> Hmm I think expecting the IRQ at all is broken so I think I should
>> >> delete the whole pending test.
>> >
>> > Feel free to repost. I'll update the patches in arm/queue before my next
>> > MR.
>> 
>> Actually I think the problem was with a regression in the TCG ITS
>> support (now fixed in master). So I believe as of v3 everything is
>> correct (and v4 should be ignored).
>> 
>> Are you happy to apply this series or do you want me to repost it as v5?
>
> No need to repost. I'll retest v3 with latest QEMU.

Gentle ping, I'm trying to clear this off my internal JIRA so let me
know if you want me to do anything to help.

>
> Thanks,
> drew


-- 
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Configuring the host GIC for guest to guest IPI

2021-12-13 Thread Alex Bennée


Hi GIC experts,

This came up last week in the Stratos sync call when we were discussing
Vincent's SCMI setup:

  
https://linaro.atlassian.net/wiki/spaces/STR/pages/28665741503/2021-12-09+Project+Stratos+Meeting+Notes

with the shared memory between the two guests the only reason we
currently exit to the guest userspace (QEMU) is to forward notifications
of virtqueue kicks from one guest to the other. I'm vaguely aware from
my time looking at GIC code that it can be configured for IPI IRQs
between two cores. Do we have that ability between two vCPUs from
different guests?

If not what would it take to enable such a feature?

-- 
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [kvm-unit-tests PATCH v9 0/9] MTTCG sanity tests for ARM

2021-12-10 Thread Alex Bennée


Alex Bennée  writes:

> Hi,
>
> Not a great deal has changed from the last posting although I have
> dropped the additional unittests.cfg in favour of setting "nodefault"
> for the tests. Otherwise the clean-ups are mainly textual (removing
> printfs, random newlines and cleaning up comments). As usual the
> details are in the commits bellow the ---.
>
> I've also tweaked .git/config so get_maintainer.pl should ensure
> direct delivery of the patches ;-)

Gentle ping...

-- 
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[kvm-unit-tests PATCH v9 9/9] arm/tcg-test: some basic TCG exercising tests

2021-12-02 Thread Alex Bennée

These tests are not really aimed at KVM at all but exist to stretch
QEMU's TCG code generator. In particular these exercise the ability of
the TCG to:

  * Chain TranslationBlocks together (tight)
  * Handle heavy usage of the tb_jump_cache (paged)
  * Pathological case of computed local jumps (computed)

In addition the tests can be varied by adding IPI IRQs or SMC sequences
into the mix to stress the tcg_exit and invalidation mechanisms.

To explicitly stress the tb_flush() mechanism you can use the mod/rounds
parameters to force more frequent tb invalidation. Combined with setting
-tb-size 1 in QEMU to limit the code generation buffer size.

Signed-off-by: Alex Bennée 
Message-Id: <2028184650.661575-11-alex.ben...@linaro.org>

---
v9
  - moved back to unittests.cfg
  - fixed some missing accel tags
  - s/printf/report_info/
---
 arm/Makefile.arm |   2 +
 arm/Makefile.arm64   |   2 +
 arm/Makefile.common  |   1 +
 arm/tcg-test-asm.S   | 171 ++
 arm/tcg-test-asm64.S | 170 ++
 arm/tcg-test.c   | 338 +++
 arm/unittests.cfg|  84 +++
 7 files changed, 768 insertions(+)
 create mode 100644 arm/tcg-test-asm.S
 create mode 100644 arm/tcg-test-asm64.S
 create mode 100644 arm/tcg-test.c

diff --git a/arm/Makefile.arm b/arm/Makefile.arm
index 3a4cc6b2..05e47f10 100644
--- a/arm/Makefile.arm
+++ b/arm/Makefile.arm
@@ -31,4 +31,6 @@ tests =
 
 include $(SRCDIR)/$(TEST_DIR)/Makefile.common
 
+$(TEST_DIR)/tcg-test.elf: $(cstart.o) $(TEST_DIR)/tcg-test.o 
$(TEST_DIR)/tcg-test-asm.o
+
 arch_clean: arm_clean
diff --git a/arm/Makefile.arm64 b/arm/Makefile.arm64
index e8a38d78..ac94f8ed 100644
--- a/arm/Makefile.arm64
+++ b/arm/Makefile.arm64
@@ -34,5 +34,7 @@ tests += $(TEST_DIR)/cache.flat
 
 include $(SRCDIR)/$(TEST_DIR)/Makefile.common
 
+$(TEST_DIR)/tcg-test.elf: $(cstart.o) $(TEST_DIR)/tcg-test.o 
$(TEST_DIR)/tcg-test-asm64.o
+
 arch_clean: arm_clean
$(RM) lib/arm64/.*.d
diff --git a/arm/Makefile.common b/arm/Makefile.common
index 861e5c7f..abb69489 100644
--- a/arm/Makefile.common
+++ b/arm/Makefile.common
@@ -14,6 +14,7 @@ tests-common += $(TEST_DIR)/pl031.flat
 tests-common += $(TEST_DIR)/tlbflush-code.flat
 tests-common += $(TEST_DIR)/locking-test.flat
 tests-common += $(TEST_DIR)/barrier-litmus-test.flat
+tests-common += $(TEST_DIR)/tcg-test.flat
 
 tests-all = $(tests-common) $(tests)
 all: directories $(tests-all)
diff --git a/arm/tcg-test-asm.S b/arm/tcg-test-asm.S
new file mode 100644
index ..f58fac08
--- /dev/null
+++ b/arm/tcg-test-asm.S
@@ -0,0 +1,171 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * TCG Test assembler functions for armv7 tests.
+ *
+ * Copyright (C) 2016, Linaro Ltd, Alex Bennée 
+ *
+ * These helper functions are written in pure asm to control the size
+ * of the basic blocks and ensure they fit neatly into page
+ * aligned chunks. The pattern of branches they follow is determined by
+ * the 32 bit seed they are passed. It should be the same for each set.
+ *
+ * Calling convention
+ *  - r0, iterations
+ *  - r1, jump pattern
+ *  - r2-r3, scratch
+ *
+ * Returns r0
+ */
+
+.arm
+
+.section .text
+
+/*
+ * Tight - all blocks should quickly be patched and should run
+ * very fast unless irqs or smc gets in the way
+ */
+
+.global tight_start
+tight_start:
+subsr0, r0, #1
+beq tight_end
+
+ror r1, r1, #1
+tst r1, #1
+beq tightA
+b   tight_start
+
+tightA:
+subsr0, r0, #1
+beq tight_end
+
+ror r1, r1, #1
+tst r1, #1
+beq tightB
+b   tight_start
+
+tightB:
+subsr0, r0, #1
+beq tight_end
+
+ror r1, r1, #1
+tst r1, #1
+beq tight_start
+b   tightA
+
+.global tight_end
+tight_end:
+mov pc, lr
+
+/*
+ * Computed jumps cannot be hardwired into the basic blocks so each one
+ * will either cause an exit for the main execution loop or trigger an
+ * inline look up for the next block.
+ *
+ * There is some caching which should ameliorate the cost a little.
+ */
+
+/* Align << 13 == 4096 byte alignment */
+.align 13
+.global computed_start
+computed_start:
+subsr0, r0, #1
+beq computed_end
+
+/* Jump table */
+ror r1, r1, #1
+and r2, r1, #1
+adr r3, computed_jump_table
+ldr r2, [r3, r2, lsl #2]
+mov pc, r2
+
+b   computed_err
+
+computed_jump_table:
+.word   computed_start
+.word   computedA
+
+computedA:
+subsr0, r0, #1
+beq computed_end
+
+/* Jump into code */
+ror r1, r1, #1
+and r2, r1, #1
+adr r3, 1f
+addr3, r2, lsl #2
+mov pc, r3
+1:  b   computed_start
+b   computedB

[kvm-unit-tests PATCH v9 6/9] arm/locking-tests: add comprehensive locking test

2021-12-02 Thread Alex Bennée

This test has been written mainly to stress multi-threaded TCG behaviour
but will demonstrate failure by default on real hardware. The test takes
the following parameters:

  - "lock" use GCC's locking semantics
  - "atomic" use GCC's __atomic primitives
  - "wfelock" use WaitForEvent sleep
  - "excl" use load/store exclusive semantics

Also two more options allow the test to be tweaked

  - "noshuffle" disables the memory shuffling
  - "count=%ld" set your own per-CPU increment count

Signed-off-by: Alex Bennée 
Message-Id: <2028184650.661575-8-alex.ben...@linaro.org>

---
v9
  - move back to unittests.cfg, drop accel=tcg
  - s/printf/report_info
---
 arm/Makefile.common |   2 +-
 arm/locking-test.c  | 322 
 arm/spinlock-test.c |  87 
 arm/unittests.cfg   |  30 +
 4 files changed, 353 insertions(+), 88 deletions(-)
 create mode 100644 arm/locking-test.c
 delete mode 100644 arm/spinlock-test.c

diff --git a/arm/Makefile.common b/arm/Makefile.common
index e3f04f2d..f9059718 100644
--- a/arm/Makefile.common
+++ b/arm/Makefile.common
@@ -5,7 +5,6 @@
 #
 
 tests-common  = $(TEST_DIR)/selftest.flat
-tests-common += $(TEST_DIR)/spinlock-test.flat
 tests-common += $(TEST_DIR)/pci-test.flat
 tests-common += $(TEST_DIR)/pmu.flat
 tests-common += $(TEST_DIR)/gic.flat
@@ -13,6 +12,7 @@ tests-common += $(TEST_DIR)/psci.flat
 tests-common += $(TEST_DIR)/sieve.flat
 tests-common += $(TEST_DIR)/pl031.flat
 tests-common += $(TEST_DIR)/tlbflush-code.flat
+tests-common += $(TEST_DIR)/locking-test.flat
 
 tests-all = $(tests-common) $(tests)
 all: directories $(tests-all)
diff --git a/arm/locking-test.c b/arm/locking-test.c
new file mode 100644
index ..93d9250c
--- /dev/null
+++ b/arm/locking-test.c
@@ -0,0 +1,322 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Locking Test
+ *
+ * This test allows us to stress the various atomic primitives of a VM
+ * guest. A number of methods are available that use various patterns
+ * to implement a lock.
+ *
+ * Copyright (C) 2017 Linaro
+ * Author: Alex Bennée 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#define MAX_CPUS 8
+
+/* Test definition structure
+ *
+ * A simple structure that describes the test name, expected pass and
+ * increment function.
+ */
+
+/* Function pointers for test */
+typedef void (*inc_fn)(int cpu);
+
+typedef struct {
+   const char *test_name;
+   bool  should_pass;
+   inc_fn main_fn;
+} test_descr_t;
+
+/* How many increments to do */
+static int increment_count = 100;
+static bool do_shuffle = true;
+
+/* Shared value all the tests attempt to safely increment using
+ * various forms of atomic locking and exclusive behaviour.
+ */
+static unsigned int shared_value;
+
+/* PAGE_SIZE * uint32_t means we span several pages */
+__attribute__((aligned(PAGE_SIZE))) static uint32_t memory_array[PAGE_SIZE];
+
+/* We use the alignment of the following to ensure accesses to locking
+ * and synchronisation primatives don't interfere with the page of the
+ * shared value
+ */
+__attribute__((aligned(PAGE_SIZE))) static unsigned int 
per_cpu_value[MAX_CPUS];
+__attribute__((aligned(PAGE_SIZE))) static cpumask_t smp_test_complete;
+__attribute__((aligned(PAGE_SIZE))) struct isaac_ctx prng_context[MAX_CPUS];
+
+/* Some of the approaches use a global lock to prevent contention. */
+static int global_lock;
+
+/* In any SMP setting this *should* fail due to cores stepping on
+ * each other updating the shared variable
+ */
+static void increment_shared(int cpu)
+{
+   (void)cpu;
+
+   shared_value++;
+}
+
+/* GCC __sync primitives are deprecated in favour of __atomic */
+static void increment_shared_with_lock(int cpu)
+{
+   (void)cpu;
+
+   while (__sync_lock_test_and_set(_lock, 1));
+
+   shared_value++;
+
+   __sync_lock_release(_lock);
+}
+
+/*
+ * In practice even __ATOMIC_RELAXED uses ARM's ldxr/stex exclusive
+ * semantics
+ */
+static void increment_shared_with_atomic(int cpu)
+{
+   (void)cpu;
+
+   __atomic_add_fetch(_value, 1, __ATOMIC_SEQ_CST);
+}
+
+
+/*
+ * Load/store exclusive with WFE (wait-for-event)
+ *
+ * See ARMv8 ARM examples:
+ *   Use of Wait For Event (WFE) and Send Event (SEV) with locks
+ */
+
+static void increment_shared_with_wfelock(int cpu)
+{
+   (void)cpu;
+
+#if defined(__aarch64__)
+   asm volatile(
+   "   mov w1, #1\n"
+   "   sevl\n"
+   "   prfm PSTL1KEEP, [%[lock]]\n"
+   "1: wfe\n"
+   "   ldaxr   w0, [%[lock]]\n"
+   "   cbnzw0, 1b\n"
+   "   stxrw0, w1, [%[lock]]\n"
+

[kvm-unit-tests PATCH v9 7/9] arm/barrier-litmus-tests: add simple mp and sal litmus tests

2021-12-02 Thread Alex Bennée

This adds a framework for adding simple barrier litmus tests against
ARM. The litmus tests aren't as comprehensive as the academic exercises
which will attempt to do all sorts of things to keep racing CPUs synced
up. These tests do honour the "sync" parameter to do a poor-mans
equivalent.

The two litmus tests are:
  - message passing
  - store-after-load

They both have case that should fail (although won't on single-threaded
TCG setups). If barriers aren't working properly the store-after-load
test will fail even on an x86 backend as x86 allows re-ording of non
aliased stores.

I've imported a few more of the barrier primatives from the Linux source
tree so we consistently use macros.

The arm64 barrier primitives trip up on -Wstrict-aliasing so this is
disabled in the Makefile.

Signed-off-by: Alex Bennée 
CC: Will Deacon 
Message-Id: <2028184650.661575-9-alex.ben...@linaro.org>

---
v9
  - return to unittests.cfg, drop accel=tcg
  - use compiler.h for barriers instead of defining outselves
  - s/printf/report_info/
---
 arm/Makefile.common   |   1 +
 lib/arm/asm/barrier.h |  19 ++
 lib/arm64/asm/barrier.h   |  50 +
 arm/barrier-litmus-test.c | 450 ++
 arm/unittests.cfg |  31 +++
 5 files changed, 551 insertions(+)
 create mode 100644 arm/barrier-litmus-test.c

diff --git a/arm/Makefile.common b/arm/Makefile.common
index f9059718..861e5c7f 100644
--- a/arm/Makefile.common
+++ b/arm/Makefile.common
@@ -13,6 +13,7 @@ tests-common += $(TEST_DIR)/sieve.flat
 tests-common += $(TEST_DIR)/pl031.flat
 tests-common += $(TEST_DIR)/tlbflush-code.flat
 tests-common += $(TEST_DIR)/locking-test.flat
+tests-common += $(TEST_DIR)/barrier-litmus-test.flat
 
 tests-all = $(tests-common) $(tests)
 all: directories $(tests-all)
diff --git a/lib/arm/asm/barrier.h b/lib/arm/asm/barrier.h
index 7f868314..0f3670b8 100644
--- a/lib/arm/asm/barrier.h
+++ b/lib/arm/asm/barrier.h
@@ -8,6 +8,9 @@
  * This work is licensed under the terms of the GNU GPL, version 2.
  */
 
+#include 
+#include 
+
 #define sev()  asm volatile("sev" : : : "memory")
 #define wfe()  asm volatile("wfe" : : : "memory")
 #define wfi()  asm volatile("wfi" : : : "memory")
@@ -25,4 +28,20 @@
 #define smp_rmb()  smp_mb()
 #define smp_wmb()  dmb(ishst)
 
+extern void abort(void);
+
+#define smp_store_release(p, v)
\
+do {   \
+   smp_mb();   \
+   WRITE_ONCE(*p, v);  \
+} while (0)
+
+
+#define smp_load_acquire(p)\
+({ \
+   typeof(*p) ___p1 = READ_ONCE(*p);   \
+   smp_mb();   \
+   ___p1;  \
+})
+
 #endif /* _ASMARM_BARRIER_H_ */
diff --git a/lib/arm64/asm/barrier.h b/lib/arm64/asm/barrier.h
index 0e1904cf..5e405190 100644
--- a/lib/arm64/asm/barrier.h
+++ b/lib/arm64/asm/barrier.h
@@ -24,4 +24,54 @@
 #define smp_rmb()  dmb(ishld)
 #define smp_wmb()  dmb(ishst)
 
+#define smp_store_release(p, v)
\
+do {   \
+   switch (sizeof(*p)) {   \
+   case 1: \
+   asm volatile ("stlrb %w1, %0"   \
+   : "=Q" (*p) : "r" (v) : "memory");  \
+   break;  \
+   case 2: \
+   asm volatile ("stlrh %w1, %0"   \
+   : "=Q" (*p) : "r" (v) : "memory");  \
+   break;  \
+   case 4: \
+   asm volatile ("stlr %w1, %0"\
+   : "=Q" (*p) : "r" (v) : "memory");  \
+   break;  \
+   case 8: \
+   asm volatile ("stlr %1, %0" \
+   : "=Q" (*p) : &q

[kvm-unit-tests PATCH v9 5/9] arm/tlbflush-code: TLB flush during code execution

2021-12-02 Thread Alex Bennée

This adds a fairly brain dead torture test for TLB flushes intended
for stressing the MTTCG QEMU build. It takes the usual -smp option for
multiple CPUs.

By default it CPU0 will do a TLBIALL flush after each cycle. You can
pass options via -append to control additional aspects of the test:

  - "page" flush each page in turn (one per function)
  - "self" do the flush after each computation cycle
  - "verbose" report progress on each computation cycle

Signed-off-by: Alex Bennée 
CC: Mark Rutland 
Message-Id: <2028184650.661575-7-alex.ben...@linaro.org>

---
v9
  - move tests back into unittests.cfg (with nodefault mttcg)
  - replace printf with report_info
  - drop accel = tcg
---
 arm/Makefile.common |   1 +
 arm/tlbflush-code.c | 209 
 arm/unittests.cfg   |  25 ++
 3 files changed, 235 insertions(+)
 create mode 100644 arm/tlbflush-code.c

diff --git a/arm/Makefile.common b/arm/Makefile.common
index 99bcf3fc..e3f04f2d 100644
--- a/arm/Makefile.common
+++ b/arm/Makefile.common
@@ -12,6 +12,7 @@ tests-common += $(TEST_DIR)/gic.flat
 tests-common += $(TEST_DIR)/psci.flat
 tests-common += $(TEST_DIR)/sieve.flat
 tests-common += $(TEST_DIR)/pl031.flat
+tests-common += $(TEST_DIR)/tlbflush-code.flat
 
 tests-all = $(tests-common) $(tests)
 all: directories $(tests-all)
diff --git a/arm/tlbflush-code.c b/arm/tlbflush-code.c
new file mode 100644
index ..bf9eb111
--- /dev/null
+++ b/arm/tlbflush-code.c
@@ -0,0 +1,209 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * TLB Flush Race Tests
+ *
+ * These tests are designed to test for incorrect TLB flush semantics
+ * under emulation. The initial CPU will set all the others working a
+ * compuation task and will then trigger TLB flushes across the
+ * system. It doesn't actually need to re-map anything but the flushes
+ * themselves will trigger QEMU's TCG self-modifying code detection
+ * which will invalidate any generated  code causing re-translation.
+ * Eventually the code buffer will fill and a general tb_lush() will
+ * be triggered.
+ *
+ * Copyright (C) 2016-2021, Linaro, Alex Bennée 
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define SEQ_LENGTH 10
+#define SEQ_HASH 0x7cd707fe
+
+static cpumask_t smp_test_complete;
+static int flush_count = 100;
+static bool flush_self;
+static bool flush_page;
+static bool flush_verbose;
+
+/*
+ * Work functions
+ *
+ * These work functions need to be:
+ *
+ *  - page aligned, so we can flush one function at a time
+ *  - have branches, so QEMU TCG generates multiple basic blocks
+ *  - call across pages, so we exercise the TCG basic block slow path
+ */
+
+/* Adler32 */
+__attribute__((aligned(PAGE_SIZE))) static
+uint32_t hash_array(const void *buf, size_t buflen)
+{
+   const uint8_t *data = (uint8_t *) buf;
+   uint32_t s1 = 1;
+   uint32_t s2 = 0;
+
+   for (size_t n = 0; n < buflen; n++) {
+   s1 = (s1 + data[n]) % 65521;
+   s2 = (s2 + s1) % 65521;
+   }
+   return (s2 << 16) | s1;
+}
+
+__attribute__((aligned(PAGE_SIZE))) static
+void create_fib_sequence(int length, unsigned int *array)
+{
+   int i;
+
+   /* first two values */
+   array[0] = 0;
+   array[1] = 1;
+   for (i = 2; i < length; i++)
+   array[i] = array[i-2] + array[i-1];
+}
+
+__attribute__((aligned(PAGE_SIZE))) static
+unsigned long long factorial(unsigned int n)
+{
+   unsigned int i;
+   unsigned long long fac = 1;
+
+   for (i = 1; i <= n; i++)
+   fac = fac * i;
+   return fac;
+}
+
+__attribute__((aligned(PAGE_SIZE))) static
+void factorial_array(unsigned int n, unsigned int *input,
+unsigned long long *output)
+{
+   unsigned int i;
+
+   for (i = 0; i < n; i++)
+   output[i] = factorial(input[i]);
+}
+
+__attribute__((aligned(PAGE_SIZE))) static
+unsigned int do_computation(void)
+{
+   unsigned int fib_array[SEQ_LENGTH];
+   unsigned long long facfib_array[SEQ_LENGTH];
+   uint32_t fib_hash, facfib_hash;
+
+   create_fib_sequence(SEQ_LENGTH, _array[0]);
+   fib_hash = hash_array(_array[0], sizeof(fib_array));
+   factorial_array(SEQ_LENGTH, _array[0], _array[0]);
+   facfib_hash = hash_array(_array[0], sizeof(facfib_array));
+
+   return (fib_hash ^ facfib_hash);
+}
+
+/* This provides a table of the work functions so we can flush each
+ * page individually
+ */
+static void *pages[] = {_array, _fib_sequence, ,
+   _array, _computation};
+
+static void do_flush(int i)
+{
+   if (flush_page)
+   flush_tlb_page((unsigned long)pages[i % ARRAY_SIZE(pages)]);
+   else
+   flush_tlb_all();
+}
+
+
+static void just_compute(void)
+{
+   int i, errors = 0;
+   int cpu = smp_processor_id();
+

[kvm-unit-tests PATCH v9 8/9] arm/run: use separate --accel form

2021-12-02 Thread Alex Bennée

This will allow TCG tests to alter things such as tb-size.

Signed-off-by: Alex Bennée 
Message-Id: <2028184650.661575-10-alex.ben...@linaro.org>
---
 arm/run | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arm/run b/arm/run
index a390ca5a..73c6c83a 100755
--- a/arm/run
+++ b/arm/run
@@ -58,8 +58,8 @@ if $qemu $M -device '?' 2>&1 | grep pci-testdev > /dev/null; 
then
pci_testdev="-device pci-testdev"
 fi
 
-M+=",accel=$ACCEL"
-command="$qemu -nodefaults $M -cpu $processor $chr_testdev $pci_testdev"
+A="-accel $ACCEL"
+command="$qemu -nodefaults $M $A -cpu $processor $chr_testdev $pci_testdev"
 command+=" -display none -serial stdio -kernel"
 command="$(migration_cmd) $(timeout_cmd) $command"
 
-- 
2.30.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[kvm-unit-tests PATCH v9 4/9] lib: add isaac prng library from CCAN

2021-12-02 Thread Alex Bennée

It's often useful to introduce some sort of random variation when
testing several racing CPU conditions. Instead of each test implementing
some half-arsed PRNG bring in a a decent one which has good statistical
randomness. Obviously it is deterministic for a given seed value which
is likely the behaviour you want.

I've pulled in the ISAAC library from CCAN:

http://ccodearchive.net/info/isaac.html

I shaved off the float related stuff which is less useful for unit
testing and re-indented to fit the style. The original license was
CC0 (Public Domain) which is compatible with the LGPL v2 of
kvm-unit-tests.

Signed-off-by: Alex Bennée 
CC: Timothy B. Terriberry 
Acked-by: Andrew Jones 
Message-Id: <2028184650.661575-6-alex.ben...@linaro.org>
---
 arm/Makefile.common |   1 +
 lib/prng.h  |  82 ++
 lib/prng.c  | 162 
 3 files changed, 245 insertions(+)
 create mode 100644 lib/prng.h
 create mode 100644 lib/prng.c

diff --git a/arm/Makefile.common b/arm/Makefile.common
index 38385e0c..99bcf3fc 100644
--- a/arm/Makefile.common
+++ b/arm/Makefile.common
@@ -44,6 +44,7 @@ cflatobjs += lib/pci-testdev.o
 cflatobjs += lib/virtio.o
 cflatobjs += lib/virtio-mmio.o
 cflatobjs += lib/chr-testdev.o
+cflatobjs += lib/prng.o
 cflatobjs += lib/arm/io.o
 cflatobjs += lib/arm/setup.o
 cflatobjs += lib/arm/mmu.o
diff --git a/lib/prng.h b/lib/prng.h
new file mode 100644
index ..bf5776d1
--- /dev/null
+++ b/lib/prng.h
@@ -0,0 +1,82 @@
+/*
+ * PRNG Header
+ */
+#ifndef __PRNG_H__
+#define __PRNG_H__
+
+# include 
+
+
+
+typedef struct isaac_ctx isaac_ctx;
+
+
+
+/*This value may be lowered to reduce memory usage on embedded platforms, at
+  the cost of reducing security and increasing bias.
+  Quoting Bob Jenkins: "The current best guess is that bias is detectable after
+  2**37 values for [ISAAC_SZ_LOG]=3, 2**45 for 4, 2**53 for 5, 2**61 for 6,
+  2**69 for 7, and 2**77 values for [ISAAC_SZ_LOG]=8."*/
+#define ISAAC_SZ_LOG  (8)
+#define ISAAC_SZ  (1<http://www.burtleburtle.net/bob/rand/isaac.html
+  To quote:
+  No efficient method is known for deducing their internal states.
+  ISAAC requires an amortized 18.75 instructions to produce a 32-bit value.
+  There are no cycles in ISAAC shorter than 2**40 values.
+  The expected cycle length is 2**8295 values.*/
+struct isaac_ctx{
+   unsigned n;
+   uint32_t r[ISAAC_SZ];
+   uint32_t m[ISAAC_SZ];
+   uint32_t a;
+   uint32_t b;
+   uint32_t c;
+};
+
+
+/**
+ * isaac_init - Initialize an instance of the ISAAC random number generator.
+ * @_ctx:   The instance to initialize.
+ * @_seed:  The specified seed bytes.
+ *  This may be NULL if _nseed is less than or equal to zero.
+ * @_nseed: The number of bytes to use for the seed.
+ *  If this is greater than ISAAC_SEED_SZ_MAX, the extra bytes are
+ *   ignored.
+ */
+void isaac_init(isaac_ctx *_ctx,const unsigned char *_seed,int _nseed);
+
+/**
+ * isaac_reseed - Mix a new batch of entropy into the current state.
+ * To reset ISAAC to a known state, call isaac_init() again instead.
+ * @_ctx:   The instance to reseed.
+ * @_seed:  The specified seed bytes.
+ *  This may be NULL if _nseed is zero.
+ * @_nseed: The number of bytes to use for the seed.
+ *  If this is greater than ISAAC_SEED_SZ_MAX, the extra bytes are
+ *   ignored.
+ */
+void isaac_reseed(isaac_ctx *_ctx,const unsigned char *_seed,int _nseed);
+/**
+ * isaac_next_uint32 - Return the next random 32-bit value.
+ * @_ctx: The ISAAC instance to generate the value with.
+ */
+uint32_t isaac_next_uint32(isaac_ctx *_ctx);
+/**
+ * isaac_next_uint - Uniform random integer less than the given value.
+ * @_ctx: The ISAAC instance to generate the value with.
+ * @_n:   The upper bound on the range of numbers returned (not inclusive).
+ *This must be greater than zero and less than 2**32.
+ *To return integers in the full range 0...2**32-1, use
+ * isaac_next_uint32() instead.
+ * Return: An integer uniformly distributed between 0 and _n-1 (inclusive).
+ */
+uint32_t isaac_next_uint(isaac_ctx *_ctx,uint32_t _n);
+
+#endif
diff --git a/lib/prng.c b/lib/prng.c
new file mode 100644
index ..ebd6df75
--- /dev/null
+++ b/lib/prng.c
@@ -0,0 +1,162 @@
+/*
+ * Pseudo Random Number Generator
+ *
+ * Lifted from ccan modules ilog/isaac under CC0
+ *   - http://ccodearchive.net/info/isaac.html
+ *   - http://ccodearchive.net/info/ilog.html
+ *
+ * And lightly hacked to compile under the KVM unit test environment.
+ * This provides a handy RNG for torture tests that want to vary
+ * delays and the like.
+ *
+ */
+
+/*Written by Timothy B. Terriberry (tterr...@xiph.org) 1999-2009.
+  CC0 (Public domain) - see LICENSE file for details
+  Based on the public domain implementation by Robert J. Jenkins Jr.*/
+
+#include "libcflat.h"
+
+#include 
+#include &

[kvm-unit-tests PATCH v9 0/9] MTTCG sanity tests for ARM

2021-12-02 Thread Alex Bennée

Hi,

Not a great deal has changed from the last posting although I have
dropped the additional unittests.cfg in favour of setting "nodefault"
for the tests. Otherwise the clean-ups are mainly textual (removing
printfs, random newlines and cleaning up comments). As usual the
details are in the commits bellow the ---.

I've also tweaked .git/config so get_maintainer.pl should ensure
direct delivery of the patches ;-)

Alex Bennée (9):
  docs: mention checkpatch in the README
  arm/flat.lds: don't drop debug during link
  Makefile: add GNU global tags support
  lib: add isaac prng library from CCAN
  arm/tlbflush-code: TLB flush during code execution
  arm/locking-tests: add comprehensive locking test
  arm/barrier-litmus-tests: add simple mp and sal litmus tests
  arm/run: use separate --accel form
  arm/tcg-test: some basic TCG exercising tests

 arm/run   |   4 +-
 Makefile  |   5 +-
 arm/Makefile.arm  |   2 +
 arm/Makefile.arm64|   2 +
 arm/Makefile.common   |   6 +-
 lib/arm/asm/barrier.h |  19 ++
 lib/arm64/asm/barrier.h   |  50 +
 lib/prng.h|  82 +++
 lib/prng.c| 162 ++
 arm/flat.lds  |   1 -
 arm/tcg-test-asm.S| 171 +++
 arm/tcg-test-asm64.S  | 170 ++
 arm/barrier-litmus-test.c | 450 ++
 arm/locking-test.c| 322 +++
 arm/spinlock-test.c   |  87 
 arm/tcg-test.c| 338 
 arm/tlbflush-code.c   | 209 ++
 arm/unittests.cfg | 170 ++
 README.md |   3 +
 19 files changed, 2161 insertions(+), 92 deletions(-)
 create mode 100644 lib/prng.h
 create mode 100644 lib/prng.c
 create mode 100644 arm/tcg-test-asm.S
 create mode 100644 arm/tcg-test-asm64.S
 create mode 100644 arm/barrier-litmus-test.c
 create mode 100644 arm/locking-test.c
 delete mode 100644 arm/spinlock-test.c
 create mode 100644 arm/tcg-test.c
 create mode 100644 arm/tlbflush-code.c

-- 
2.30.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[kvm-unit-tests PATCH v9 3/9] Makefile: add GNU global tags support

2021-12-02 Thread Alex Bennée

If you have ctags you might as well offer gtags as a target.

Signed-off-by: Alex Bennée 
Message-Id: <2028184650.661575-4-alex.ben...@linaro.org>
---
 Makefile | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index b80c31f8..0b7c03ac 100644
--- a/Makefile
+++ b/Makefile
@@ -122,6 +122,9 @@ cscope:
-name '*.[chsS]' -exec realpath --relative-base=$(CURDIR) {} \; 
| sort -u > ./cscope.files
cscope -bk
 
-.PHONY: tags
+.PHONY: tags gtags
 tags:
ctags -R
+
+gtags:
+   gtags
-- 
2.30.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[kvm-unit-tests PATCH v9 2/9] arm/flat.lds: don't drop debug during link

2021-12-02 Thread Alex Bennée

It is useful to keep the debug in the .elf file so we can debug and it
doesn't get copied across to the final .flat file. Of course we still
need to ensure we apply the offset when we load the symbols based on
where QEMU decided to load the kernel.

  (gdb) symbol-file ./builds/arm64/arm/tlbflush-data.elf -o 0x4008

Signed-off-by: Alex Bennée 
Message-Id: <2028184650.661575-3-alex.ben...@linaro.org>
---
 arm/flat.lds | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arm/flat.lds b/arm/flat.lds
index 6fb459ef..47fcb649 100644
--- a/arm/flat.lds
+++ b/arm/flat.lds
@@ -62,7 +62,6 @@ SECTIONS
 /DISCARD/ : {
 *(.note*)
 *(.interp)
-*(.debug*)
 *(.comment)
 *(.dynamic)
 }
-- 
2.30.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[kvm-unit-tests PATCH v9 1/9] docs: mention checkpatch in the README

2021-12-02 Thread Alex Bennée

Signed-off-by: Alex Bennée 
Acked-by: Andrew Jones 
Message-Id: <2028184650.661575-2-alex.ben...@linaro.org>
Acked-by: Thomas Huth 

---
v9
  - slightly more weaselly statement about checkpatch
---
 README.md | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/README.md b/README.md
index 6e6a9d04..245052b4 100644
--- a/README.md
+++ b/README.md
@@ -182,3 +182,6 @@ the code files.  We also start with common code and finish 
with unit test
 code. git-diff's orderFile feature allows us to specify the order in a
 file.  The orderFile we use is `scripts/git.difforder`; adding the config
 with `git config diff.orderFile scripts/git.difforder` enables it.
+
+We strive to follow the Linux kernels coding style so it's recommended
+to run the kernel's ./scripts/checkpatch.pl on new patches.
-- 
2.30.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [kvm-unit-tests PATCH v8 04/10] run_tests.sh: add --config option for alt test set

2021-12-01 Thread Alex Bennée


Andrew Jones  writes:

> On Wed, Dec 01, 2021 at 04:20:02PM +0000, Alex Bennée wrote:
>> 
>> Andrew Jones  writes:
>> 
>> > On Thu, Nov 18, 2021 at 06:46:44PM +, Alex Bennée wrote:
>> >> The upcoming MTTCG tests don't need to be run for normal KVM unit
>> >> tests so lets add the facility to have a custom set of tests.
>> >
>> > I think an environment variable override would be better than this command
>> > line override, because then we could also get mkstandalone to work with
>> > the new unittests.cfg files. Or, it may be better to just add them to
>> > the main unittests.cfg with lines like these
>> >
>> > groups = nodefault mttcg
>> > accel = tcg
>> >
>> > That'll "dirty" the logs with SKIP ... (test marked as manual run only)
>> > for each one, but at least we won't easily forget about running them from
>> > time to time.
>> 
>> So what is the meaning of accel here? Is it:
>> 
>>   - this test only runs on accel FOO
>> 
>> or
>> 
>>   - this test defaults to running on accel FOO
>> 
>> because while the tests are for TCG I want to run them on KVM (so I can
>> validate the test on real HW). If I have accel=tcg then:
>> 
>>   env ACCEL=kvm QEMU=$HOME/lsrc/qemu.git/builds/all/qemu-system-aarch64 
>> ./run_tests.sh -g mttcg
>>   SKIP tlbflush-code::all_other (tcg only, but ACCEL=kvm)
>>   SKIP tlbflush-code::page_other (tcg only, but ACCEL=kvm)
>>   SKIP tlbflush-code::all_self (tcg only, but ACCEL=kvm)
>>   ...
>> 
>> so I can either drop the accel line and rely on nodefault to ensure it
>> doesn't run normally or make the env ACCEL processing less anal about
>> preventing me running TCG tests under KVM. What do you think?
>
> Just drop the 'accel = tcg' line. I only suggested it because I didn't
> know you also wanted to run the MTTCG "specific" tests under KVM. Now,
> that I do, I wonder why we wouldn't run them all the time, i.e. no
> nodefault group? Do the tests not exercise enough hypervisor code to
> be worth the energy used to run them?

I think in most cases if they fail under KVM it wouldn't be due to the
hypervisor being broken but the silicon not meeting it's architectural
specification. I'm fine with them being nodefault for that.

I'm not sure how much the tlbflush code exercises on the host. There is
a WIP tlbflush-data which might make a case for being run more
regularly on KVM.

>
> Thanks,
> drew


-- 
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [kvm-unit-tests PATCH v8 04/10] run_tests.sh: add --config option for alt test set

2021-12-01 Thread Alex Bennée

Andrew Jones  writes:

> On Thu, Nov 18, 2021 at 06:46:44PM +0000, Alex Bennée wrote:
>> The upcoming MTTCG tests don't need to be run for normal KVM unit
>> tests so lets add the facility to have a custom set of tests.
>
> I think an environment variable override would be better than this command
> line override, because then we could also get mkstandalone to work with
> the new unittests.cfg files. Or, it may be better to just add them to
> the main unittests.cfg with lines like these
>
> groups = nodefault mttcg
> accel = tcg
>
> That'll "dirty" the logs with SKIP ... (test marked as manual run only)
> for each one, but at least we won't easily forget about running them from
> time to time.

So what is the meaning of accel here? Is it:

  - this test only runs on accel FOO

or

  - this test defaults to running on accel FOO

because while the tests are for TCG I want to run them on KVM (so I can
validate the test on real HW). If I have accel=tcg then:

  env ACCEL=kvm QEMU=$HOME/lsrc/qemu.git/builds/all/qemu-system-aarch64 
./run_tests.sh -g mttcg
  SKIP tlbflush-code::all_other (tcg only, but ACCEL=kvm)
  SKIP tlbflush-code::page_other (tcg only, but ACCEL=kvm)
  SKIP tlbflush-code::all_self (tcg only, but ACCEL=kvm)
  ...

so I can either drop the accel line and rely on nodefault to ensure it
doesn't run normally or make the env ACCEL processing less anal about
preventing me running TCG tests under KVM. What do you think?

-- 
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [kvm-unit-tests PATCH v3 0/3] GIC ITS tests

2021-11-30 Thread Alex Bennée


Andrew Jones  writes:

> On Fri, Nov 19, 2021 at 04:30:47PM +0000, Alex Bennée wrote:
>> 
>> Andrew Jones  writes:
>> 
>> > On Fri, Nov 12, 2021 at 02:08:01PM +, Alex Bennée wrote:
>> >> 
>> >> Andrew Jones  writes:
>> >> 
>> >> > On Fri, Nov 12, 2021 at 11:47:31AM +, Alex Bennée wrote:
>> >> >> Hi,
>> >> >> 
>> >> >> Sorry this has been sitting in my tree so long. The changes are fairly
>> >> >> minor from v2. I no longer split the tests up into TCG and KVM
>> >> >> versions and instead just ensure that ERRATA_FORCE is always set when
>> >> >> run under TCG.
>> >> >> 
>> >> >> Alex Bennée (3):
>> >> >>   arm64: remove invalid check from its-trigger test
>> >> >>   arm64: enable its-migration tests for TCG
>> >> >>   arch-run: do not process ERRATA when running under TCG
>> >> >> 
>> >> >>  scripts/arch-run.bash |  4 +++-
>> >> >>  arm/gic.c | 16 ++--
>> >> >>  arm/unittests.cfg |  3 ---
>> >> >>  3 files changed, 9 insertions(+), 14 deletions(-)
>> >> >> 
>> >> >> -- 
>> >> >> 2.30.2
>> >> >> 
>> >> >> ___
>> >> >> kvmarm mailing list
>> >> >> kvmarm@lists.cs.columbia.edu
>> >> >> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
>> >> >
>> >> > Hi Alex,
>> >> >
>> >> > Thanks for this. I've applied to arm/queue, but I see that
>> >> >
>> >> > FAIL: gicv3: its-trigger: inv/invall: dev2/eventid=20 pending LPI is 
>> >> > received
>> >> >
>> >> > consistently fails for me. Is that expected? Does it work for you?
>> >> 
>> >> doh - looks like I cocked up the merge conflict...
>> >> 
>> >> Did it fail for TCG or for KVM (or both)?
>> >
>> > Just TCG, which was why I was wondering if it was expected. I've never run
>> > these tests with TCG before.
>> 
>> Hmm I think expecting the IRQ at all is broken so I think I should
>> delete the whole pending test.
>
> Feel free to repost. I'll update the patches in arm/queue before my next
> MR.

Actually I think the problem was with a regression in the TCG ITS
support (now fixed in master). So I believe as of v3 everything is
correct (and v4 should be ignored).

Are you happy to apply this series or do you want me to repost it as v5?

>
> Thanks,
> drew


-- 
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [kvm-unit-tests PATCH v8 01/10] docs: mention checkpatch in the README

2021-11-24 Thread Alex Bennée


Andrew Jones  writes:

> On Thu, Nov 18, 2021 at 06:46:41PM +0000, Alex Bennée wrote:
>> Signed-off-by: Alex Bennée 
>> ---
>>  README.md | 2 ++
>>  1 file changed, 2 insertions(+)
>> 
>> diff --git a/README.md b/README.md
>> index b498aaf..5db48e5 100644
>> --- a/README.md
>> +++ b/README.md
>> @@ -182,3 +182,5 @@ the code files.  We also start with common code and 
>> finish with unit test
>>  code. git-diff's orderFile feature allows us to specify the order in a
>>  file.  The orderFile we use is `scripts/git.difforder`; adding the config
>>  with `git config diff.orderFile scripts/git.difforder` enables it.
>> +
>> +Please run the kernel's ./scripts/checkpatch.pl on new patches
>
> This is a bit of a problem for kvm-unit-tests code which still has a mix
> of styles since it was originally written with a strange tab and space
> mixed style. If somebody is patching one of those files we've usually
> tried to maintain the original style rather than reformat the whole
> thing (in hindsight maybe we should have just reformatted). We're also
> more flexible with line length than Linux, although Linux now only warns
> for anything over 80 as long as it's under 100, which is probably good
> enough for us too. Anyway, let's see what Paolo and Thomas say. Personally
> I wouldn't mind adding this line to the documentation, so I'll ack it.
> Anyway, we can also ignore our own advise when it suits us :-)
>
> Acked-by: Andrew Jones 

I can make the wording more weaselly:

 We strive to follow the Linux kernels coding style so it's recommended
 to run the kernel's ./scripts/checkpatch.pl on new patches.

I added this reference because on the older iterations of these test
divergence from the kernel coding style was pointed out and I've fixed
them in this iteration.

>
> Thanks,
> drew


-- 
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[kvm-unit-tests PATCH v4 0/3] GIC ITS tests

2021-11-19 Thread Alex Bennée

Hi,

changes since v3:

  - dropped the pending LPI test altogether

Alex Bennée (3):
  arm64: remove invalid check from its-trigger test
  arm64: enable its-migration tests for TCG
  arch-run: do not process ERRATA when running under TCG

 scripts/arch-run.bash |  4 +++-
 arm/gic.c | 28 
 arm/unittests.cfg |  3 ---
 3 files changed, 11 insertions(+), 24 deletions(-)

-- 
2.30.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[kvm-unit-tests PATCH v4 3/3] arch-run: do not process ERRATA when running under TCG

2021-11-19 Thread Alex Bennée

All the errata checking looks at the current host kernel version. For
TCG runs this is entirely irrelevant as the host kernel has no impact
on the behaviour of the guest. In fact we should set ERRATA_FORCE to
ensure we run those tests as QEMU doesn't attempt to model
non-confirming architectures.

Signed-off-by: Alex Bennée 
---
 scripts/arch-run.bash | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/scripts/arch-run.bash b/scripts/arch-run.bash
index 43da998..f1f4456 100644
--- a/scripts/arch-run.bash
+++ b/scripts/arch-run.bash
@@ -267,7 +267,9 @@ env_file ()
 
 env_errata ()
 {
-   if [ "$ERRATATXT" ] && [ ! -f "$ERRATATXT" ]; then
+   if [ "$ACCEL" = "tcg" ]; then
+   eval export "ERRATA_FORCE=y"
+   elif [ "$ERRATATXT" ] && [ ! -f "$ERRATATXT" ]; then
echo "$ERRATATXT not found. (ERRATATXT=$ERRATATXT)" >&2
return 2
elif [ "$ERRATATXT" ]; then
-- 
2.30.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[kvm-unit-tests PATCH v4 2/3] arm64: enable its-migration tests for TCG

2021-11-19 Thread Alex Bennée

With the support for TCG emulated GIC we can also test these now.

Signed-off-by: Alex Bennée 
Reviewed-by: Eric Auger 
Reviewed-by: Andrew Jones 
Cc: Shashi Mallela 
Message-Id: <20210525172628.2088-4-alex.ben...@linaro.org>

---
v3
  - add its-migrate-unmapped-collection
---
 arm/unittests.cfg | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/arm/unittests.cfg b/arm/unittests.cfg
index f776b66..21474b8 100644
--- a/arm/unittests.cfg
+++ b/arm/unittests.cfg
@@ -194,7 +194,6 @@ arch = arm64
 [its-migration]
 file = gic.flat
 smp = $MAX_SMP
-accel = kvm
 extra_params = -machine gic-version=3 -append 'its-migration'
 groups = its migration
 arch = arm64
@@ -202,7 +201,6 @@ arch = arm64
 [its-pending-migration]
 file = gic.flat
 smp = $MAX_SMP
-accel = kvm
 extra_params = -machine gic-version=3 -append 'its-pending-migration'
 groups = its migration
 arch = arm64
@@ -210,7 +208,6 @@ arch = arm64
 [its-migrate-unmapped-collection]
 file = gic.flat
 smp = $MAX_SMP
-accel = kvm
 extra_params = -machine gic-version=3 -append 'its-migrate-unmapped-collection'
 groups = its migration
 arch = arm64
-- 
2.30.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[kvm-unit-tests PATCH v4 1/3] arm64: remove invalid check from its-trigger test

2021-11-19 Thread Alex Bennée

While an IRQ is not "guaranteed to be visible until an appropriate
invalidation" it doesn't stop the actual implementation delivering it
earlier if it wants to. This is the case for QEMU's TCG and as tests
should only be checking architectural compliance this check is
invalid.

Signed-off-by: Alex Bennée 
Reviewed-by: Eric Auger 
Cc: Shashi Mallela 
Message-Id: <20210525172628.2088-2-alex.ben...@linaro.org>

---
v4
  - drop the pending test altogether
v3
  - reflow the comment, drop "willingly do not call" as per Eric's suggestion
---
 arm/gic.c | 28 
 1 file changed, 8 insertions(+), 20 deletions(-)

diff --git a/arm/gic.c b/arm/gic.c
index 98135ef..1b9ad06 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -732,34 +732,22 @@ static void test_its_trigger(void)
"dev2/eventid=20 does not trigger any LPI");
 
/*
-* re-enable the LPI but willingly do not call invall
-* so the change in config is not taken into account.
-* The LPI should not hit
+* re-enable the LPI. While "A change to the LPI configuration
+* is not guaranteed to be visible until an appropriate
+* invalidation operation has completed" hardware that doesn't
+* implement caches may have delivered the event at any point
+* after the enabling. Check the LPI has hit by the time the
+* invall is done.
 */
-   gicv3_lpi_set_config(8195, LPI_PROP_DEFAULT);
-   stats_reset();
-   cpumask_clear();
-   its_send_int(dev2, 20);
-   wait_for_interrupts();
-   report(check_acked(, -1, -1),
-   "dev2/eventid=20 still does not trigger any LPI");
-
-   /* Now call the invall and check the LPI hits */
stats_reset();
-   cpumask_clear();
-   cpumask_set_cpu(3, );
+   gicv3_lpi_set_config(8195, LPI_PROP_DEFAULT);
its_send_invall(col3);
-   wait_for_interrupts();
-   report(check_acked(, 0, 8195),
-   "dev2/eventid=20 pending LPI is received");
-
-   stats_reset();
cpumask_clear();
cpumask_set_cpu(3, );
its_send_int(dev2, 20);
wait_for_interrupts();
report(check_acked(, 0, 8195),
-   "dev2/eventid=20 now triggers an LPI");
+   "dev2/eventid=20 triggers an LPI");
 
report_prefix_pop();
 
-- 
2.30.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [kvm-unit-tests PATCH v3 0/3] GIC ITS tests

2021-11-19 Thread Alex Bennée


Andrew Jones  writes:

> On Fri, Nov 12, 2021 at 02:08:01PM +0000, Alex Bennée wrote:
>> 
>> Andrew Jones  writes:
>> 
>> > On Fri, Nov 12, 2021 at 11:47:31AM +, Alex Bennée wrote:
>> >> Hi,
>> >> 
>> >> Sorry this has been sitting in my tree so long. The changes are fairly
>> >> minor from v2. I no longer split the tests up into TCG and KVM
>> >> versions and instead just ensure that ERRATA_FORCE is always set when
>> >> run under TCG.
>> >> 
>> >> Alex Bennée (3):
>> >>   arm64: remove invalid check from its-trigger test
>> >>   arm64: enable its-migration tests for TCG
>> >>   arch-run: do not process ERRATA when running under TCG
>> >> 
>> >>  scripts/arch-run.bash |  4 +++-
>> >>  arm/gic.c | 16 ++--
>> >>  arm/unittests.cfg |  3 ---
>> >>  3 files changed, 9 insertions(+), 14 deletions(-)
>> >> 
>> >> -- 
>> >> 2.30.2
>> >> 
>> >> ___
>> >> kvmarm mailing list
>> >> kvmarm@lists.cs.columbia.edu
>> >> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
>> >
>> > Hi Alex,
>> >
>> > Thanks for this. I've applied to arm/queue, but I see that
>> >
>> > FAIL: gicv3: its-trigger: inv/invall: dev2/eventid=20 pending LPI is 
>> > received
>> >
>> > consistently fails for me. Is that expected? Does it work for you?
>> 
>> doh - looks like I cocked up the merge conflict...
>> 
>> Did it fail for TCG or for KVM (or both)?
>
> Just TCG, which was why I was wondering if it was expected. I've never run
> these tests with TCG before.

Hmm I think expecting the IRQ at all is broken so I think I should
delete the whole pending test.

>
> Thanks,
> drew


-- 
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[kvm-unit-tests PATCH v8 10/10] arm/tcg-test: some basic TCG exercising tests

2021-11-18 Thread Alex Bennée

These tests are not really aimed at KVM at all but exist to stretch
QEMU's TCG code generator. In particular these exercise the ability of
the TCG to:

  * Chain TranslationBlocks together (tight)
  * Handle heavy usage of the tb_jump_cache (paged)
  * Pathological case of computed local jumps (computed)

In addition the tests can be varied by adding IPI IRQs or SMC sequences
into the mix to stress the tcg_exit and invalidation mechanisms.

To explicitly stress the tb_flush() mechanism you can use the mod/rounds
parameters to force more frequent tb invalidation. Combined with setting
-tb-size 1 in QEMU to limit the code generation buffer size.

Signed-off-by: Alex Bennée 

---
v5
  - added armv8 version of the tcg tests
  - max out at -smp 4 in unittests.cfg
  - add up IRQs sent and delivered for PASS/FAIL
  - take into account error count
  - add "rounds=" parameter
  - tweak smc to tb-size=1
  - printf fmt fix
v7
  - merged in IRQ numerology
  - updated to latest IRQ API
v8
  - fix report usage
  - fix checkpatch errors
---
 arm/Makefile.arm |   2 +
 arm/Makefile.arm64   |   2 +
 arm/Makefile.common  |   1 +
 arm/tcg-test-asm.S   | 171 ++
 arm/tcg-test-asm64.S | 170 ++
 arm/tcg-test.c   | 338 +++
 arm/mttcgtests.cfg   |  84 +++
 7 files changed, 768 insertions(+)
 create mode 100644 arm/tcg-test-asm.S
 create mode 100644 arm/tcg-test-asm64.S
 create mode 100644 arm/tcg-test.c

diff --git a/arm/Makefile.arm b/arm/Makefile.arm
index 3a4cc6b..05e47f1 100644
--- a/arm/Makefile.arm
+++ b/arm/Makefile.arm
@@ -31,4 +31,6 @@ tests =
 
 include $(SRCDIR)/$(TEST_DIR)/Makefile.common
 
+$(TEST_DIR)/tcg-test.elf: $(cstart.o) $(TEST_DIR)/tcg-test.o 
$(TEST_DIR)/tcg-test-asm.o
+
 arch_clean: arm_clean
diff --git a/arm/Makefile.arm64 b/arm/Makefile.arm64
index e8a38d7..ac94f8e 100644
--- a/arm/Makefile.arm64
+++ b/arm/Makefile.arm64
@@ -34,5 +34,7 @@ tests += $(TEST_DIR)/cache.flat
 
 include $(SRCDIR)/$(TEST_DIR)/Makefile.common
 
+$(TEST_DIR)/tcg-test.elf: $(cstart.o) $(TEST_DIR)/tcg-test.o 
$(TEST_DIR)/tcg-test-asm64.o
+
 arch_clean: arm_clean
$(RM) lib/arm64/.*.d
diff --git a/arm/Makefile.common b/arm/Makefile.common
index 861e5c7..abb6948 100644
--- a/arm/Makefile.common
+++ b/arm/Makefile.common
@@ -14,6 +14,7 @@ tests-common += $(TEST_DIR)/pl031.flat
 tests-common += $(TEST_DIR)/tlbflush-code.flat
 tests-common += $(TEST_DIR)/locking-test.flat
 tests-common += $(TEST_DIR)/barrier-litmus-test.flat
+tests-common += $(TEST_DIR)/tcg-test.flat
 
 tests-all = $(tests-common) $(tests)
 all: directories $(tests-all)
diff --git a/arm/tcg-test-asm.S b/arm/tcg-test-asm.S
new file mode 100644
index 000..4ec4978
--- /dev/null
+++ b/arm/tcg-test-asm.S
@@ -0,0 +1,171 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * TCG Test assembler functions for armv7 tests.
+ *
+ * Copyright (C) 2016, Linaro Ltd, Alex Bennée 
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.
+ *
+ * These helper functions are written in pure asm to control the size
+ * of the basic blocks and ensure they fit neatly into page
+ * aligned chunks. The pattern of branches they follow is determined by
+ * the 32 bit seed they are passed. It should be the same for each set.
+ *
+ * Calling convention
+ *  - r0, iterations
+ *  - r1, jump pattern
+ *  - r2-r3, scratch
+ *
+ * Returns r0
+ */
+
+.arm
+
+.section .text
+
+/* Tight - all blocks should quickly be patched and should run
+ * very fast unless irqs or smc gets in the way
+ */
+
+.global tight_start
+tight_start:
+subsr0, r0, #1
+beq tight_end
+
+ror r1, r1, #1
+tst r1, #1
+beq tightA
+b   tight_start
+
+tightA:
+subsr0, r0, #1
+beq tight_end
+
+ror r1, r1, #1
+tst r1, #1
+beq tightB
+b   tight_start
+
+tightB:
+subsr0, r0, #1
+beq tight_end
+
+ror r1, r1, #1
+tst r1, #1
+beq tight_start
+b   tightA
+
+.global tight_end
+tight_end:
+mov pc, lr
+
+/*
+ * Computed jumps cannot be hardwired into the basic blocks so each one
+ * will cause an exit for the main execution loop to look up the next block.
+ *
+ * There is some caching which should ameliorate the cost a little.
+ */
+
+/* Align << 13 == 4096 byte alignment */
+.align 13
+.global computed_start
+computed_start:
+subsr0, r0, #1
+beq computed_end
+
+/* Jump table */
+ror r1, r1, #1
+and r2, r1, #1
+adr r3, computed_jump_table
+ldr r2, [r3, r2, lsl #2]
+mov pc, r2
+
+b   computed_err
+
+computed_jump_table:
+.word   computed_start
+.word   computedA
+
+computedA:
+subsr0, r0, #1
+beq computed_end
+

[kvm-unit-tests PATCH v8 05/10] lib: add isaac prng library from CCAN

2021-11-18 Thread Alex Bennée

It's often useful to introduce some sort of random variation when
testing several racing CPU conditions. Instead of each test implementing
some half-arsed PRNG bring in a a decent one which has good statistical
randomness. Obviously it is deterministic for a given seed value which
is likely the behaviour you want.

I've pulled in the ISAAC library from CCAN:

http://ccodearchive.net/info/isaac.html

I shaved off the float related stuff which is less useful for unit
testing and re-indented to fit the style. The original license was
CC0 (Public Domain) which is compatible with the LGPL v2 of
kvm-unit-tests.

Signed-off-by: Alex Bennée 
CC: Timothy B. Terriberry 
Acked-by: Andrew Jones 
---
 arm/Makefile.common |   1 +
 lib/prng.h  |  82 ++
 lib/prng.c  | 162 
 3 files changed, 245 insertions(+)
 create mode 100644 lib/prng.h
 create mode 100644 lib/prng.c

diff --git a/arm/Makefile.common b/arm/Makefile.common
index 38385e0..99bcf3f 100644
--- a/arm/Makefile.common
+++ b/arm/Makefile.common
@@ -44,6 +44,7 @@ cflatobjs += lib/pci-testdev.o
 cflatobjs += lib/virtio.o
 cflatobjs += lib/virtio-mmio.o
 cflatobjs += lib/chr-testdev.o
+cflatobjs += lib/prng.o
 cflatobjs += lib/arm/io.o
 cflatobjs += lib/arm/setup.o
 cflatobjs += lib/arm/mmu.o
diff --git a/lib/prng.h b/lib/prng.h
new file mode 100644
index 000..bf5776d
--- /dev/null
+++ b/lib/prng.h
@@ -0,0 +1,82 @@
+/*
+ * PRNG Header
+ */
+#ifndef __PRNG_H__
+#define __PRNG_H__
+
+# include 
+
+
+
+typedef struct isaac_ctx isaac_ctx;
+
+
+
+/*This value may be lowered to reduce memory usage on embedded platforms, at
+  the cost of reducing security and increasing bias.
+  Quoting Bob Jenkins: "The current best guess is that bias is detectable after
+  2**37 values for [ISAAC_SZ_LOG]=3, 2**45 for 4, 2**53 for 5, 2**61 for 6,
+  2**69 for 7, and 2**77 values for [ISAAC_SZ_LOG]=8."*/
+#define ISAAC_SZ_LOG  (8)
+#define ISAAC_SZ  (1<http://www.burtleburtle.net/bob/rand/isaac.html
+  To quote:
+  No efficient method is known for deducing their internal states.
+  ISAAC requires an amortized 18.75 instructions to produce a 32-bit value.
+  There are no cycles in ISAAC shorter than 2**40 values.
+  The expected cycle length is 2**8295 values.*/
+struct isaac_ctx{
+   unsigned n;
+   uint32_t r[ISAAC_SZ];
+   uint32_t m[ISAAC_SZ];
+   uint32_t a;
+   uint32_t b;
+   uint32_t c;
+};
+
+
+/**
+ * isaac_init - Initialize an instance of the ISAAC random number generator.
+ * @_ctx:   The instance to initialize.
+ * @_seed:  The specified seed bytes.
+ *  This may be NULL if _nseed is less than or equal to zero.
+ * @_nseed: The number of bytes to use for the seed.
+ *  If this is greater than ISAAC_SEED_SZ_MAX, the extra bytes are
+ *   ignored.
+ */
+void isaac_init(isaac_ctx *_ctx,const unsigned char *_seed,int _nseed);
+
+/**
+ * isaac_reseed - Mix a new batch of entropy into the current state.
+ * To reset ISAAC to a known state, call isaac_init() again instead.
+ * @_ctx:   The instance to reseed.
+ * @_seed:  The specified seed bytes.
+ *  This may be NULL if _nseed is zero.
+ * @_nseed: The number of bytes to use for the seed.
+ *  If this is greater than ISAAC_SEED_SZ_MAX, the extra bytes are
+ *   ignored.
+ */
+void isaac_reseed(isaac_ctx *_ctx,const unsigned char *_seed,int _nseed);
+/**
+ * isaac_next_uint32 - Return the next random 32-bit value.
+ * @_ctx: The ISAAC instance to generate the value with.
+ */
+uint32_t isaac_next_uint32(isaac_ctx *_ctx);
+/**
+ * isaac_next_uint - Uniform random integer less than the given value.
+ * @_ctx: The ISAAC instance to generate the value with.
+ * @_n:   The upper bound on the range of numbers returned (not inclusive).
+ *This must be greater than zero and less than 2**32.
+ *To return integers in the full range 0...2**32-1, use
+ * isaac_next_uint32() instead.
+ * Return: An integer uniformly distributed between 0 and _n-1 (inclusive).
+ */
+uint32_t isaac_next_uint(isaac_ctx *_ctx,uint32_t _n);
+
+#endif
diff --git a/lib/prng.c b/lib/prng.c
new file mode 100644
index 000..ebd6df7
--- /dev/null
+++ b/lib/prng.c
@@ -0,0 +1,162 @@
+/*
+ * Pseudo Random Number Generator
+ *
+ * Lifted from ccan modules ilog/isaac under CC0
+ *   - http://ccodearchive.net/info/isaac.html
+ *   - http://ccodearchive.net/info/ilog.html
+ *
+ * And lightly hacked to compile under the KVM unit test environment.
+ * This provides a handy RNG for torture tests that want to vary
+ * delays and the like.
+ *
+ */
+
+/*Written by Timothy B. Terriberry (tterr...@xiph.org) 1999-2009.
+  CC0 (Public domain) - see LICENSE file for details
+  Based on the public domain implementation by Robert J. Jenkins Jr.*/
+
+#include "libcflat.h"
+
+#include 
+#include "prng.h"
+
+#define ISAAC_MASK(0xU)
+
+/* Extract ISA

[kvm-unit-tests PATCH v8 09/10] arm/run: use separate --accel form

2021-11-18 Thread Alex Bennée

This will allow TCG tests to alter things such as tb-size.

Signed-off-by: Alex Bennée 
---
 arm/run | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arm/run b/arm/run
index a390ca5..73c6c83 100755
--- a/arm/run
+++ b/arm/run
@@ -58,8 +58,8 @@ if $qemu $M -device '?' 2>&1 | grep pci-testdev > /dev/null; 
then
pci_testdev="-device pci-testdev"
 fi
 
-M+=",accel=$ACCEL"
-command="$qemu -nodefaults $M -cpu $processor $chr_testdev $pci_testdev"
+A="-accel $ACCEL"
+command="$qemu -nodefaults $M $A -cpu $processor $chr_testdev $pci_testdev"
 command+=" -display none -serial stdio -kernel"
 command="$(migration_cmd) $(timeout_cmd) $command"
 
-- 
2.30.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[kvm-unit-tests PATCH v8 08/10] arm/barrier-litmus-tests: add simple mp and sal litmus tests

2021-11-18 Thread Alex Bennée

This adds a framework for adding simple barrier litmus tests against
ARM. The litmus tests aren't as comprehensive as the academic exercises
which will attempt to do all sorts of things to keep racing CPUs synced
up. These tests do honour the "sync" parameter to do a poor-mans
equivalent.

The two litmus tests are:
  - message passing
  - store-after-load

They both have case that should fail (although won't on single-threaded
TCG setups). If barriers aren't working properly the store-after-load
test will fail even on an x86 backend as x86 allows re-ording of non
aliased stores.

I've imported a few more of the barrier primatives from the Linux source
tree so we consistently use macros.

The arm64 barrier primitives trip up on -Wstrict-aliasing so this is
disabled in the Makefile.

Signed-off-by: Alex Bennée 
CC: Will Deacon 

---
v8
  - move to mttcgtests.cfg
  - fix checkpatch issues
  - fix report usage
v7
  - merge in store-after-load
  - clean-up sync-up code
  - use new counter api
  - fix xfail for sal test
v6
  - add a unittest.cfg
  - -fno-strict-aliasing
---
 arm/Makefile.common   |   1 +
 lib/arm/asm/barrier.h |  61 ++
 lib/arm64/asm/barrier.h   |  50 +
 arm/barrier-litmus-test.c | 450 ++
 arm/mttcgtests.cfg|  33 +++
 5 files changed, 595 insertions(+)
 create mode 100644 arm/barrier-litmus-test.c

diff --git a/arm/Makefile.common b/arm/Makefile.common
index f905971..861e5c7 100644
--- a/arm/Makefile.common
+++ b/arm/Makefile.common
@@ -13,6 +13,7 @@ tests-common += $(TEST_DIR)/sieve.flat
 tests-common += $(TEST_DIR)/pl031.flat
 tests-common += $(TEST_DIR)/tlbflush-code.flat
 tests-common += $(TEST_DIR)/locking-test.flat
+tests-common += $(TEST_DIR)/barrier-litmus-test.flat
 
 tests-all = $(tests-common) $(tests)
 all: directories $(tests-all)
diff --git a/lib/arm/asm/barrier.h b/lib/arm/asm/barrier.h
index 7f86831..2870080 100644
--- a/lib/arm/asm/barrier.h
+++ b/lib/arm/asm/barrier.h
@@ -8,6 +8,8 @@
  * This work is licensed under the terms of the GNU GPL, version 2.
  */
 
+#include 
+
 #define sev()  asm volatile("sev" : : : "memory")
 #define wfe()  asm volatile("wfe" : : : "memory")
 #define wfi()  asm volatile("wfi" : : : "memory")
@@ -25,4 +27,63 @@
 #define smp_rmb()  smp_mb()
 #define smp_wmb()  dmb(ishst)
 
+extern void abort(void);
+
+static inline void __write_once_size(volatile void *p, void *res, int size)
+{
+   switch (size) {
+   case 1: *(volatile uint8_t *)p = *(uint8_t *)res; break;
+   case 2: *(volatile uint16_t *)p = *(uint16_t *)res; break;
+   case 4: *(volatile uint32_t *)p = *(uint32_t *)res; break;
+   case 8: *(volatile uint64_t *)p = *(uint64_t *)res; break;
+   default:
+   /* unhandled case */
+   abort();
+   }
+}
+
+#define WRITE_ONCE(x, val) \
+({ \
+   union { typeof(x) __val; char __c[1]; } __u =   \
+   { .__val = (typeof(x)) (val) }; \
+   __write_once_size(&(x), __u.__c, sizeof(x));\
+   __u.__val;  \
+})
+
+#define smp_store_release(p, v)
\
+do {   \
+   smp_mb();   \
+   WRITE_ONCE(*p, v);  \
+} while (0)
+
+
+static inline
+void __read_once_size(const volatile void *p, void *res, int size)
+{
+   switch (size) {
+   case 1: *(uint8_t *)res = *(volatile uint8_t *)p; break;
+   case 2: *(uint16_t *)res = *(volatile uint16_t *)p; break;
+   case 4: *(uint32_t *)res = *(volatile uint32_t *)p; break;
+   case 8: *(uint64_t *)res = *(volatile uint64_t *)p; break;
+   default:
+   /* unhandled case */
+   abort();
+   }
+}
+
+#define READ_ONCE(x)   \
+({ \
+   union { typeof(x) __val; char __c[1]; } __u;\
+   __read_once_size(&(x), __u.__c, sizeof(x)); \
+   __u.__val;  \
+})
+
+
+#define smp_load_acquire(p)\
+({ \
+   typeof(*p) ___p1 = READ_ONCE(*p);   \
+   smp_mb();   \
+   ___p1;  \
+})
+
 #endif /* _ASMARM_BARRIER_H_ */
diff --git a/lib/arm64/asm/barrier.h b/lib/arm64/asm/barrier.h
index 0e1904c..5e40519 100644
--- a/lib/arm64/asm/barrier.h
+++ b/lib/arm64/asm/barrier.h
@@ -24,4

[kvm-unit-tests PATCH v8 07/10] arm/locking-tests: add comprehensive locking test

2021-11-18 Thread Alex Bennée

This test has been written mainly to stress multi-threaded TCG behaviour
but will demonstrate failure by default on real hardware. The test takes
the following parameters:

  - "lock" use GCC's locking semantics
  - "atomic" use GCC's __atomic primitives
  - "wfelock" use WaitForEvent sleep
  - "excl" use load/store exclusive semantics

Also two more options allow the test to be tweaked

  - "noshuffle" disables the memory shuffling
  - "count=%ld" set your own per-CPU increment count

Signed-off-by: Alex Bennée 

---
v2
  - Don't use thumb style strexeq stuff
  - Add atomic and wfelock tests
  - Add count/noshuffle test controls
  - Move barrier tests to separate test file
v4
  - fix up unitests.cfg to use correct test name
  - move into "locking" group, remove barrier tests
  - use a table to add tests, mark which are expected to work
  - correctly report XFAIL
v5
  - max out at -smp 4 in unittest.cfg
v7
  - make test control flags bools
  - default the count to 10 (so it doesn't timeout)
v8
  - rm spinlock test
  - fix checkpatch errors
  - fix report usage
---
 arm/Makefile.common |   2 +-
 arm/locking-test.c  | 322 
 arm/spinlock-test.c |  87 
 arm/mttcgtests.cfg  |  29 
 4 files changed, 352 insertions(+), 88 deletions(-)
 create mode 100644 arm/locking-test.c
 delete mode 100644 arm/spinlock-test.c

diff --git a/arm/Makefile.common b/arm/Makefile.common
index e3f04f2..f905971 100644
--- a/arm/Makefile.common
+++ b/arm/Makefile.common
@@ -5,7 +5,6 @@
 #
 
 tests-common  = $(TEST_DIR)/selftest.flat
-tests-common += $(TEST_DIR)/spinlock-test.flat
 tests-common += $(TEST_DIR)/pci-test.flat
 tests-common += $(TEST_DIR)/pmu.flat
 tests-common += $(TEST_DIR)/gic.flat
@@ -13,6 +12,7 @@ tests-common += $(TEST_DIR)/psci.flat
 tests-common += $(TEST_DIR)/sieve.flat
 tests-common += $(TEST_DIR)/pl031.flat
 tests-common += $(TEST_DIR)/tlbflush-code.flat
+tests-common += $(TEST_DIR)/locking-test.flat
 
 tests-all = $(tests-common) $(tests)
 all: directories $(tests-all)
diff --git a/arm/locking-test.c b/arm/locking-test.c
new file mode 100644
index 000..eab9497
--- /dev/null
+++ b/arm/locking-test.c
@@ -0,0 +1,322 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Locking Test
+ *
+ * This test allows us to stress the various atomic primitives of a VM
+ * guest. A number of methods are available that use various patterns
+ * to implement a lock.
+ *
+ * Copyright (C) 2017 Linaro
+ * Author: Alex Bennée 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#define MAX_CPUS 8
+
+/* Test definition structure
+ *
+ * A simple structure that describes the test name, expected pass and
+ * increment function.
+ */
+
+/* Function pointers for test */
+typedef void (*inc_fn)(int cpu);
+
+typedef struct {
+   const char *test_name;
+   bool  should_pass;
+   inc_fn main_fn;
+} test_descr_t;
+
+/* How many increments to do */
+static int increment_count = 100;
+static bool do_shuffle = true;
+
+/* Shared value all the tests attempt to safely increment using
+ * various forms of atomic locking and exclusive behaviour.
+ */
+static unsigned int shared_value;
+
+/* PAGE_SIZE * uint32_t means we span several pages */
+__attribute__((aligned(PAGE_SIZE))) static uint32_t memory_array[PAGE_SIZE];
+
+/* We use the alignment of the following to ensure accesses to locking
+ * and synchronisation primatives don't interfere with the page of the
+ * shared value
+ */
+__attribute__((aligned(PAGE_SIZE))) static unsigned int 
per_cpu_value[MAX_CPUS];
+__attribute__((aligned(PAGE_SIZE))) static cpumask_t smp_test_complete;
+__attribute__((aligned(PAGE_SIZE))) struct isaac_ctx prng_context[MAX_CPUS];
+
+/* Some of the approaches use a global lock to prevent contention. */
+static int global_lock;
+
+/* In any SMP setting this *should* fail due to cores stepping on
+ * each other updating the shared variable
+ */
+static void increment_shared(int cpu)
+{
+   (void)cpu;
+
+   shared_value++;
+}
+
+/* GCC __sync primitives are deprecated in favour of __atomic */
+static void increment_shared_with_lock(int cpu)
+{
+   (void)cpu;
+
+   while (__sync_lock_test_and_set(_lock, 1));
+
+   shared_value++;
+
+   __sync_lock_release(_lock);
+}
+
+/*
+ * In practice even __ATOMIC_RELAXED uses ARM's ldxr/stex exclusive
+ * semantics
+ */
+static void increment_shared_with_atomic(int cpu)
+{
+   (void)cpu;
+
+   __atomic_add_fetch(_value, 1, __ATOMIC_SEQ_CST);
+}
+
+
+/*
+ * Load/store exclusive with WFE (wait-for-event)
+ *
+ * See ARMv8 ARM examples:
+ *   Use of Wait For Event (WFE) and Send Event (SEV) with locks
+ */
+
+static void incre

[kvm-unit-tests PATCH v8 06/10] arm/tlbflush-code: TLB flush during code execution

2021-11-18 Thread Alex Bennée

This adds a fairly brain dead torture test for TLB flushes intended
for stressing the MTTCG QEMU build. It takes the usual -smp option for
multiple CPUs.

By default it CPU0 will do a TLBIALL flush after each cycle. You can
pass options via -append to control additional aspects of the test:

  - "page" flush each page in turn (one per function)
  - "self" do the flush after each computation cycle
  - "verbose" report progress on each computation cycle

Signed-off-by: Alex Bennée 
CC: Mark Rutland 

---
v2
  - rename to tlbflush-test
  - made makefile changes cleaner
  - added self/other flush mode
  - create specific prefix
  - whitespace fixes
v3
  - using new SMP framework for test runing
v4
  - merge in the unitests.cfg
v5
  - max out at -smp 4
  - printf fmtfix
v7
  - rename to tlbflush-code
  - int -> bool flags
v8
  - kernel style fixes
  - move to separate mttcgtests.cfg
---
 arm/Makefile.common |   1 +
 arm/tlbflush-code.c | 209 
 arm/mttcgtests.cfg  |  30 +++
 3 files changed, 240 insertions(+)
 create mode 100644 arm/tlbflush-code.c
 create mode 100644 arm/mttcgtests.cfg

diff --git a/arm/Makefile.common b/arm/Makefile.common
index 99bcf3f..e3f04f2 100644
--- a/arm/Makefile.common
+++ b/arm/Makefile.common
@@ -12,6 +12,7 @@ tests-common += $(TEST_DIR)/gic.flat
 tests-common += $(TEST_DIR)/psci.flat
 tests-common += $(TEST_DIR)/sieve.flat
 tests-common += $(TEST_DIR)/pl031.flat
+tests-common += $(TEST_DIR)/tlbflush-code.flat
 
 tests-all = $(tests-common) $(tests)
 all: directories $(tests-all)
diff --git a/arm/tlbflush-code.c b/arm/tlbflush-code.c
new file mode 100644
index 000..ca98c82
--- /dev/null
+++ b/arm/tlbflush-code.c
@@ -0,0 +1,209 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * TLB Flush Race Tests
+ *
+ * These tests are designed to test for incorrect TLB flush semantics
+ * under emulation. The initial CPU will set all the others working a
+ * compuation task and will then trigger TLB flushes across the
+ * system. It doesn't actually need to re-map anything but the flushes
+ * themselves will trigger QEMU's TCG self-modifying code detection
+ * which will invalidate any generated  code causing re-translation.
+ * Eventually the code buffer will fill and a general tb_lush() will
+ * be triggered.
+ *
+ * Copyright (C) 2016-2021, Linaro, Alex Bennée 
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define SEQ_LENGTH 10
+#define SEQ_HASH 0x7cd707fe
+
+static cpumask_t smp_test_complete;
+static int flush_count = 100;
+static bool flush_self;
+static bool flush_page;
+static bool flush_verbose;
+
+/*
+ * Work functions
+ *
+ * These work functions need to be:
+ *
+ *  - page aligned, so we can flush one function at a time
+ *  - have branches, so QEMU TCG generates multiple basic blocks
+ *  - call across pages, so we exercise the TCG basic block slow path
+ */
+
+/* Adler32 */
+__attribute__((aligned(PAGE_SIZE))) static
+uint32_t hash_array(const void *buf, size_t buflen)
+{
+   const uint8_t *data = (uint8_t *) buf;
+   uint32_t s1 = 1;
+   uint32_t s2 = 0;
+
+   for (size_t n = 0; n < buflen; n++) {
+   s1 = (s1 + data[n]) % 65521;
+   s2 = (s2 + s1) % 65521;
+   }
+   return (s2 << 16) | s1;
+}
+
+__attribute__((aligned(PAGE_SIZE))) static
+void create_fib_sequence(int length, unsigned int *array)
+{
+   int i;
+
+   /* first two values */
+   array[0] = 0;
+   array[1] = 1;
+   for (i = 2; i < length; i++)
+   array[i] = array[i-2] + array[i-1];
+}
+
+__attribute__((aligned(PAGE_SIZE))) static
+unsigned long long factorial(unsigned int n)
+{
+   unsigned int i;
+   unsigned long long fac = 1;
+
+   for (i = 1; i <= n; i++)
+   fac = fac * i;
+   return fac;
+}
+
+__attribute__((aligned(PAGE_SIZE))) static
+void factorial_array(unsigned int n, unsigned int *input,
+unsigned long long *output)
+{
+   unsigned int i;
+
+   for (i = 0; i < n; i++)
+   output[i] = factorial(input[i]);
+}
+
+__attribute__((aligned(PAGE_SIZE))) static
+unsigned int do_computation(void)
+{
+   unsigned int fib_array[SEQ_LENGTH];
+   unsigned long long facfib_array[SEQ_LENGTH];
+   uint32_t fib_hash, facfib_hash;
+
+   create_fib_sequence(SEQ_LENGTH, _array[0]);
+   fib_hash = hash_array(_array[0], sizeof(fib_array));
+   factorial_array(SEQ_LENGTH, _array[0], _array[0]);
+   facfib_hash = hash_array(_array[0], sizeof(facfib_array));
+
+   return (fib_hash ^ facfib_hash);
+}
+
+/* This provides a table of the work functions so we can flush each
+ * page individually
+ */
+static void *pages[] = {_array, _fib_sequence, ,
+   _array, _computation};
+
+static void do_flush(int i)
+{
+   if (flush_page)
+

[kvm-unit-tests PATCH v8 04/10] run_tests.sh: add --config option for alt test set

2021-11-18 Thread Alex Bennée

The upcoming MTTCG tests don't need to be run for normal KVM unit
tests so lets add the facility to have a custom set of tests.

Signed-off-by: Alex Bennée 
---
 run_tests.sh | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/run_tests.sh b/run_tests.sh
index 9f233c5..b1088d2 100755
--- a/run_tests.sh
+++ b/run_tests.sh
@@ -15,7 +15,7 @@ function usage()
 {
 cat <https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[kvm-unit-tests PATCH v8 01/10] docs: mention checkpatch in the README

2021-11-18 Thread Alex Bennée

Signed-off-by: Alex Bennée 
---
 README.md | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/README.md b/README.md
index b498aaf..5db48e5 100644
--- a/README.md
+++ b/README.md
@@ -182,3 +182,5 @@ the code files.  We also start with common code and finish 
with unit test
 code. git-diff's orderFile feature allows us to specify the order in a
 file.  The orderFile we use is `scripts/git.difforder`; adding the config
 with `git config diff.orderFile scripts/git.difforder` enables it.
+
+Please run the kernel's ./scripts/checkpatch.pl on new patches
-- 
2.30.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[kvm-unit-tests PATCH v8 02/10] arm/flat.lds: don't drop debug during link

2021-11-18 Thread Alex Bennée

It is useful to keep the debug in the .elf file so we can debug and it
doesn't get copied across to the final .flat file. Of course we still
need to ensure we apply the offset when we load the symbols based on
where QEMU decided to load the kernel.

  (gdb) symbol-file ./builds/arm64/arm/tlbflush-data.elf -o 0x4008

Signed-off-by: Alex Bennée 
---
 arm/flat.lds | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arm/flat.lds b/arm/flat.lds
index 6fb459e..47fcb64 100644
--- a/arm/flat.lds
+++ b/arm/flat.lds
@@ -62,7 +62,6 @@ SECTIONS
 /DISCARD/ : {
 *(.note*)
 *(.interp)
-*(.debug*)
 *(.comment)
 *(.dynamic)
 }
-- 
2.30.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[kvm-unit-tests PATCH v8 03/10] Makefile: add GNU global tags support

2021-11-18 Thread Alex Bennée

If you have ctags you might as well offer gtags as a target.

Signed-off-by: Alex Bennée 
---
 Makefile | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index b80c31f..0b7c03a 100644
--- a/Makefile
+++ b/Makefile
@@ -122,6 +122,9 @@ cscope:
-name '*.[chsS]' -exec realpath --relative-base=$(CURDIR) {} \; 
| sort -u > ./cscope.files
cscope -bk
 
-.PHONY: tags
+.PHONY: tags gtags
 tags:
ctags -R
+
+gtags:
+   gtags
-- 
2.30.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[kvm-unit-tests PATCH v8 00/10] MTTCG sanity tests for ARM

2021-11-18 Thread Alex Bennée

Hi,

It's been a long time since I last posted these but I'd like to
incorporate some MTTCG tests into QEMU's upstream acceptance tests and
a first step is getting these up-streamed. Most of the changes are
fixing up the numerous checkpatch failures (although isaac remains
unchanged and some warnings make no sense for kvm-unit-tests).

I dropped an additional test which attempts to test for data flush
behaviour but it still needs some work:

  
https://github.com/stsquad/kvm-unit-tests/commit/712eb3a287df24cdeff00ef966d68aef6ff2b8eb

Alex Bennée (10):
  docs: mention checkpatch in the README
  arm/flat.lds: don't drop debug during link
  Makefile: add GNU global tags support
  run_tests.sh: add --config option for alt test set
  lib: add isaac prng library from CCAN
  arm/tlbflush-code: TLB flush during code execution
  arm/locking-tests: add comprehensive locking test
  arm/barrier-litmus-tests: add simple mp and sal litmus tests
  arm/run: use separate --accel form
  arm/tcg-test: some basic TCG exercising tests

 arm/run   |   4 +-
 run_tests.sh  |  11 +-
 Makefile  |   5 +-
 arm/Makefile.arm  |   2 +
 arm/Makefile.arm64|   2 +
 arm/Makefile.common   |   6 +-
 lib/arm/asm/barrier.h |  61 ++
 lib/arm64/asm/barrier.h   |  50 +
 lib/prng.h|  82 +++
 lib/prng.c| 162 ++
 arm/flat.lds  |   1 -
 arm/tcg-test-asm.S| 171 +++
 arm/tcg-test-asm64.S  | 170 ++
 arm/barrier-litmus-test.c | 450 ++
 arm/locking-test.c| 322 +++
 arm/spinlock-test.c   |  87 
 arm/tcg-test.c| 338 
 arm/tlbflush-code.c   | 209 ++
 arm/mttcgtests.cfg| 176 +++
 README.md |   2 +
 20 files changed, 2216 insertions(+), 95 deletions(-)
 create mode 100644 lib/prng.h
 create mode 100644 lib/prng.c
 create mode 100644 arm/tcg-test-asm.S
 create mode 100644 arm/tcg-test-asm64.S
 create mode 100644 arm/barrier-litmus-test.c
 create mode 100644 arm/locking-test.c
 delete mode 100644 arm/spinlock-test.c
 create mode 100644 arm/tcg-test.c
 create mode 100644 arm/tlbflush-code.c
 create mode 100644 arm/mttcgtests.cfg

-- 
2.30.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [kvm-unit-tests PATCH v3 0/3] GIC ITS tests

2021-11-12 Thread Alex Bennée


Andrew Jones  writes:

> On Fri, Nov 12, 2021 at 11:47:31AM +0000, Alex Bennée wrote:
>> Hi,
>> 
>> Sorry this has been sitting in my tree so long. The changes are fairly
>> minor from v2. I no longer split the tests up into TCG and KVM
>> versions and instead just ensure that ERRATA_FORCE is always set when
>> run under TCG.
>> 
>> Alex Bennée (3):
>>   arm64: remove invalid check from its-trigger test
>>   arm64: enable its-migration tests for TCG
>>   arch-run: do not process ERRATA when running under TCG
>> 
>>  scripts/arch-run.bash |  4 +++-
>>  arm/gic.c | 16 ++--
>>  arm/unittests.cfg |  3 ---
>>  3 files changed, 9 insertions(+), 14 deletions(-)
>> 
>> -- 
>> 2.30.2
>> 
>> ___
>> kvmarm mailing list
>> kvmarm@lists.cs.columbia.edu
>> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
>
> Hi Alex,
>
> Thanks for this. I've applied to arm/queue, but I see that
>
> FAIL: gicv3: its-trigger: inv/invall: dev2/eventid=20 pending LPI is received
>
> consistently fails for me. Is that expected? Does it work for you?

Hmm so it works with KVM so I suspect it's showing up an error in the
TCG ITS implementation which I didn't see in earlier runs.

>
> Thanks,
> drew


-- 
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [kvm-unit-tests PATCH v3 0/3] GIC ITS tests

2021-11-12 Thread Alex Bennée


Andrew Jones  writes:

> On Fri, Nov 12, 2021 at 11:47:31AM +0000, Alex Bennée wrote:
>> Hi,
>> 
>> Sorry this has been sitting in my tree so long. The changes are fairly
>> minor from v2. I no longer split the tests up into TCG and KVM
>> versions and instead just ensure that ERRATA_FORCE is always set when
>> run under TCG.
>> 
>> Alex Bennée (3):
>>   arm64: remove invalid check from its-trigger test
>>   arm64: enable its-migration tests for TCG
>>   arch-run: do not process ERRATA when running under TCG
>> 
>>  scripts/arch-run.bash |  4 +++-
>>  arm/gic.c | 16 ++--
>>  arm/unittests.cfg |  3 ---
>>  3 files changed, 9 insertions(+), 14 deletions(-)
>> 
>> -- 
>> 2.30.2
>> 
>> ___
>> kvmarm mailing list
>> kvmarm@lists.cs.columbia.edu
>> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
>
> Hi Alex,
>
> Thanks for this. I've applied to arm/queue, but I see that
>
> FAIL: gicv3: its-trigger: inv/invall: dev2/eventid=20 pending LPI is received
>
> consistently fails for me. Is that expected? Does it work for you?

doh - looks like I cocked up the merge conflict...

Did it fail for TCG or for KVM (or both)?

-- 
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[kvm-unit-tests PATCH v3 2/3] arm64: enable its-migration tests for TCG

2021-11-12 Thread Alex Bennée

With the support for TCG emulated GIC we can also test these now.

Signed-off-by: Alex Bennée 
Reviewed-by: Eric Auger 
Reviewed-by: Andrew Jones 
Cc: Shashi Mallela 
Message-Id: <20210525172628.2088-4-alex.ben...@linaro.org>

---
v3
  - add its-migrate-unmapped-collection
---
 arm/unittests.cfg | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/arm/unittests.cfg b/arm/unittests.cfg
index f776b66..21474b8 100644
--- a/arm/unittests.cfg
+++ b/arm/unittests.cfg
@@ -194,7 +194,6 @@ arch = arm64
 [its-migration]
 file = gic.flat
 smp = $MAX_SMP
-accel = kvm
 extra_params = -machine gic-version=3 -append 'its-migration'
 groups = its migration
 arch = arm64
@@ -202,7 +201,6 @@ arch = arm64
 [its-pending-migration]
 file = gic.flat
 smp = $MAX_SMP
-accel = kvm
 extra_params = -machine gic-version=3 -append 'its-pending-migration'
 groups = its migration
 arch = arm64
@@ -210,7 +208,6 @@ arch = arm64
 [its-migrate-unmapped-collection]
 file = gic.flat
 smp = $MAX_SMP
-accel = kvm
 extra_params = -machine gic-version=3 -append 'its-migrate-unmapped-collection'
 groups = its migration
 arch = arm64
-- 
2.30.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[kvm-unit-tests PATCH v3 3/3] arch-run: do not process ERRATA when running under TCG

2021-11-12 Thread Alex Bennée

All the errata checking looks at the current host kernel version. For
TCG runs this is entirely irrelevant as the host kernel has no impact
on the behaviour of the guest. In fact we should set ERRATA_FORCE to
ensure we run those tests as QEMU doesn't attempt to model
non-confirming architectures.

Signed-off-by: Alex Bennée 
---
 scripts/arch-run.bash | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/scripts/arch-run.bash b/scripts/arch-run.bash
index 43da998..f1f4456 100644
--- a/scripts/arch-run.bash
+++ b/scripts/arch-run.bash
@@ -267,7 +267,9 @@ env_file ()
 
 env_errata ()
 {
-   if [ "$ERRATATXT" ] && [ ! -f "$ERRATATXT" ]; then
+   if [ "$ACCEL" = "tcg" ]; then
+   eval export "ERRATA_FORCE=y"
+   elif [ "$ERRATATXT" ] && [ ! -f "$ERRATATXT" ]; then
echo "$ERRATATXT not found. (ERRATATXT=$ERRATATXT)" >&2
return 2
elif [ "$ERRATATXT" ]; then
-- 
2.30.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[kvm-unit-tests PATCH v3 1/3] arm64: remove invalid check from its-trigger test

2021-11-12 Thread Alex Bennée

While an IRQ is not "guaranteed to be visible until an appropriate
invalidation" it doesn't stop the actual implementation delivering it
earlier if it wants to. This is the case for QEMU's TCG and as tests
should only be checking architectural compliance this check is
invalid.

Signed-off-by: Alex Bennée 
Reviewed-by: Eric Auger 
Cc: Shashi Mallela 
Message-Id: <20210525172628.2088-2-alex.ben...@linaro.org>

---
v3
  - reflow the comment, drop "willingly do not call" as per Eric's suggestion
---
 arm/gic.c | 16 ++--
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/arm/gic.c b/arm/gic.c
index 98135ef..1e3ea80 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -732,21 +732,17 @@ static void test_its_trigger(void)
"dev2/eventid=20 does not trigger any LPI");
 
/*
-* re-enable the LPI but willingly do not call invall
-* so the change in config is not taken into account.
-* The LPI should not hit
+* re-enable the LPI. While "A change to the LPI configuration
+* is not guaranteed to be visible until an appropriate
+* invalidation operation has completed" hardware that doesn't
+* implement caches may have delivered the event at any point
+* after the enabling. Check the LPI has hit by the time the
+* invall is done.
 */
gicv3_lpi_set_config(8195, LPI_PROP_DEFAULT);
stats_reset();
cpumask_clear();
its_send_int(dev2, 20);
-   wait_for_interrupts();
-   report(check_acked(, -1, -1),
-   "dev2/eventid=20 still does not trigger any LPI");
-
-   /* Now call the invall and check the LPI hits */
-   stats_reset();
-   cpumask_clear();
cpumask_set_cpu(3, );
its_send_invall(col3);
wait_for_interrupts();
-- 
2.30.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[kvm-unit-tests PATCH v3 0/3] GIC ITS tests

2021-11-12 Thread Alex Bennée

Hi,

Sorry this has been sitting in my tree so long. The changes are fairly
minor from v2. I no longer split the tests up into TCG and KVM
versions and instead just ensure that ERRATA_FORCE is always set when
run under TCG.

Alex Bennée (3):
  arm64: remove invalid check from its-trigger test
  arm64: enable its-migration tests for TCG
  arch-run: do not process ERRATA when running under TCG

 scripts/arch-run.bash |  4 +++-
 arm/gic.c | 16 ++--
 arm/unittests.cfg |  3 ---
 3 files changed, 9 insertions(+), 14 deletions(-)

-- 
2.30.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [kvm-unit-tests PATCH v2 4/4] arm64: split its-migrate-unmapped-collection into KVM and TCG variants

2021-06-01 Thread Alex Bennée


Andrew Jones  writes:

> On Tue, Jun 01, 2021 at 05:49:01PM +0100, Alex Bennée wrote:
>> 
>> Auger Eric  writes:
>> 
>> > Hi Alex,
>> >
>> > On 5/25/21 7:26 PM, Alex Bennée wrote:
>> >> When running the test in TCG we are basically running on bare metal so
>> >> don't rely on having a particular kernel errata applied.
>> >> 
>> >> You might wonder why we handle this with a totally new test name
>> >> instead of adjusting the append to take an extra parameter? Well the
>> >> run_migration shell script uses eval "$@" which unwraps the -append
>> >> leading to any second parameter being split and leaving QEMU very
>> >> confused and the test hanging. This seemed simpler than re-writing all
>> >> the test running logic in something sane ;-)
>> >
>> > there is
>> > lib/s390x/vm.h:bool vm_is_tcg(void)
>> >
>> > but I don't see any particular ID we could use to differentiate both the
>> > KVM and the TCG mode, do you?
>> 
>> For -cpu max we do:
>> 
>> /*
>>  * Reset MIDR so the guest doesn't mistake our 'max' CPU type for a 
>> real
>>  * one and try to apply errata workarounds or use impdef features we
>>  * don't provide.
>>  * An IMPLEMENTER field of 0 means "reserved for software use";
>>  * ARCHITECTURE must be 0xf indicating "v7 or later, check ID 
>> registers
>>  * to see which features are present";
>>  * the VARIANT, PARTNUM and REVISION fields are all implementation
>>  * defined and we choose to define PARTNUM just in case guest
>>  * code needs to distinguish this QEMU CPU from other software
>>  * implementations, though this shouldn't be needed.
>>  */
>> t = FIELD_DP64(0, MIDR_EL1, IMPLEMENTER, 0);
>> t = FIELD_DP64(t, MIDR_EL1, ARCHITECTURE, 0xf);
>> t = FIELD_DP64(t, MIDR_EL1, PARTNUM, 'Q');
>> t = FIELD_DP64(t, MIDR_EL1, VARIANT, 0);
>> t = FIELD_DP64(t, MIDR_EL1, REVISION, 0);
>> cpu->midr = t;
>> 
>> However for the default -cpu cortex-a57 we aim to look just like the
>> real thing - only without any annoying micro-architecture bugs ;-)
>> 
>> >
>> > without a more elegant solution,
>> 
>> I'll look into the suggestion made by Richard.
>
> Where did Richard make a suggestion? And what is it?

Sorry - I had a brain fart, I was of course referring to your
ERRATA_FORCE suggestion.

>
> Thanks,
> drew
>
>> 
>> > Reviewed-by: Eric Auger 
>> >
>> > Thanks
>> >
>> > Eric
>> >
>> >
>> >> 
>> >> Signed-off-by: Alex Bennée 
>> >> Cc: Shashi Mallela 
>> >> ---
>> >>  arm/gic.c |  8 +++-
>> >>  arm/unittests.cfg | 10 +-
>> >>  2 files changed, 16 insertions(+), 2 deletions(-)
>> >> 
>> >> diff --git a/arm/gic.c b/arm/gic.c
>> >> index bef061a..0fce2a4 100644
>> >> --- a/arm/gic.c
>> >> +++ b/arm/gic.c
>> >> @@ -36,6 +36,7 @@ static struct gic *gic;
>> >>  static int acked[NR_CPUS], spurious[NR_CPUS];
>> >>  static int irq_sender[NR_CPUS], irq_number[NR_CPUS];
>> >>  static cpumask_t ready;
>> >> +static bool under_tcg;
>> >>  
>> >>  static void nr_cpu_check(int nr)
>> >>  {
>> >> @@ -834,7 +835,7 @@ static void test_migrate_unmapped_collection(void)
>> >>   goto do_migrate;
>> >>   }
>> >>  
>> >> - if (!errata(ERRATA_UNMAPPED_COLLECTIONS)) {
>> >> + if (!errata(ERRATA_UNMAPPED_COLLECTIONS) && !under_tcg) {
>> >>   report_skip("Skipping test, as this test hangs without the fix. 
>> >> "
>> >>   "Set %s=y to enable.", ERRATA_UNMAPPED_COLLECTIONS);
>> >>   test_skipped = true;
>> >> @@ -1005,6 +1006,11 @@ int main(int argc, char **argv)
>> >>   report_prefix_push(argv[1]);
>> >>   test_migrate_unmapped_collection();
>> >>   report_prefix_pop();
>> >> + } else if (!strcmp(argv[1], "its-migrate-unmapped-collection-tcg")) {
>> >> + under_tcg = true;
>> >> + report_prefix_push(argv[1]);
>> >> + test_migrate_unmapped_collection();
>> >> +

Re: [kvm-unit-tests PATCH v2 4/4] arm64: split its-migrate-unmapped-collection into KVM and TCG variants

2021-06-01 Thread Alex Bennée


Auger Eric  writes:

> Hi Alex,
>
> On 5/25/21 7:26 PM, Alex Bennée wrote:
>> When running the test in TCG we are basically running on bare metal so
>> don't rely on having a particular kernel errata applied.
>> 
>> You might wonder why we handle this with a totally new test name
>> instead of adjusting the append to take an extra parameter? Well the
>> run_migration shell script uses eval "$@" which unwraps the -append
>> leading to any second parameter being split and leaving QEMU very
>> confused and the test hanging. This seemed simpler than re-writing all
>> the test running logic in something sane ;-)
>
> there is
> lib/s390x/vm.h:bool vm_is_tcg(void)
>
> but I don't see any particular ID we could use to differentiate both the
> KVM and the TCG mode, do you?

For -cpu max we do:

/*
 * Reset MIDR so the guest doesn't mistake our 'max' CPU type for a real
 * one and try to apply errata workarounds or use impdef features we
 * don't provide.
 * An IMPLEMENTER field of 0 means "reserved for software use";
 * ARCHITECTURE must be 0xf indicating "v7 or later, check ID registers
 * to see which features are present";
 * the VARIANT, PARTNUM and REVISION fields are all implementation
 * defined and we choose to define PARTNUM just in case guest
 * code needs to distinguish this QEMU CPU from other software
 * implementations, though this shouldn't be needed.
 */
t = FIELD_DP64(0, MIDR_EL1, IMPLEMENTER, 0);
t = FIELD_DP64(t, MIDR_EL1, ARCHITECTURE, 0xf);
t = FIELD_DP64(t, MIDR_EL1, PARTNUM, 'Q');
t = FIELD_DP64(t, MIDR_EL1, VARIANT, 0);
t = FIELD_DP64(t, MIDR_EL1, REVISION, 0);
cpu->midr = t;

However for the default -cpu cortex-a57 we aim to look just like the
real thing - only without any annoying micro-architecture bugs ;-)

>
> without a more elegant solution,

I'll look into the suggestion made by Richard.

> Reviewed-by: Eric Auger 
>
> Thanks
>
> Eric
>
>
>> 
>> Signed-off-by: Alex Bennée 
>> Cc: Shashi Mallela 
>> ---
>>  arm/gic.c |  8 +++-
>>  arm/unittests.cfg | 10 +-
>>  2 files changed, 16 insertions(+), 2 deletions(-)
>> 
>> diff --git a/arm/gic.c b/arm/gic.c
>> index bef061a..0fce2a4 100644
>> --- a/arm/gic.c
>> +++ b/arm/gic.c
>> @@ -36,6 +36,7 @@ static struct gic *gic;
>>  static int acked[NR_CPUS], spurious[NR_CPUS];
>>  static int irq_sender[NR_CPUS], irq_number[NR_CPUS];
>>  static cpumask_t ready;
>> +static bool under_tcg;
>>  
>>  static void nr_cpu_check(int nr)
>>  {
>> @@ -834,7 +835,7 @@ static void test_migrate_unmapped_collection(void)
>>  goto do_migrate;
>>  }
>>  
>> -if (!errata(ERRATA_UNMAPPED_COLLECTIONS)) {
>> +if (!errata(ERRATA_UNMAPPED_COLLECTIONS) && !under_tcg) {
>>  report_skip("Skipping test, as this test hangs without the fix. 
>> "
>>  "Set %s=y to enable.", ERRATA_UNMAPPED_COLLECTIONS);
>>  test_skipped = true;
>> @@ -1005,6 +1006,11 @@ int main(int argc, char **argv)
>>  report_prefix_push(argv[1]);
>>  test_migrate_unmapped_collection();
>>  report_prefix_pop();
>> +} else if (!strcmp(argv[1], "its-migrate-unmapped-collection-tcg")) {
>> +under_tcg = true;
>> +report_prefix_push(argv[1]);
>> +test_migrate_unmapped_collection();
>> +report_prefix_pop();
>>  } else if (strcmp(argv[1], "its-introspection") == 0) {
>>  report_prefix_push(argv[1]);
>>  test_its_introspection();
>> diff --git a/arm/unittests.cfg b/arm/unittests.cfg
>> index 1a39428..adc1bbf 100644
>> --- a/arm/unittests.cfg
>> +++ b/arm/unittests.cfg
>> @@ -205,7 +205,7 @@ extra_params = -machine gic-version=3 -append 
>> 'its-pending-migration'
>>  groups = its migration
>>  arch = arm64
>>  
>> -[its-migrate-unmapped-collection]
>> +[its-migrate-unmapped-collection-kvm]
>>  file = gic.flat
>>  smp = $MAX_SMP
>>  accel = kvm
>> @@ -213,6 +213,14 @@ extra_params = -machine gic-version=3 -append 
>> 'its-migrate-unmapped-collection'
>>  groups = its migration
>>  arch = arm64
>>  
>> +[its-migrate-unmapped-collection-tcg]
>> +file = gic.flat
>> +smp = $MAX_SMP
>> +accel = tcg
>> +extra_params = -machine gic-version=3 -append 
>> 'its-migrate-unmapped-collection-tcg'
>> +groups = its migration
>> +arch = arm64
>> +
>>  # Test PSCI emulation
>>  [psci]
>>  file = psci.flat
>> 


-- 
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[kvm-unit-tests RFC PATCH] arm64: split pmu tests into tcg/kvm variants

2021-05-25 Thread Alex Bennée

QEMU is able to give a counter for instructions retired under TCG but
you need to enable -icount for it to work. Split the tests into
kvm/tcg variants to support this.

[AJB: I wonder if the solution is to have a totally separate
unittests.cfg for TCG mode here?]

Signed-off-by: Alex Bennée 
---
 arm/unittests.cfg | 20 ++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/arm/unittests.cfg b/arm/unittests.cfg
index adc1bbf..2c4cf41 100644
--- a/arm/unittests.cfg
+++ b/arm/unittests.cfg
@@ -66,24 +66,40 @@ file = pmu.flat
 groups = pmu
 extra_params = -append 'cycle-counter 0'
 
-[pmu-event-introspection]
+[pmu-event-introspection-kvm]
 file = pmu.flat
 groups = pmu
 arch = arm64
+accel = kvm
 extra_params = -append 'pmu-event-introspection'
 
+[pmu-event-introspection-tcg]
+file = pmu.flat
+groups = pmu
+arch = arm64
+accel = tcg
+extra_params = -append 'pmu-event-introspection' -icount shift=1
+
 [pmu-event-counter-config]
 file = pmu.flat
 groups = pmu
 arch = arm64
 extra_params = -append 'pmu-event-counter-config'
 
-[pmu-basic-event-count]
+[pmu-basic-event-count-kvm]
 file = pmu.flat
 groups = pmu
 arch = arm64
+accel = kvm
 extra_params = -append 'pmu-basic-event-count'
 
+[pmu-basic-event-count-tcg]
+file = pmu.flat
+groups = pmu
+arch = arm64
+accel = tcg
+extra_params = -append 'pmu-basic-event-count' -icount shift=1
+
 [pmu-mem-access]
 file = pmu.flat
 groups = pmu
-- 
2.20.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[kvm-unit-tests PATCH v2 1/4] arm64: remove invalid check from its-trigger test

2021-05-25 Thread Alex Bennée

While an IRQ is not "guaranteed to be visible until an appropriate
invalidation" it doesn't stop the actual implementation delivering it
earlier if it wants to. This is the case for QEMU's TCG and as tests
should only be checking architectural compliance this check is
invalid.

Signed-off-by: Alex Bennée 
Cc: Shashi Mallela 
---
 arm/gic.c | 14 ++
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/arm/gic.c b/arm/gic.c
index 98135ef..bef061a 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -732,21 +732,19 @@ static void test_its_trigger(void)
"dev2/eventid=20 does not trigger any LPI");
 
/*
-* re-enable the LPI but willingly do not call invall
-* so the change in config is not taken into account.
-* The LPI should not hit
+* re-enable the LPI but willingly do not call invall so the
+* change in config is not taken into account. While "A change
+* to the LPI configuration is not guaranteed to be visible
+* until an appropriate invalidation operation has completed"
+* hardware that doesn't implement caches may have delivered
+* the event at any point after the enabling.
 */
gicv3_lpi_set_config(8195, LPI_PROP_DEFAULT);
stats_reset();
cpumask_clear();
its_send_int(dev2, 20);
-   wait_for_interrupts();
-   report(check_acked(, -1, -1),
-   "dev2/eventid=20 still does not trigger any LPI");
 
/* Now call the invall and check the LPI hits */
-   stats_reset();
-   cpumask_clear();
cpumask_set_cpu(3, );
its_send_invall(col3);
wait_for_interrupts();
-- 
2.20.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[kvm-unit-tests PATCH v2 0/4] enable LPI and ITS for TCG

2021-05-25 Thread Alex Bennée

This is a companion to Shashi's series enabling LPI and ITS features
for QEMU's TCG emulation. This is part of our push for a sbsa-ref
platform which needs a more modern set of features. Since the last
posting this is now at v3:

  From: Shashi Mallela 
  Subject: [PATCH v3 0/8] GICv3 LPI and ITS feature implementation
  Date: Thu, 29 Apr 2021 19:41:53 -0400
  Message-Id: <20210429234201.125565-1-shashi.mall...@linaro.org>

The only real change from the last version was to drop the IMPDEF
behaviour check instead of trying to work around it for the TCG case.

Please review.

Alex Bennée (4):
  arm64: remove invalid check from its-trigger test
  scripts/arch-run: don't use deprecated server/nowait options
  arm64: enable its-migration tests for TCG
  arm64: split its-migrate-unmapped-collection into KVM and TCG variants

 scripts/arch-run.bash |  4 ++--
 arm/gic.c | 22 +-
 arm/unittests.cfg | 12 +---
 3 files changed, 24 insertions(+), 14 deletions(-)

-- 
2.20.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[kvm-unit-tests PATCH v2 2/4] scripts/arch-run: don't use deprecated server/nowait options

2021-05-25 Thread Alex Bennée

The very fact that QEMU drops the deprecation warning while running is
enough to confuse the its-migration test into failing. The boolean
options server and wait have accepted the long form options for a long
time.

Signed-off-by: Alex Bennée 
Cc: Shashi Mallela 
---
 scripts/arch-run.bash | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/scripts/arch-run.bash b/scripts/arch-run.bash
index 5997e38..70693f2 100644
--- a/scripts/arch-run.bash
+++ b/scripts/arch-run.bash
@@ -122,14 +122,14 @@ run_migration ()
trap 'kill 0; exit 2' INT TERM
trap 'rm -f ${migout1} ${migsock} ${qmp1} ${qmp2} ${fifo}' RETURN EXIT
 
-   eval "$@" -chardev socket,id=mon1,path=${qmp1},server,nowait \
+   eval "$@" -chardev socket,id=mon1,path=${qmp1},server=on,wait=off \
-mon chardev=mon1,mode=control | tee ${migout1} &
 
# We have to use cat to open the named FIFO, because named FIFO's, 
unlike
# pipes, will block on open() until the other end is also opened, and 
that
# totally breaks QEMU...
mkfifo ${fifo}
-   eval "$@" -chardev socket,id=mon2,path=${qmp2},server,nowait \
+   eval "$@" -chardev socket,id=mon2,path=${qmp2},server=on,wait=off \
-mon chardev=mon2,mode=control -incoming unix:${migsock} < 
<(cat ${fifo}) &
incoming_pid=`jobs -l %+ | awk '{print$2}'`
 
-- 
2.20.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[kvm-unit-tests PATCH v2 4/4] arm64: split its-migrate-unmapped-collection into KVM and TCG variants

2021-05-25 Thread Alex Bennée

When running the test in TCG we are basically running on bare metal so
don't rely on having a particular kernel errata applied.

You might wonder why we handle this with a totally new test name
instead of adjusting the append to take an extra parameter? Well the
run_migration shell script uses eval "$@" which unwraps the -append
leading to any second parameter being split and leaving QEMU very
confused and the test hanging. This seemed simpler than re-writing all
the test running logic in something sane ;-)

Signed-off-by: Alex Bennée 
Cc: Shashi Mallela 
---
 arm/gic.c |  8 +++-
 arm/unittests.cfg | 10 +-
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/arm/gic.c b/arm/gic.c
index bef061a..0fce2a4 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -36,6 +36,7 @@ static struct gic *gic;
 static int acked[NR_CPUS], spurious[NR_CPUS];
 static int irq_sender[NR_CPUS], irq_number[NR_CPUS];
 static cpumask_t ready;
+static bool under_tcg;
 
 static void nr_cpu_check(int nr)
 {
@@ -834,7 +835,7 @@ static void test_migrate_unmapped_collection(void)
goto do_migrate;
}
 
-   if (!errata(ERRATA_UNMAPPED_COLLECTIONS)) {
+   if (!errata(ERRATA_UNMAPPED_COLLECTIONS) && !under_tcg) {
report_skip("Skipping test, as this test hangs without the fix. 
"
"Set %s=y to enable.", ERRATA_UNMAPPED_COLLECTIONS);
test_skipped = true;
@@ -1005,6 +1006,11 @@ int main(int argc, char **argv)
report_prefix_push(argv[1]);
test_migrate_unmapped_collection();
report_prefix_pop();
+   } else if (!strcmp(argv[1], "its-migrate-unmapped-collection-tcg")) {
+   under_tcg = true;
+   report_prefix_push(argv[1]);
+   test_migrate_unmapped_collection();
+   report_prefix_pop();
} else if (strcmp(argv[1], "its-introspection") == 0) {
report_prefix_push(argv[1]);
test_its_introspection();
diff --git a/arm/unittests.cfg b/arm/unittests.cfg
index 1a39428..adc1bbf 100644
--- a/arm/unittests.cfg
+++ b/arm/unittests.cfg
@@ -205,7 +205,7 @@ extra_params = -machine gic-version=3 -append 
'its-pending-migration'
 groups = its migration
 arch = arm64
 
-[its-migrate-unmapped-collection]
+[its-migrate-unmapped-collection-kvm]
 file = gic.flat
 smp = $MAX_SMP
 accel = kvm
@@ -213,6 +213,14 @@ extra_params = -machine gic-version=3 -append 
'its-migrate-unmapped-collection'
 groups = its migration
 arch = arm64
 
+[its-migrate-unmapped-collection-tcg]
+file = gic.flat
+smp = $MAX_SMP
+accel = tcg
+extra_params = -machine gic-version=3 -append 
'its-migrate-unmapped-collection-tcg'
+groups = its migration
+arch = arm64
+
 # Test PSCI emulation
 [psci]
 file = psci.flat
-- 
2.20.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[kvm-unit-tests PATCH v2 3/4] arm64: enable its-migration tests for TCG

2021-05-25 Thread Alex Bennée

With the support for TCG emulated GIC we can also test these now.

Signed-off-by: Alex Bennée 
Cc: Shashi Mallela 
---
 arm/unittests.cfg | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arm/unittests.cfg b/arm/unittests.cfg
index f776b66..1a39428 100644
--- a/arm/unittests.cfg
+++ b/arm/unittests.cfg
@@ -194,7 +194,6 @@ arch = arm64
 [its-migration]
 file = gic.flat
 smp = $MAX_SMP
-accel = kvm
 extra_params = -machine gic-version=3 -append 'its-migration'
 groups = its migration
 arch = arm64
@@ -202,7 +201,6 @@ arch = arm64
 [its-pending-migration]
 file = gic.flat
 smp = $MAX_SMP
-accel = kvm
 extra_params = -machine gic-version=3 -append 'its-pending-migration'
 groups = its migration
 arch = arm64
-- 
2.20.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [kvm-unit-tests PATCH v1 1/4] arm64: split its-trigger test into KVM and TCG variants

2021-04-28 Thread Alex Bennée


Alex Bennée  writes:

> Marc Zyngier  writes:
>
>> On Wed, 28 Apr 2021 15:00:15 +0100,
>> Alexandru Elisei  wrote:
>>> 
>>> I interpret that as that an INVALL guarantees that a change is
>>> visible, but it the change can become visible even without the
>>> INVALL.
>>
>> Yes. Expecting the LPI to be delivered or not in the absence of an
>> invalidate when its configuration has been altered is wrong. The
>> architecture doesn't guarantee anything of the sort.
>
> Is the underlying hypervisor allowed to invalidate and reload the
> configuration whenever it wants or should it only be driven by the
> guests requests?
>
> I did consider a more nuanced variant of the test that allowed for a
> delivery pre-inval and a pass for post-inval as long as it had been
> delivered one way or another:
>
> --8<---cut here---start->8---
> modified   arm/gic.c
> @@ -36,6 +36,7 @@ static struct gic *gic;
>  static int acked[NR_CPUS], spurious[NR_CPUS];
>  static int irq_sender[NR_CPUS], irq_number[NR_CPUS];
>  static cpumask_t ready;
> +static bool under_tcg;
>  
>  static void nr_cpu_check(int nr)
>  {
> @@ -687,6 +688,7 @@ static void test_its_trigger(void)
>   struct its_collection *col3;
>   struct its_device *dev2, *dev7;
>   cpumask_t mask;
> + bool before, after;
>  
>   if (its_setup1())
>   return;
> @@ -734,15 +736,17 @@ static void test_its_trigger(void)
>   /*
>* re-enable the LPI but willingly do not call invall
>* so the change in config is not taken into account.
> -  * The LPI should not hit
> +  * The LPI should not hit. This does however depend on
> +  * implementation defined behaviour - under QEMU TCG emulation
> +  * it can quite correctly process the event directly.
>*/
>   gicv3_lpi_set_config(8195, LPI_PROP_DEFAULT);
>   stats_reset();
>   cpumask_clear();
>   its_send_int(dev2, 20);
>   wait_for_interrupts();
> - report(check_acked(, -1, -1),
> - "dev2/eventid=20 still does not trigger any LPI");
> + before = check_acked(, -1, -1);
> + report_xfail(under_tcg, before, "dev2/eventid=20 still may not trigger 
> any LPI");
>  
>   /* Now call the invall and check the LPI hits */
>   stats_reset();
> @@ -750,8 +754,8 @@ static void test_its_trigger(void)
>   cpumask_set_cpu(3, );
>   its_send_invall(col3);
>   wait_for_interrupts();
> - report(check_acked(, 0, 8195),
> - "dev2/eventid=20 pending LPI is received");
> + after = check_acked(, 0, 8195);
> + report(before != after, "dev2/eventid=20 pending LPI is
>   received");

Actually that should be:

 report(after || !before, "dev2/eventid=20 pending LPI is received");

so either the IRQ arrives after the flush or it had previously.

-- 
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [kvm-unit-tests PATCH v1 1/4] arm64: split its-trigger test into KVM and TCG variants

2021-04-28 Thread Alex Bennée

44 valid=0
  INT dev_id=2 event_id=20
  PASS: gicv3: its-trigger: mapd valid=false: no LPI after device unmap
  SUMMARY: 7 tests, 1 unexpected failures, 1 expected failures

>> The test relies on the fact that changes to the LPI tables are not
>> visible *under KVM* until the INVALL command, but that's not
>> necessarily the case on real hardware. To match the spec, I think
>> the test "dev2/eventid=20 still does not trigger any LPI" should be
>> removed and the stats reset should take place before the
>> configuration for LPI 8195 is set to the default.
>
> If that's what the test expects (I haven't tried to investigate), it
> should be dropped completely, rather than trying to sidestep it for
> TCG.

All three parts of that section?

report(check_acked(, -1, -1),
"dev2/eventid=20 still does not trigger any LPI");
report(check_acked(, 0, 8195),
"dev2/eventid=20 pending LPI is received");
report(check_acked(, 0, 8195),
"dev2/eventid=20 now triggers an LPI");


-- 
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [kvm-unit-tests PATCH v1 1/4] arm64: split its-trigger test into KVM and TCG variants

2021-04-28 Thread Alex Bennée


Marc Zyngier  writes:

> On 2021-04-28 11:18, Alex Bennée wrote:
>> A few of the its-trigger tests rely on IMPDEF behaviour where caches
>> aren't flushed before invall events. However TCG emulation doesn't
>> model any invall behaviour and as we can't probe for it we need to be
>> told. Split the test into a KVM and TCG variant and skip the invall
>> tests when under TCG.
>> Signed-off-by: Alex Bennée 
>> Cc: Shashi Mallela 
>> ---
>>  arm/gic.c | 60 +++
>>  arm/unittests.cfg | 11 -
>>  2 files changed, 45 insertions(+), 26 deletions(-)
>> diff --git a/arm/gic.c b/arm/gic.c
>> index 98135ef..96a329d 100644
>> --- a/arm/gic.c
>> +++ b/arm/gic.c
>> @@ -36,6 +36,7 @@ static struct gic *gic;
>>  static int acked[NR_CPUS], spurious[NR_CPUS];
>>  static int irq_sender[NR_CPUS], irq_number[NR_CPUS];
>>  static cpumask_t ready;
>> +static bool under_tcg;
>>  static void nr_cpu_check(int nr)
>>  {
>> @@ -734,32 +735,38 @@ static void test_its_trigger(void)
>>  /*
>>   * re-enable the LPI but willingly do not call invall
>>   * so the change in config is not taken into account.
>> - * The LPI should not hit
>> + * The LPI should not hit. This does however depend on
>> + * implementation defined behaviour - under QEMU TCG emulation
>> + * it can quite correctly process the event directly.
>
> It looks to me that you are using an IMPDEF behaviour of *TCG*
> here. The programming model mandates that there is an invalidation
> if you change the configuration of the LPI.

But does it mandate that the LPI cannot be sent until the invalidation?

>
> M.


-- 
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[kvm-unit-tests PATCH v1 0/4] enable LPI and ITS for TCG

2021-04-28 Thread Alex Bennée

This is a companion to Shashi's series enabling LPI and ITS features
for QEMU's TCG emulation. This is part of our push for a sbsa-ref
platform which needs a more modern set of features.

  From: Shashi Mallela 
  Subject: [PATCH v2 0/8] GICv3 LPI and ITS feature implementation
  Date: Wed, 31 Mar 2021 22:41:44 -0400
  Message-Id: <20210401024152.203896-1-shashi.mall...@linaro.org>

Most of the changes are minor except the its-trigger test which skips
invall handling checks which I think is relying on IMPDEF behaviour
which we can't probe for. There is also a hilarious work around to
some limitations in the run_migration() script in the last patch.

Alex Bennée (4):
  arm64: split its-trigger test into KVM and TCG variants
  scripts/arch-run: don't use deprecated server/nowait options
  arm64: enable its-migration tests for TCG
  arm64: split its-migrate-unmapped-collection into KVM and TCG variants

 scripts/arch-run.bash |  4 +--
 arm/gic.c | 67 ++-
 arm/unittests.cfg | 23 ---
 3 files changed, 62 insertions(+), 32 deletions(-)

-- 
2.20.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[kvm-unit-tests PATCH v1 3/4] arm64: enable its-migration tests for TCG

2021-04-28 Thread Alex Bennée

With the support for TCG emulated GIC we can also test these now. You
need to call run_tests.sh with -a to trigger the
its-migrate-unmapped-collection test which obviously doesn't need the
KVM errata to run in TCG system emulation mode.

Signed-off-by: Alex Bennée 
---
 arm/unittests.cfg | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/arm/unittests.cfg b/arm/unittests.cfg
index c72dc34..d4dbc8b 100644
--- a/arm/unittests.cfg
+++ b/arm/unittests.cfg
@@ -203,7 +203,6 @@ arch = arm64
 [its-migration]
 file = gic.flat
 smp = $MAX_SMP
-accel = kvm
 extra_params = -machine gic-version=3 -append 'its-migration'
 groups = its migration
 arch = arm64
@@ -211,7 +210,6 @@ arch = arm64
 [its-pending-migration]
 file = gic.flat
 smp = $MAX_SMP
-accel = kvm
 extra_params = -machine gic-version=3 -append 'its-pending-migration'
 groups = its migration
 arch = arm64
@@ -219,7 +217,6 @@ arch = arm64
 [its-migrate-unmapped-collection]
 file = gic.flat
 smp = $MAX_SMP
-accel = kvm
 extra_params = -machine gic-version=3 -append 'its-migrate-unmapped-collection'
 groups = its migration
 arch = arm64
-- 
2.20.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[kvm-unit-tests PATCH v1 4/4] arm64: split its-migrate-unmapped-collection into KVM and TCG variants

2021-04-28 Thread Alex Bennée

When running the test in TCG we are basically running on bare metal so
don't rely on having a particular kernel errata applied.

You might wonder why we handle this with a totally new test name
instead of adjusting the append as we have before? Well the
run_migration shell script uses eval "$@" which unwraps the -append
leading to any second parameter being split and leaving QEMU very
confused and the test hanging. This seemed simpler than re-writing all
the test running logic in something sane ;-)

Signed-off-by: Alex Bennée 
Cc: Shashi Mallela 
---
 arm/gic.c |  7 ++-
 arm/unittests.cfg | 11 ++-
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/arm/gic.c b/arm/gic.c
index 96a329d..3bc7477 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -843,7 +843,7 @@ static void test_migrate_unmapped_collection(void)
goto do_migrate;
}
 
-   if (!errata(ERRATA_UNMAPPED_COLLECTIONS)) {
+   if (!errata(ERRATA_UNMAPPED_COLLECTIONS) && !under_tcg) {
report_skip("Skipping test, as this test hangs without the fix. 
"
"Set %s=y to enable.", ERRATA_UNMAPPED_COLLECTIONS);
test_skipped = true;
@@ -1017,6 +1017,11 @@ int main(int argc, char **argv)
report_prefix_push(argv[1]);
test_migrate_unmapped_collection();
report_prefix_pop();
+   } else if (!strcmp(argv[1], "its-migrate-unmapped-collection-tcg")) {
+   under_tcg = true;
+   report_prefix_push(argv[1]);
+   test_migrate_unmapped_collection();
+   report_prefix_pop();
} else if (strcmp(argv[1], "its-introspection") == 0) {
report_prefix_push(argv[1]);
test_its_introspection();
diff --git a/arm/unittests.cfg b/arm/unittests.cfg
index d4dbc8b..e8f2e74 100644
--- a/arm/unittests.cfg
+++ b/arm/unittests.cfg
@@ -214,13 +214,22 @@ extra_params = -machine gic-version=3 -append 
'its-pending-migration'
 groups = its migration
 arch = arm64
 
-[its-migrate-unmapped-collection]
+[its-migrate-unmapped-collection-kvm]
 file = gic.flat
 smp = $MAX_SMP
+accel = kvm
 extra_params = -machine gic-version=3 -append 'its-migrate-unmapped-collection'
 groups = its migration
 arch = arm64
 
+[its-migrate-unmapped-collection-tcg]
+file = gic.flat
+smp = $MAX_SMP
+accel = tcg
+extra_params = -machine gic-version=3 -append 
'its-migrate-unmapped-collection-tcg'
+groups = its migration
+arch = arm64
+
 # Test PSCI emulation
 [psci]
 file = psci.flat
-- 
2.20.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[kvm-unit-tests PATCH v1 1/4] arm64: split its-trigger test into KVM and TCG variants

2021-04-28 Thread Alex Bennée

A few of the its-trigger tests rely on IMPDEF behaviour where caches
aren't flushed before invall events. However TCG emulation doesn't
model any invall behaviour and as we can't probe for it we need to be
told. Split the test into a KVM and TCG variant and skip the invall
tests when under TCG.

Signed-off-by: Alex Bennée 
Cc: Shashi Mallela 
---
 arm/gic.c | 60 +++
 arm/unittests.cfg | 11 -
 2 files changed, 45 insertions(+), 26 deletions(-)

diff --git a/arm/gic.c b/arm/gic.c
index 98135ef..96a329d 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -36,6 +36,7 @@ static struct gic *gic;
 static int acked[NR_CPUS], spurious[NR_CPUS];
 static int irq_sender[NR_CPUS], irq_number[NR_CPUS];
 static cpumask_t ready;
+static bool under_tcg;
 
 static void nr_cpu_check(int nr)
 {
@@ -734,32 +735,38 @@ static void test_its_trigger(void)
/*
 * re-enable the LPI but willingly do not call invall
 * so the change in config is not taken into account.
-* The LPI should not hit
+* The LPI should not hit. This does however depend on
+* implementation defined behaviour - under QEMU TCG emulation
+* it can quite correctly process the event directly.
 */
-   gicv3_lpi_set_config(8195, LPI_PROP_DEFAULT);
-   stats_reset();
-   cpumask_clear();
-   its_send_int(dev2, 20);
-   wait_for_interrupts();
-   report(check_acked(, -1, -1),
-   "dev2/eventid=20 still does not trigger any LPI");
-
-   /* Now call the invall and check the LPI hits */
-   stats_reset();
-   cpumask_clear();
-   cpumask_set_cpu(3, );
-   its_send_invall(col3);
-   wait_for_interrupts();
-   report(check_acked(, 0, 8195),
-   "dev2/eventid=20 pending LPI is received");
-
-   stats_reset();
-   cpumask_clear();
-   cpumask_set_cpu(3, );
-   its_send_int(dev2, 20);
-   wait_for_interrupts();
-   report(check_acked(, 0, 8195),
-   "dev2/eventid=20 now triggers an LPI");
+   if (under_tcg) {
+   report_skip("checking LPI triggers without invall");
+   } else {
+   gicv3_lpi_set_config(8195, LPI_PROP_DEFAULT);
+   stats_reset();
+   cpumask_clear();
+   its_send_int(dev2, 20);
+   wait_for_interrupts();
+   report(check_acked(, -1, -1),
+  "dev2/eventid=20 still does not trigger any LPI");
+
+   /* Now call the invall and check the LPI hits */
+   stats_reset();
+   cpumask_clear();
+   cpumask_set_cpu(3, );
+   its_send_invall(col3);
+   wait_for_interrupts();
+   report(check_acked(, 0, 8195),
+  "dev2/eventid=20 pending LPI is received");
+
+   stats_reset();
+   cpumask_clear();
+   cpumask_set_cpu(3, );
+   its_send_int(dev2, 20);
+   wait_for_interrupts();
+   report(check_acked(, 0, 8195),
+  "dev2/eventid=20 now triggers an LPI");
+   }
 
report_prefix_pop();
 
@@ -981,6 +988,9 @@ int main(int argc, char **argv)
if (argc < 2)
report_abort("no test specified");
 
+   if (argc == 3 && strcmp(argv[2], "tcg") == 0)
+   under_tcg = true;
+
if (strcmp(argv[1], "ipi") == 0) {
report_prefix_push(argv[1]);
nr_cpu_check(2);
diff --git a/arm/unittests.cfg b/arm/unittests.cfg
index f776b66..c72dc34 100644
--- a/arm/unittests.cfg
+++ b/arm/unittests.cfg
@@ -184,13 +184,22 @@ extra_params = -machine gic-version=3 -append 
'its-introspection'
 groups = its
 arch = arm64
 
-[its-trigger]
+[its-trigger-kvm]
 file = gic.flat
 smp = $MAX_SMP
+accel = kvm
 extra_params = -machine gic-version=3 -append 'its-trigger'
 groups = its
 arch = arm64
 
+[its-trigger-tcg]
+file = gic.flat
+smp = $MAX_SMP
+accel = tcg
+extra_params = -machine gic-version=3 -append 'its-trigger tcg'
+groups = its
+arch = arm64
+
 [its-migration]
 file = gic.flat
 smp = $MAX_SMP
-- 
2.20.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[kvm-unit-tests PATCH v1 2/4] scripts/arch-run: don't use deprecated server/nowait options

2021-04-28 Thread Alex Bennée

The very fact that QEMU drops the deprecation warning while running is
enough to confuse the its-migration test into failing. The boolean
options server and wait have accepted the long form options for a long
time.

Signed-off-by: Alex Bennée 
---
 scripts/arch-run.bash | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/scripts/arch-run.bash b/scripts/arch-run.bash
index 5997e38..70693f2 100644
--- a/scripts/arch-run.bash
+++ b/scripts/arch-run.bash
@@ -122,14 +122,14 @@ run_migration ()
trap 'kill 0; exit 2' INT TERM
trap 'rm -f ${migout1} ${migsock} ${qmp1} ${qmp2} ${fifo}' RETURN EXIT
 
-   eval "$@" -chardev socket,id=mon1,path=${qmp1},server,nowait \
+   eval "$@" -chardev socket,id=mon1,path=${qmp1},server=on,wait=off \
-mon chardev=mon1,mode=control | tee ${migout1} &
 
# We have to use cat to open the named FIFO, because named FIFO's, 
unlike
# pipes, will block on open() until the other end is also opened, and 
that
# totally breaks QEMU...
mkfifo ${fifo}
-   eval "$@" -chardev socket,id=mon2,path=${qmp2},server,nowait \
+   eval "$@" -chardev socket,id=mon2,path=${qmp2},server=on,wait=off \
-mon chardev=mon2,mode=control -incoming unix:${migsock} < 
<(cat ${fifo}) &
incoming_pid=`jobs -l %+ | awk '{print$2}'`
 
-- 
2.20.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v1 3/3] kernel/configs: don't include PCI_QUIRKS in KVM guest configs

2020-08-04 Thread Alex Bennée

Marc Zyngier  writes:

> On 2020-08-04 15:44, Alex Bennée wrote:
>> Marc Zyngier  writes:
>> 
>>> On 2020-08-04 13:44, Alex Bennée wrote:
>>>> The VIRTIO_PCI support is an idealised PCI bus, we don't need a bunch
>>>> of bloat for real world hardware for a VirtIO guest.
>>> 
>>> Who says this guest will only have virtio devices?
>> 
>> This is true - although what is the point of kvm_guest.config? We
>> certainly turn on a whole bunch of virt optimised pathways with 
>> PARAVIRT
>> and HYPERVISOR_GUEST along with the rest of VirtIO.
>
> Most of which actually qualifies as bloat itself as far as KVM/arm64
> is concerned...

So here is the question - does the kernel care about having a blessed
config for a minimal viable guest? They are certainly used in the cloud
but I understand the kernel is trying to get away from having a zoo of
configs. What is the actual point of kvm_guest.config? Just an easy
enabling for developers?

>
>   M.

-- 
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [RFC PATCH v1 0/3] put arm64 kvm_config on a diet

2020-08-04 Thread Alex Bennée


Ard Biesheuvel  writes:

> On Tue, 4 Aug 2020 at 14:45, Alex Bennée  wrote:
>>
>> Hi,
>>
>> When building guest kernels for virtualisation we were bringing in a
>> bunch of stuff from physical hardware which we don't need for our
>> idealised fixable virtual PCI devices. This series makes some Kconfig
>> changes to allow the ThunderX and XGene PCI drivers to be compiled
>> out. It also drops PCI_QUIRKS from the KVM guest build as a virtual
>> PCI device should be quirk free.
>>
>
> What about PCI passthrough?

That is a good point - how much of the host PCI controller is visible to
a pass-through guest?

AIUI in passthrough the driver only interacts with the particular cards
IO window. How many quirks are visible just at the device level (rather
than the bus itself)?

That said I think the last patch might get dropped as long as the user
has the option to slim down their kernel with the first two.

>
>> This is my first time hacking around Kconfig so I hope I've got the
>> balance between depends and selects right but please let be know if it
>> could be specified in a cleaner way.
>>
>> Alex Bennée (3):
>>   arm64: allow de-selection of ThunderX PCI controllers
>>   arm64: gate the whole of pci-xgene on CONFIG_PCI_XGENE
>>   kernel/configs: don't include PCI_QUIRKS in KVM guest configs
>>
>>  arch/arm64/Kconfig.platforms| 2 ++
>>  arch/arm64/configs/defconfig| 1 +
>>  drivers/pci/controller/Kconfig  | 7 +++
>>  drivers/pci/controller/Makefile | 8 +++-
>>  kernel/configs/kvm_guest.config | 1 +
>>  5 files changed, 14 insertions(+), 5 deletions(-)
>>
>> --
>> 2.20.1
>>
>>
>> ___
>> linux-arm-kernel mailing list
>> linux-arm-ker...@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel


-- 
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v1 3/3] kernel/configs: don't include PCI_QUIRKS in KVM guest configs

2020-08-04 Thread Alex Bennée


Marc Zyngier  writes:

> On 2020-08-04 13:44, Alex Bennée wrote:
>> The VIRTIO_PCI support is an idealised PCI bus, we don't need a bunch
>> of bloat for real world hardware for a VirtIO guest.
>
> Who says this guest will only have virtio devices?

This is true - although what is the point of kvm_guest.config? We
certainly turn on a whole bunch of virt optimised pathways with PARAVIRT
and HYPERVISOR_GUEST along with the rest of VirtIO.

> Or even, virtio devices without bugs? Given that said device can
> come from any VMM, I'm not sure this is the right thing to do.

Perhaps this patch is one too far. I don't mind dropping it as long as I
can still slim down the kernels I know don't need the extra bloat.

>
> Thanks,
>
>      M.
>
>> 
>> Signed-off-by: Alex Bennée 
>> ---
>>  kernel/configs/kvm_guest.config | 1 +
>>  1 file changed, 1 insertion(+)
>> 
>> diff --git a/kernel/configs/kvm_guest.config 
>> b/kernel/configs/kvm_guest.config
>> index 208481d91090..672863a2fdf1 100644
>> --- a/kernel/configs/kvm_guest.config
>> +++ b/kernel/configs/kvm_guest.config
>> @@ -13,6 +13,7 @@ CONFIG_IP_PNP_DHCP=y
>>  CONFIG_BINFMT_ELF=y
>>  CONFIG_PCI=y
>>  CONFIG_PCI_MSI=y
>> +CONFIG_PCI_QUIRKS=n
>>  CONFIG_DEBUG_KERNEL=y
>>  CONFIG_VIRTUALIZATION=y
>>  CONFIG_HYPERVISOR_GUEST=y


-- 
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[PATCH v1 2/3] arm64: gate the whole of pci-xgene on CONFIG_PCI_XGENE

2020-08-04 Thread Alex Bennée

This is a little weirder as bits of the file are already conditioned
on the exiting symbol. Either way they are not actually needed for
non-xgene machines saving another 12k:

-rwxr-xr-x 1 alex alex  86033880 Aug  3 16:39 vmlinux.orig*
-rwxr-xr-x 1 alex alex  85652472 Aug  3 16:54 vmlinux.rm-thunder*
-rwxr-xr-x 1 alex alex  85639808 Aug  3 17:12 vmlinux*

Signed-off-by: Alex Bennée 
---
 drivers/pci/controller/Makefile | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/pci/controller/Makefile b/drivers/pci/controller/Makefile
index 8fad4781a5d3..3b9b72f5773a 100644
--- a/drivers/pci/controller/Makefile
+++ b/drivers/pci/controller/Makefile
@@ -47,6 +47,4 @@ obj-y += mobiveil/
 
 obj-$(CONFIG_PCI_THUNDER) += pci-thunder-ecam.o
 obj-$(CONFIG_PCI_THUNDER) += pci-thunder-pem.o
-ifdef CONFIG_PCI
-obj-$(CONFIG_ARM64) += pci-xgene.o
-endif
+obj-$(CONFIG_PCI_XGENE) += pci-xgene.o
-- 
2.20.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[PATCH v1 3/3] kernel/configs: don't include PCI_QUIRKS in KVM guest configs

2020-08-04 Thread Alex Bennée

The VIRTIO_PCI support is an idealised PCI bus, we don't need a bunch
of bloat for real world hardware for a VirtIO guest.

Signed-off-by: Alex Bennée 
---
 kernel/configs/kvm_guest.config | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/configs/kvm_guest.config b/kernel/configs/kvm_guest.config
index 208481d91090..672863a2fdf1 100644
--- a/kernel/configs/kvm_guest.config
+++ b/kernel/configs/kvm_guest.config
@@ -13,6 +13,7 @@ CONFIG_IP_PNP_DHCP=y
 CONFIG_BINFMT_ELF=y
 CONFIG_PCI=y
 CONFIG_PCI_MSI=y
+CONFIG_PCI_QUIRKS=n
 CONFIG_DEBUG_KERNEL=y
 CONFIG_VIRTUALIZATION=y
 CONFIG_HYPERVISOR_GUEST=y
-- 
2.20.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[PATCH v1 1/3] arm64: allow de-selection of ThunderX PCI controllers

2020-08-04 Thread Alex Bennée

For a pure VirtIO guest bringing in all the PCI quirk handling adds a
significant amount of bloat to kernel we don't need. Solve this by
adding a CONFIG symbol for the ThunderX PCI devices and allowing it to
be turned off. Saving over 300k from the uncompressed vmlinux:

  -rwxr-xr-x 1 alex alex  85652472 Aug  3 16:48 vmlinux*
  -rwxr-xr-x 1 alex alex  86033880 Aug  3 16:39 vmlinux.orig*

Signed-off-by: Alex Bennée 
Cc: Robert Richter 
Cc: linux-...@vger.kernel.org
Cc: linux-arm-ker...@lists.infradead.org
---
 arch/arm64/Kconfig.platforms| 2 ++
 arch/arm64/configs/defconfig| 1 +
 drivers/pci/controller/Kconfig  | 7 +++
 drivers/pci/controller/Makefile | 4 ++--
 4 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/Kconfig.platforms b/arch/arm64/Kconfig.platforms
index 8dd05b2a925c..a328eebdaa59 100644
--- a/arch/arm64/Kconfig.platforms
+++ b/arch/arm64/Kconfig.platforms
@@ -253,12 +253,14 @@ config ARCH_SPRD
 
 config ARCH_THUNDER
bool "Cavium Inc. Thunder SoC Family"
+select PCI_THUNDER
help
  This enables support for Cavium's Thunder Family of SoCs.
 
 config ARCH_THUNDER2
bool "Cavium ThunderX2 Server Processors"
select GPIOLIB
+select PCI_THUNDER
help
  This enables support for Cavium's ThunderX2 CN99XX family of
  server processors.
diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index 2ca7ba69c318..d840cba99941 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -199,6 +199,7 @@ CONFIG_PCI_HOST_GENERIC=y
 CONFIG_PCI_XGENE=y
 CONFIG_PCIE_ALTERA=y
 CONFIG_PCIE_ALTERA_MSI=y
+CONFIG_PCI_THUNDER=y
 CONFIG_PCI_HOST_THUNDER_PEM=y
 CONFIG_PCI_HOST_THUNDER_ECAM=y
 CONFIG_PCIE_ROCKCHIP_HOST=m
diff --git a/drivers/pci/controller/Kconfig b/drivers/pci/controller/Kconfig
index adddf21fa381..28335ffa5d48 100644
--- a/drivers/pci/controller/Kconfig
+++ b/drivers/pci/controller/Kconfig
@@ -286,6 +286,13 @@ config PCI_LOONGSON
  Say Y here if you want to enable PCI controller support on
  Loongson systems.
 
+config PCI_THUNDER
+   bool "Thunder X PCIE controllers"
+   depends on ARM64
+   select PCI_QUIRKS
+   help
+  Say Y here to enable ThunderX ECAM and PEM PCI controllers.
+
 source "drivers/pci/controller/dwc/Kconfig"
 source "drivers/pci/controller/mobiveil/Kconfig"
 source "drivers/pci/controller/cadence/Kconfig"
diff --git a/drivers/pci/controller/Makefile b/drivers/pci/controller/Makefile
index efd9733ead26..8fad4781a5d3 100644
--- a/drivers/pci/controller/Makefile
+++ b/drivers/pci/controller/Makefile
@@ -45,8 +45,8 @@ obj-y += mobiveil/
 # ARM64 and use internal ifdefs to only build the pieces we need
 # depending on whether ACPI, the DT driver, or both are enabled.
 
+obj-$(CONFIG_PCI_THUNDER) += pci-thunder-ecam.o
+obj-$(CONFIG_PCI_THUNDER) += pci-thunder-pem.o
 ifdef CONFIG_PCI
-obj-$(CONFIG_ARM64) += pci-thunder-ecam.o
-obj-$(CONFIG_ARM64) += pci-thunder-pem.o
 obj-$(CONFIG_ARM64) += pci-xgene.o
 endif
-- 
2.20.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[RFC PATCH v1 0/3] put arm64 kvm_config on a diet

2020-08-04 Thread Alex Bennée

Hi,

When building guest kernels for virtualisation we were bringing in a
bunch of stuff from physical hardware which we don't need for our
idealised fixable virtual PCI devices. This series makes some Kconfig
changes to allow the ThunderX and XGene PCI drivers to be compiled
out. It also drops PCI_QUIRKS from the KVM guest build as a virtual
PCI device should be quirk free.

This is my first time hacking around Kconfig so I hope I've got the
balance between depends and selects right but please let be know if it
could be specified in a cleaner way.

Alex Bennée (3):
  arm64: allow de-selection of ThunderX PCI controllers
  arm64: gate the whole of pci-xgene on CONFIG_PCI_XGENE
  kernel/configs: don't include PCI_QUIRKS in KVM guest configs

 arch/arm64/Kconfig.platforms| 2 ++
 arch/arm64/configs/defconfig| 1 +
 drivers/pci/controller/Kconfig  | 7 +++
 drivers/pci/controller/Makefile | 8 +++-
 kernel/configs/kvm_guest.config | 1 +
 5 files changed, 14 insertions(+), 5 deletions(-)

-- 
2.20.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [kvm-unit-tests PATCH] arm: expand the timer tests

2020-01-13 Thread Alex Bennée


Alexandru Elisei  writes:

> Hi,
>
> On 1/10/20 4:05 PM, Alex Bennée wrote:
>> This was an attempt to replicate a QEMU bug. However to trigger the
>> bug you need to have an offset set in EL2 which kvm-unit-tests is
>> unable to do. However it does exercise some more corner cases.
>>
>> Bug: https://bugs.launchpad.net/bugs/1859021
>
> I'm not aware of any Bug: tags in the Linux kernel. If you want people to 
> follow
> the link to the bug, how about referencing something like this:
>
> "This was an attempt to replicate a QEMU bug [1]. [..]
>
> [1] https://bugs.launchpad.net/qemu/+bug/1859021;

OK, I'll fix that in v2.

>
> Also, are launchpad bug reports permanent? Will the link still work in
> a years' time?

They should be - they are a unique id and we use them in the QEMU source
tree.

>
>> Signed-off-by: Alex Bennée 
>> ---
>>  arm/timer.c | 27 ++-
>>  1 file changed, 26 insertions(+), 1 deletion(-)
>>
>> diff --git a/arm/timer.c b/arm/timer.c
>> index f390e8e..ae1d299 100644
>> --- a/arm/timer.c
>> +++ b/arm/timer.c
>> @@ -214,21 +214,46 @@ static void test_timer(struct timer_info *info)
>>   * still read the pending state even if it's disabled. */
>>  set_timer_irq_enabled(info, false);
>>  
>> +/* Verify count goes up */
>> +report(info->read_counter() >= now, "counter increments");
>> +
>>  /* Enable the timer, but schedule it for much later */
>>  info->write_cval(later);
>>  info->write_ctl(ARCH_TIMER_CTL_ENABLE);
>>  isb();
>> -report(!gic_timer_pending(info), "not pending before");
>> +report(!gic_timer_pending(info), "not pending before 10s");
>> +
>> +/* Check with a maximum possible cval */
>> +info->write_cval(UINT64_MAX);
>> +isb();
>> +report(!gic_timer_pending(info), "not pending before UINT64_MAX");
>> +
>> +/* also by setting tval */
>
> All the comments in this file seem to start with a capital letter.
>
>> +info->write_tval(time_10s);
>> +isb();
>> +report(!gic_timer_pending(info), "not pending before 10s (via tval)");
>
> You can remove the "(via tval)" part - the message is unique enough to figure 
> out
> which part of the test it refers to.

I added it to differentiate with the message a little further above.

>> +report_info("TVAL is %d (delta CVAL %ld) ticks",
>> +info->read_tval(), info->read_cval() - 
>> info->read_counter());
>
> I'm not sure what you are trying to achieve with this. You can transform it to
> check that TVAL is indeed positive and (almost) equal to cval - cntpct, 
> something
> like this:
>
> + s32 tval = info->read_tval();
> + report(tval > 0 && tval <= info->read_cval() -
> info->read_counter(), "TVAL measures time to next interrupt");

Yes it was purely informational to say tval decrements towards the next
IRQ. I can make it a pure test.

>
>>  
>> +/* check pending once cval is before now */
>
> This comment adds nothing to the test.

dropped.

>
>>  info->write_cval(now - 1);
>>  isb();
>>  report(gic_timer_pending(info), "interrupt signal pending");
>> +report_info("TVAL is %d ticks", info->read_tval());
>
> You can test that TVAL is negative here instead of printing the value.

ok.

>
>>  
>>  /* Disable the timer again and prepare to take interrupts */
>>  info->write_ctl(0);
>>  set_timer_irq_enabled(info, true);
>>  report(!gic_timer_pending(info), "interrupt signal no longer pending");
>>  
>> +/* QEMU bug when cntvoff_el2 > 0
>> + * https://bugs.launchpad.net/bugs/1859021 */
>
> This looks confusing to me. From the commit message, I got that kvm-unit-tests
> needs qemu to set a special value for CNTVOFF_EL2. But the comments seems to
> suggest that kvm-unit-tests can trigger the bug without qemu doing anything
> special. Can you elaborate under which condition kvm-unit-tests can
> trigger the bug?

It can't without some sort of mechanism to set the hypervisor registers
before running the test. The QEMU bug is an overflow when cval of UINT64_MAX
with a non-zero CNTVOFF_EL2.

Running under KVM the host kernel will have likely set CNTVOFF_EL2 to
some sort of value with:

update_vtimer_cntvoff(vcpu, kvm_phys_timer_read());

>
>> +info->write_ctl(ARCH_TIMER_CTL_ENABLE);
>> +info->write_cval(UINT64_MAX);
>
> The order

[kvm-unit-tests PATCH] arm: expand the timer tests

2020-01-10 Thread Alex Bennée

This was an attempt to replicate a QEMU bug. However to trigger the
bug you need to have an offset set in EL2 which kvm-unit-tests is
unable to do. However it does exercise some more corner cases.

Bug: https://bugs.launchpad.net/bugs/1859021
Signed-off-by: Alex Bennée 
---
 arm/timer.c | 27 ++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/arm/timer.c b/arm/timer.c
index f390e8e..ae1d299 100644
--- a/arm/timer.c
+++ b/arm/timer.c
@@ -214,21 +214,46 @@ static void test_timer(struct timer_info *info)
 * still read the pending state even if it's disabled. */
set_timer_irq_enabled(info, false);
 
+   /* Verify count goes up */
+   report(info->read_counter() >= now, "counter increments");
+
/* Enable the timer, but schedule it for much later */
info->write_cval(later);
info->write_ctl(ARCH_TIMER_CTL_ENABLE);
isb();
-   report(!gic_timer_pending(info), "not pending before");
+   report(!gic_timer_pending(info), "not pending before 10s");
+
+   /* Check with a maximum possible cval */
+   info->write_cval(UINT64_MAX);
+   isb();
+   report(!gic_timer_pending(info), "not pending before UINT64_MAX");
+
+   /* also by setting tval */
+   info->write_tval(time_10s);
+   isb();
+   report(!gic_timer_pending(info), "not pending before 10s (via tval)");
+   report_info("TVAL is %d (delta CVAL %ld) ticks",
+   info->read_tval(), info->read_cval() - 
info->read_counter());
 
+/* check pending once cval is before now */
info->write_cval(now - 1);
isb();
report(gic_timer_pending(info), "interrupt signal pending");
+   report_info("TVAL is %d ticks", info->read_tval());
 
/* Disable the timer again and prepare to take interrupts */
info->write_ctl(0);
set_timer_irq_enabled(info, true);
report(!gic_timer_pending(info), "interrupt signal no longer pending");
 
+   /* QEMU bug when cntvoff_el2 > 0
+* https://bugs.launchpad.net/bugs/1859021 */
+   info->write_ctl(ARCH_TIMER_CTL_ENABLE);
+   info->write_cval(UINT64_MAX);
+   isb();
+   report(!gic_timer_pending(info), "not pending before UINT64_MAX (irqs 
on)");
+   info->write_ctl(0);
+
report(test_cval_10msec(info), "latency within 10 ms");
report(info->irq_received, "interrupt received");
 
-- 
2.20.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v2 07/14] KVM: arm64/sve: Make register ioctl access errors more consistent

2019-04-25 Thread Alex Bennée


Dave Martin  writes:

> On Thu, Apr 25, 2019 at 01:30:29PM +0100, Alex Bennée wrote:
>>
>> Dave Martin  writes:
>>
>> > Currently, the way error codes are generated when processing the
>> > SVE register access ioctls in a bit haphazard.
>> >
>> > This patch refactors the code so that the behaviour is more
>> > consistent: now, -EINVAL should be returned only for unrecognised
>> > register IDs or when some other runtime error occurs.  -ENOENT is
>> > returned for register IDs that are recognised, but whose
>> > corresponding register (or slice) does not exist for the vcpu.
>> >
>> > To this end, in {get,set}_sve_reg() we now delegate the
>> > vcpu_has_sve() check down into {get,set}_sve_vls() and
>> > sve_reg_to_region().  The KVM_REG_ARM64_SVE_VLS special case is
>> > picked off first, then sve_reg_to_region() plays the role of
>> > exhaustively validating or rejecting the register ID and (where
>> > accepted) computing the applicable register region as before.
>> >
>> > sve_reg_to_region() is rearranged so that -ENOENT or -EPERM is not
>> > returned prematurely, before checking whether reg->id is in a
>> > recognised range.
>> >
>> > -EPERM is now only returned when an attempt is made to access an
>> > actually existing register slice on an unfinalized vcpu.
>> >
>> > Fixes: e1c9c98345b3 ("KVM: arm64/sve: Add SVE support to register access 
>> > ioctl interface")
>> > Fixes: 9033bba4b535 ("KVM: arm64/sve: Add pseudo-register for the guest's 
>> > vector lengths")
>> > Suggested-by: Andrew Jones 
>> > Signed-off-by: Dave Martin 
>> > Reviewed-by: Andrew Jones 
>
> [...]
>
>> > @@ -335,25 +344,30 @@ static int sve_reg_to_region(struct 
>> > sve_state_reg_region *region,
>> >/* Verify that we match the UAPI header: */
>> >BUILD_BUG_ON(SVE_NUM_SLICES != KVM_ARM64_SVE_MAX_SLICES);
>> >
>> > -  if ((reg->id & SVE_REG_SLICE_MASK) > 0)
>> > -  return -ENOENT;
>> > -
>> > -  vq = sve_vq_from_vl(vcpu->arch.sve_max_vl);
>> > -
>> >reg_num = (reg->id & SVE_REG_ID_MASK) >> SVE_REG_ID_SHIFT;
>> >
>> >if (reg->id >= zreg_id_min && reg->id <= zreg_id_max) {
>> > +  if (!vcpu_has_sve(vcpu) || (reg->id & SVE_REG_SLICE_MASK) > 0)
>> > +  return -ENOENT;
>> > +
>> > +  vq = sve_vq_from_vl(vcpu->arch.sve_max_vl);
>> > +
>> >reqoffset = SVE_SIG_ZREG_OFFSET(vq, reg_num) -
>> >SVE_SIG_REGS_OFFSET;
>> >reqlen = KVM_SVE_ZREG_SIZE;
>> >maxlen = SVE_SIG_ZREG_SIZE(vq);
>> >} else if (reg->id >= preg_id_min && reg->id <= preg_id_max) {
>> > +  if (!vcpu_has_sve(vcpu) || (reg->id & SVE_REG_SLICE_MASK) > 0)
>> > +  return -ENOENT;
>> > +
>> > +  vq = sve_vq_from_vl(vcpu->arch.sve_max_vl);
>> > +
>>
>> I suppose you could argue for a:
>>
>>  if (reg->id >= zreg_id_min && reg->id <= preg_id_max) {
>>  if (!vcpu_has_sve(vcpu) || (reg->id & SVE_REG_SLICE_MASK) > 0)
>>  return -ENOENT;
>>
>>  vq = sve_vq_from_vl(vcpu->arch.sve_max_vl);
>>
>> if (reg->id <= zreg_id_max) {
>>  reqoffset = SVE_SIG_ZREG_OFFSET(vq, reg_num) -
>>  SVE_SIG_REGS_OFFSET;
>>  reqlen = KVM_SVE_ZREG_SIZE;
>>  maxlen = SVE_SIG_ZREG_SIZE(vq);
>> } else {
>>  reqoffset = SVE_SIG_PREG_OFFSET(vq, reg_num) -
>>  SVE_SIG_REGS_OFFSET;
>>  reqlen = KVM_SVE_PREG_SIZE;
>>  maxlen = SVE_SIG_PREG_SIZE(vq);
>>  }
>>  } else {
>>      return -EINVAL;
>>  }
>>
>> but only for minimal DRY reasons.
>
> Agreed, but that bakes in another assumption: that the ZREG and PREG ID
> ranges are contiguous.

Ahh I'd misread:

  /* reg ID ranges for P- registers and FFR (which are contiguous) */

However these are defined in the UABI:

  /* Z- and P-regs occupy blocks at the following offsets within this range: */
  #define KVM_REG_ARM64_SVE_ZREG_BASE   0
  #define KVM_REG_ARM64_SVE_PREG_BASE   0x400
  #define KVM_REG_ARM64_SVE_FFR_BASE0x600

so there position is pretty fixed now right?

> I preferred to keep the number of assumptions down.
>
> Althoug the resulting code wasn't ideal, the actual amount of
> duplication that I ended up with here seemed low enough as to be
> acceptable (though opinions can differ on that).

It's no biggie ;-)

--
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v2 00/14] KVM: arm64: SVE cleanups

2019-04-25 Thread Alex Bennée


Dave Martin  writes:

> This series contains some cleanups applicable to the SVE KVM support
> patches merged into kvmarm/next.  These arose from Andrew Jones'
> review.
>
> Apart from some minor changes to error codes and checking, these are
> mostly cosmetic / sytlistic changes only.
>
> The patches are based on
> git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git next
> 5d8d4af24460 ("arm64: KVM: Fix system register enumeration")
>
> This series in git:
> git://linux-arm.org/linux-dm.git sve-kvm-fixes/v2/head
> http://linux-arm.org/git?p=linux-dm.git;a=shortlog;h=refs/heads/sve-kvm-fixes/v2/head
>
> Tested with qemu and kvmtool on ThunderX2, and with kvmtool on the Arm
> Fast model (to exercise SVE support).

These all look good to me:

Reviewed-by: Alex Bennée 

--
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v2 07/14] KVM: arm64/sve: Make register ioctl access errors more consistent

2019-04-25 Thread Alex Bennée

 reqlen = KVM_SVE_ZREG_SIZE;
maxlen = SVE_SIG_ZREG_SIZE(vq);
} else {
reqoffset = SVE_SIG_PREG_OFFSET(vq, reg_num) -
SVE_SIG_REGS_OFFSET;
reqlen = KVM_SVE_PREG_SIZE;
maxlen = SVE_SIG_PREG_SIZE(vq);
}
} else {
return -EINVAL;
}

but only for minimal DRY reasons.

>   reqoffset = SVE_SIG_PREG_OFFSET(vq, reg_num) -
>   SVE_SIG_REGS_OFFSET;
>   reqlen = KVM_SVE_PREG_SIZE;
>   maxlen = SVE_SIG_PREG_SIZE(vq);
>   } else {
> - return -ENOENT;
> + return -EINVAL;
>   }
>
>   sve_state_size = vcpu_sve_state_size(vcpu);
> @@ -369,24 +383,22 @@ static int sve_reg_to_region(struct 
> sve_state_reg_region *region,
>
>  static int get_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  {
> + int ret;
>   struct sve_state_reg_region region;
>   char __user *uptr = (char __user *)reg->addr;
>
> - if (!vcpu_has_sve(vcpu))
> - return -ENOENT;
> -
>   /* Handle the KVM_REG_ARM64_SVE_VLS pseudo-reg as a special case: */
>   if (reg->id == KVM_REG_ARM64_SVE_VLS)
>   return get_sve_vls(vcpu, reg);
>
> - /* Otherwise, reg is an architectural SVE register... */
> + /* Try to interpret reg ID as an architectural SVE register... */
> + ret = sve_reg_to_region(, vcpu, reg);
> + if (ret)
> + return ret;
>
>   if (!kvm_arm_vcpu_sve_finalized(vcpu))
>   return -EPERM;
>
> - if (sve_reg_to_region(, vcpu, reg))
> - return -ENOENT;
> -
>   if (copy_to_user(uptr, vcpu->arch.sve_state + region.koffset,
>region.klen) ||
>   clear_user(uptr + region.klen, region.upad))
> @@ -397,24 +409,22 @@ static int get_sve_reg(struct kvm_vcpu *vcpu, const 
> struct kvm_one_reg *reg)
>
>  static int set_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  {
> + int ret;
>   struct sve_state_reg_region region;
>   const char __user *uptr = (const char __user *)reg->addr;
>
> - if (!vcpu_has_sve(vcpu))
> - return -ENOENT;
> -
>   /* Handle the KVM_REG_ARM64_SVE_VLS pseudo-reg as a special case: */
>       if (reg->id == KVM_REG_ARM64_SVE_VLS)
>   return set_sve_vls(vcpu, reg);
>
> - /* Otherwise, reg is an architectural SVE register... */
> + /* Try to interpret reg ID as an architectural SVE register... */
> + ret = sve_reg_to_region(, vcpu, reg);
> + if (ret)
> + return ret;
>
>   if (!kvm_arm_vcpu_sve_finalized(vcpu))
>   return -EPERM;
>
> - if (sve_reg_to_region(, vcpu, reg))
> - return -ENOENT;
> -
>   if (copy_from_user(vcpu->arch.sve_state + region.koffset, uptr,
>  region.klen))
>   return -EFAULT;


--
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v7 00/27] KVM: arm64: SVE guest support

2019-04-25 Thread Alex Bennée


Dave Martin  writes:

> This series implements support for allowing KVM guests to use the Arm
> Scalable Vector Extension (SVE), superseding the previous v6 series [1].
>
> The patches are also available on a branch for reviewer convenience. [2]
>
> The patches are based on v5.1-rc2.
>
> This series addresses a couple of minor review comments received on v6
> and otherwise applies reviewer tags only.  The code differences
> between v6 and this series consist of minor cosmetic fixups only.
>
> Draft kvmtool patches were posted separately [3], [4].
>
> For a description of minor updates, see the individual patches.
>
>
> Thanks go to Julien Thierry and Julian Grall for their review efforts,
> and to Zhang Lei for testing the series -- many thanks for their help
> in getting the series to this point!
>
>
> Reviewers' attention is drawn to the following patches, which have no
> Reviewed-by/Acked-by.  Please take a look if you have a moment.
>
>  * Patch 11 (KVM: arm64: Support runtime sysreg visibility filtering)
>
>Previously Reviewed-by Julien Thierry, but this version of the
>patch contains some minor rework suggested by Mark Rutland during
>the v5 review [5].
>
>  * Patch 15 (KVM: arm64: Add missing #include of 
>in guest.c)
>
>(Trivial patch.)
>
>  * Patch 26: (KVM: Document errors for KVM_GET_ONE_REG and
>KVM_SET_ONE_REG)
>
>(Documentation only.)
>
>  * Patch 27: KVM: arm64/sve: Document KVM API extensions for SVE
>
>(Documentation only.)

I've finished going through the series. Aside from a few minor nits and
the discussion you've already had with drew happy with it. Have a
general:

Reviewed-by: Alex Bennée 

for the mail record...
>
> Known issues: none
>
>
> Testing status:
>
>  * Lightweight testing on the Arm Fast Model, primarily to exercise the
>new vcpu finalization API.
>
>Ran sve-stress testing for several days on v6 on the Arm Fast Model,
>with no errors observed.

Once we have VHE support in QEMU I intend to give it a bit more testing.

--
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v7 12/27] KVM: arm64/sve: System register context switch and access support

2019-04-24 Thread Alex Bennée

t; + if (val != guest_id_aa64zfr0_el1(vcpu))
> + return -EINVAL;
> +
> + return 0;
> +}
> +
>  /*
>   * cpufeature ID register user accessors
>   *
> @@ -1346,7 +1418,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
>   ID_SANITISED(ID_AA64PFR1_EL1),
>   ID_UNALLOCATED(4,2),
>   ID_UNALLOCATED(4,3),
> - ID_UNALLOCATED(4,4),
> + { SYS_DESC(SYS_ID_AA64ZFR0_EL1), access_id_aa64zfr0_el1, .get_user = 
> get_id_aa64zfr0_el1, .set_user = set_id_aa64zfr0_el1, .visibility = 
> sve_id_visibility },

long line...

>   ID_UNALLOCATED(4,5),
>   ID_UNALLOCATED(4,6),
>   ID_UNALLOCATED(4,7),
> @@ -1383,6 +1455,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
>
>   { SYS_DESC(SYS_SCTLR_EL1), access_vm_reg, reset_val, SCTLR_EL1, 
> 0x00C50078 },
>   { SYS_DESC(SYS_CPACR_EL1), NULL, reset_val, CPACR_EL1, 0 },
> + { SYS_DESC(SYS_ZCR_EL1), NULL, reset_val, ZCR_EL1, 0, .visibility = 
> sve_visibility },
>   { SYS_DESC(SYS_TTBR0_EL1), access_vm_reg, reset_unknown, TTBR0_EL1 },
>   { SYS_DESC(SYS_TTBR1_EL1), access_vm_reg, reset_unknown, TTBR1_EL1 },
>   { SYS_DESC(SYS_TCR_EL1), access_vm_reg, reset_val, TCR_EL1, 0 },

Minor nits aside:

Reviewed-by: Alex Bennée 

--
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v7 13/27] KVM: arm64/sve: Context switch the SVE registers

2019-04-24 Thread Alex Bennée


Dave Martin  writes:

> On Thu, Apr 04, 2019 at 10:35:02AM +0200, Andrew Jones wrote:
>> On Thu, Apr 04, 2019 at 09:10:08AM +0100, Dave Martin wrote:
>> > On Wed, Apr 03, 2019 at 10:01:45PM +0200, Andrew Jones wrote:
>> > > On Fri, Mar 29, 2019 at 01:00:38PM +, Dave Martin wrote:
>> > > > In order to give each vcpu its own view of the SVE registers, this
>> > > > patch adds context storage via a new sve_state pointer in struct
>> > > > vcpu_arch.  An additional member sve_max_vl is also added for each
>> > > > vcpu, to determine the maximum vector length visible to the guest
>> > > > and thus the value to be configured in ZCR_EL2.LEN while the vcpu
>> > > > is active.  This also determines the layout and size of the storage
>> > > > in sve_state, which is read and written by the same backend
>> > > > functions that are used for context-switching the SVE state for
>> > > > host tasks.
>> > > >
>> > > > On SVE-enabled vcpus, SVE access traps are now handled by switching
>> > > > in the vcpu's SVE context and disabling the trap before returning
>> > > > to the guest.  On other vcpus, the trap is not handled and an exit
>> > > > back to the host occurs, where the handle_sve() fallback path
>> > > > reflects an undefined instruction exception back to the guest,
>> > > > consistently with the behaviour of non-SVE-capable hardware (as was
>> > > > done unconditionally prior to this patch).
>> > > >
>> > > > No SVE handling is added on non-VHE-only paths, since VHE is an
>> > > > architectural and Kconfig prerequisite of SVE.
>> > > >
>> > > > Signed-off-by: Dave Martin 
>> > > > Reviewed-by: Julien Thierry 
>> > > > Tested-by: zhang.lei 
>> > > >
>> > > > ---
>> > > >
>> > > > Changes since v5:
>> > > >
>> > > >  * [Julien Thierry, Julien Grall] Commit message typo fixes
>> > > >
>> > > >  * [Mark Rutland] Rename trap_class to hsr_ec, for consistency with
>> > > >existing code.
>> > > >
>> > > >  * [Mark Rutland] Simplify condition for refusing to handle an
>> > > >FPSIMD/SVE trap, using multiple if () statements for clarity.  The
>> > > >previous condition was a bit tortuous, and how that the static_key
>> > > >checks have been hoisted out, it makes little difference to the
>> > > >compiler how we express the condition here.
>> > > > ---
>> > > >  arch/arm64/include/asm/kvm_host.h |  6 
>> > > >  arch/arm64/kvm/fpsimd.c   |  5 +--
>> > > >  arch/arm64/kvm/hyp/switch.c   | 75 
>> > > > +--
>> > > >  3 files changed, 66 insertions(+), 20 deletions(-)
>> > > >
>> > > > diff --git a/arch/arm64/include/asm/kvm_host.h 
>> > > > b/arch/arm64/include/asm/kvm_host.h
>> > > > index 22cf484..4fabfd2 100644
>> > > > --- a/arch/arm64/include/asm/kvm_host.h
>> > > > +++ b/arch/arm64/include/asm/kvm_host.h
>> > > > @@ -228,6 +228,8 @@ struct vcpu_reset_state {
>> > > >
>> > > >  struct kvm_vcpu_arch {
>> > > >struct kvm_cpu_context ctxt;
>> > > > +  void *sve_state;
>> > > > +  unsigned int sve_max_vl;
>> > > >
>> > > >/* HYP configuration */
>> > > >u64 hcr_el2;
>> > > > @@ -323,6 +325,10 @@ struct kvm_vcpu_arch {
>> > > >bool sysregs_loaded_on_cpu;
>> > > >  };
>> > > >
>> > > > +/* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
>> > > > +#define vcpu_sve_pffr(vcpu) ((void *)((char 
>> > > > *)((vcpu)->arch.sve_state) + \
>> > > > +
>> > > > sve_ffr_offset((vcpu)->arch.sve_max_vl)))
>> > >
>> > > Maybe an inline function instead?
>> >
>> > I tried, but that requires the definition of struct kvm_vcpu to be
>> > visible.  I failed to get that here without circular #include problems,
>> > and it looked tricky to fix.
>>
>> Ah, OK
>>
>> >
>> > Since this is a small bit of code which is unlikely to get used by
>> > accident, I decided it was OK to keep it as a macro.
>> >
>> > Can you see another way around this?
>>
>> Nope
>
> OK.  If someone eventually solves this, I'd be happy to change to an
> inline function.

Is the function intended to be used by more call sites? Currently in the
tree with this plus the v2 fixups I can only see:

  arch/arm64/include/asm/kvm_host.h:333:#define vcpu_sve_pffr(vcpu) ((void 
*)((char *)((vcpu)->arch.sve_state) + \
  arch/arm64/kvm/hyp/switch.c:388:  
sve_load_state(vcpu_sve_pffr(vcpu),

--
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v7 17/27] KVM: arm64: Reject ioctl access to FPSIMD V-regs on SVE vcpus

2019-04-24 Thread Alex Bennée


Dave Martin  writes:

> In order to avoid the pointless complexity of maintaining two ioctl
> register access views of the same data, this patch blocks ioctl
> access to the FPSIMD V-registers on vcpus that support SVE.
>
> This will make it more straightforward to add SVE register access
> support.
>
> Since SVE is an opt-in feature for userspace, this will not affect
> existing users.
>
> Signed-off-by: Dave Martin 
> Reviewed-by: Julien Thierry 
> Tested-by: zhang.lei 

Reviewed-by: Alex Bennée 

>
> ---
>
> Changes since v5:
>
>  * Refactored to cope with the removal of core_reg_size_from_offset()
>(which was added by another series which will now be handled
>independently).
>
>This leaves some duplication in that we still filter the V-regs out
>in two places, but this no worse than other existing code in guest.c.
>I plan to tidy this up independently later on.
> ---
>  arch/arm64/kvm/guest.c | 48 
>  1 file changed, 36 insertions(+), 12 deletions(-)
>
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index a391a61..756d0d6 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -54,12 +54,19 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
>   return 0;
>  }
>
> +static bool core_reg_offset_is_vreg(u64 off)
> +{
> + return off >= KVM_REG_ARM_CORE_REG(fp_regs.vregs) &&
> + off < KVM_REG_ARM_CORE_REG(fp_regs.fpsr);
> +}
> +
>  static u64 core_reg_offset_from_id(u64 id)
>  {
>   return id & ~(KVM_REG_ARCH_MASK | KVM_REG_SIZE_MASK | KVM_REG_ARM_CORE);
>  }
>
> -static int validate_core_offset(const struct kvm_one_reg *reg)
> +static int validate_core_offset(const struct kvm_vcpu *vcpu,
> + const struct kvm_one_reg *reg)
>  {
>   u64 off = core_reg_offset_from_id(reg->id);
>   int size;
> @@ -91,11 +98,19 @@ static int validate_core_offset(const struct kvm_one_reg 
> *reg)
>   return -EINVAL;
>   }
>
> - if (KVM_REG_SIZE(reg->id) == size &&
> - IS_ALIGNED(off, size / sizeof(__u32)))
> - return 0;
> + if (KVM_REG_SIZE(reg->id) != size ||
> + !IS_ALIGNED(off, size / sizeof(__u32)))
> + return -EINVAL;
>
> - return -EINVAL;
> + /*
> +  * The KVM_REG_ARM64_SVE regs must be used instead of
> +  * KVM_REG_ARM_CORE for accessing the FPSIMD V-registers on
> +  * SVE-enabled vcpus:
> +  */
> + if (vcpu_has_sve(vcpu) && core_reg_offset_is_vreg(off))
> + return -EINVAL;
> +
> + return 0;
>  }
>
>  static int get_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> @@ -117,7 +132,7 @@ static int get_core_reg(struct kvm_vcpu *vcpu, const 
> struct kvm_one_reg *reg)
>   (off + (KVM_REG_SIZE(reg->id) / sizeof(__u32))) >= nr_regs)
>   return -ENOENT;
>
> - if (validate_core_offset(reg))
> + if (validate_core_offset(vcpu, reg))
>   return -EINVAL;
>
>   if (copy_to_user(uaddr, ((u32 *)regs) + off, KVM_REG_SIZE(reg->id)))
> @@ -142,7 +157,7 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const 
> struct kvm_one_reg *reg)
>   (off + (KVM_REG_SIZE(reg->id) / sizeof(__u32))) >= nr_regs)
>   return -ENOENT;
>
> - if (validate_core_offset(reg))
> + if (validate_core_offset(vcpu, reg))
>   return -EINVAL;
>
>   if (KVM_REG_SIZE(reg->id) > sizeof(tmp))
> @@ -195,13 +210,22 @@ int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, 
> struct kvm_regs *regs)
>   return -EINVAL;
>  }
>
> -static int kvm_arm_copy_core_reg_indices(u64 __user *uindices)
> +static int copy_core_reg_indices(const struct kvm_vcpu *vcpu,
> +  u64 __user *uindices)
>  {
>   unsigned int i;
>   int n = 0;
>   const u64 core_reg = KVM_REG_ARM64 | KVM_REG_SIZE_U64 | 
> KVM_REG_ARM_CORE;
>
>   for (i = 0; i < sizeof(struct kvm_regs) / sizeof(__u32); i++) {
> + /*
> +  * The KVM_REG_ARM64_SVE regs must be used instead of
> +  * KVM_REG_ARM_CORE for accessing the FPSIMD V-registers on
> +  * SVE-enabled vcpus:
> +  */
> + if (vcpu_has_sve(vcpu) && core_reg_offset_is_vreg(i))
> + continue;
> +
>   if (uindices) {
>   if (put_user(core_reg | i, uindices))
>   return -EFAULT;
> @@ -214,9 +238,9 @@ static int kvm_arm_copy_core_reg_indices(u64 __user 
> *uind

Re: [PATCH v7 11/27] KVM: arm64: Support runtime sysreg visibility filtering

2019-04-24 Thread Alex Bennée


Dave Martin  writes:

> Some optional features of the Arm architecture add new system
> registers that are not present in the base architecture.
>
> Where these features are optional for the guest, the visibility of
> these registers may need to depend on some runtime configuration,
> such as a flag passed to KVM_ARM_VCPU_INIT.

This combined with...
> --- a/arch/arm64/kvm/sys_regs.h
> +++ b/arch/arm64/kvm/sys_regs.h
> @@ -64,8 +64,15 @@ struct sys_reg_desc {
>   const struct kvm_one_reg *reg, void __user *uaddr);
>   int (*set_user)(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
>   const struct kvm_one_reg *reg, void __user *uaddr);
> +
> + /* Return mask of REG_* runtime visibility overrides */
> + unsigned int (*visibility)(const struct kvm_vcpu *vcpu,
> +const struct sys_reg_desc *rd);
>  };

this makes me wonder what sort of machines will see different register
visibility depending on which vcpu you are running on?

Otherwise is looks good to me:

Reviewed-by: Alex Bennée 

--
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v7 01/27] KVM: Documentation: Document arm64 core registers in detail

2019-04-24 Thread Alex Bennée


Dave Martin  writes:

> Since the the sizes of individual members of the core arm64
> registers vary, the list of register encodings that make sense is
> not a simple linear sequence.
>
> To clarify which encodings to use, this patch adds a brief list
> to the documentation.
>
> Signed-off-by: Dave Martin 
> Reviewed-by: Julien Grall 
> Reviewed-by: Peter Maydell 

Reviewed-by: Alex Bennée 

> ---
>  Documentation/virtual/kvm/api.txt | 24 
>  1 file changed, 24 insertions(+)
>
> diff --git a/Documentation/virtual/kvm/api.txt 
> b/Documentation/virtual/kvm/api.txt
> index 7de9eee..2d4f7ce 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -2107,6 +2107,30 @@ contains elements ranging from 32 to 128 bits. The 
> index is a 32bit
>  value in the kvm_regs structure seen as a 32bit array.
>0x60x0  0010 
>
> +Specifically:
> +EncodingRegister  Bits  kvm_regs member
> +
> +  0x6030  0010  X0  64  regs.regs[0]
> +  0x6030  0010 0002 X1  64  regs.regs[1]
> +...
> +  0x6030  0010 003c X30 64  regs.regs[30]
> +  0x6030  0010 003e SP  64  regs.sp
> +  0x6030  0010 0040 PC  64  regs.pc
> +  0x6030  0010 0042 PSTATE  64  regs.pstate
> +  0x6030  0010 0044 SP_EL1  64  sp_el1
> +  0x6030  0010 0046 ELR_EL1 64  elr_el1
> +  0x6030  0010 0048 SPSR_EL164  spsr[KVM_SPSR_EL1] (alias SPSR_SVC)
> +  0x6030  0010 004a SPSR_ABT64  spsr[KVM_SPSR_ABT]
> +  0x6030  0010 004c SPSR_UND64  spsr[KVM_SPSR_UND]
> +  0x6030  0010 004e SPSR_IRQ64  spsr[KVM_SPSR_IRQ]
> +  0x6060  0010 0050 SPSR_FIQ64  spsr[KVM_SPSR_FIQ]
> +  0x6040  0010 0054 V0 128  fp_regs.vregs[0]
> +  0x6040  0010 0058 V1 128  fp_regs.vregs[1]
> +...
> +  0x6040  0010 00d0 V31128  fp_regs.vregs[31]
> +  0x6020  0010 00d4 FPSR32  fp_regs.fpsr
> +  0x6020 0000 0010 00d5 FPCR32  fp_regs.fpcr
> +
>  arm64 CCSIDR registers are demultiplexed by CSSELR value:
>0x6020  0011 00 


--
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v2 00/14] KVM: arm64: SVE cleanups

2019-04-24 Thread Alex Bennée


Dave Martin  writes:

> This series contains some cleanups applicable to the SVE KVM support
> patches merged into kvmarm/next.  These arose from Andrew Jones'
> review.

Does this mean these won't get merged into the original series before
the final merging upstream?



--
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Kernel boot regression with PAuth and aarch64-softmmu -cpu max and el2 enabled

2019-01-29 Thread Alex Bennée


Hi,

Following up on yesterday's discussion on IRC I thought I'd better
report on my findings in the permanent record so things don't get lost.

As I tend to periodically rebuild my test kernels from the current
state of linux.git I occasionally run into these things. My test
invocation is:

  qemu-system-aarch64 -machine type=virt,virtualization=on \
  -display none -m 4096 -serial mon:stdio \
  -kernel ../../kernel-v8-plain.build/arch/arm64/boot/Image 
\
  -append 'console=ttyAMA0 panic=-1' -no-reboot -cpu max

The kernel is essentially a defconfig kernel with a bunch of the VIRTIO
device drivers built-in for when I actually boot a more complex setup
with disks and drives. However this is a boot test so doesn't really
matter.

The -machine type=virt,virtualization=on enables our virt machine model
with EL2 turned on. As there is no BIOS involved the kernel is invoked
directly at EL2.

The -cpu max enabled a cortex-a57 + whatever extra features we've
enabled in QEMU so far. It won't match any "real" CPU but it should be
architecturally correct in so far we implement prerequisite features for
any given feature. The cpuid feature bits should also be correct as we
test them internally in QEMU to enable features.

The breakage is the kernel never boots (no output on serial port) and on
attaching with gdb I found it stuck in:

  (gdb) bt
  #0  0xff8010a9e480 in overflow_stack ()
  Backtrace stopped: not enough registers or memory available to unwind further

If I turn on exception tracing it looks like we go into an exception
loop.

On the QEMU side this breakage comes in at:

  commit 1ce32e47db52e3511132c7104770eae65d412144 (HEAD, refs/bisect/bad)
  Author: Richard Henderson 
  Date:   Mon Jan 21 10:23:13 2019 +

  target/arm: Enable PAuth for -cpu max

  Reviewed-by: Peter Maydell 
  Signed-off-by: Richard Henderson 
  Message-id: 20190108223129.5570-30-richard.hender...@linaro.org
  Signed-off-by: Peter Maydell 

and as you would expect the system boots fine with -cpu cortex-a57

On the kernel side it breaks at:

  commit 04ca3204fa09f5f55c8f113b0072004a7b364ff4
  Author: Mark Rutland 
  Date:   Fri Dec 7 18:39:30 2018 +

  arm64: enable pointer authentication

  Now that all the necessary bits are in place for userspace, add the
  necessary Kconfig logic to allow this to be enabled.

  Signed-off-by: Mark Rutland 
  Signed-off-by: Kristina Martsenko 
  Cc: Catalin Marinas 
  Cc: Will Deacon 
  Signed-off-by: Will Deacon 

So predictably we failed at enabling PAuth somewhere between the kernel
and QEMU.

I'm guessing the kernel so far has been tested on the fast model with a
full chain of TF, UEFI and kernel?

I think Richard's tests were without EL2 enabled.

So in the case that the kernel boots in EL2 is it expecting anyone else
to deal with Pauth exceptions or should it be able to cope with an
enabled Pauth but no firmware underneath it?

Either we've got something wrong or we'll need to rethink what features
the user can have enabled by -cpu max on a direct kernel boot.

--
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [RFC PATCH v2 00/23] KVM: arm64: Initial support for SVE guests

2018-11-22 Thread Alex Bennée


Dave Martin  writes:

> This series implements basic support for allowing KVM guests to use the
> Arm Scalable Vector Extension (SVE).
>
> The patches are based on v4.19-rc5.
>
> The patches are also available on a branch for reviewer convenience. [1]
>
> This is a significant overhaul of the previous preliminary series [2],
> with the major changes outlined below, and additional minor updates in
> response to review feedback (see the individual patches for those).
>
> In the interest of getting this series out for review,
> This series is **completely untested**.

Richard is currently working on VHE support for QEMU and we already have
SVE system emulation support as of 3.1 so hopefully QEMU will be able to
test this soon (and probably shake out a few of our own bugs ;-).

> Reviewers should focus on the proposed API (but any other comments are
> of course welcome!)


I've finished my pass for this revision. Sorry it took so long to get to
it.

--
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [RFC PATCH v2 23/23] KVM: arm64/sve: Document KVM API extensions for SVE

2018-11-22 Thread Alex Bennée

d: */
> +#define KVM_ARM_SVE_CONFIG_QUERY 0 /* query what the host can support */
> +#define KVM_ARM_SVE_CONFIG_SET   1 /* enable SVE for vcpu and 
> set VLs */
> +#define KVM_ARM_SVE_CONFIG_GET   2 /* read the set of VLs for a 
> vcpu */
> +
> +Subcommand details:
> +
> +4.116.1 KVM_ARM_SVE_CONFIG_QUERY
> +Type: vm and vcpu
> +
> +Retrieve the full set of SVE vector lengths available for use by KVM
> +guests on this host.  The result is independent of which vcpu this
> +command is invoked on.  As a convenience, it may also be invoked on a
> +vm file descriptor, eliminating the need to create a vcpu first.
> +
> +4.116.2 KVM_ARM_SVE_CONFIG_SET
> +Type: vcpu only
> +
> +Enables SVE for the vcpu and sets the set of SVE vector lengths that
> +will be visible to the guest.
> +
> +This is the only way to enable SVE for a vcpu: if this command is not
> +invoked for a vcpu then SVE will not be available to the guest on this
> +vcpu.
> +
> +This subcommand is only permitted once per vcpu, before KVM_RUN has been
> +invoked for the vcpu for the first time.  Otherwise, the command fails
> +with -EBADFD and the state of the vcpu is not modified.
> +
> +In typical use, the user should call KVM_ARM_SVE_CONFIG_QUERY first to
> +populate a struct kvm_sve_vls with the full set of vector lengths
> +available on the host, then set cmd = KVM_ARM_SVE_CONFIG_SET and
> +re-issue the KVM_ARM_SVE_CONFIG ioctl on the desired vcpu.  This will
> +configure the best set of vector lengths available.  When following this
> +approach, the maximum available vector length can also be restricted by
> +reducing the value of max_vq before invoking KVM_ARM_SVE_CONFIG_SET.
> +
> +Every requested vector length in the struct kvm_sve_vls argument must be
> +supported by the hardware.  In addition, except for vector lengths
> +greater than the maximum requested vector length, every vector length
> +not requested must *not* be supported by the hardware.  (The latter
> +restriction may be relaxed in the future.)  If the requested set of
> +vector lengths is not supportable, the command fails with -EINVAL and
> +the state of the vcpu is not modified.
> +
> +Different vcpus of a vm may be configured with different sets of vector
> +lengths.  Equally, some vcpus may have SVE enabled and some not.
> +However, such configurations are not recommended except for testing and
> +experimentation purposes.  Architecturally compliant guest OSes will
> +work, but may or may not make effective use of the resulting
> +configuration.
> +
> +After a successful KVM_ARM_SVE_CONFIG_SET, KVM_ARM_SVE_CONFIG_GET can be
> +used to retrieve the configured set of vector lengths.
> +
> +4.116.3 KVM_ARM_SVE_CONFIG_GET
> +Type: vcpu only
> +
> +This subcommand returns the set of vector lengths enabled for the vcpu.
> +SVE must have been enabled and configured for this vcpu by a successful
> +prior KVM_ARM_SVE_CONFIG_SET call.  Otherwise, -EBADFD is returned.
> +
> +The state of the vcpu is unchanged.
> +
> +
>  5. The kvm_run structure
>  


--
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [RFC PATCH v2 21/23] KVM: arm64/sve: allow KVM_ARM_SVE_CONFIG_QUERY on vm fd

2018-11-22 Thread Alex Bennée


Dave Martin  writes:

> Since userspace may need to decide on the set of vector lengths for
> the guest before setting up a vm, it is onerous to require a vcpu
> fd to be available first.  KVM_ARM_SVE_CONFIG_QUERY is not
> vcpu-dependent anyway, so this patch wires up KVM_ARM_SVE_CONFIG to
> be usable on a vm fd where appropriate.
>
> Subcommands that are vcpu-dependent (currently
> KVM_ARM_SVE_CONFIG_SET, KVM_ARM_SVE_CONFIG_GET) will return -EINVAL
> if invoked on a vm fd.
>
> Signed-off-by: Dave Martin 

Apropos comments on last patch, this could go away if we went with just
reporting the max VL via the SVE capability probe which works on the
vmfd.

> ---
>  arch/arm64/kvm/guest.c | 17 -
>  1 file changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index f066b17..2313c22 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -574,6 +574,9 @@ static int kvm_vcpu_set_sve_vls(struct kvm_vcpu *vcpu, 
> struct kvm_sve_vls *vls,
>   unsigned int vq, max_vq;
>   int ret;
>
> + if (!vcpu)
> + return -EINVAL; /* per-vcpu operation on vm fd */
> +
>   if (vcpu->arch.has_run_once || vcpu_has_sve(vcpu))
>   return -EBADFD; /* too late, or already configured */
>
> @@ -659,6 +662,9 @@ static int kvm_vcpu_query_sve_vls(struct kvm_vcpu *vcpu, 
> struct kvm_sve_vls *vls
>  static int kvm_vcpu_get_sve_vls(struct kvm_vcpu *vcpu, struct kvm_sve_vls 
> *vls,
>   struct kvm_sve_vls __user *userp)
>  {
> + if (!vcpu)
> + return -EINVAL; /* per-vcpu operation on vm fd */
> +
>   if (!vcpu_has_sve(vcpu))
>   return -EBADFD; /* not configured yet */
>
> @@ -668,6 +674,7 @@ static int kvm_vcpu_get_sve_vls(struct kvm_vcpu *vcpu, 
> struct kvm_sve_vls *vls,
>   sve_vq_from_vl(vcpu->arch.sve_max_vl), userp);
>  }
>
> +/* vcpu may be NULL if this is called via a vm fd */
>  static int kvm_vcpu_sve_config(struct kvm_vcpu *vcpu,
>  struct kvm_sve_vls __user *userp)
>  {
> @@ -717,7 +724,15 @@ int kvm_arm_arch_vcpu_ioctl(struct kvm_vcpu *vcpu,
>  int kvm_arm_arch_vm_ioctl(struct kvm *kvm,
> unsigned int ioctl, unsigned long arg)
>  {
> - return -EINVAL;
> + void __user *userp = (void __user *)arg;
> +
> + switch (ioctl) {
> + case KVM_ARM_SVE_CONFIG:
> + return kvm_vcpu_sve_config(NULL, userp);
> +
> + default:
> + return -EINVAL;
> + }
>  }
>
>  int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)


--
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [RFC PATCH v2 19/23] KVM: arm64/sve: Report and enable SVE API extensions for userspace

2018-11-22 Thread Alex Bennée

using WARN_ON for side effects seems sketchy but while
BUG_ON can compile away to nothing it seems WARN_ON has been designed to
always give you a result of the condition so never mind...

> +
> + memset(vcpu->arch.sve_state, 0,
> +SVE_SIG_REGS_SIZE(sve_vq_from_vl(vcpu->arch.sve_max_vl)));
> +
> + return 0;
> +}
> +
>  /**
>   * kvm_reset_vcpu - sets core registers and sys_regs to reset value
>   * @vcpu: The VCPU pointer
> @@ -103,6 +148,7 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, 
> long ext)
>  int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>  {
>   const struct kvm_regs *cpu_reset;
> + int ret;
>
>   switch (vcpu->arch.target) {
>   default:
> @@ -120,6 +166,10 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>   /* Reset core registers */
>   memcpy(vcpu_gp_regs(vcpu), cpu_reset, sizeof(*cpu_reset));
>
> + ret = kvm_reset_sve(vcpu);
> + if (ret)
> + return ret;
> +
>   /* Reset system registers */
>   kvm_reset_sys_regs(vcpu);
>
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 7c3c5cc..488ca56 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -953,6 +953,7 @@ struct kvm_ppc_resize_hpt {
>  #define KVM_CAP_NESTED_STATE 157
>  #define KVM_CAP_ARM_INJECT_SERROR_ESR 158
>  #define KVM_CAP_MSR_PLATFORM_INFO 159
> +#define KVM_CAP_ARM_SVE 160
>
>  #ifdef KVM_CAP_IRQ_ROUTING
>
> @@ -1400,6 +1401,9 @@ struct kvm_enc_region {
>  #define KVM_GET_NESTED_STATE _IOWR(KVMIO, 0xbe, struct 
> kvm_nested_state)
>  #define KVM_SET_NESTED_STATE _IOW(KVMIO,  0xbf, struct 
> kvm_nested_state)
>
> +/* Available with KVM_CAP_ARM_SVE */
> +#define KVM_ARM_SVE_CONFIG _IOWR(KVMIO,  0xc0, struct kvm_sve_vls)
> +
>  /* Secure Encrypted Virtualization command */
>  enum sev_cmd_id {
>   /* Guest initialization commands */


--
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [RFC PATCH v2 11/23] KVM: arm64: Support runtime sysreg filtering for KVM_GET_REG_LIST

2018-11-22 Thread Alex Bennée


Christoffer Dall  writes:

> [Adding Peter and Alex for their view on the QEMU side]
>
> On Thu, Nov 15, 2018 at 05:27:11PM +, Dave Martin wrote:
>> On Fri, Nov 02, 2018 at 09:16:25AM +0100, Christoffer Dall wrote:
>> > On Fri, Sep 28, 2018 at 02:39:15PM +0100, Dave Martin wrote:
>> > > KVM_GET_REG_LIST should only enumerate registers that are actually
>> > > accessible, so it is necessary to filter out any register that is
>> > > not exposed to the guest.  For features that are configured at
>> > > runtime, this will require a dynamic check.
>> > >
>> > > For example, ZCR_EL1 and ID_AA64ZFR0_EL1 would need to be hidden
>> > > if SVE is not enabled for the guest.
>> >
>> > This implies that userspace can never access this interface for a vcpu
>> > before having decided whether such features are enabled for the guest or
>> > not, since otherwise userspace will see different states for a VCPU
>> > depending on sequencing of the API, which sounds fragile to me.
>> >
>> > That should probably be documented somewhere, and I hope the
>> > enable/disable API for SVE in guests already takes that into account.
>> >
>> > Not sure if there's an action to take here, but it was the best place I
>> > could raise this concern.
>>
>> Fair point.  I struggled to come up with something better that solves
>> all problems.
>>
>> My expectation is that KVM_ARM_SVE_CONFIG_SET is considered part of
>> creating the vcpu, so that if issued at all for a vcpu, it is issued
>> very soon after KVM_VCPU_INIT.
>>
>> I think this worked OK with the current structure of kvmtool and I
>> seem to remember discussing this with Peter Maydell re qemu -- but
>> it sounds like I should double-check.
>
> QEMU does some thing around enumerating all the system registers exposed
> by KVM and saving/restoring them as part of its startup, but I don't
> remember the exact sequence.

QEMU does this for each vCPU as part of it's start-up sequence:

  kvm_init_vcpu
kvm_get_cpu (-> KVM_CREATE_VCPU)
KVM_GET_VCPU_MMAP_SIZE
kvm_arch_init_vcpu
  kvm_arm_vcpu_init (-> KVM_ARM_VCPU_INIT)
  kvm_get_one_reg(ARM_CPU_ID_MPIDR)
  kvm_arm_init_debug (chk for KVM_CAP 
SET_GUEST_DEBUG/GUEST_DEBUG_HW_WPS/BPS)
  kvm_arm_init_serror_injection (chk KVM_CAP_ARM_INJECT_SERROR_ESR)
  kvm_arm_init_cpreg_list (KVM_GET_REG_LIST)

At this point we have the register list we need for
kvm_arch_get_registers which is what we call every time we want to
synchronise state. We only really do this for debug events, crashes and
at some point when migrating.

>
>>
>> Either way, you're right, this needs to be clearly documented.
>>
>>
>> If we want to be more robust, maybe we should add a capability too,
>> so that userspace that enables this capability promises to call
>> KVM_ARM_SVE_CONFIG_SET for each vcpu, and affected ioctls (KVM_RUN,
>> KVM_GET_REG_LIST etc.) are forbidden until that is done?
>>
>> That should help avoid accidents.
>>
>> I could add a special meaning for an empty kvm_sve_vls, such that
>> it doesn't enable SVE on the affected vcpu.  That retains the ability
>> to create heterogeneous guests while still following the above flow.
>>
> I think making sure that userspace can ever only see the same list of
> available system regiters is going to cause us less pain going forward.
>
> If the separate ioctl and capability check is the easiest way of doing
> that, then I think that sounds good.  (I had wished we could have just
> added some data to KVM_CREATE_VCPU, but that doesn't seem to be the
> case.)
>
>
> Thanks,
>
> Christoffer


--
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [RFC PATCH v2 16/23] KVM: arm64: Enumerate SVE register indices for KVM_GET_REG_LIST

2018-11-21 Thread Alex Bennée


Dave Martin  writes:

> On Wed, Nov 21, 2018 at 04:09:03PM +0000, Alex Bennée wrote:
>>
>> Dave Martin  writes:
>>
>> > This patch includes the SVE register IDs in the list returned by
>> > KVM_GET_REG_LIST, as appropriate.
>> >
>> > On a non-SVE-enabled vcpu, no extra IDs are added.
>> >
>> > On an SVE-enabled vcpu, the appropriate number of slice IDs are
>> > enumerated for each SVE register, depending on the maximum vector
>> > length for the vcpu.
>> >
>> > Signed-off-by: Dave Martin 
>> > ---
>> >
>> > Changes since RFCv1:
>> >
>> >  * Simplify enumerate_sve_regs() based on Andrew Jones' approach.
>> >
>> >  * Reg copying loops are inverted for brevity, since the order we
>> >spit out the regs in doesn't really matter.
>> >
>> > (I tried to keep part of my approach to avoid the duplicate logic
>> > between num_sve_regs() and copy_sve_reg_indices(), but although
>> > it works in principle, gcc fails to fully collapse the num_regs()
>> > case... so I gave up.  The two functions need to be manually kept
>> > consistent, but hopefully that's fairly straightforward.)
>> > ---
>> >  arch/arm64/kvm/guest.c | 45 +
>> >  1 file changed, 45 insertions(+)
>> >
>> > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
>> > index 320db0f..89eab68 100644
>> > --- a/arch/arm64/kvm/guest.c
>> > +++ b/arch/arm64/kvm/guest.c
>> > @@ -323,6 +323,46 @@ static int get_timer_reg(struct kvm_vcpu *vcpu, const 
>> > struct kvm_one_reg *reg)
>> >return copy_to_user(uaddr, , KVM_REG_SIZE(reg->id)) ? -EFAULT : 0;
>> >  }
>> >
>> > +static unsigned long num_sve_regs(const struct kvm_vcpu *vcpu)
>> > +{
>> > +  const unsigned int slices = DIV_ROUND_UP(
>> > +  vcpu->arch.sve_max_vl,
>> > +  KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0)));
>>
>> Having seen this formulation come up several times now I wonder if there
>> should be a kernel private define, KVM_SVE_ZREG/PREG_SIZE to avoid this
>> clumsiness.
>
> I agree it's a bit awkward.  Previous I spelled this "0x100", which
> was terse but more sensitive to typos and other screwups that Io
> liked.
>
>> You could still use the KVM_REG_SIZE to extract it as I guess this is to
>> make changes simpler if/when the SVE reg size gets bumped up.
>
> That might be more challenging to determine at compile time.
>
> I'm not sure how good GCC is at doing const-propagation between related
> (but different) expressions, so I preferred to go for something that
> is clearly compiletime constant rather than extracting it from the
> register ID that came from userspace.
>
> So, I'd prefer not to use KVM_REG_SIZE() for this, but I'm happy to add
> a private #define to hide this cumbersome construct.  That would
> certainly make the code more readable.
>
> (Of course, the actual runtime cost is trivial either way, but I felt
> it was easier to reason about correctness if this is really a constant.)
>
>
> Sound OK?

Yes.

I'd almost suggested by not just use KVM_REG_SIZE(KVM_REG_SIZE_U2048)
earlier until I realised this might be forward looking.

--
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [RFC PATCH v2 17/23] arm64/sve: In-kernel vector length availability query interface

2018-11-21 Thread Alex Bennée


Dave Martin  writes:

> On Wed, Nov 21, 2018 at 04:16:42PM +0000, Alex Bennée wrote:
>>
>> Dave Martin  writes:
>>
>> > KVM will need to interrogate the set of SVE vector lengths
>> > available on the system.
>> >
>> > This patch exposes the relevant bits to the kernel, along with a
>> > sve_vq_available() helper to check whether a particular vector
>> > length is supported.
>> >
>> > vq_to_bit() and bit_to_vq() are not intended for use outside these
>> > functions, so they are given a __ prefix to warn people not to use
>> > them unless they really know what they are doing.
>>
>> Personally I wouldn't have bothered with the __ but whatever:
>>
>> Reviewed-by: Alex Bennée 
>
> OK, thanks
>
> I'll probably keep the __ unless somebody else objects, but if you feel
> strongly I could get rid of it.

nah - it's just a personal opinion...

> Perhaps I simply shouldn't have called attention to it in the commit
> message ;)

Psychological priming ;-)

>
> Cheers
> ---Dave
>
>>
>> >
>> > Signed-off-by: Dave Martin 
>> > ---
>> >  arch/arm64/include/asm/fpsimd.h | 29 +
>> >  arch/arm64/kernel/fpsimd.c  | 35 ---
>> >  2 files changed, 37 insertions(+), 27 deletions(-)
>> >
>> > diff --git a/arch/arm64/include/asm/fpsimd.h 
>> > b/arch/arm64/include/asm/fpsimd.h
>> > index df7a143..ad6d2e4 100644
>> > --- a/arch/arm64/include/asm/fpsimd.h
>> > +++ b/arch/arm64/include/asm/fpsimd.h
>> > @@ -24,10 +24,13 @@
>> >
>> >  #ifndef __ASSEMBLY__
>> >
>> > +#include 
>> >  #include 
>> > +#include 
>> >  #include 
>> >  #include 
>> >  #include 
>> > +#include 
>> >
>> >  #if defined(__KERNEL__) && defined(CONFIG_COMPAT)
>> >  /* Masks for extracting the FPSR and FPCR from the FPSCR */
>> > @@ -89,6 +92,32 @@ extern u64 read_zcr_features(void);
>> >
>> >  extern int __ro_after_init sve_max_vl;
>> >  extern int __ro_after_init sve_max_virtualisable_vl;
>> > +/* Set of available vector lengths, as vq_to_bit(vq): */
>> > +extern __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
>> > +
>> > +/*
>> > + * Helpers to translate bit indices in sve_vq_map to VQ values (and
>> > + * vice versa).  This allows find_next_bit() to be used to find the
>> > + * _maximum_ VQ not exceeding a certain value.
>> > + */
>> > +static inline unsigned int __vq_to_bit(unsigned int vq)
>> > +{
>> > +  return SVE_VQ_MAX - vq;
>> > +}
>> > +
>> > +static inline unsigned int __bit_to_vq(unsigned int bit)
>> > +{
>
> [...]


--
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [RFC PATCH v2 17/23] arm64/sve: In-kernel vector length availability query interface

2018-11-21 Thread Alex Bennée


Dave Martin  writes:

> KVM will need to interrogate the set of SVE vector lengths
> available on the system.
>
> This patch exposes the relevant bits to the kernel, along with a
> sve_vq_available() helper to check whether a particular vector
> length is supported.
>
> vq_to_bit() and bit_to_vq() are not intended for use outside these
> functions, so they are given a __ prefix to warn people not to use
> them unless they really know what they are doing.

Personally I wouldn't have bothered with the __ but whatever:

Reviewed-by: Alex Bennée 

>
> Signed-off-by: Dave Martin 
> ---
>  arch/arm64/include/asm/fpsimd.h | 29 +
>  arch/arm64/kernel/fpsimd.c  | 35 ---
>  2 files changed, 37 insertions(+), 27 deletions(-)
>
> diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
> index df7a143..ad6d2e4 100644
> --- a/arch/arm64/include/asm/fpsimd.h
> +++ b/arch/arm64/include/asm/fpsimd.h
> @@ -24,10 +24,13 @@
>
>  #ifndef __ASSEMBLY__
>
> +#include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #if defined(__KERNEL__) && defined(CONFIG_COMPAT)
>  /* Masks for extracting the FPSR and FPCR from the FPSCR */
> @@ -89,6 +92,32 @@ extern u64 read_zcr_features(void);
>
>  extern int __ro_after_init sve_max_vl;
>  extern int __ro_after_init sve_max_virtualisable_vl;
> +/* Set of available vector lengths, as vq_to_bit(vq): */
> +extern __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
> +
> +/*
> + * Helpers to translate bit indices in sve_vq_map to VQ values (and
> + * vice versa).  This allows find_next_bit() to be used to find the
> + * _maximum_ VQ not exceeding a certain value.
> + */
> +static inline unsigned int __vq_to_bit(unsigned int vq)
> +{
> + return SVE_VQ_MAX - vq;
> +}
> +
> +static inline unsigned int __bit_to_vq(unsigned int bit)
> +{
> + if (WARN_ON(bit >= SVE_VQ_MAX))
> + bit = SVE_VQ_MAX - 1;
> +
> + return SVE_VQ_MAX - bit;
> +}
> +
> +/* Ensure vq >= SVE_VQ_MIN && vq <= SVE_VQ_MAX before calling this function 
> */
> +static inline bool sve_vq_available(unsigned int vq)
> +{
> + return test_bit(__vq_to_bit(vq), sve_vq_map);
> +}
>
>  #ifdef CONFIG_ARM64_SVE
>
> diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
> index 60c5e28..cc5a495 100644
> --- a/arch/arm64/kernel/fpsimd.c
> +++ b/arch/arm64/kernel/fpsimd.c
> @@ -136,7 +136,7 @@ static int sve_default_vl = -1;
>  int __ro_after_init sve_max_vl = SVE_VL_MIN;
>  int __ro_after_init sve_max_virtualisable_vl = SVE_VL_MIN;
>  /* Set of available vector lengths, as vq_to_bit(vq): */
> -static __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
> +__ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
>  /* Set of vector lengths present on at least one cpu: */
>  static __ro_after_init DECLARE_BITMAP(sve_vq_partial_map, SVE_VQ_MAX);
>  static void __percpu *efi_sve_state;
> @@ -270,25 +270,6 @@ void fpsimd_save(void)
>  }
>
>  /*
> - * Helpers to translate bit indices in sve_vq_map to VQ values (and
> - * vice versa).  This allows find_next_bit() to be used to find the
> - * _maximum_ VQ not exceeding a certain value.
> - */
> -
> -static unsigned int vq_to_bit(unsigned int vq)
> -{
> - return SVE_VQ_MAX - vq;
> -}
> -
> -static unsigned int bit_to_vq(unsigned int bit)
> -{
> - if (WARN_ON(bit >= SVE_VQ_MAX))
> - bit = SVE_VQ_MAX - 1;
> -
> - return SVE_VQ_MAX - bit;
> -}
> -
> -/*
>   * All vector length selection from userspace comes through here.
>   * We're on a slow path, so some sanity-checks are included.
>   * If things go wrong there's a bug somewhere, but try to fall back to a
> @@ -309,8 +290,8 @@ static unsigned int find_supported_vector_length(unsigned 
> int vl)
>   vl = max_vl;
>
>   bit = find_next_bit(sve_vq_map, SVE_VQ_MAX,
> - vq_to_bit(sve_vq_from_vl(vl)));
> - return sve_vl_from_vq(bit_to_vq(bit));
> + __vq_to_bit(sve_vq_from_vl(vl)));
> + return sve_vl_from_vq(__bit_to_vq(bit));
>  }
>
>  #ifdef CONFIG_SYSCTL
> @@ -651,7 +632,7 @@ static void sve_probe_vqs(DECLARE_BITMAP(map, SVE_VQ_MAX))
>   write_sysreg_s(zcr | (vq - 1), SYS_ZCR_EL1); /* self-syncing */
>   vl = sve_get_vl();
>   vq = sve_vq_from_vl(vl); /* skip intervening lengths */
> - set_bit(vq_to_bit(vq), map);
> + set_bit(__vq_to_bit(vq), map);
>   }
>  }
>
> @@ -712,7 +693,7 @@ int sve_verify_vq_map(void)

Re: [RFC PATCH v2 16/23] KVM: arm64: Enumerate SVE register indices for KVM_GET_REG_LIST

2018-11-21 Thread Alex Bennée


Dave Martin  writes:

> This patch includes the SVE register IDs in the list returned by
> KVM_GET_REG_LIST, as appropriate.
>
> On a non-SVE-enabled vcpu, no extra IDs are added.
>
> On an SVE-enabled vcpu, the appropriate number of slice IDs are
> enumerated for each SVE register, depending on the maximum vector
> length for the vcpu.
>
> Signed-off-by: Dave Martin 
> ---
>
> Changes since RFCv1:
>
>  * Simplify enumerate_sve_regs() based on Andrew Jones' approach.
>
>  * Reg copying loops are inverted for brevity, since the order we
>spit out the regs in doesn't really matter.
>
> (I tried to keep part of my approach to avoid the duplicate logic
> between num_sve_regs() and copy_sve_reg_indices(), but although
> it works in principle, gcc fails to fully collapse the num_regs()
> case... so I gave up.  The two functions need to be manually kept
> consistent, but hopefully that's fairly straightforward.)
> ---
>  arch/arm64/kvm/guest.c | 45 +
>  1 file changed, 45 insertions(+)
>
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 320db0f..89eab68 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -323,6 +323,46 @@ static int get_timer_reg(struct kvm_vcpu *vcpu, const 
> struct kvm_one_reg *reg)
>   return copy_to_user(uaddr, , KVM_REG_SIZE(reg->id)) ? -EFAULT : 0;
>  }
>
> +static unsigned long num_sve_regs(const struct kvm_vcpu *vcpu)
> +{
> + const unsigned int slices = DIV_ROUND_UP(
> + vcpu->arch.sve_max_vl,
> + KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0)));

Having seen this formulation come up several times now I wonder if there
should be a kernel private define, KVM_SVE_ZREG/PREG_SIZE to avoid this
clumsiness.

You could still use the KVM_REG_SIZE to extract it as I guess this is to
make changes simpler if/when the SVE reg size gets bumped up.

> +
> + if (!vcpu_has_sve(vcpu))
> + return 0;
> +
> + return slices * (SVE_NUM_PREGS + SVE_NUM_ZREGS + 1 /* FFR */);
> +}
> +
> +static int copy_sve_reg_indices(const struct kvm_vcpu *vcpu, u64 __user 
> **uind)
> +{
> + const unsigned int slices = DIV_ROUND_UP(
> + vcpu->arch.sve_max_vl,
> + KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0)));
> + unsigned int i, n;
> +
> + if (!vcpu_has_sve(vcpu))
> + return 0;
> +
> + for (i = 0; i < slices; i++) {
> + for (n = 0; n < SVE_NUM_ZREGS; n++) {
> + if (put_user(KVM_REG_ARM64_SVE_ZREG(n, i), (*uind)++))
> + return -EFAULT;
> + }
> +
> + for (n = 0; n < SVE_NUM_PREGS; n++) {
> + if (put_user(KVM_REG_ARM64_SVE_PREG(n, i), (*uind)++))
> + return -EFAULT;
> + }
> +
> + if (put_user(KVM_REG_ARM64_SVE_FFR(i), (*uind)++))
> + return -EFAULT;
> + }
> +
> + return 0;
> +}
> +
>  /**
>   * kvm_arm_num_regs - how many registers do we present via KVM_GET_ONE_REG
>   *
> @@ -333,6 +373,7 @@ unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
>   unsigned long res = 0;
>
>   res += num_core_regs();
> + res += num_sve_regs(vcpu);
>   res += kvm_arm_num_sys_reg_descs(vcpu);
>   res += kvm_arm_get_fw_num_regs(vcpu);
>   res += NUM_TIMER_REGS;
> @@ -357,6 +398,10 @@ int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 
> __user *uindices)
>   uindices++;
>   }
>
> + ret = copy_sve_reg_indices(vcpu, );
> + if (ret)
> + return ret;
> +
>   ret = kvm_arm_copy_fw_reg_indices(vcpu, uindices);
>   if (ret)
>   return ret;

Otherwise:

Reviewed-by: Alex Bennée 

--
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [RFC PATCH v2 15/23] KVM: arm64/sve: Add SVE support to register access ioctl interface

2018-11-21 Thread Alex Bennée

 + const unsigned int slice_num =
> + (reg->id & SVE_REG_SLICE_MASK) >> SVE_REG_SLICE_SHIFT;
> +
> + unsigned int slice_size, offset, limit;
> +
> + if (reg->id >= KVM_REG_ARM64_SVE_ZREG(0, 0) &&
> + reg->id <= KVM_REG_ARM64_SVE_ZREG(SVE_NUM_ZREGS - 1,
> +   SVE_NUM_SLICES - 1)) {
> + slice_size = KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0));
> +
> + /* Compute start and end of the register: */
> + offset = SVE_SIG_ZREG_OFFSET(vq, reg_num) - SVE_SIG_REGS_OFFSET;
> + limit = offset + SVE_SIG_ZREG_SIZE(vq);
> +
> + offset += slice_size * slice_num; /* start of requested slice */
> +
> + } else if (reg->id >= KVM_REG_ARM64_SVE_PREG(0, 0) &&
> +reg->id <= KVM_REG_ARM64_SVE_FFR(SVE_NUM_SLICES - 1)) {
> + /* (FFR is P16 for our purposes) */
> +
> + slice_size = KVM_REG_SIZE(KVM_REG_ARM64_SVE_PREG(0, 0));
> +
> + /* Compute start and end of the register: */
> + offset = SVE_SIG_PREG_OFFSET(vq, reg_num) - SVE_SIG_REGS_OFFSET;
> + limit = offset + SVE_SIG_PREG_SIZE(vq);
> +
> + offset += slice_size * slice_num; /* start of requested slice */
> +
> + } else {
> + return -ENOENT;
> + }
> +
> + b->kptr = (char *)vcpu->arch.sve_state + offset;
> +
> + /*
> +  * If the slice starts after the end of the reg, just pad.
> +  * Otherwise, copy as much as possible up to slice_size and pad
> +  * the remainder:
> +  */
> + b->size = offset >= limit ? 0 : min(limit - offset, slice_size);
> + b->zeropad = slice_size - b->size;
> +
> + return 0;
> +}
> +
> +static int get_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> +{
> + struct kreg_region kreg;
> + char __user *uptr = (char __user *)reg->addr;
> +
> + if (!vcpu_has_sve(vcpu) || sve_reg_region(, vcpu, reg))
> + return -ENOENT;
> +
> + if (copy_to_user(uptr, kreg.kptr, kreg.size) ||
> + clear_user(uptr + kreg.size, kreg.zeropad))
> + return -EFAULT;
> +
> + return 0;
> +}
> +
> +static int set_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> +{
> + struct kreg_region kreg;
> + char __user *uptr = (char __user *)reg->addr;
> +
> + if (!vcpu_has_sve(vcpu) || sve_reg_region(, vcpu, reg))
> + return -ENOENT;
> +
> + if (copy_from_user(kreg.kptr, uptr, kreg.size))
> + return -EFAULT;
> +
> + return 0;
> +}
> +
>  int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs 
> *regs)
>  {
>   return -EINVAL;
> @@ -251,12 +376,11 @@ int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct 
> kvm_one_reg *reg)
>   if ((reg->id & ~KVM_REG_SIZE_MASK) >> 32 != KVM_REG_ARM64 >> 32)
>   return -EINVAL;
>
> - /* Register group 16 means we want a core register. */
> - if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
> - return get_core_reg(vcpu, reg);
> -
> - if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
> - return kvm_arm_get_fw_reg(vcpu, reg);
> + switch (reg->id & KVM_REG_ARM_COPROC_MASK) {
> + case KVM_REG_ARM_CORE:  return get_core_reg(vcpu, reg);
> + case KVM_REG_ARM_FW:return kvm_arm_get_fw_reg(vcpu, reg);
> + case KVM_REG_ARM64_SVE: return get_sve_reg(vcpu, reg);
> + }
>
>   if (is_timer_reg(reg->id))
>   return get_timer_reg(vcpu, reg);
> @@ -270,12 +394,11 @@ int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct 
> kvm_one_reg *reg)
>   if ((reg->id & ~KVM_REG_SIZE_MASK) >> 32 != KVM_REG_ARM64 >> 32)
>   return -EINVAL;
>
> - /* Register group 16 means we set a core register. */
> - if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
> - return set_core_reg(vcpu, reg);
> -
> - if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
> - return kvm_arm_set_fw_reg(vcpu, reg);
> + switch (reg->id & KVM_REG_ARM_COPROC_MASK) {
> + case KVM_REG_ARM_CORE:  return set_core_reg(vcpu, reg);
> + case KVM_REG_ARM_FW:return kvm_arm_set_fw_reg(vcpu, reg);
> + case KVM_REG_ARM64_SVE: return set_sve_reg(vcpu, reg);
> + }
>
>   if (is_timer_reg(reg->id))
>   return set_timer_reg(vcpu, reg);

The kernel coding-style.rst seems mute on the subject of default
handling in switch but it's probably worth having a:

  default: break; /* falls through */

to be explicit.

It's out of scope for this review but I did get a bit confused as the
KVM_REG_ARM_COPROC_SHIFT registers seems to be fairly spread out across
the files. We have demux_c15_get/set in sys_regs but doesn't look as
though it touches the rest of the emulation logic and we have
kvm_arm_get/set_fw_reg which are "special" PCSI registers. I guess this
is because COPROC_SHIFT has been used for a bunch of disparate core and
non-core and special registers.

--
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [RFC PATCH v2 13/23] KVM: arm64/sve: Context switch the SVE registers

2018-11-20 Thread Alex Bennée

>> Put calculating guest_has_sve at the top of __hyp_switch_fpsimd make
>> most of that go away and just moves things around a little bit. So I
>> guess it could makes sense for the fast(ish) path although I'd be
>> interested in knowing if it made any real difference to the numbers.
>> After all the first read should be well cached and moving it through the
>> stack is just additional memory and register pressure.
>
> Hmmm, I will have a think about this when I respin.
>
> Explicitly caching guest_has_sve() does reduce the compiler's freedom to
> optimise.
>
> We might be able to mark it as __pure or __attribute_const__ to enable
> the compiler to decide whether to cache the result, but this may not be
> 100% safe.
>
> Part of me would prefer to leave things as they are to avoid the risk of
> breaking the code again...

Given that the only place you call __hyp_switch_fpsimd is here you could
just roll in into __hyp_trap_is_fpsimd and have:

if (__hyp_trap_is_fpsimd(vcpu))
return true;

--
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [RFC PATCH v2 13/23] KVM: arm64/sve: Context switch the SVE registers

2018-11-20 Thread Alex Bennée


Dave Martin  writes:

> On Mon, Nov 19, 2018 at 04:36:01PM +0000, Alex Bennée wrote:
>>
>> Dave Martin  writes:
>>
>> > In order to give each vcpu its own view of the SVE registers, this
>> > patch adds context storage via a new sve_state pointer in struct
>> > vcpu_arch.  An additional member sve_max_vl is also added for each
>> > vcpu, to determine the maximum vector length visible to the guest
>> > and thus the value to be configured in ZCR_EL2.LEN while the is
>> > active.  This also determines the layout and size of the storage in
>> > sve_state, which is read and written by the same backend functions
>> > that are used for context-switching the SVE state for host tasks.
>> >
>> > On SVE-enabled vcpus, SVE access traps are now handled by switching
>> > in the vcpu's SVE context and disabling the trap before returning
>> > to the guest.  On other vcpus, the trap is not handled and an exit
>> > back to the host occurs, where the handle_sve() fallback path
>> > reflects an undefined instruction exception back to the guest,
>> > consistently with the behaviour of non-SVE-capable hardware (as was
>> > done unconditionally prior to this patch).
>> >
>> > No SVE handling is added on non-VHE-only paths, since VHE is an
>> > architectural and Kconfig prerequisite of SVE.
>> >
>> > Signed-off-by: Dave Martin 
>
> [...]
>
>> > diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
>> > index 085ed06..9941349 100644
>> > --- a/arch/arm64/kvm/hyp/switch.c
>> > +++ b/arch/arm64/kvm/hyp/switch.c
>
> [...]
>
>> > @@ -380,6 +398,26 @@ static bool __hyp_text __hyp_switch_fpsimd(struct 
>> > kvm_vcpu *vcpu)
>> >return true;
>> >  }
>> >
>> > +static inline bool __hyp_text __hyp_trap_is_fpsimd(struct kvm_vcpu *vcpu,
>> > + bool guest_has_sve)
>> > +{
>> > +
>> > +  u8 trap_class;
>> > +
>> > +  if (!system_supports_fpsimd())
>> > +  return false;
>> > +
>> > +  trap_class = kvm_vcpu_trap_get_class(vcpu);
>> > +
>> > +  if (trap_class == ESR_ELx_EC_FP_ASIMD)
>> > +  return true;
>> > +
>> > +  if_sve (guest_has_sve && trap_class == ESR_ELx_EC_SVE)
>> > +  return true;
>>
>> Do we really need to check the guest has SVE before believing what the
>> hardware is telling us? According to the ARM ARM:
>>
>> For ESR_ELx_EC_FP_ASIMD
>>
>>   Excludes exceptions resulting from CPACR_EL1 when the value of HCR_EL2.TGE 
>> is
>>   1, or because SVE or Advanced SIMD and floating-point are not implemented. 
>> These
>>   are reported with EC value 0b00
>>
>> But also for ESR_ELx_EC_SVE
>>
>>   Access to SVE functionality trapped as a result of CPACR_EL1.ZEN,
>>   CPTR_EL2.ZEN, CPTR_EL2.TZ, or CPTR_EL3.EZ, that is not reported using EC
>>   0b00. This EC is defined only if SVE is implemented
>>
>> Given I got confused maybe we need a comment for clarity?
>
> This is not about not trusting the value ESR_ELx_EC_SVE on older
> hardware: in effect it is retrospectively reserved for this purpose on
> all older arch versions, so there is no ambiguity about what it means.
> It should never be observed on hardware that doesn't have SVE.
>
> Rather, how we handle this trap differs depending on whether the guest
> is SVE-enabled or not.  If not, then this trap is handled by the generic
> fallback path for unhandled guest traps, so we don't check for this
> particular EC value explicitly in that case.
>
>>   /* Catch guests without SVE enabled running on SVE capable hardware */
>
> I might write something like:
>
>   /*
>* For sve-enmabled guests only, handle SVE access via FPSIMD
>* context handling code.
>*/
>
> Does that make sense?  I may have misunderstood your concern here.

s/enmabled/enabled/ but yeah that's fine.

>
> [...]
>
>> > @@ -387,6 +425,8 @@ static bool __hyp_text __hyp_switch_fpsimd(struct 
>> > kvm_vcpu *vcpu)
>> >   */
>> >  static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 
>> > *exit_code)
>> >  {
>> > +  bool guest_has_sve;
>> > +
>> >if (ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ)
>> >vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
>> >
>> > @@ -404,10 +444,11 @@ static bool __hyp_text fixup_guest_exit(s

Re: [RFC PATCH v2 14/23] KVM: Allow 2048-bit register access via ioctl interface

2018-11-20 Thread Alex Bennée


Dave Martin  writes:

> On Mon, Nov 19, 2018 at 04:48:36PM +0000, Alex Bennée wrote:
>>
>> Dave Martin  writes:
>>
>> > The Arm SVE architecture defines registers that are up to 2048 bits
>> > in size (with some possibility of further future expansion).
>> >
>> > In order to avoid the need for an excessively large number of
>> > ioctls when saving and restoring a vcpu's registers, this patch
>> > adds a #define to make support for individual 2048-bit registers
>> > through the KVM_{GET,SET}_ONE_REG ioctl interface official.  This
>> > will allow each SVE register to be accessed in a single call.
>> >
>> > There are sufficient spare bits in the register id size field for
>> > this change, so there is no ABI impact providing that
>> > KVM_GET_REG_LIST does not enumerate any 2048-bit register unless
>> > userspace explicitly opts in to the relevant architecture-specific
>> > features.
>> >
>> > Signed-off-by: Dave Martin 
>> > ---
>> >  include/uapi/linux/kvm.h | 1 +
>> >  1 file changed, 1 insertion(+)
>> >
>> > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>> > index 251be35..7c3c5cc 100644
>> > --- a/include/uapi/linux/kvm.h
>> > +++ b/include/uapi/linux/kvm.h
>> > @@ -1110,6 +1110,7 @@ struct kvm_dirty_tlb {
>> >  #define KVM_REG_SIZE_U256 0x0050ULL
>> >  #define KVM_REG_SIZE_U512 0x0060ULL
>> >  #define KVM_REG_SIZE_U10240x0070ULL
>> > +#define KVM_REG_SIZE_U20480x0080ULL
>>
>> Yeah OK I guess - but it does make me question if KVM_REG_SIZE_MASK is
>> part of the ABI because although we have space for another few bits that
>> is the last one without changing the mask.
>
> Debatable, but KVM_REG_SIZE_MASK is UAPI and suggests a clear intent not
> to recycle bit 55 for another purpose.  This allows for reg sizes up to
> 262144 bits which is hopefully more than enough for the foreseeable
> future.
>
> Even if bits 56-59 are currently always 0, KVM_REG_ARCH_MASK suggests
> that these bits aren't going to be used for size field bits.
>
>
> Or am I missing something?

No you are quite right - I thought I was watching an incrementing bit
position not an incrementing number. Too much staring at defines, carry
on ;-)

>
>> Reviewed-by: Alex Bennée 
>
> Thanks
> ---Dave


--
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [RFC PATCH v2 14/23] KVM: Allow 2048-bit register access via ioctl interface

2018-11-19 Thread Alex Bennée


Dave Martin  writes:

> The Arm SVE architecture defines registers that are up to 2048 bits
> in size (with some possibility of further future expansion).
>
> In order to avoid the need for an excessively large number of
> ioctls when saving and restoring a vcpu's registers, this patch
> adds a #define to make support for individual 2048-bit registers
> through the KVM_{GET,SET}_ONE_REG ioctl interface official.  This
> will allow each SVE register to be accessed in a single call.
>
> There are sufficient spare bits in the register id size field for
> this change, so there is no ABI impact providing that
> KVM_GET_REG_LIST does not enumerate any 2048-bit register unless
> userspace explicitly opts in to the relevant architecture-specific
> features.
>
> Signed-off-by: Dave Martin 
> ---
>  include/uapi/linux/kvm.h | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 251be35..7c3c5cc 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1110,6 +1110,7 @@ struct kvm_dirty_tlb {
>  #define KVM_REG_SIZE_U2560x0050ULL
>  #define KVM_REG_SIZE_U5120x0060ULL
>  #define KVM_REG_SIZE_U1024   0x0070ULL
> +#define KVM_REG_SIZE_U2048   0x0080ULL

Yeah OK I guess - but it does make me question if KVM_REG_SIZE_MASK is
part of the ABI because although we have space for another few bits that
is the last one without changing the mask.

Reviewed-by: Alex Bennée 

>
>  struct kvm_reg_list {
>   __u64 n; /* number of regs */


--
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [RFC PATCH v2 13/23] KVM: arm64/sve: Context switch the SVE registers

2018-11-19 Thread Alex Bennée

system_supports_sve() is resolved at build time or via a static key.)
> + */
> +#define if_sve(cond) if (system_supports_sve() && (cond))
> +
> +static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu,
> +bool guest_has_sve)
>  {
>   struct user_fpsimd_state *host_fpsimd = vcpu->arch.host_fpsimd_state;
>
> - if (has_vhe())
> - write_sysreg(read_sysreg(cpacr_el1) | CPACR_EL1_FPEN,
> -  cpacr_el1);
> - else
> + if (has_vhe()) {
> + u64 reg = read_sysreg(cpacr_el1) | CPACR_EL1_FPEN;
> +
> + if_sve (guest_has_sve)
> + reg |= CPACR_EL1_ZEN;
> +
> + write_sysreg(reg, cpacr_el1);
> + } else {
>   write_sysreg(read_sysreg(cptr_el2) & ~(u64)CPTR_EL2_TFP,
>cptr_el2);
> + }
>
>   isb();
>
> @@ -350,8 +366,7 @@ static bool __hyp_text __hyp_switch_fpsimd(struct 
> kvm_vcpu *vcpu)
>* In the SVE case, VHE is assumed: it is enforced by
>* Kconfig and kvm_arch_init().
>*/
> - if (system_supports_sve() &&
> - (vcpu->arch.flags & KVM_ARM64_HOST_SVE_IN_USE)) {
> + if_sve (vcpu->arch.flags & KVM_ARM64_HOST_SVE_IN_USE) {
>   struct thread_struct *thread = container_of(
>   host_fpsimd,
>   struct thread_struct, uw.fpsimd_state);
> @@ -364,11 +379,14 @@ static bool __hyp_text __hyp_switch_fpsimd(struct 
> kvm_vcpu *vcpu)
>   vcpu->arch.flags &= ~KVM_ARM64_FP_HOST;
>   }
>
> - __fpsimd_restore_state(>arch.ctxt.gp_regs.fp_regs);
> -
> - if (system_supports_sve() &&
> - vcpu->arch.flags & KVM_ARM64_GUEST_HAS_SVE)
> + if_sve (guest_has_sve) {
> + sve_load_state(vcpu_sve_pffr(vcpu),
> +>arch.ctxt.gp_regs.fp_regs.fpsr,
> +sve_vq_from_vl(vcpu->arch.sve_max_vl) - 1);
>   write_sysreg_s(vcpu->arch.ctxt.sys_regs[ZCR_EL1], SYS_ZCR_EL12);
> + } else {
> + __fpsimd_restore_state(>arch.ctxt.gp_regs.fp_regs);
> + }
>
>   /* Skip restoring fpexc32 for AArch64 guests */
>   if (!(read_sysreg(hcr_el2) & HCR_RW))
> @@ -380,6 +398,26 @@ static bool __hyp_text __hyp_switch_fpsimd(struct 
> kvm_vcpu *vcpu)
>   return true;
>  }
>
> +static inline bool __hyp_text __hyp_trap_is_fpsimd(struct kvm_vcpu *vcpu,
> +bool guest_has_sve)
> +{
> +
> + u8 trap_class;
> +
> + if (!system_supports_fpsimd())
> + return false;
> +
> + trap_class = kvm_vcpu_trap_get_class(vcpu);
> +
> + if (trap_class == ESR_ELx_EC_FP_ASIMD)
> + return true;
> +
> + if_sve (guest_has_sve && trap_class == ESR_ELx_EC_SVE)
> + return true;

Do we really need to check the guest has SVE before believing what the
hardware is telling us? According to the ARM ARM:

For ESR_ELx_EC_FP_ASIMD

  Excludes exceptions resulting from CPACR_EL1 when the value of HCR_EL2.TGE is
  1, or because SVE or Advanced SIMD and floating-point are not implemented. 
These
  are reported with EC value 0b00

But also for ESR_ELx_EC_SVE

  Access to SVE functionality trapped as a result of CPACR_EL1.ZEN,
  CPTR_EL2.ZEN, CPTR_EL2.TZ, or CPTR_EL3.EZ, that is not reported using EC
  0b00. This EC is defined only if SVE is implemented

Given I got confused maybe we need a comment for clarity?

  /* Catch guests without SVE enabled running on SVE capable hardware */

> +
> + return false;
> +}
> +
>  /*
>   * Return true when we were able to fixup the guest exit and should return to
>   * the guest, false when we should restore the host state and return to the
> @@ -387,6 +425,8 @@ static bool __hyp_text __hyp_switch_fpsimd(struct 
> kvm_vcpu *vcpu)
>   */
>  static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 
> *exit_code)
>  {
> + bool guest_has_sve;
> +
>   if (ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ)
>   vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
>
> @@ -404,10 +444,11 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu 
> *vcpu, u64 *exit_code)
>* and restore the guest context lazily.
>* If FP/SIMD is not implemented, handle the trap and inject an
>* undefined instruction exception to the guest.
> +  * Similarly for trapped SVE accesses.
>*/
> - if (system_supports_fpsimd() &&
> - kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD)
> - return __hyp_switch_fpsimd(vcpu);
> + guest_has_sve = vcpu_has_sve(vcpu);

I'm not sure if it's worth fishing this out here given you are already
passing vcpu down the chain.

> + if (__hyp_trap_is_fpsimd(vcpu, guest_has_sve))
> + return __hyp_switch_fpsimd(vcpu, guest_has_sve);
>
>   if (!__populate_fault_info(vcpu))
>   return true;

Otherwise:

Reviewed-by: Alex Bennée 

--
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [RFC PATCH v2 06/23] arm64/sve: Check SVE virtualisability

2018-11-16 Thread Alex Bennée


Dave Martin  writes:

> On Thu, Nov 15, 2018 at 03:39:01PM +0000, Alex Bennée wrote:
>>
>> Dave Martin  writes:
>>
>> > Due to the way the effective SVE vector length is controlled and
>> > trapped at different exception levels, certain mismatches in the
>> > sets of vector lengths supported by different physical CPUs in the
>> > system may prevent straightforward virtualisation of SVE at parity
>> > with the host.
>> >
>> > This patch analyses the extent to which SVE can be virtualised
>> > safely without interfering with migration of vcpus between physical
>> > CPUs, and rejects late secondary CPUs that would erode the
>> > situation further.
>> >
>> > It is left up to KVM to decide what to do with this information.
>> >
>> > Signed-off-by: Dave Martin 
>> > ---
>> >
>> > Changes since RFCv1:
>> >
>> >  * The analysis done by this patch is the same as in the previous
>> >version, but the commit message the printks etc. have been reworded
>> >to avoid the suggestion that KVM is expected to work on a system with
>> >mismatched SVE implementations.
>> > ---
>> >  arch/arm64/include/asm/fpsimd.h |  1 +
>> >  arch/arm64/kernel/cpufeature.c  |  2 +-
>> >  arch/arm64/kernel/fpsimd.c  | 87 
>> > +++--
>> >  3 files changed, 76 insertions(+), 14 deletions(-)
>> >
>
> [...]
>
>> > diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
>
> [...]
>
>> > @@ -623,11 +629,8 @@ int sve_get_current_vl(void)
>
> [...]
>
>> > +/* Bitmaps for temporary storage during manipulation of vector length 
>> > sets */
>> > +static DECLARE_BITMAP(sve_tmp_vq_map, SVE_VQ_MAX);
>>
>> This seems odd as a local global, why not declared locally when used?
>
> Could do.
>
> My original concern was that this is "big" and therefore it's impolite
> to allocate it on the stack.
>
> But on reflection, 64 bytes of stack is no big deal for a 64-bit
> architecture.  The affected functions probably spill more than than
> already, and these functions are called on well-defined paths which
> shouldn't have super-deep stacks already.
>
> [...]
>
>> > @@ -658,24 +662,60 @@ void __init sve_init_vq_map(void)
>> >   */
>> >  void sve_update_vq_map(void)
>> >  {
>> > -  sve_probe_vqs(sve_secondary_vq_map);
>> > -  bitmap_and(sve_vq_map, sve_vq_map, sve_secondary_vq_map, SVE_VQ_MAX);
>> > +  sve_probe_vqs(sve_tmp_vq_map);
>> > +  bitmap_and(sve_vq_map, sve_vq_map, sve_tmp_vq_map,
>> > + SVE_VQ_MAX);
>> > +  bitmap_or(sve_vq_partial_map, sve_vq_partial_map, sve_tmp_vq_map,
>> > +SVE_VQ_MAX);
>> >  }
>>
>> I'm not quite following what's going on here. This is tracking both the
>> vector lengths available on all CPUs and the ones available on at least
>> one CPU? This raises a some questions:
>>
>>   - do such franken-machines exist or are expected to exit?
>
> no, and yes respectively (Linux does not endorse the latter for now,
> since it results in a non-SMP system: we hide the asymmetries where
> possible by clamping the set of available vector lengths, but for
> KVM it's too hard and we don't aim to support it at all).
>
> Even if we don't recommend deploying a general-purpose OS on such a
> system, people will eventually try it.  So it's better to fail safe
> rather than silently doing the wrong thing.
>
>>   - how do we ensure this is always upto date?
>
> This gets updated for each early secondary CPU that comes up.  (Early
> secondaries' boot is serialised, so we shouldn't have to worry about
> races here.)
>
> The configuration is frozen by the time we enter userspace (hence
> __ro_after_init).
>
> Once all the early secondaries have come up, we commit to the best
> possible set of vector lengths for the CPUs that we know about, and we
> don't call this path any more: instead, each late secondary goes into
> sve_verify_vq_map() instead to check that those CPUs are compatible
> with the configuration we committed to.
>
> For context, take a look at
> arch/arm64/kernel/cpufeature.c:check_local_cpu_capabilities(), which is
> the common entry point for all secondary CPUs: that splits into
> update_cpu_capabilities() and verify_local_cpu_capabilities() paths for
> the two cases described above, calling down into sve_update_vq_map()
> and sve_verify_vq_map() as appropriate.
>
>>   - what happens when we hotplug a

Re: [RFC PATCH v2 12/23] KVM: arm64/sve: System register context switch and access support

2018-11-15 Thread Alex Bennée

reg_to_index(rd);
> + int err;
> + u64 val;
> +
> + if (!vcpu_has_sve(vcpu))
> + return -ENOENT;
> +
> + err = reg_from_user(, uaddr, id);
> + if (err)
> + return err;
> +
> + /* This is what we mean by invariant: you can't change it. */
> + if (val != guest_id_aa64zfr0_el1(vcpu))
> + return -EINVAL;
> +
> + return 0;
> +}
> +#endif /* CONFIG_ARM64_SVE */
> +
>  /*
>   * cpufeature ID register user accessors
>   *
> @@ -1270,7 +1366,11 @@ static const struct sys_reg_desc sys_reg_descs[] = {
>   ID_SANITISED(ID_AA64PFR1_EL1),
>   ID_UNALLOCATED(4,2),
>   ID_UNALLOCATED(4,3),
> +#ifdef CONFIG_ARM64_SVE
> + { SYS_DESC(SYS_ID_AA64ZFR0_EL1), access_id_aa64zfr0_el1, .get_user = 
> get_id_aa64zfr0_el1, .set_user = set_id_aa64zfr0_el1, .check_present = 
> sve_check_present },
> +#else
>   ID_UNALLOCATED(4,4),
> +#endif
>   ID_UNALLOCATED(4,5),
>   ID_UNALLOCATED(4,6),
>   ID_UNALLOCATED(4,7),
> @@ -1307,6 +1407,9 @@ static const struct sys_reg_desc sys_reg_descs[] = {
>
>   { SYS_DESC(SYS_SCTLR_EL1), access_vm_reg, reset_val, SCTLR_EL1, 
> 0x00C50078 },
>   { SYS_DESC(SYS_CPACR_EL1), NULL, reset_val, CPACR_EL1, 0 },
> +#ifdef CONFIG_ARM64_SVE
> + { SYS_DESC(SYS_ZCR_EL1), access_zcr_el1, reset_unknown, ZCR_EL1, 
> ~0xfUL, .get_user = get_zcr_el1, .set_user = set_zcr_el1, .check_present = 
> sve_check_present },
> +#endif
>   { SYS_DESC(SYS_TTBR0_EL1), access_vm_reg, reset_unknown, TTBR0_EL1 },
>   { SYS_DESC(SYS_TTBR1_EL1), access_vm_reg, reset_unknown, TTBR1_EL1 },
>   { SYS_DESC(SYS_TCR_EL1), access_vm_reg, reset_val, TCR_EL1, 0 },

Overlong lines.


--
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [RFC PATCH v2 09/23] KVM: arm64: Propagate vcpu into read_id_reg()

2018-11-15 Thread Alex Bennée


Dave Martin  writes:

> Architecture features that are conditionally visible to the guest
> will require run-time checks in the ID register accessor functions.
> In particular, read_id_reg() will need to perform checks in order
> to generate the correct emulated value for certain ID register
> fields such as ID_AA64PFR0_EL1.SVE for example.
>
> This patch propagates vcpu into read_id_reg() so that future
> patches can add run-time checks on the guest configuration here.
>
> For now, there is no functional change.
>
> Signed-off-by: Dave Martin 

Reviewed-by: Alex Bennée 

> ---
>  arch/arm64/kvm/sys_regs.c | 23 +--
>  1 file changed, 13 insertions(+), 10 deletions(-)
>
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index 22fbbdb..0dfd064 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -1029,7 +1029,8 @@ static bool access_cntp_cval(struct kvm_vcpu *vcpu,
>  }
>
>  /* Read a sanitised cpufeature ID register by sys_reg_desc */
> -static u64 read_id_reg(struct sys_reg_desc const *r, bool raz)
> +static u64 read_id_reg(const struct kvm_vcpu *vcpu,
> + struct sys_reg_desc const *r, bool raz)
>  {
>   u32 id = sys_reg((u32)r->Op0, (u32)r->Op1,
>(u32)r->CRn, (u32)r->CRm, (u32)r->Op2);
> @@ -1060,7 +1061,7 @@ static bool __access_id_reg(struct kvm_vcpu *vcpu,
>   if (p->is_write)
>   return write_to_read_only(vcpu, p, r);
>
> - p->regval = read_id_reg(r, raz);
> + p->regval = read_id_reg(vcpu, r, raz);
>   return true;
>  }
>
> @@ -1089,16 +1090,18 @@ static u64 sys_reg_to_index(const struct sys_reg_desc 
> *reg);
>   * are stored, and for set_id_reg() we don't allow the effective value
>   * to be changed.
>   */
> -static int __get_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
> +static int __get_id_reg(const struct kvm_vcpu *vcpu,
> + const struct sys_reg_desc *rd, void __user *uaddr,
>   bool raz)
>  {
>   const u64 id = sys_reg_to_index(rd);
> - const u64 val = read_id_reg(rd, raz);
> + const u64 val = read_id_reg(vcpu, rd, raz);
>
>   return reg_to_user(uaddr, , id);
>  }
>
> -static int __set_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
> +static int __set_id_reg(const struct kvm_vcpu *vcpu,
> + const struct sys_reg_desc *rd, void __user *uaddr,
>   bool raz)
>  {
>   const u64 id = sys_reg_to_index(rd);
> @@ -1110,7 +1113,7 @@ static int __set_id_reg(const struct sys_reg_desc *rd, 
> void __user *uaddr,
>   return err;
>
>   /* This is what we mean by invariant: you can't change it. */
> - if (val != read_id_reg(rd, raz))
> + if (val != read_id_reg(vcpu, rd, raz))
>   return -EINVAL;
>
>   return 0;
> @@ -1119,25 +1122,25 @@ static int __set_id_reg(const struct sys_reg_desc 
> *rd, void __user *uaddr,
>  static int get_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
> const struct kvm_one_reg *reg, void __user *uaddr)
>  {
> - return __get_id_reg(rd, uaddr, false);
> + return __get_id_reg(vcpu, rd, uaddr, false);
>  }
>
>  static int set_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
> const struct kvm_one_reg *reg, void __user *uaddr)
>  {
> - return __set_id_reg(rd, uaddr, false);
> + return __set_id_reg(vcpu, rd, uaddr, false);
>  }
>
>  static int get_raz_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc 
> *rd,
> const struct kvm_one_reg *reg, void __user *uaddr)
>  {
> - return __get_id_reg(rd, uaddr, true);
> + return __get_id_reg(vcpu, rd, uaddr, true);
>  }
>
>  static int set_raz_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc 
> *rd,
> const struct kvm_one_reg *reg, void __user *uaddr)
>  {
> - return __set_id_reg(rd, uaddr, true);
> + return __set_id_reg(vcpu, rd, uaddr, true);
>  }
>
>  /* sys_reg_desc initialiser for known cpufeature ID registers */


--
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [RFC PATCH v2 08/23] KVM: arm64: Add a vcpu flag to control SVE visibility for the guest

2018-11-15 Thread Alex Bennée


Dave Martin  writes:

> Since SVE will be enabled or disabled on a per-vcpu basis, a flag
> is needed in order to track which vcpus have it enabled.
>
> This patch adds a suitable flag and a helper for checking it.
>
> Signed-off-by: Dave Martin 

Reviewed-by: Alex Bennée 

> ---
>
> Changes since RFCv1:
>
>  * Convert vcpu_has_sve() to a macro so that it can operate on a vcpu
>without circular header dependency problems.
>
>This avoids the helper requiring a vcpu_arch argument, which was
>a little ugly.
> ---
>  arch/arm64/include/asm/kvm_host.h | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/arch/arm64/include/asm/kvm_host.h 
> b/arch/arm64/include/asm/kvm_host.h
> index d4b65414..20baf4a 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -307,6 +307,10 @@ struct kvm_vcpu_arch {
>  #define KVM_ARM64_FP_HOST(1 << 2) /* host FP regs loaded */
>  #define KVM_ARM64_HOST_SVE_IN_USE(1 << 3) /* backup for host TIF_SVE */
>  #define KVM_ARM64_HOST_SVE_ENABLED   (1 << 4) /* SVE enabled for EL0 */
> +#define KVM_ARM64_GUEST_HAS_SVE  (1 << 5) /* SVE exposed to 
> guest */
> +
> +#define vcpu_has_sve(vcpu) (system_supports_sve() && \
> +     ((vcpu)->arch.flags & KVM_ARM64_GUEST_HAS_SVE))
>
>  #define vcpu_gp_regs(v)  (&(v)->arch.ctxt.gp_regs)


--
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [RFC PATCH v2 06/23] arm64/sve: Check SVE virtualisability

2018-11-15 Thread Alex Bennée

e_tmp_vq_map,
> +SVE_VQ_MAX);
> + bitmap_or(sve_vq_partial_map, sve_vq_partial_map, sve_tmp_vq_map,
> +   SVE_VQ_MAX);
>  }

I'm not quite following what's going on here. This is tracking both the
vector lengths available on all CPUs and the ones available on at least
one CPU? This raises a some questions:

  - do such franken-machines exist or are expected to exit?
  - how do we ensure this is always upto date?
  - what happens when we hotplug a new CPU with less available VQ?

>
>  /* Check whether the current CPU supports all VQs in the committed set */
>  int sve_verify_vq_map(void)
>  {
> - int ret = 0;
> + int ret = -EINVAL;
> + unsigned long b;
>
> - sve_probe_vqs(sve_secondary_vq_map);
> - bitmap_andnot(sve_secondary_vq_map, sve_vq_map, sve_secondary_vq_map,
> -   SVE_VQ_MAX);
> - if (!bitmap_empty(sve_secondary_vq_map, SVE_VQ_MAX)) {
> + sve_probe_vqs(sve_tmp_vq_map);
> +
> + bitmap_complement(sve_tmp_vq_map, sve_tmp_vq_map, SVE_VQ_MAX);
> + if (bitmap_intersects(sve_tmp_vq_map, sve_vq_map, SVE_VQ_MAX)) {
>   pr_warn("SVE: cpu%d: Required vector length(s) missing\n",
>   smp_processor_id());
> - ret = -EINVAL;
> + goto error;

The use of goto seems a little premature considering we don't have any
clean-up to do.

> + }
> +
> + if (!IS_ENABLED(CONFIG_KVM) || !is_hyp_mode_available())
> + goto ok;
> +
> + /*
> +  * For KVM, it is necessary to ensure that this CPU doesn't
> +  * support any vector length that guests may have probed as
> +  * unsupported.
> +  */
> +
> + /* Recover the set of supported VQs: */
> + bitmap_complement(sve_tmp_vq_map, sve_tmp_vq_map, SVE_VQ_MAX);
> + /* Find VQs supported that are not globally supported: */
> + bitmap_andnot(sve_tmp_vq_map, sve_tmp_vq_map, sve_vq_map, SVE_VQ_MAX);
> +
> + /* Find the lowest such VQ, if any: */
> + b = find_last_bit(sve_tmp_vq_map, SVE_VQ_MAX);
> + if (b >= SVE_VQ_MAX)
> + goto ok; /* no mismatches */
> +
> + /*
> +  * Mismatches above sve_max_virtualisable_vl are fine, since
> +  * no guest is allowed to configure ZCR_EL2.LEN to exceed this:
> +  */
> + if (sve_vl_from_vq(bit_to_vq(b)) <= sve_max_virtualisable_vl) {
> + pr_warn("SVE: cpu%d: Unsupported vector length(s) present\n",
> + smp_processor_id());
> + goto error;
>   }
>
> +ok:
> + ret = 0;
> +error:
>   return ret;
>  }
>
> @@ -743,6 +783,7 @@ u64 read_zcr_features(void)
>  void __init sve_setup(void)
>  {
>   u64 zcr;
> + unsigned long b;
>
>   if (!system_supports_sve())
>   return;
> @@ -771,11 +812,31 @@ void __init sve_setup(void)
>*/
>   sve_default_vl = find_supported_vector_length(64);
>
> + bitmap_andnot(sve_tmp_vq_map, sve_vq_partial_map, sve_vq_map,
> +   SVE_VQ_MAX);
> +
> + b = find_last_bit(sve_tmp_vq_map, SVE_VQ_MAX);
> + if (b >= SVE_VQ_MAX)
> + /* No non-virtualisable VLs found */
> + sve_max_virtualisable_vl = SVE_VQ_MAX;
> + else if (WARN_ON(b == SVE_VQ_MAX - 1))
> + /* No virtualisable VLs?  This is architecturally forbidden. */
> + sve_max_virtualisable_vl = SVE_VQ_MIN;
> + else /* b + 1 < SVE_VQ_MAX */
> + sve_max_virtualisable_vl = sve_vl_from_vq(bit_to_vq(b + 1));
> +
> + if (sve_max_virtualisable_vl > sve_max_vl)
> + sve_max_virtualisable_vl = sve_max_vl;
> +
>   pr_info("SVE: maximum available vector length %u bytes per vector\n",
>   sve_max_vl);
>   pr_info("SVE: default vector length %u bytes per vector\n",
>   sve_default_vl);
>
> + /* KVM decides whether to support mismatched systems. Just warn here: */
> + if (sve_max_virtualisable_vl < sve_max_vl)
> + pr_info("SVE: unvirtualisable vector lengths present\n");
> +
>   sve_efi_setup();
>  }


--
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH 2/2] kvm/arm: consistently advance singlestep when emulating instructions

2018-11-09 Thread Alex Bennée


Mark Rutland  writes:

> When we emulate a guest instruction, we don't advance the hardware
> singlestep state machine, and thus the guest will receive a software
> step exception after a next instruction which is not emulated by the
> host.
>
> We bodge around this in an ad-hoc fashion. Sometimes we explicitly check
> whether userspace requested a single step, and fake a debug exception
> from within the kernel. Other times, we advance the HW singlestep state
> rely on the HW to generate the exception for us. Thus, the observed step
> behaviour differs for host and guest.
>
> Let's make this simpler and consistent by always advancing the HW
> singlestep state machine when we skip an instruction. Thus we can rely
> on the hardware to generate the singlestep exception for us, and never
> need to explicitly check for an active-pending step, nor do we need to
> fake a debug exception from the guest.
>
> Signed-off-by: Mark Rutland 
> Cc: Alex Bennée 
> Cc: Christoffer Dall 
> Cc: Marc Zyngier 
> Cc: Peter Maydell 

Reviewed-by: Alex Bennée 
Tested-by: Alex Bennée 

For reference I'm leaving this kernel boot soaking overnight. In theory
there may be a branch to self but we shall see:

  https://gist.github.com/stsquad/ddfb00787cb133b4b658756cb6c47f63

> ---
>  arch/arm/include/asm/kvm_host.h  |  5 
>  arch/arm64/include/asm/kvm_emulate.h | 35 --
>  arch/arm64/include/asm/kvm_host.h|  1 -
>  arch/arm64/kvm/debug.c   | 21 
>  arch/arm64/kvm/handle_exit.c | 14 +--
>  arch/arm64/kvm/hyp/switch.c  | 43 
> +++-
>  arch/arm64/kvm/hyp/vgic-v2-cpuif-proxy.c | 12 ++---
>  virt/kvm/arm/arm.c   |  2 --
>  virt/kvm/arm/hyp/vgic-v3-sr.c|  6 -
>  9 files changed, 46 insertions(+), 93 deletions(-)
>
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index 5ca5d9af0c26..c5634c6ffcea 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -296,11 +296,6 @@ static inline void kvm_arm_init_debug(void) {}
>  static inline void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) {}
>  static inline void kvm_arm_clear_debug(struct kvm_vcpu *vcpu) {}
>  static inline void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu) {}
> -static inline bool kvm_arm_handle_step_debug(struct kvm_vcpu *vcpu,
> -  struct kvm_run *run)
> -{
> - return false;
> -}
>
>  int kvm_arm_vcpu_arch_set_attr(struct kvm_vcpu *vcpu,
>  struct kvm_device_attr *attr);
> diff --git a/arch/arm64/include/asm/kvm_emulate.h 
> b/arch/arm64/include/asm/kvm_emulate.h
> index 21247870def7..506386a3edde 100644
> --- a/arch/arm64/include/asm/kvm_emulate.h
> +++ b/arch/arm64/include/asm/kvm_emulate.h
> @@ -24,6 +24,7 @@
>
>  #include 
>
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -147,14 +148,6 @@ static inline bool kvm_condition_valid(const struct 
> kvm_vcpu *vcpu)
>   return true;
>  }
>
> -static inline void kvm_skip_instr(struct kvm_vcpu *vcpu, bool is_wide_instr)
> -{
> - if (vcpu_mode_is_32bit(vcpu))
> - kvm_skip_instr32(vcpu, is_wide_instr);
> - else
> - *vcpu_pc(vcpu) += 4;
> -}
> -
>  static inline void vcpu_set_thumb(struct kvm_vcpu *vcpu)
>  {
>   *vcpu_cpsr(vcpu) |= PSR_AA32_T_BIT;
> @@ -424,4 +417,30 @@ static inline unsigned long 
> vcpu_data_host_to_guest(struct kvm_vcpu *vcpu,
>   return data;/* Leave LE untouched */
>  }
>
> +static inline void kvm_skip_instr(struct kvm_vcpu *vcpu, bool is_wide_instr)
> +{
> + if (vcpu_mode_is_32bit(vcpu))
> + kvm_skip_instr32(vcpu, is_wide_instr);
> + else
> + *vcpu_pc(vcpu) += 4;
> +
> + /* advance the singlestep state machine */
> + *vcpu_cpsr(vcpu) &= ~DBG_SPSR_SS;
> +}
> +
> +/*
> + * Skip an instruction which has been emulated at hyp while most guest 
> sysregs
> + * are live.
> + */
> +static inline void __hyp_text __kvm_skip_instr(struct kvm_vcpu *vcpu)
> +{
> + *vcpu_pc(vcpu) = read_sysreg_el2(elr);
> + vcpu->arch.ctxt.gp_regs.regs.pstate = read_sysreg_el2(spsr);
> +
> + kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
> +
> + write_sysreg_el2(vcpu->arch.ctxt.gp_regs.regs.pstate, spsr);
> + write_sysreg_el2(*vcpu_pc(vcpu), elr);
> +}
> +
>  #endif /* __ARM64_KVM_EMULATE_H__ */
> diff --git a/arch/arm64/include/asm/kvm_host.h 
> b/arch/arm64/include/asm/kvm_host.h
> index 52fbc823ff8c..7a5035f9c5c3 100644
> --- a/arch/arm64/in

Re: [PATCH 1/2] kvm/arm: skip MMIO insn after emulation

2018-11-09 Thread Alex Bennée


Mark Rutland  writes:

> When we emulate an MMIO instruction, we advance the CPU state within
> decode_hsr(), before emulating the instruction effects.
>
> Having this logic in decode_hsr() is opaque, and advancing the state
> before emulation is problematic. It gets in the way of applying
> consistent single-step logic, and it prevents us from being able to fail
> an MMIO instruction with a synchronous exception.
>
> Clean this up by only advancing the CPU state *after* the effects of the
> instruction are emulated.
>
> Signed-off-by: Mark Rutland 
> Cc: Alex Bennée 
> Cc: Christoffer Dall 
> Cc: Marc Zyngier 
> Cc: Peter Maydell 

Reviewed-by: Alex Bennée 

> ---
>  virt/kvm/arm/mmio.c | 11 ++-
>  1 file changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/virt/kvm/arm/mmio.c b/virt/kvm/arm/mmio.c
> index dac7ceb1a677..08443a15e6be 100644
> --- a/virt/kvm/arm/mmio.c
> +++ b/virt/kvm/arm/mmio.c
> @@ -117,6 +117,12 @@ int kvm_handle_mmio_return(struct kvm_vcpu *vcpu, struct 
> kvm_run *run)
>   vcpu_set_reg(vcpu, vcpu->arch.mmio_decode.rt, data);
>   }
>
> + /*
> +  * The MMIO instruction is emulated and should not be re-executed
> +  * in the guest.
> +  */
> + kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
> +
>   return 0;
>  }
>
> @@ -144,11 +150,6 @@ static int decode_hsr(struct kvm_vcpu *vcpu, bool 
> *is_write, int *len)
>   vcpu->arch.mmio_decode.sign_extend = sign_extend;
>   vcpu->arch.mmio_decode.rt = rt;
>
> - /*
> -  * The MMIO instruction is emulated and should not be re-executed
> -  * in the guest.
> -  */
> - kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
>   return 0;
>  }


--
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [RFC PATCH] KVM: arm64: don't single-step for non-emulated faults

2018-11-09 Thread Alex Bennée

Mark Rutland  writes:

> On Thu, Nov 08, 2018 at 02:38:43PM +, Peter Maydell wrote:
>> On 8 November 2018 at 14:28, Alex Bennée  wrote:
>> >
>> > Mark Rutland  writes:
>> >> One problem is that I couldn't spot when we advance the PC for an MMIO
>> >> trap. I presume we do that in the kernel, *after* the MMIO trap, but I
>> >> can't see where that happens.
>> >
>> > Nope it gets done before during decode_hsr in mmio.c:
>> >
>> > /*
>> >  * The MMIO instruction is emulated and should not be re-executed
>> >  * in the guest.
>> >  */
>> > kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
>>
>> I think that this attempt to do the PC-advance early is
>> probably an underlying problem that is not helping the
>> code structure here.
>>
>> An enhancement that's been floated previously is that the
>> MMIO emulation in userspace should be able to report back
>> to KVM "nope, that access should generate a guest synchronous
>> external abort (with ESR_EL1.EA = 0/1)".
>> If we have that, then we definitely need to not advance the
>> PC until after userspace has done the emulation and told
>> us whether the memory access succeeded or not...
>
> Yup.
>
> I think that we absolutely want to do all the CPU state advancement (PC,
> SS bit, etc) at the point we apply the effects of the instruction. Not
> before we emulate the instruction, and not higher/lower in the call
> stack.

There is certainly an argument to abstract some of the advancement logic
so we can make debug related decisions in one place. I don't know how
much churn we would need to get there.

Currently most of the guest debug decisions are made in one place as we
head into the guest run. Generally I don't think the emulation code want
to know or care about the SS bit or what debug is currently happening
although I guess the presence of the SS bit could be used to decide on
exactly what exit type you are going to do - back to guest or out to
user space. Currently kvm_arm_handle_step_debug on cares about host
directed debug but we could expand it to raise the appropriate guest
exception if required.

> We have a big problem in that guest-directed singlestep and
> host-directed singlestep don't compose, and given that host-directed
> singlestep doesn't work reliably today I'd be tempted to rip that out
> until we've fixed guest-directed singlestep.

Getting host and guest debug to run at the same time is tricky given we
end up subverting guest state when the host debug is in control. It did
make my head spin when I worked on it originally which led to the
acceptance that guest debug couldn't be expected to work transparently
while host directed debug was in effect. If we can support it without
adding complexity then that would be great but it's a pretty niche use
case.

I'd be loathed to rip out the current single step support as it does
actually work pretty well - it's just corner cases with emulated MMIO
and first step that are failing. Last I looked these were cases x86
didn't even get right and no one has called to remove it's single step
support AFAIK.

> We should have a story for how host-directed debug is handled
> transparently from the PoV of a guest using guest-directed debug.

What's the use case for this apart from having a cleaner abstraction? Do
users really spend time running multiple gdbs at different levels in
the stack?

>
> Thanks,
> Mark.

--
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [RFC PATCH] KVM: arm64: don't single-step for non-emulated faults

2018-11-08 Thread Alex Bennée


Mark Rutland  writes:

> On Thu, Nov 08, 2018 at 12:40:11PM +0000, Alex Bennée wrote:
>> Mark Rutland  writes:
>> > On Wed, Nov 07, 2018 at 06:01:20PM +, Mark Rutland wrote:
>> >> On Wed, Nov 07, 2018 at 05:10:31PM +, Alex Bennée wrote:
>> >> > Not all faults handled by handle_exit are instruction emulations. For
>> >> > example a ESR_ELx_EC_IABT will result in the page tables being updated
>> >> > but the instruction that triggered the fault hasn't actually executed
>> >> > yet. We use the simple heuristic of checking for a changed PC before
>> >> > seeing if kvm_arm_handle_step_debug wants to claim we stepped an
>> >> > instruction.
>> >> >
>> >> > Signed-off-by: Alex Bennée 

>> >> > @@ -233,7 +234,8 @@ static int handle_trap_exceptions(struct kvm_vcpu 
>> >> > *vcpu, struct kvm_run *run)
>> >> >  * kvm_arm_handle_step_debug() sets the exit_reason on the 
>> >> > kvm_run
>> >> >  * structure if we need to return to userspace.
>> >> >  */
>> >> > -   if (handled > 0 && kvm_arm_handle_step_debug(vcpu, run))
>> >> > +   if (handled > 0 && *vcpu_pc(vcpu) != old_pc &&
>> >>

>> >> When are we failing to advance the single-step state machine
>> >> correctly?
>>
>> When the trap is not actually an instruction emulation - e.g. setting up
>> the page tables on a fault. Because we are in the act of single-stepping
>> an instruction that didn't actually execute we erroneously return to
>> userspace pretending we did even though we shouldn't.
>
> I think one problem here is that we're trying to use one bit of state
> (the KVM_GUESTDBG_SINGLESTEP) when we actually need two.
>
> I had expected that we'd follow the architectural single-step state
> machine, and have three states:
>
> * inactive/disabled: not single stepping
>
> * active-not-pending: the current instruction will be stepped, and we'll
>   transition to active-pending before executing the next instruction.
>
> * active-pending: the current instruction will raise a software step
>   debug exception, before being executed.
>
> For that to work, all we have to do is advence the state machine when we
> emulate/skip an instruction, and the HW will raise the exception for us
> when we enter the guest (which is the only place we have to handle the
> step exception).

We also elide the fact that single-stepping is happening from the guest
here by piggy backing the step bit onto cpsr() as we enter KVM rather
than just tracking the state of the bit.

The current flow of guest debug is very much "as I enter what do I need
to set" rather than tracking state between VCPU_RUN events.

> We need two bits of internal state for that, but KVM only gives us a
> single KVM_GUESTDBG_SINGLESTEP flag, and we might exit to userspace
> mid-emulation (e.g. for MMIO). To avoid that resulting in skipping two
> instructions at a time, we currently add explicit
> kvm_arm_handle_step_debug() checks everywhere after we've (possibly)
> emulated an instruction, but these seem to hit too often.

Yes - treating all exits as potential emulations is problematic and we
are increasing complexity to track which exits are and aren't
actual *completed* instruction emulations which can also be a
multi-stage thing split between userspace and the kernel.

> One problem is that I couldn't spot when we advance the PC for an MMIO
> trap. I presume we do that in the kernel, *after* the MMIO trap, but I
> can't see where that happens.

Nope it gets done before during decode_hsr in mmio.c:

/*
 * The MMIO instruction is emulated and should not be re-executed
 * in the guest.
 */
kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));

That is a little non-obvious but before guest debug support was added it
makes sense as the whole trap->kernel->user->kernel->guest cycle is
"atomic" w.r.t the guest. It's also common code for
in-kernel/in-userspace emulation.

For single-step we just built on that and completed the single-step
after mmio was complete.

>
> Thanks,
> Mark.


--
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

1 2 3 4 5 >

1 - 100 of 422 matches

Mail list logo