date:20191202

Re: [PATCH 17/21] s390x: Fix latent query-cpu-model-FOO error handling bugs

2019-12-02 Thread Markus Armbruster

Cornelia Huck  writes:

> On Sat, 30 Nov 2019 20:42:36 +0100
> Markus Armbruster  wrote:
>
> I don't really want to restart the discussion :), but what about:
>
>> cpu_model_from_info() is a helper for qmp_query_cpu_model_expansion(),
>> qmp_query_cpu_model_comparison(), qmp_query_cpu_model_baseline().  It
>> crashes when the visitor or the QOM setter fails, and its @errp
>> argument is null. 
>
> "It would crash when the visitor or the QOM setter fails if its @errp
> argument were NULL." ?
>
> (Hope I got my conditionals right...)

I don't think this is an improvement.

The commit message matches a pattern "what's wrong, since when, impact,
how is it fixed".  The pattern has become habit for me.  Its "what's
wrong" part is strictly local.  The non-local argument comes in only
when we assess impact.

Use of "would" in the what part's conditional signals the condition is
unlikely.  True (it's actually impossible), but distracting (because it
involves the non-local argument I'm not yet ready to make).

Let me try a different phrasing below.

>> Messed up in commit 137974cea3 's390x/cpumodel:
>
> I agree that "Introduced" is a bit nicer than "Messed up".

Works fine for me.  I didn't mean any disrespect --- I'd have to
disrespect myself: the mess corrected by PATCH 10 is mine.

>> implement QMP interface "query-cpu-model-expansion"'.
>> 
>> Its three callers have the same bug.  Messed up in commit 4e82ef0502

Feel free to call it "issue" rather than "bug".  I don't care, but David
might.

>> 's390x/cpumodel: implement QMP interface "query-cpu-model-comparison"'
>> and commit f1a47d08ef 's390x/cpumodel: implement QMP interface
>> "query-cpu-model-baseline"'.
>
> If we agree, I can tweak the various commit messages for the s390x
> patches and apply them.

Tweaking the non-s390x commit messages as well would be nicer, but
requires a respin.

Let's try to craft a mutually agreeable commit message for this patch.
Here's my attempt:

s390x: Fix query-cpu-model-FOO error API violations

cpu_model_from_info() is a helper for qmp_query_cpu_model_expansion(),
qmp_query_cpu_model_comparison(), qmp_query_cpu_model_baseline().  It
dereferences @errp when the visitor or the QOM setter fails.  That's
wrong; see the big comment in error.h.  Introduced in commit
137974cea3 's390x/cpumodel: implement QMP interface
"query-cpu-model-expansion"'.

Its three callers have the same issue.  Introduced in commit
4e82ef0502 's390x/cpumodel: implement QMP interface
"query-cpu-model-comparison"' and commit f1a47d08ef 's390x/cpumodel:
implement QMP interface "query-cpu-model-baseline"'.

No caller actually passes null.  To fix, splice in a local Error *err,
and error_propagate().

Cc: David Hildenbrand 
Cc: Cornelia Huck 
Signed-off-by: Markus Armbruster 

Adapting it to other patches should be straightforward.

>> The bugs can't bite as no caller actually passes null.  Fix them
>> anyway.
>> 
>> Cc: David Hildenbrand 
>> Cc: Cornelia Huck 
>> Signed-off-by: Markus Armbruster 
>> ---
>>  target/s390x/cpu_models.c | 43 ---
>>  1 file changed, 27 insertions(+), 16 deletions(-)
>
> David, I don't think you gave a R-b for that one yet?

Re: [PATCH] hw/ppc/prep: Remove the deprecated "prep" machine and the OpenHackware BIOS

2019-12-02 Thread Philippe Mathieu-Daudé


On 12/3/19 8:29 AM, Thomas Huth wrote:

It's been deprecated since QEMU v3.1. The 40p machine should be
used nowadays instead.

Signed-off-by: Thomas Huth 
---
  .gitmodules|   3 -
  MAINTAINERS|   1 -
  Makefile   |   2 +-
  docs/interop/firmware.json |   3 +-
  hw/ppc/ppc.c   |  18 --
  hw/ppc/prep.c  | 384 +
  include/hw/ppc/ppc.h   |   1 -
  pc-bios/README |   3 -
  pc-bios/ppc_rom.bin| Bin 1048576 -> 0 bytes
  qemu-deprecated.texi   |   6 -
  qemu-doc.texi  |  15 +-
  roms/openhackware  |   1 -
  tests/boot-order-test.c|  25 ---
  tests/cdrom-test.c |   2 +-
  tests/endianness-test.c|   2 +-
  15 files changed, 10 insertions(+), 456 deletions(-)
  delete mode 100644 pc-bios/ppc_rom.bin
  delete mode 16 roms/openhackware

[...]

diff --git a/tests/boot-order-test.c b/tests/boot-order-test.c
index a725bce729..4a6218a516 100644
--- a/tests/boot-order-test.c
+++ b/tests/boot-order-test.c
@@ -108,30 +108,6 @@ static void test_pc_boot_order(void)
  test_boot_orders(NULL, read_boot_order_pc, test_cases_pc);
  }
  
-static uint8_t read_m48t59(QTestState *qts, uint64_t addr, uint16_t reg)

-{
-qtest_writeb(qts, addr, reg & 0xff);
-qtest_writeb(qts, addr + 1, reg >> 8);
-return qtest_readb(qts, addr + 3);
-}
-
-static uint64_t read_boot_order_prep(QTestState *qts)
-{
-return read_m48t59(qts, 0x8000 + 0x74, 0x34);


I'd rather keep this generic mmio-mapped ISA test.
Maybe run it with the 40p machine?

Maybe we can rename this as read_boot_order_mm, and the previous 
read_boot_order_pc as read_boot_order_io.


Except this comment, thanks for this cleaning.

Ideally keeping read_boot_order_io test:
Reviewed-by: Philippe Mathieu-Daudé 


-}
-
-static const boot_order_test test_cases_prep[] = {
-{ "", 'c', 'c' },
-{ "-boot c", 'c', 'c' },
-{ "-boot d", 'd', 'd' },
-{}
-};
-
-static void test_prep_boot_order(void)
-{
-test_boot_orders("prep", read_boot_order_prep, test_cases_prep);
-}
-
  static uint64_t read_boot_order_pmac(QTestState *qts)
  {
  QFWCFG *fw_cfg = mm_fw_cfg_init(qts, 0xf510);
@@ -190,7 +166,6 @@ int main(int argc, char *argv[])
  if (strcmp(arch, "i386") == 0 || strcmp(arch, "x86_64") == 0) {
  qtest_add_func("boot-order/pc", test_pc_boot_order);
  } else if (strcmp(arch, "ppc") == 0 || strcmp(arch, "ppc64") == 0) {
-qtest_add_func("boot-order/prep", test_prep_boot_order);
  qtest_add_func("boot-order/pmac_oldworld",
 test_pmac_oldworld_boot_order);
  qtest_add_func("boot-order/pmac_newworld",
diff --git a/tests/cdrom-test.c b/tests/cdrom-test.c
index 34e9974634..006044f48a 100644
--- a/tests/cdrom-test.c
+++ b/tests/cdrom-test.c
@@ -189,7 +189,7 @@ int main(int argc, char **argv)
  add_s390x_tests();
  } else if (g_str_equal(arch, "ppc64")) {
  const char *ppcmachines[] = {
-"pseries", "mac99", "g3beige", "40p", "prep", NULL
+"pseries", "mac99", "g3beige", "40p", NULL
  };
  add_cdrom_param_tests(ppcmachines);
  } else if (g_str_equal(arch, "sparc")) {
diff --git a/tests/endianness-test.c b/tests/endianness-test.c
index 58527952a5..2798802c63 100644
--- a/tests/endianness-test.c
+++ b/tests/endianness-test.c
@@ -35,7 +35,7 @@ static const TestCase test_cases[] = {
  { "mips64", "malta", 0x1000, .bswap = true },
  { "mips64el", "fulong2e", 0x1fd0 },
  { "ppc", "g3beige", 0xfe00, .bswap = true, .superio = "i82378" },
-{ "ppc", "prep", 0x8000, .bswap = true },
+{ "ppc", "40p", 0x8000, .bswap = true },
  { "ppc", "bamboo", 0xe800, .bswap = true, .superio = "i82378" },
  { "ppc64", "mac99", 0xf200, .bswap = true, .superio = "i82378" },
  { "ppc64", "pseries", (1ULL << 45), .bswap = true, .superio = "i82378" },

Re: [PATCH] monitor: Fix slow reading

2019-12-02 Thread Denis V. Lunev

On 12/2/19 11:49 PM, Markus Armbruster wrote:
> Yury Kotov  writes:
>
>> Hi!
>>
>> 29.11.2019, 11:22, "Markus Armbruster" :
>>> Yury Kotov  writes:
>>>
  The monitor_can_read (as a callback of qemu_chr_fe_set_handlers)
  should return size of buffer which monitor_qmp_read or monitor_read
  can process.
  Currently, monitor_can_read returns 1 as a result of logical not.
  Thus, for each QMP command, len(QMD) iterations of the main loop
  are required to handle a command.
  In fact, these both functions can process any buffer size.
  So, return 1024 as a reasonable size which is enough to process
  the most QMP commands, but not too big to block the main loop for
  a long time.

  Signed-off-by: Yury Kotov 
  ---
   monitor/monitor.c | 9 -
   1 file changed, 8 insertions(+), 1 deletion(-)

  diff --git a/monitor/monitor.c b/monitor/monitor.c
  index 12898b6448..cac3f39727 100644
  --- a/monitor/monitor.c
  +++ b/monitor/monitor.c
  @@ -50,6 +50,13 @@ typedef struct {
   int64_t rate; /* Minimum time (in ns) between two events */
   } MonitorQAPIEventConf;

  +/*
  + * The maximum buffer size which the monitor can process in one iteration
  + * of the main loop. We don't want to block the loop for a long time
  + * because of JSON parser, so use a reasonable value.
  + */
  +#define MONITOR_READ_LEN_MAX 1024
  +
   /* Shared monitor I/O thread */
   IOThread *mon_iothread;

  @@ -498,7 +505,7 @@ int monitor_can_read(void *opaque)
   {
   Monitor *mon = opaque;

  - return !atomic_mb_read(>suspend_cnt);
  + return atomic_mb_read(>suspend_cnt) ? 0 : MONITOR_READ_LEN_MAX;
   }

   void monitor_list_append(Monitor *mon)
>>> Prior attempt:
>>> [PATCH 1/1] monitor: increase amount of data for monitor to read
>>> Message-Id: <1493732857-10721-1-git-send-email-...@openvz.org>
>>> https://lists.nongnu.org/archive/html/qemu-devel/2017-05/msg00206.html
>>>
>>> Review concluded that it breaks HMP command migrate without -d. QMP is
>>> probably okay. Sadly, no v2.
>>>
>>> Next one:
>>> Subject: [PATCH] monitor: increase amount of data for monitor to read
>>> Message-Id: <20190610105906.28524-1-dplotni...@virtuozzo.com>
>>> https://lists.nongnu.org/archive/html/qemu-devel/2019-06/msg01912.html
>>>
>>> Same patch, with a second, suspicious-looking hunk thrown in. I didn't
>>> make the connection to the prior attempt back then. I wrote "I think I
>>> need to (re-)examine how QMP reads input, with special consideration to
>>> its OOB feature."
>>>
>>> This patch is a cleaner variation on the same theme. Its ramifications
>>> are as unobvious as ever.
>>>
>>> I figure the HMP situation is unchanged: not safe, although we could
>>> probably make it safe if we wanted to (Daniel sketched how). My simpler
>>> suggestion stands: separate f_can_read() callbacks for HMP and QMP
>>> [PATCH 1], then change only the one for QMP [PATCH 2].
>>>
>>> The QMP situation is also unchanged: we still need to think through how
>>> this affects reading of QMP input, in particular OOB.
>> I've read the discussion around patches:
>> "monitor: increase amount of data for monitor to read"
>> and realized the problem.
>>
>> It seems that my patch actually has some bugs with HMP and OOB
>> because of suspend/resume.
> For HMP we're sure, for OOB we don't know.
>
>> IIUC there are some approaches to fix them:
>>
>> 1) Input buffer
>>   1. Add input buffer for Monitor struct
>>   2. Handle commands from monitor_xxx_read callbacks one by one
>>   3. Schedule BH to handle remaining bytes in the buffer
>>
>> 2) Input buffer for suspend/resume
>>   1. Add input buffer for Monitor struct
>>   2. Handle multiple commands until monitor is not suspended
>>   3. If monitor suspended, put remaining data to the buffer
>>   4. Handle remaining data in the buffer when we get resume
>>
>> We use QEMU 2.12 which doesn't have the full support of OOB and for which 
>> it's
>> enough to fix HMP case by separating can_read callbacks. But those who use
>> a newer version of QEMU can use OOB feature to improve HMP/QMP performance.
> OOB isn't for monitor performance, it's for monitor availability.
>
> QMP executes one command after the other.  While a command executes, the
> monitor is effectively unavailable.  This can be a problem.  OOB
> execution lets you execute a few special commands right away, without
> waiting for prior commands to complete.
>
>> So, I'm not sure there's a big sense in introducing some buffers.
> Reading byte-wise is pretty pathetic, but it works.  I'm not sure how
> much performance buffers can gain us, and whether it's worth the
> necessary review effort.  How QMP reads input is not trivial, thanks to
> OOB.
>
> Have you measured the improvement?
>
We have had in the past.

The effect is pretty visible under 2 cases:
1. 100+ idle VMs on host. CPU load drops by

[PATCH] hw/ppc/prep: Remove the deprecated "prep" machine and the OpenHackware BIOS

2019-12-02 Thread Thomas Huth

It's been deprecated since QEMU v3.1. The 40p machine should be
used nowadays instead.

Signed-off-by: Thomas Huth 
---
 .gitmodules|   3 -
 MAINTAINERS|   1 -
 Makefile   |   2 +-
 docs/interop/firmware.json |   3 +-
 hw/ppc/ppc.c   |  18 --
 hw/ppc/prep.c  | 384 +
 include/hw/ppc/ppc.h   |   1 -
 pc-bios/README |   3 -
 pc-bios/ppc_rom.bin| Bin 1048576 -> 0 bytes
 qemu-deprecated.texi   |   6 -
 qemu-doc.texi  |  15 +-
 roms/openhackware  |   1 -
 tests/boot-order-test.c|  25 ---
 tests/cdrom-test.c |   2 +-
 tests/endianness-test.c|   2 +-
 15 files changed, 10 insertions(+), 456 deletions(-)
 delete mode 100644 pc-bios/ppc_rom.bin
 delete mode 16 roms/openhackware

diff --git a/.gitmodules b/.gitmodules
index 19792c9a11..9c0501a4d4 100644
--- a/.gitmodules
+++ b/.gitmodules
@@ -10,9 +10,6 @@
 [submodule "roms/openbios"]
path = roms/openbios
url = https://git.qemu.org/git/openbios.git
-[submodule "roms/openhackware"]
-   path = roms/openhackware
-   url = https://git.qemu.org/git/openhackware.git
 [submodule "roms/qemu-palcode"]
path = roms/qemu-palcode
url = https://git.qemu.org/git/qemu-palcode.git
diff --git a/MAINTAINERS b/MAINTAINERS
index 5e5e3e52d6..9a9dee0fa4 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1081,7 +1081,6 @@ F: hw/dma/i82374.c
 F: hw/rtc/m48t59-isa.c
 F: include/hw/isa/pc87312.h
 F: include/hw/rtc/m48t59.h
-F: pc-bios/ppc_rom.bin
 F: tests/acceptance/ppc_prep_40p.py
 
 sPAPR
diff --git a/Makefile b/Makefile
index b437a346d7..86476d1d9e 100644
--- a/Makefile
+++ b/Makefile
@@ -776,7 +776,7 @@ ifdef INSTALL_BLOBS
 BLOBS=bios.bin bios-256k.bin bios-microvm.bin sgabios.bin vgabios.bin 
vgabios-cirrus.bin \
 vgabios-stdvga.bin vgabios-vmware.bin vgabios-qxl.bin vgabios-virtio.bin \
 vgabios-ramfb.bin vgabios-bochs-display.bin vgabios-ati.bin \
-ppc_rom.bin openbios-sparc32 openbios-sparc64 openbios-ppc QEMU,tcx.bin 
QEMU,cgthree.bin \
+openbios-sparc32 openbios-sparc64 openbios-ppc QEMU,tcx.bin QEMU,cgthree.bin \
 pxe-e1000.rom pxe-eepro100.rom pxe-ne2k_pci.rom \
 pxe-pcnet.rom pxe-rtl8139.rom pxe-virtio.rom \
 efi-e1000.rom efi-eepro100.rom efi-ne2k_pci.rom \
diff --git a/docs/interop/firmware.json b/docs/interop/firmware.json
index 8ffb7856d2..240f565397 100644
--- a/docs/interop/firmware.json
+++ b/docs/interop/firmware.json
@@ -27,8 +27,7 @@
 #
 # @openfirmware: The interface is defined by the (historical) IEEE
 #1275-1994 standard. Examples for firmware projects that
-#provide this interface are: OpenBIOS, OpenHackWare,
-#SLOF.
+#provide this interface are: OpenBIOS and SLOF.
 #
 # @uboot: Firmware interface defined by the U-Boot project.
 #
diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c
index 52a18eb7d7..13b6cf85c0 100644
--- a/hw/ppc/ppc.c
+++ b/hw/ppc/ppc.c
@@ -1476,24 +1476,6 @@ int ppc_dcr_init (CPUPPCState *env, int 
(*read_error)(int dcrn),
 }
 
 /*/
-/* Debug port */
-void PPC_debug_write (void *opaque, uint32_t addr, uint32_t val)
-{
-addr &= 0xF;
-switch (addr) {
-case 0:
-printf("%c", val);
-break;
-case 1:
-printf("\n");
-fflush(stdout);
-break;
-case 2:
-printf("Set loglevel to %04" PRIx32 "\n", val);
-qemu_set_log(val | 0x100);
-break;
-}
-}
 
 PowerPCCPU *ppc_get_vcpu_by_pir(int pir)
 {
diff --git a/hw/ppc/prep.c b/hw/ppc/prep.c
index 862345c2ac..111cc80867 100644
--- a/hw/ppc/prep.c
+++ b/hw/ppc/prep.c
@@ -42,7 +42,7 @@
 #include "hw/loader.h"
 #include "hw/rtc/mc146818rtc.h"
 #include "hw/isa/pc87312.h"
-#include "hw/net/ne2000-isa.h"
+#include "hw/qdev-properties.h"
 #include "sysemu/arch_init.h"
 #include "sysemu/kvm.h"
 #include "sysemu/qtest.h"
@@ -60,178 +60,9 @@
 
 #define CFG_ADDR 0xf510
 
-#define BIOS_SIZE (1 * MiB)
-#define BIOS_FILENAME "ppc_rom.bin"
 #define KERNEL_LOAD_ADDR 0x0100
 #define INITRD_LOAD_ADDR 0x0180
 
-/* Constants for devices init */
-static const int ide_iobase[2] = { 0x1f0, 0x170 };
-static const int ide_iobase2[2] = { 0x3f6, 0x376 };
-static const int ide_irq[2] = { 13, 13 };
-
-#define NE2000_NB_MAX 6
-
-static uint32_t ne2000_io[NE2000_NB_MAX] = { 0x300, 0x320, 0x340, 0x360, 
0x280, 0x380 };
-static int ne2000_irq[NE2000_NB_MAX] = { 9, 10, 11, 3, 4, 5 };
-
-/* ISA IO ports bridge */
-#define PPC_IO_BASE 0x8000
-
-/* Fake super-io ports for PREP platform (Intel 82378ZB) */
-typedef struct sysctrl_t {
-qemu_irq reset_irq;
-Nvram *nvram;
-uint8_t state;
-uint8_t syscontrol;
-int contiguous_map;
-qemu_irq contiguous_map_irq;
-int endian;
-} sysctrl_t;
-
-enum {
-STATE_HARDFILE = 0x01,
-};
-
-static sysctrl_t *sysctrl;
-
-static void PREP_io_800_writeb (void

Re: [PATCH 17/21] s390x: Fix latent query-cpu-model-FOO error handling bugs

2019-12-02 Thread Markus Armbruster

David Hildenbrand  writes:

> [...]
>
>> First search hit.  Here's my second one:
>> 
>> Q: What are latent bugs?
>> 
>> A: These bugs do not cause problems today. However, they are lurking
>> just waiting to reveal themselves later.  The Ariane 5 rocket
>> failure was caused by a float->int conversion error that lay dormant
>> when previous rockets were slower; but the faster Ariane 5 triggered
>> the problem.  The Millennium bug is another example of a latent bug
>> that came to light when circumstances changed.  Latent bugs are much
>> harder to test using conventional testing techniques, and finding
>> them requires someone with foresight to ask.
>> 
>> http://www.geekinterview.com/question_details/36689
>
> Google search "latent software BUG"

I think this argument has run its course.  Let's agree to differ on the
finer meaning and possible uses of "latent", and craft a mutually
agreeable commit message instead.  I'll reply to Cornelia's message.

[...]

Re: [PATCH 15/21] s390x/cpu_models: Fix latent feature property error handling bugs

2019-12-02 Thread Markus Armbruster

David Hildenbrand  writes:

> On 30.11.19 20:42, Markus Armbruster wrote:
>> s390x-cpu property setters set_feature() and set_feature_group() crash
>> when the visitor fails and its @errp argument is null.  Messed up in
>> commit 0754f60429 "s390x/cpumodel: expose features and feature groups
>> as properties".
>
> Same comment as to the other patches :)
>
> I think we usually use "s390x/cpumodels", but that's just nitpicking.

$ git-log --oneline target/s390x/cpu_models.c | awk '$2 ~ /:$/ { print $2 }' | 
sort | uniq -c
  1 S390:
  6 qapi:
  1 qemu-common:
  1 qmp:
  2 qobject:
  1 qom:
  1 s390/cpumodel:
  1 s390x/ccw:
 21 s390x/cpumodel:
  1 s390x/cpumodels:
  1 s390x/kvm:
  4 s390x/tcg:
  7 s390x:
  1 target/s390x/cpu_models:
 17 target/s390x:
  1 target:

You're right, except for the plural vs. singular.  I should've browsed
git-log.

> Reviewed-by: David Hildenbrand 

Thanks!

Re: [PATCH] docker: remove libcap development packages

2019-12-02 Thread Alex Bennée



Greg Kurz  writes:

> On Fri, 29 Nov 2019 16:08:01 +0100
> Paolo Bonzini  wrote:
>
>> Libcap was dropped from virtio-9p, so remove it from the dockerfiles as well.
>> 
>> Signed-off-by: Paolo Bonzini 
>> ---
>
> Similarly to what was discussed in these threads:
>
> 20191129111632.22840-2-pbonz...@redhat.com
>
> 20191129142126.32967-1-dgilb...@redhat.com
>
> I'm ok to take this one in my tree as well if I get an ack from Alex
> or Fam.

Acked-by: Alex Bennée 

>
>>  tests/docker/dockerfiles/fedora.docker | 1 -
>>  tests/docker/dockerfiles/ubuntu.docker | 1 -
>>  tests/docker/dockerfiles/ubuntu1804.docker | 1 -
>>  3 files changed, 3 deletions(-)
>> 
>> diff --git a/tests/docker/dockerfiles/fedora.docker 
>> b/tests/docker/dockerfiles/fedora.docker
>> index 4ddc7dd112..47732fc5d5 100644
>> --- a/tests/docker/dockerfiles/fedora.docker
>> +++ b/tests/docker/dockerfiles/fedora.docker
>> @@ -25,7 +25,6 @@ ENV PACKAGES \
>>  libasan \
>>  libattr-devel \
>>  libblockdev-mpath-devel \
>> -libcap-devel \
>>  libcap-ng-devel \
>>  libcurl-devel \
>>  libfdt-devel \
>> diff --git a/tests/docker/dockerfiles/ubuntu.docker 
>> b/tests/docker/dockerfiles/ubuntu.docker
>> index f486492224..ecea155646 100644
>> --- a/tests/docker/dockerfiles/ubuntu.docker
>> +++ b/tests/docker/dockerfiles/ubuntu.docker
>> @@ -23,7 +23,6 @@ ENV PACKAGES flex bison \
>>  libbrlapi-dev \
>>  libbz2-dev \
>>  libcacard-dev \
>> -libcap-dev \
>>  libcap-ng-dev \
>>  libcurl4-gnutls-dev \
>>  libdrm-dev \
>> diff --git a/tests/docker/dockerfiles/ubuntu1804.docker 
>> b/tests/docker/dockerfiles/ubuntu1804.docker
>> index 3cc4f492c4..32a607471a 100644
>> --- a/tests/docker/dockerfiles/ubuntu1804.docker
>> +++ b/tests/docker/dockerfiles/ubuntu1804.docker
>> @@ -12,7 +12,6 @@ ENV PACKAGES flex bison \
>>  libbrlapi-dev \
>>  libbz2-dev \
>>  libcacard-dev \
>> -libcap-dev \
>>  libcap-ng-dev \
>>  libcurl4-gnutls-dev \
>>  libdrm-dev \


-- 
Alex Bennée

Re: [PATCH v20 0/8] Build ACPI Heterogeneous Memory Attribute Table (HMAT)

2019-12-02 Thread Markus Armbruster

"Michael S. Tsirkin"  writes:

> On Tue, Dec 03, 2019 at 07:00:53AM +0100, Markus Armbruster wrote:
>> "Michael S. Tsirkin"  writes:
>> 
>> > On Tue, Dec 03, 2019 at 08:53:30AM +0800, Tao Xu wrote:
>> >> Hi Michael,
>> >> 
>> >> Could this patch series be queued?
>> >> Thank you very much!
>> >> 
>> >> Tao
>> >
>> > QEMU is in freeze, so not yet. Please ping after the release.
>> 
>> Just to avoid confusion: it's Michael's personal preference not to
>> process patches for the next version during freeze.  Other maintainers
>> do, and that's actually the project's policy:
>> 
>> Subject: QEMU Summit 2017: minutes
>> Message-ID: 
>> 
>> https://lists.nongnu.org/archive/html/qemu-devel/2017-11/msg04453.html
>> 
>> qemu-next:
>>  * Problem 1: Contributors cannot get patches merged during freeze
>>(bad experience)
>>  [...]
>>  * Markus Armbruster: Problem 1 is solved if maintainers keep their own
>>-next trees
>>  * Paolo Bonzini: Maintaining -next could slow down or create work for
>>-freeze (e.g. who does backports)
>>  * Action: Maintainers mustn't tell submitters to go away just because
>>we're in a release freeze (it's up to them whether they prefer to
>>maintain a "-next" tree for their subsystem with patches queued for
>>the following release, or track which patches they've accepted
>>some other way)
>>  * We're not going to have an official project-wide "-next" tree, though
>> 
>> Michael, would queuing up patches in a -next branch really be too much
>> trouble for you?
>
> Thanks for pointing this out!
>
> I stopped asking for re-post since awhile ago.  I don't queue patches in
> a public tree but I do review and do keep track of pending patches.
>
> I tend to ask contributors to also ping because sometimes there's a
> problem with rebase, I drop the patch but forget to tell the
> contributor, and it tends to happen more with big patchsets posted during
> freeze as there's a rush to merge changes right after that.
> I usually don't bother people with this for small patches though.
>
> I'll try to be clearer in my communication so contributors don't feel
> stressed.
>
> Would something like:
>
> "I'll queue it for merge after the release. If possible please ping me
> after the release to help make sure it didn't get dropped."
>
> be clearer?

Yes, that's both clearer and friendlier.  Thank you!

> Hopefully windows CI efforts will soon bear fruit to the point where
> they stress PCI enough to make maintaining next worth the effort.

CI++ :)

Re: [RFC][PATCH 0/3] IVSHMEM version 2 device for QEMU

2019-12-02 Thread Jan Kiszka

On 03.12.19 06:53, Liang Yan wrote:
> 
> On 12/2/19 1:16 AM, Jan Kiszka wrote:
>> On 27.11.19 18:19, Jan Kiszka wrote:
>>> Hi Liang,
>>>
>>> On 27.11.19 16:28, Liang Yan wrote:


 On 11/11/19 7:57 AM, Jan Kiszka wrote:
> To get the ball rolling after my presentation of the topic at KVM Forum
> [1] and many fruitful discussions around it, this is a first concrete
> code series. As discussed, I'm starting with the IVSHMEM implementation
> of a QEMU device and server. It's RFC because, besides specification
> and
> implementation details, there will still be some decisions needed about
> how to integrate the new version best into the existing code bases.
>
> If you want to play with this, the basic setup of the shared memory
> device is described in patch 1 and 3. UIO driver and also the
> virtio-ivshmem prototype can be found at
>
> 
> http://git.kiszka.org/?p=linux.git;a=shortlog;h=refs/heads/queues/ivshmem2
>
>
> Accessing the device via UIO is trivial enough. If you want to use it
> for virtio, this is additionally to the description in patch 3
> needed on
> the virtio console backend side:
>
>  modprobe uio_ivshmem
>  echo "1af4 1110 1af4 1100 ffc003 ff" >
> /sys/bus/pci/drivers/uio_ivshmem/new_id
>  linux/tools/virtio/virtio-ivshmem-console /dev/uio0
>
> And for virtio block:
>
>  echo "1af4 1110 1af4 1100 ffc002 ff" >
> /sys/bus/pci/drivers/uio_ivshmem/new_id
>  linux/tools/virtio/virtio-ivshmem-console /dev/uio0
> /path/to/disk.img
>
> After that, you can start the QEMU frontend instance with the
> virtio-ivshmem driver installed which can use the new /dev/hvc* or
> /dev/vda* as usual.
>
> Any feedback welcome!

 Hi, Jan,

 I have been playing your code for last few weeks, mostly study and test,
 of course. Really nice work. I have a few questions here:

 First, qemu part looks good, I tried test between couple VMs, and device
 could pop up correctly for all of them, but I had some problems when
 trying load driver. For example, if set up two VMs, vm1 and vm2, start
 ivshmem server as you suggested. vm1 could load uio_ivshmem and
 virtio_ivshmem correctly, vm2 could load uio_ivshmem but could not show
 up "/dev/uio0", virtio_ivshmem could not be loaded at all, these still
 exist even I switch the load sequence of vm1 and vm2, and sometimes
 reset "virtio_ivshmem" could crash both vm1 and vm2. Not quite sure this
 is bug or "Ivshmem Mode" issue, but I went through ivshmem-server code,
 did not related information.
>>>
>>> If we are only talking about one ivshmem link and vm1 is the master,
>>> there is not role for virtio_ivshmem on it as backend. That is purely
>>> a frontend driver. Vice versa for vm2: If you want to use its ivshmem
>>> instance as virtio frontend, uio_ivshmem plays no role.
>>>
> Hi, Jan,
> 
> Sorry for the late response. Just came back from Thanksgiving holiday.
> 
> I have a few questions here.
> First, how to decide master/slave node? I used two VMs here, they did
> not show same behavior even if I change the boot sequence.

The current mechanism works by selecting the VM gets ID 0 as the
backend, thus sending it also a different protocol ID than the frontend
gets. Could possibly be improved by allowing selection also on the VM
side (QEMU command line parameter etc.).

Inherently, this only affects virtio over ivshmem. Other, symmetric
protocols do not need this differentiation.

> 
> Second, in order to run virtio-ivshmem-console demo, VM1 connect to VM2
> Console. So, need to install virtio frontend driver in VM2, then install
> uio frontend driver in VM1 to get "/dev/uio0", then run demo, right?
> Could you share your procedure?
> 
> Also, I could not get "/dev/uio0" all the time.

OK, should have collected this earlier. This is how I start the console
demo right now:

- ivshmem2-server -F -l 64K -n 2 -V 3 -P 0x8003
- start backend qemu with something like
  "-chardev socket,path=/tmp/ivshmem_socket,id=ivshm
  -device ivshmem,chardev=ivshm" in its command line
- inside that VM
   - modprobe uio_ivshmem
   - echo "110a 4106 1af4 1100 ffc003 ff" > \
 /sys/bus/pci/drivers/uio_ivshmem/new_id
   - virtio-ivshmem-console /dev/uio0
- start frontend qemu (can be identical options)

Now the frontend VM should see the ivshmem-virtio transport device and
attach a virtio console driver to it (/dev/hvc0). If you build the
transport into the kernel, you can even do "console=hvc0".

> 
> 
>>> The "crash" is would be interesting to understand: Do you see kernel
>>> panics of the guests? Or are they stuck? Or are the QEMU instances
>>> stuck? Do you know that you can debug the guest kernels via gdb (and
>>> gdb-scripts of the kernel)?
>>>
> 
> They are stuck, no kernel panics, It's like during console connection, I
> try to

Re: [PATCH v20 0/8] Build ACPI Heterogeneous Memory Attribute Table (HMAT)

2019-12-02 Thread Tao Xu


On 12/3/2019 2:25 PM, Michael S. Tsirkin wrote:

On Tue, Dec 03, 2019 at 07:00:53AM +0100, Markus Armbruster wrote:

"Michael S. Tsirkin"  writes:


On Tue, Dec 03, 2019 at 08:53:30AM +0800, Tao Xu wrote:

Hi Michael,

Could this patch series be queued?
Thank you very much!

Tao


QEMU is in freeze, so not yet. Please ping after the release.


Just to avoid confusion: it's Michael's personal preference not to
process patches for the next version during freeze.  Other maintainers
do, and that's actually the project's policy:

Subject: QEMU Summit 2017: minutes
Message-ID: 
https://lists.nongnu.org/archive/html/qemu-devel/2017-11/msg04453.html

 qemu-next:
  * Problem 1: Contributors cannot get patches merged during freeze
(bad experience)
  [...]
  * Markus Armbruster: Problem 1 is solved if maintainers keep their own
-next trees
  * Paolo Bonzini: Maintaining -next could slow down or create work for
-freeze (e.g. who does backports)
  * Action: Maintainers mustn't tell submitters to go away just because
we're in a release freeze (it's up to them whether they prefer to
maintain a "-next" tree for their subsystem with patches queued for
the following release, or track which patches they've accepted
some other way)
  * We're not going to have an official project-wide "-next" tree, though

Michael, would queuing up patches in a -next branch really be too much
trouble for you?


Thanks for pointing this out!

I stopped asking for re-post since awhile ago.  I don't queue patches in
a public tree but I do review and do keep track of pending patches.

I tend to ask contributors to also ping because sometimes there's a
problem with rebase, I drop the patch but forget to tell the
contributor, and it tends to happen more with big patchsets posted during
freeze as there's a rush to merge changes right after that.
I usually don't bother people with this for small patches though.

I'll try to be clearer in my communication so contributors don't feel
stressed.

Would something like:

"I'll queue it for merge after the release. If possible please ping me
after the release to help make sure it didn't get dropped."

be clearer?

Hopefully windows CI efforts will soon bear fruit to the point where
they stress PCI enough to make maintaining next worth the effort.

I see. Thanks for Markus and Michael's kindly response. I feel happy 
rather than stressed in QEMU community :)

Re: [PATCH v4 39/40] target/arm: Use bool for unmasked in arm_excp_unmasked

2019-12-02 Thread Philippe Mathieu-Daudé


On 12/3/19 3:29 AM, Richard Henderson wrote:

The value computed is fully boolean; using int8_t is odd.

Signed-off-by: Richard Henderson 


Reviewed-by: Philippe Mathieu-Daudé 


---
  target/arm/cpu.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 7a1177b883..a366448c6d 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -417,7 +417,7 @@ static inline bool arm_excp_unmasked(CPUState *cs, unsigned 
int excp_idx,
  {
  CPUARMState *env = cs->env_ptr;
  bool pstate_unmasked;
-int8_t unmasked = 0;
+bool unmasked = false;
  
  /*

   * Don't take exceptions if they target a lower EL.
@@ -468,7 +468,7 @@ static inline bool arm_excp_unmasked(CPUState *cs, unsigned 
int excp_idx,
   * don't affect the masking logic, only the interrupt routing.
   */
  if (target_el == 3 || !secure) {
-unmasked = 1;
+unmasked = true;
  }
  } else {
  /*
@@ -514,7 +514,7 @@ static inline bool arm_excp_unmasked(CPUState *cs, unsigned 
int excp_idx,
  }
  
  if ((scr || hcr) && !secure) {

-unmasked = 1;
+unmasked = true;
  }
  }
  }

Re: [PATCH v4 38/40] target/arm: Pass more cpu state to arm_excp_unmasked

2019-12-02 Thread Philippe Mathieu-Daudé


On 12/3/19 3:29 AM, Richard Henderson wrote:

Avoid redundant computation of cpu state by passing it in
from the caller, which has already computed it for itself.

Signed-off-by: Richard Henderson 


Reviewed-by: Philippe Mathieu-Daudé 


---
  target/arm/cpu.c | 22 --
  1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index a36344d4c7..7a1177b883 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -411,14 +411,13 @@ static void arm_cpu_reset(CPUState *s)
  }
  
  static inline bool arm_excp_unmasked(CPUState *cs, unsigned int excp_idx,

- unsigned int target_el)
+ unsigned int target_el,
+ unsigned int cur_el, bool secure,
+ uint64_t hcr_el2)
  {
  CPUARMState *env = cs->env_ptr;
-unsigned int cur_el = arm_current_el(env);
-bool secure = arm_is_secure(env);
  bool pstate_unmasked;
  int8_t unmasked = 0;
-uint64_t hcr_el2;
  
  /*

   * Don't take exceptions if they target a lower EL.
@@ -429,8 +428,6 @@ static inline bool arm_excp_unmasked(CPUState *cs, unsigned 
int excp_idx,
  return false;
  }
  
-hcr_el2 = arm_hcr_el2_eff(env);

-
  switch (excp_idx) {
  case EXCP_FIQ:
  pstate_unmasked = !(env->daif & PSTATE_F);
@@ -535,6 +532,7 @@ bool arm_cpu_exec_interrupt(CPUState *cs, int 
interrupt_request)
  CPUARMState *env = cs->env_ptr;
  uint32_t cur_el = arm_current_el(env);
  bool secure = arm_is_secure(env);
+uint64_t hcr_el2 = arm_hcr_el2_eff(env);
  uint32_t target_el;
  uint32_t excp_idx;
  bool ret = false;
@@ -542,7 +540,8 @@ bool arm_cpu_exec_interrupt(CPUState *cs, int 
interrupt_request)
  if (interrupt_request & CPU_INTERRUPT_FIQ) {
  excp_idx = EXCP_FIQ;
  target_el = arm_phys_excp_target_el(cs, excp_idx, cur_el, secure);
-if (arm_excp_unmasked(cs, excp_idx, target_el)) {
+if (arm_excp_unmasked(cs, excp_idx, target_el,
+  cur_el, secure, hcr_el2)) {
  cs->exception_index = excp_idx;
  env->exception.target_el = target_el;
  cc->do_interrupt(cs);
@@ -552,7 +551,8 @@ bool arm_cpu_exec_interrupt(CPUState *cs, int 
interrupt_request)
  if (interrupt_request & CPU_INTERRUPT_HARD) {
  excp_idx = EXCP_IRQ;
  target_el = arm_phys_excp_target_el(cs, excp_idx, cur_el, secure);
-if (arm_excp_unmasked(cs, excp_idx, target_el)) {
+if (arm_excp_unmasked(cs, excp_idx, target_el,
+  cur_el, secure, hcr_el2)) {
  cs->exception_index = excp_idx;
  env->exception.target_el = target_el;
  cc->do_interrupt(cs);
@@ -562,7 +562,8 @@ bool arm_cpu_exec_interrupt(CPUState *cs, int 
interrupt_request)
  if (interrupt_request & CPU_INTERRUPT_VIRQ) {
  excp_idx = EXCP_VIRQ;
  target_el = 1;
-if (arm_excp_unmasked(cs, excp_idx, target_el)) {
+if (arm_excp_unmasked(cs, excp_idx, target_el,
+  cur_el, secure, hcr_el2)) {
  cs->exception_index = excp_idx;
  env->exception.target_el = target_el;
  cc->do_interrupt(cs);
@@ -572,7 +573,8 @@ bool arm_cpu_exec_interrupt(CPUState *cs, int 
interrupt_request)
  if (interrupt_request & CPU_INTERRUPT_VFIQ) {
  excp_idx = EXCP_VFIQ;
  target_el = 1;
-if (arm_excp_unmasked(cs, excp_idx, target_el)) {
+if (arm_excp_unmasked(cs, excp_idx, target_el,
+  cur_el, secure, hcr_el2)) {
  cs->exception_index = excp_idx;
  env->exception.target_el = target_el;
  cc->do_interrupt(cs);

Re: [PATCH v4 37/40] target/arm: Move arm_excp_unmasked to cpu.c

2019-12-02 Thread Philippe Mathieu-Daudé


On 12/3/19 3:29 AM, Richard Henderson wrote:

This inline function has one user in cpu.c, and need not be exposed
otherwise.  Code movement only, with fixups for checkpatch.

Signed-off-by: Richard Henderson 


Reviewed-by: Philippe Mathieu-Daudé 


---
  target/arm/cpu.h | 111 ---
  target/arm/cpu.c | 119 +++
  2 files changed, 119 insertions(+), 111 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 8e5aaaf415..22935e4433 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -2673,117 +2673,6 @@ bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync);
  #define ARM_CPUID_TI915T  0x54029152
  #define ARM_CPUID_TI925T  0x54029252
  
-static inline bool arm_excp_unmasked(CPUState *cs, unsigned int excp_idx,

- unsigned int target_el)
-{
-CPUARMState *env = cs->env_ptr;
-unsigned int cur_el = arm_current_el(env);
-bool secure = arm_is_secure(env);
-bool pstate_unmasked;
-int8_t unmasked = 0;
-uint64_t hcr_el2;
-
-/* Don't take exceptions if they target a lower EL.
- * This check should catch any exceptions that would not be taken but left
- * pending.
- */
-if (cur_el > target_el) {
-return false;
-}
-
-hcr_el2 = arm_hcr_el2_eff(env);
-
-switch (excp_idx) {
-case EXCP_FIQ:
-pstate_unmasked = !(env->daif & PSTATE_F);
-break;
-
-case EXCP_IRQ:
-pstate_unmasked = !(env->daif & PSTATE_I);
-break;
-
-case EXCP_VFIQ:
-if (secure || !(hcr_el2 & HCR_FMO) || (hcr_el2 & HCR_TGE)) {
-/* VFIQs are only taken when hypervized and non-secure.  */
-return false;
-}
-return !(env->daif & PSTATE_F);
-case EXCP_VIRQ:
-if (secure || !(hcr_el2 & HCR_IMO) || (hcr_el2 & HCR_TGE)) {
-/* VIRQs are only taken when hypervized and non-secure.  */
-return false;
-}
-return !(env->daif & PSTATE_I);
-default:
-g_assert_not_reached();
-}
-
-/* Use the target EL, current execution state and SCR/HCR settings to
- * determine whether the corresponding CPSR bit is used to mask the
- * interrupt.
- */
-if ((target_el > cur_el) && (target_el != 1)) {
-/* Exceptions targeting a higher EL may not be maskable */
-if (arm_feature(env, ARM_FEATURE_AARCH64)) {
-/* 64-bit masking rules are simple: exceptions to EL3
- * can't be masked, and exceptions to EL2 can only be
- * masked from Secure state. The HCR and SCR settings
- * don't affect the masking logic, only the interrupt routing.
- */
-if (target_el == 3 || !secure) {
-unmasked = 1;
-}
-} else {
-/* The old 32-bit-only environment has a more complicated
- * masking setup. HCR and SCR bits not only affect interrupt
- * routing but also change the behaviour of masking.
- */
-bool hcr, scr;
-
-switch (excp_idx) {
-case EXCP_FIQ:
-/* If FIQs are routed to EL3 or EL2 then there are cases where
- * we override the CPSR.F in determining if the exception is
- * masked or not. If neither of these are set then we fall back
- * to the CPSR.F setting otherwise we further assess the state
- * below.
- */
-hcr = hcr_el2 & HCR_FMO;
-scr = (env->cp15.scr_el3 & SCR_FIQ);
-
-/* When EL3 is 32-bit, the SCR.FW bit controls whether the
- * CPSR.F bit masks FIQ interrupts when taken in non-secure
- * state. If SCR.FW is set then FIQs can be masked by CPSR.F
- * when non-secure but only when FIQs are only routed to EL3.
- */
-scr = scr && !((env->cp15.scr_el3 & SCR_FW) && !hcr);
-break;
-case EXCP_IRQ:
-/* When EL3 execution state is 32-bit, if HCR.IMO is set then
- * we may override the CPSR.I masking when in non-secure state.
- * The SCR.IRQ setting has already been taken into 
consideration
- * when setting the target EL, so it does not have a further
- * affect here.
- */
-hcr = hcr_el2 & HCR_IMO;
-scr = false;
-break;
-default:
-g_assert_not_reached();
-}
-
-if ((scr || hcr) && !secure) {
-unmasked = 1;
-}
-}
-}
-
-/* The PSTATE bits only mask the interrupt if we have not overriden the
- * ability above.
- */
-return unmasked || pstate_unmasked;
-}
-
  #define ARM_CPU_TYPE_SUFFIX "-" TYPE_ARM_CPU
  #define ARM_CPU_TYPE_NAME(name)

Re: [PATCH v20 0/8] Build ACPI Heterogeneous Memory Attribute Table (HMAT)

2019-12-02 Thread Michael S. Tsirkin

On Tue, Dec 03, 2019 at 07:00:53AM +0100, Markus Armbruster wrote:
> "Michael S. Tsirkin"  writes:
> 
> > On Tue, Dec 03, 2019 at 08:53:30AM +0800, Tao Xu wrote:
> >> Hi Michael,
> >> 
> >> Could this patch series be queued?
> >> Thank you very much!
> >> 
> >> Tao
> >
> > QEMU is in freeze, so not yet. Please ping after the release.
> 
> Just to avoid confusion: it's Michael's personal preference not to
> process patches for the next version during freeze.  Other maintainers
> do, and that's actually the project's policy:
> 
> Subject: QEMU Summit 2017: minutes
> Message-ID: 
> 
> https://lists.nongnu.org/archive/html/qemu-devel/2017-11/msg04453.html
> 
> qemu-next:
>  * Problem 1: Contributors cannot get patches merged during freeze
>(bad experience)
>  [...]
>  * Markus Armbruster: Problem 1 is solved if maintainers keep their own
>-next trees
>  * Paolo Bonzini: Maintaining -next could slow down or create work for
>-freeze (e.g. who does backports)
>  * Action: Maintainers mustn't tell submitters to go away just because
>we're in a release freeze (it's up to them whether they prefer to
>maintain a "-next" tree for their subsystem with patches queued for
>the following release, or track which patches they've accepted
>some other way)
>  * We're not going to have an official project-wide "-next" tree, though
> 
> Michael, would queuing up patches in a -next branch really be too much
> trouble for you?

Thanks for pointing this out!

I stopped asking for re-post since awhile ago.  I don't queue patches in
a public tree but I do review and do keep track of pending patches.

I tend to ask contributors to also ping because sometimes there's a
problem with rebase, I drop the patch but forget to tell the
contributor, and it tends to happen more with big patchsets posted during
freeze as there's a rush to merge changes right after that.
I usually don't bother people with this for small patches though.

I'll try to be clearer in my communication so contributors don't feel
stressed.

Would something like:

"I'll queue it for merge after the release. If possible please ping me
after the release to help make sure it didn't get dropped."

be clearer?

Hopefully windows CI efforts will soon bear fruit to the point where
they stress PCI enough to make maintaining next worth the effort.

-- 
MST

Re: [PATCH v4 17/40] target/arm: Tidy ARMMMUIdx m-profile definitions

2019-12-02 Thread Philippe Mathieu-Daudé


On 12/3/19 3:29 AM, Richard Henderson wrote:

Replace the magic numbers with the relevant ARM_MMU_IDX_M_* constants.
Keep the definitions short by referencing previous symbols.


Nice trick :)

Reviewed-by: Philippe Mathieu-Daudé 



Signed-off-by: Richard Henderson 
---
  target/arm/cpu.h | 16 
  1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 6ba5126852..015301e93a 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -2871,14 +2871,14 @@ typedef enum ARMMMUIdx {
  ARMMMUIdx_SE0 = 4 | ARM_MMU_IDX_A,
  ARMMMUIdx_SE1 = 5 | ARM_MMU_IDX_A,
  ARMMMUIdx_Stage2 = 6 | ARM_MMU_IDX_A,
-ARMMMUIdx_MUser = 0 | ARM_MMU_IDX_M,
-ARMMMUIdx_MPriv = 1 | ARM_MMU_IDX_M,
-ARMMMUIdx_MUserNegPri = 2 | ARM_MMU_IDX_M,
-ARMMMUIdx_MPrivNegPri = 3 | ARM_MMU_IDX_M,
-ARMMMUIdx_MSUser = 4 | ARM_MMU_IDX_M,
-ARMMMUIdx_MSPriv = 5 | ARM_MMU_IDX_M,
-ARMMMUIdx_MSUserNegPri = 6 | ARM_MMU_IDX_M,
-ARMMMUIdx_MSPrivNegPri = 7 | ARM_MMU_IDX_M,
+ARMMMUIdx_MUser = ARM_MMU_IDX_M,
+ARMMMUIdx_MPriv = ARM_MMU_IDX_M | ARM_MMU_IDX_M_PRIV,
+ARMMMUIdx_MUserNegPri = ARMMMUIdx_MUser | ARM_MMU_IDX_M_NEGPRI,
+ARMMMUIdx_MPrivNegPri = ARMMMUIdx_MPriv | ARM_MMU_IDX_M_NEGPRI,
+ARMMMUIdx_MSUser = ARMMMUIdx_MUser | ARM_MMU_IDX_M_S,
+ARMMMUIdx_MSPriv = ARMMMUIdx_MPriv | ARM_MMU_IDX_M_S,
+ARMMMUIdx_MSUserNegPri = ARMMMUIdx_MUserNegPri | ARM_MMU_IDX_M_S,
+ARMMMUIdx_MSPrivNegPri = ARMMMUIdx_MPrivNegPri | ARM_MMU_IDX_M_S,
  /* Indexes below here don't have TLBs and are used only for AT system
   * instructions or for the first stage of an S12 page table walk.
   */

Re: [PATCH v4 06/40] target/arm: Split out vae1_tlbmask, vmalle1_tlbmask

2019-12-02 Thread Philippe Mathieu-Daudé


On 12/3/19 3:29 AM, Richard Henderson wrote:

No functional change, but unify code sequences.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 


Easier to review in 2 patches: vae1_tlbmask first, then vmalle1_tlbmask.

If you need to respin, the 2 patches are welcome. Regardless:
Reviewed-by: Philippe Mathieu-Daudé 


---
  target/arm/helper.c | 118 ++--
  1 file changed, 37 insertions(+), 81 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 731507a82f..0b0130d814 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -3890,70 +3890,61 @@ static CPAccessResult aa64_cacheop_access(CPUARMState 
*env,
   * Page D4-1736 (DDI0487A.b)
   */
  
+static int vae1_tlbmask(CPUARMState *env)

+{
+if (arm_is_secure_below_el3(env)) {
+return ARMMMUIdxBit_S1SE1 | ARMMMUIdxBit_S1SE0;
+} else {
+return ARMMMUIdxBit_S12NSE1 | ARMMMUIdxBit_S12NSE0;
+}
+}
+
  static void tlbi_aa64_vmalle1is_write(CPUARMState *env, const ARMCPRegInfo 
*ri,
uint64_t value)
  {
  CPUState *cs = env_cpu(env);
-bool sec = arm_is_secure_below_el3(env);
+int mask = vae1_tlbmask(env);
  
-if (sec) {

-tlb_flush_by_mmuidx_all_cpus_synced(cs,
-ARMMMUIdxBit_S1SE1 |
-ARMMMUIdxBit_S1SE0);
-} else {
-tlb_flush_by_mmuidx_all_cpus_synced(cs,
-ARMMMUIdxBit_S12NSE1 |
-ARMMMUIdxBit_S12NSE0);
-}
+tlb_flush_by_mmuidx_all_cpus_synced(cs, mask);
  }
  
  static void tlbi_aa64_vmalle1_write(CPUARMState *env, const ARMCPRegInfo *ri,

  uint64_t value)
  {
  CPUState *cs = env_cpu(env);
+int mask = vae1_tlbmask(env);
  
  if (tlb_force_broadcast(env)) {

  tlbi_aa64_vmalle1is_write(env, NULL, value);
  return;
  }
  
+tlb_flush_by_mmuidx(cs, mask);

+}
+
+static int vmalle1_tlbmask(CPUARMState *env)
+{
+/*
+ * Note that the 'ALL' scope must invalidate both stage 1 and
+ * stage 2 translations, whereas most other scopes only invalidate
+ * stage 1 translations.
+ */
  if (arm_is_secure_below_el3(env)) {
-tlb_flush_by_mmuidx(cs,
-ARMMMUIdxBit_S1SE1 |
-ARMMMUIdxBit_S1SE0);
+return ARMMMUIdxBit_S1SE1 | ARMMMUIdxBit_S1SE0;
+} else if (arm_feature(env, ARM_FEATURE_EL2)) {
+return ARMMMUIdxBit_S12NSE1 | ARMMMUIdxBit_S12NSE0 | ARMMMUIdxBit_S2NS;
  } else {
-tlb_flush_by_mmuidx(cs,
-ARMMMUIdxBit_S12NSE1 |
-ARMMMUIdxBit_S12NSE0);
+return ARMMMUIdxBit_S12NSE1 | ARMMMUIdxBit_S12NSE0;
  }
  }
  
  static void tlbi_aa64_alle1_write(CPUARMState *env, const ARMCPRegInfo *ri,

uint64_t value)
  {
-/* Note that the 'ALL' scope must invalidate both stage 1 and
- * stage 2 translations, whereas most other scopes only invalidate
- * stage 1 translations.
- */
-ARMCPU *cpu = env_archcpu(env);
-CPUState *cs = CPU(cpu);
+CPUState *cs = env_cpu(env);
+int mask = vmalle1_tlbmask(env);
  
-if (arm_is_secure_below_el3(env)) {

-tlb_flush_by_mmuidx(cs,
-ARMMMUIdxBit_S1SE1 |
-ARMMMUIdxBit_S1SE0);
-} else {
-if (arm_feature(env, ARM_FEATURE_EL2)) {
-tlb_flush_by_mmuidx(cs,
-ARMMMUIdxBit_S12NSE1 |
-ARMMMUIdxBit_S12NSE0 |
-ARMMMUIdxBit_S2NS);
-} else {
-tlb_flush_by_mmuidx(cs,
-ARMMMUIdxBit_S12NSE1 |
-ARMMMUIdxBit_S12NSE0);
-}
-}
+tlb_flush_by_mmuidx(cs, mask);
  }
  
  static void tlbi_aa64_alle2_write(CPUARMState *env, const ARMCPRegInfo *ri,

@@ -3977,28 +3968,10 @@ static void tlbi_aa64_alle3_write(CPUARMState *env, 
const ARMCPRegInfo *ri,
  static void tlbi_aa64_alle1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
  uint64_t value)
  {
-/* Note that the 'ALL' scope must invalidate both stage 1 and
- * stage 2 translations, whereas most other scopes only invalidate
- * stage 1 translations.
- */
  CPUState *cs = env_cpu(env);
-bool sec = arm_is_secure_below_el3(env);
-bool has_el2 = arm_feature(env, ARM_FEATURE_EL2);
+int mask = vmalle1_tlbmask(env);
  
-if (sec) {

-tlb_flush_by_mmuidx_all_cpus_synced(cs,
-ARMMMUIdxBit_S1SE1 |
-ARMMMUIdxBit_S1SE0);
-} else if (has_el2) {
-tlb_flush_by_mmuidx_all_cpus_synced(cs,
-

Re: [PATCH v2 4/4] ast2600: Configure CNTFRQ at 1125MHz

2019-12-02 Thread Philippe Mathieu-Daudé


On 12/3/19 5:14 AM, Andrew Jeffery wrote:

This matches the configuration set by u-boot on the AST2600.

Signed-off-by: Andrew Jeffery 
Reviewed-by: Richard Henderson 
Reviewed-by: Cédric Le Goater 
---
  hw/arm/aspeed_ast2600.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/hw/arm/aspeed_ast2600.c b/hw/arm/aspeed_ast2600.c
index 931887ac681f..5aecc3b3caec 100644
--- a/hw/arm/aspeed_ast2600.c
+++ b/hw/arm/aspeed_ast2600.c
@@ -259,6 +259,9 @@ static void aspeed_soc_ast2600_realize(DeviceState *dev, 
Error **errp)
  object_property_set_int(OBJECT(>cpu[i]), aspeed_calc_affinity(i),
  "mp-affinity", _abort);
  
+object_property_set_int(OBJECT(>cpu[i]), 112500, "cntfrq",

+_abort);
+
  /*
   * TODO: the secondary CPUs are started and a boot helper
   * is needed when using -kernel



Reviewed-by: Philippe Mathieu-Daudé

Re: [PATCH v2 3/4] target/arm: Prepare generic timer for per-platform CNTFRQ

2019-12-02 Thread Philippe Mathieu-Daudé


On 12/3/19 5:14 AM, Andrew Jeffery wrote:

The ASPEED AST2600 clocks the generic timer at the rate of HPLL. On
recent firmwares this is at 1125MHz, which is considerably quicker than
the assumed 62.5MHz of the current generic timer implementation. The
delta between the value as read from CNTFRQ and the true rate of the
underlying QEMUTimer leads to sticky behaviour in AST2600 guests.

Add a feature-gated property exposing CNTFRQ for ARM CPUs providing the
generic timer. This allows platforms to configure CNTFRQ (and the
associated QEMUTimer) to the appropriate frequency prior to starting the
guest.

As the platform can now determine the rate of CNTFRQ we're exposed to
limitations of QEMUTimer that didn't previously materialise: In the
course of emulation we need to arbitrarily and accurately convert
between guest ticks and time, but we're constrained by QEMUTimer's use
of an integer scaling factor. The effect is QEMUTimer cannot exactly
capture the period of frequencies that do not cleanly divide
NANOSECONDS_PER_SECOND for scaling ticks to time. As such, provide an
equally inaccurate scaling factor for scaling time to ticks so at least
a self-consistent inverse relationship holds.

Signed-off-by: Andrew Jeffery 
Reviewed-by: Richard Henderson 
---
  target/arm/cpu.c| 43 +--
  target/arm/cpu.h| 18 ++
  target/arm/helper.c |  9 -
  3 files changed, 59 insertions(+), 11 deletions(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 5698a74061bb..f186019a77fd 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -974,10 +974,12 @@ static void arm_cpu_initfn(Object *obj)
  if (tcg_enabled()) {
  cpu->psci_version = 2; /* TCG implements PSCI 0.2 */
  }
-
-cpu->gt_cntfrq = NANOSECONDS_PER_SECOND / GTIMER_SCALE;
  }
  
+static Property arm_cpu_gt_cntfrq_property =

+DEFINE_PROP_UINT64("cntfrq", ARMCPU, gt_cntfrq,
+   NANOSECONDS_PER_SECOND / GTIMER_SCALE);
+
  static Property arm_cpu_reset_cbar_property =
  DEFINE_PROP_UINT64("reset-cbar", ARMCPU, reset_cbar, 0);
  
@@ -1174,6 +1176,11 @@ void arm_cpu_post_init(Object *obj)
  
  qdev_property_add_static(DEVICE(obj), _cpu_cfgend_property,

   _abort);
+
+if (arm_feature(>env, ARM_FEATURE_GENERIC_TIMER)) {
+qdev_property_add_static(DEVICE(cpu), _cpu_gt_cntfrq_property,
+ _abort);
+}
  }
  
  static void arm_cpu_finalizefn(Object *obj)

@@ -1253,14 +1260,30 @@ static void arm_cpu_realizefn(DeviceState *dev, Error 
**errp)
  }
  }
  
-cpu->gt_timer[GTIMER_PHYS] = timer_new(QEMU_CLOCK_VIRTUAL, GTIMER_SCALE,

-   arm_gt_ptimer_cb, cpu);
-cpu->gt_timer[GTIMER_VIRT] = timer_new(QEMU_CLOCK_VIRTUAL, GTIMER_SCALE,
-   arm_gt_vtimer_cb, cpu);
-cpu->gt_timer[GTIMER_HYP] = timer_new(QEMU_CLOCK_VIRTUAL, GTIMER_SCALE,
-  arm_gt_htimer_cb, cpu);
-cpu->gt_timer[GTIMER_SEC] = timer_new(QEMU_CLOCK_VIRTUAL, GTIMER_SCALE,
-  arm_gt_stimer_cb, cpu);
+
+{
+uint64_t scale;


Apparently you have to use this odd indent due to the '#ifndef 
CONFIG_USER_ONLY'. Well, acceptable.



+
+if (arm_feature(env, ARM_FEATURE_GENERIC_TIMER)) {
+if (!cpu->gt_cntfrq) {
+error_setg(errp, "Invalid CNTFRQ: %"PRId64"Hz",
+   cpu->gt_cntfrq);
+return;
+}
+scale = gt_cntfrq_period_ns(cpu);
+} else {
+scale = GTIMER_SCALE;
+}
+
+cpu->gt_timer[GTIMER_PHYS] = timer_new(QEMU_CLOCK_VIRTUAL, scale,
+   arm_gt_ptimer_cb, cpu);
+cpu->gt_timer[GTIMER_VIRT] = timer_new(QEMU_CLOCK_VIRTUAL, scale,
+   arm_gt_vtimer_cb, cpu);
+cpu->gt_timer[GTIMER_HYP] = timer_new(QEMU_CLOCK_VIRTUAL, scale,
+  arm_gt_htimer_cb, cpu);
+cpu->gt_timer[GTIMER_SEC] = timer_new(QEMU_CLOCK_VIRTUAL, scale,
+  arm_gt_stimer_cb, cpu);
+}
  #endif
  
  cpu_exec_realizefn(cs, _err);

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 666c03871fdf..0bcd13dcac81 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -939,6 +939,24 @@ struct ARMCPU {
  
  static inline unsigned int gt_cntfrq_period_ns(ARMCPU *cpu)

  {
+/*
+ * The exact approach to calculating guest ticks is:
+ *
+ * muldiv64(qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL), cpu->gt_cntfrq,
+ *  NANOSECONDS_PER_SECOND);
+ *
+ * We don't do that. Rather we intentionally use integer division
+ * truncation below and in the caller for the conversion of host monotonic
+ * time to

Re: [PATCH] virtio-serial-bus: fix memory leak while attach virtio-serial-bus

2019-12-02 Thread pannengyuan




On 2019/12/3 13:37, Michael S. Tsirkin wrote:
> On Tue, Dec 03, 2019 at 08:53:42AM +0800, pannengyuan wrote:
>>
>>
>> On 2019/12/2 21:58, Laurent Vivier wrote:
>>> On 02/12/2019 12:15, pannengy...@huawei.com wrote:
 From: PanNengyuan 

 ivqs/ovqs/c_ivq/c_ovq is forgot to cleanup in
 virtio_serial_device_unrealize, the memory leak stack is as bellow:

 Direct leak of 1290240 byte(s) in 180 object(s) allocated from:
 #0 0x7fc9bfc27560 in calloc (/usr/lib64/libasan.so.3+0xc7560)
 #1 0x7fc9bed6f015 in g_malloc0 (/usr/lib64/libglib-2.0.so.0+0x50015)
 #2 0x5650e02b83e7 in virtio_add_queue 
 /mnt/sdb/qemu-4.2.0-rc0/hw/virtio/virtio.c:2327
 #3 0x5650e02847b5 in virtio_serial_device_realize 
 /mnt/sdb/qemu-4.2.0-rc0/hw/char/virtio-serial-bus.c:1089
 #4 0x5650e02b56a7 in virtio_device_realize 
 /mnt/sdb/qemu-4.2.0-rc0/hw/virtio/virtio.c:3504
 #5 0x5650e03bf031 in device_set_realized 
 /mnt/sdb/qemu-4.2.0-rc0/hw/core/qdev.c:876
 #6 0x5650e0531efd in property_set_bool 
 /mnt/sdb/qemu-4.2.0-rc0/qom/object.c:2080
 #7 0x5650e053650e in object_property_set_qobject 
 /mnt/sdb/qemu-4.2.0-rc0/qom/qom-qobject.c:26
 #8 0x5650e0533e14 in object_property_set_bool 
 /mnt/sdb/qemu-4.2.0-rc0/qom/object.c:1338
 #9 0x5650e04c0e37 in virtio_pci_realize 
 /mnt/sdb/qemu-4.2.0-rc0/hw/virtio/virtio-pci.c:1801

 Reported-by: Euler Robot 
 Signed-off-by: PanNengyuan 
 ---
  hw/char/virtio-serial-bus.c | 6 ++
  1 file changed, 6 insertions(+)

 diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
 index 3325904..da9019a 100644
 --- a/hw/char/virtio-serial-bus.c
 +++ b/hw/char/virtio-serial-bus.c
 @@ -1126,9 +1126,15 @@ static void 
 virtio_serial_device_unrealize(DeviceState *dev, Error **errp)
  {
  VirtIODevice *vdev = VIRTIO_DEVICE(dev);
  VirtIOSerial *vser = VIRTIO_SERIAL(dev);
 +int i;
  
  QLIST_REMOVE(vser, next);
  
 +for (i = 0; i <= vser->bus.max_nr_ports; i++) {
 +virtio_del_queue(vdev, 2 * i);
 +virtio_del_queue(vdev, 2 * i + 1);
 +}
 +
>>>
>>> According to virtio_serial_device_realize() and the number of
>>> virtio_add_queue(), I think you have more queues to delete:
>>>
>>>   4 + 2 * vser->bus.max_nr_ports
>>>
>>> (for vser->ivqs[0], vser->ovqs[0], vser->c_ivq, vser->c_ovq,
>>> vser->ivqs[i], vser->ovqs[i]).
>>>
>>> Thanks,
>>> Laurent
>>>
>>>
>> Thanks, but I think the queues is correct, the queues in
>> virtio_serial_device_realize is as follow:
>>
>> // here is 2
>> vser->ivqs[0] = virtio_add_queue(vdev, 128, handle_input);
>> vser->ovqs[0] = virtio_add_queue(vdev, 128, handle_output);
>>
>> // here is 2
>> vser->c_ivq = virtio_add_queue(vdev, 32, control_in);
>> vser->c_ovq = virtio_add_queue(vdev, 32, control_out);
>>
>> // here 2 * (max_nr_ports - 1)  - i is from 1 to max_nr_ports - 1
>> for (i = 1; i < vser->bus.max_nr_ports; i++) {
>> vser->ivqs[i] = virtio_add_queue(vdev, 128, handle_input);
>> vser->ovqs[i] = virtio_add_queue(vdev, 128, handle_output);
>> }
>>
>> so the total queues number is:  2 * (vser->bus.max_nr_ports + 1)
> 
> Rather than worry about this, I posted a patch adding virtio_delete_queue.
> How about reusing that, and just using ivqs/ovqs pointers?
> 
Ok, I will reuse it in next version.

Thanks.

Re: [PATCH] virtio-balloon: fix memory leak while attach virtio-balloon device

2019-12-02 Thread pannengyuan




On 2019/12/3 13:34, Michael S. Tsirkin wrote:
> On Tue, Dec 03, 2019 at 09:44:19AM +0800, pannengy...@huawei.com wrote:
>> From: PanNengyuan 
>>
>> ivq/dvq/svq/free_page_vq is forgot to cleanup in
>> virtio_balloon_device_unrealize, the memory leak stack is as follow:
>>
>> Direct leak of 14336 byte(s) in 2 object(s) allocated from:
>> #0 0x7f99fd9d8560 in calloc (/usr/lib64/libasan.so.3+0xc7560)
>> #1 0x7f99fcb20015 in g_malloc0 (/usr/lib64/libglib-2.0.so.0+0x50015)
>> #2 0x557d90638437 in virtio_add_queue 
>> /mnt/sdb/qemu-4.2.0-rc0/hw/virtio/virtio.c:2327
>> #3 0x557d9064401d in virtio_balloon_device_realize 
>> /mnt/sdb/qemu-4.2.0-rc0/hw/virtio/virtio-balloon.c:793
>> #4 0x557d906356f7 in virtio_device_realize 
>> /mnt/sdb/qemu-4.2.0-rc0/hw/virtio/virtio.c:3504
>> #5 0x557d9073f081 in device_set_realized 
>> /mnt/sdb/qemu-4.2.0-rc0/hw/core/qdev.c:876
>> #6 0x557d908b1f4d in property_set_bool 
>> /mnt/sdb/qemu-4.2.0-rc0/qom/object.c:2080
>> #7 0x557d908b655e in object_property_set_qobject 
>> /mnt/sdb/qemu-4.2.0-rc0/qom/qom-qobject.c:26
>>
>> Reported-by: Euler Robot 
>> Signed-off-by: PanNengyuan 
>> ---
>>  hw/virtio/virtio-balloon.c | 7 +++
>>  1 file changed, 7 insertions(+)
>>
>> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
>> index 40b04f5..5329c65 100644
>> --- a/hw/virtio/virtio-balloon.c
>> +++ b/hw/virtio/virtio-balloon.c
>> @@ -831,6 +831,13 @@ static void virtio_balloon_device_unrealize(DeviceState 
>> *dev, Error **errp)
>>  }
>>  balloon_stats_destroy_timer(s);
>>  qemu_remove_balloon_handler(s);
>> +
>> +virtio_del_queue(vdev, 0);
>> +virtio_del_queue(vdev, 1);
>> +virtio_del_queue(vdev, 2);
>> +if (s->free_page_vq) {
>> +virtio_del_queue(vdev, 3);
>> +}
>>  virtio_cleanup(vdev);
>>  }
> 
> Hmm ok, but how about just doing it through a vq pointer then?
> Seems cleaner. E.g. use patch below and add your on top
> using the new virtio_delete_queue?
> 

ok, It seems more cleaner, I will send a new version later.

Thanks.

> -->
> virtio: add ability to delete vq through a pointer
> 
> Devices tend to maintain vq pointers, allow deleting them like this.
> 
> Signed-off-by: Michael S. Tsirkin 
> 
> --
> 
> diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> index c32a815303..e18756d50d 100644
> --- a/include/hw/virtio/virtio.h
> +++ b/include/hw/virtio/virtio.h
> @@ -183,6 +183,8 @@ VirtQueue *virtio_add_queue(VirtIODevice *vdev, int 
> queue_size,
>  
>  void virtio_del_queue(VirtIODevice *vdev, int n);
>  
> +void virtio_delete_queue(VirtQueue *vq);
> +
>  void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem,
>  unsigned int len);
>  void virtqueue_flush(VirtQueue *vq, unsigned int count);
> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> index 04716b5f6c..31dd140990 100644
> --- a/hw/virtio/virtio.c
> +++ b/hw/virtio/virtio.c
> @@ -2330,17 +2330,22 @@ VirtQueue *virtio_add_queue(VirtIODevice *vdev, int 
> queue_size,
>  return >vq[i];
>  }
>  
> +void virtio_delete_queue(VirtQueue *vq)
> +{
> +vq->vring.num = 0;
> +vq->vring.num_default = 0;
> +vq->handle_output = NULL;
> +vq->handle_aio_output = NULL;
> +g_free(vq->used_elems);
> +}
> +
>  void virtio_del_queue(VirtIODevice *vdev, int n)
>  {
>  if (n < 0 || n >= VIRTIO_QUEUE_MAX) {
>  abort();
>  }
>  
> -vdev->vq[n].vring.num = 0;
> -vdev->vq[n].vring.num_default = 0;
> -vdev->vq[n].handle_output = NULL;
> -vdev->vq[n].handle_aio_output = NULL;
> -g_free(vdev->vq[n].used_elems);
> +virtio_delete_queue(>vq[n]);
>  }
>  
>  static void virtio_set_isr(VirtIODevice *vdev, int value)
> 
> 
> .
>

Re: [PATCH v2 2/4] target/arm: Abstract the generic timer frequency

2019-12-02 Thread Philippe Mathieu-Daudé


On 12/3/19 5:14 AM, Andrew Jeffery wrote:

Prepare for SoCs such as the ASPEED AST2600 whose firmware configures
CNTFRQ to values significantly larger than the static 62.5MHz value
currently derived from GTIMER_SCALE. As the OS potentially derives its
timer periods from the CNTFRQ value the lack of support for running
QEMUTimers at the appropriate rate leads to sticky behaviour in the
guest.

Substitute the GTIMER_SCALE constant with use of a helper to derive the
period from gt_cntfrq stored in struct ARMCPU. Initially set gt_cntfrq
to the frequency associated with GTIMER_SCALE so current behaviour is
maintained.

Signed-off-by: Andrew Jeffery 
Reviewed-by: Richard Henderson 
---
  target/arm/cpu.c|  2 ++
  target/arm/cpu.h| 10 ++
  target/arm/helper.c | 10 +++---
  3 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 7a4ac9339bf9..5698a74061bb 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -974,6 +974,8 @@ static void arm_cpu_initfn(Object *obj)
  if (tcg_enabled()) {
  cpu->psci_version = 2; /* TCG implements PSCI 0.2 */
  }
+
+cpu->gt_cntfrq = NANOSECONDS_PER_SECOND / GTIMER_SCALE;
  }
  
  static Property arm_cpu_reset_cbar_property =

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 83a809d4bac4..666c03871fdf 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -932,8 +932,18 @@ struct ARMCPU {
   */
  DECLARE_BITMAP(sve_vq_map, ARM_MAX_VQ);
  DECLARE_BITMAP(sve_vq_init, ARM_MAX_VQ);
+
+/* Generic timer counter frequency, in Hz */
+uint64_t gt_cntfrq;


You can also explicit the unit by calling it 'gt_cntfrq_hz'.


  };
  
+static inline unsigned int gt_cntfrq_period_ns(ARMCPU *cpu)

+{
+/* XXX: Could include qemu/timer.h to get NANOSECONDS_PER_SECOND? */


Why inline this call? I doubt there is a significant performance gain.


+const unsigned int ns_per_s = 1000 * 1000 * 1000;
+return ns_per_s > cpu->gt_cntfrq ? ns_per_s / cpu->gt_cntfrq : 1;
+}
+
  void arm_cpu_post_init(Object *obj);
  
  uint64_t arm_cpu_mp_affinity(int idx, uint8_t clustersz);

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 65c4441a3896..2622a9a8d02f 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -2409,7 +2409,9 @@ static CPAccessResult gt_stimer_access(CPUARMState *env,
  
  static uint64_t gt_get_countervalue(CPUARMState *env)

  {
-return qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) / GTIMER_SCALE;
+ARMCPU *cpu = env_archcpu(env);
+
+return qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) / gt_cntfrq_period_ns(cpu);
  }
  
  static void gt_recalc_timer(ARMCPU *cpu, int timeridx)

@@ -2445,7 +2447,7 @@ static void gt_recalc_timer(ARMCPU *cpu, int timeridx)
   * set the timer for as far in the future as possible. When the
   * timer expires we will reset the timer for any remaining period.
   */
-if (nexttick > INT64_MAX / GTIMER_SCALE) {
+if (nexttick > INT64_MAX / gt_cntfrq_period_ns(cpu)) {
  timer_mod_ns(cpu->gt_timer[timeridx], INT64_MAX);
  } else {
  timer_mod(cpu->gt_timer[timeridx], nexttick);
@@ -2874,11 +2876,13 @@ static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
  
  static uint64_t gt_virt_cnt_read(CPUARMState *env, const ARMCPRegInfo *ri)

  {
+ARMCPU *cpu = env_archcpu(env);
+
  /* Currently we have no support for QEMUTimer in linux-user so we
   * can't call gt_get_countervalue(env), instead we directly
   * call the lower level functions.
   */
-return cpu_get_clock() / GTIMER_SCALE;
+return cpu_get_clock() / gt_cntfrq_period_ns(cpu);
  }
  
  static const ARMCPRegInfo generic_timer_cp_reginfo[] = {

Re: [PATCH v2 0/4] Expose GT CNTFRQ as a CPU property to support AST2600

2019-12-02 Thread Philippe Mathieu-Daudé


On 12/3/19 5:14 AM, Andrew Jeffery wrote:

Hello,

This is a v2 of the belated follow-up from a few of my earlier attempts to fix
up the ARM generic timer for correct behaviour on the ASPEED AST2600 SoC. The
AST2600 clocks the generic timer at the rate of HPLL, which is configured to
1125MHz.  This is significantly quicker than the currently hard-coded generic
timer rate of 62.5MHz and so we see "sticky" behaviour in the guest.


Glad you fixed this! I hit the same problem with the Raspi4.

Re: [PATCH v20 0/8] Build ACPI Heterogeneous Memory Attribute Table (HMAT)

2019-12-02 Thread Markus Armbruster

"Michael S. Tsirkin"  writes:

> On Tue, Dec 03, 2019 at 08:53:30AM +0800, Tao Xu wrote:
>> Hi Michael,
>> 
>> Could this patch series be queued?
>> Thank you very much!
>> 
>> Tao
>
> QEMU is in freeze, so not yet. Please ping after the release.

Just to avoid confusion: it's Michael's personal preference not to
process patches for the next version during freeze.  Other maintainers
do, and that's actually the project's policy:

Subject: QEMU Summit 2017: minutes
Message-ID: 
https://lists.nongnu.org/archive/html/qemu-devel/2017-11/msg04453.html

qemu-next:
 * Problem 1: Contributors cannot get patches merged during freeze
   (bad experience)
 [...]
 * Markus Armbruster: Problem 1 is solved if maintainers keep their own
   -next trees
 * Paolo Bonzini: Maintaining -next could slow down or create work for
   -freeze (e.g. who does backports)
 * Action: Maintainers mustn't tell submitters to go away just because
   we're in a release freeze (it's up to them whether they prefer to
   maintain a "-next" tree for their subsystem with patches queued for
   the following release, or track which patches they've accepted
   some other way)
 * We're not going to have an official project-wide "-next" tree, though

Michael, would queuing up patches in a -next branch really be too much
trouble for you?

Re: [PATCH v20 0/8] Build ACPI Heterogeneous Memory Attribute Table (HMAT)

2019-12-02 Thread Tao Xu


On 12/3/2019 1:35 PM, Michael S. Tsirkin wrote:

On Tue, Dec 03, 2019 at 08:53:30AM +0800, Tao Xu wrote:

Hi Michael,

Could this patch series be queued?
Thank you very much!

Tao


QEMU is in freeze, so not yet. Please ping after the release.


OK, Thank you!

Re: [PATCH] virtio-serial-bus: fix memory leak while attach virtio-serial-bus

2019-12-02 Thread Michael S. Tsirkin

On Tue, Dec 03, 2019 at 08:53:42AM +0800, pannengyuan wrote:
> 
> 
> On 2019/12/2 21:58, Laurent Vivier wrote:
> > On 02/12/2019 12:15, pannengy...@huawei.com wrote:
> >> From: PanNengyuan 
> >>
> >> ivqs/ovqs/c_ivq/c_ovq is forgot to cleanup in
> >> virtio_serial_device_unrealize, the memory leak stack is as bellow:
> >>
> >> Direct leak of 1290240 byte(s) in 180 object(s) allocated from:
> >> #0 0x7fc9bfc27560 in calloc (/usr/lib64/libasan.so.3+0xc7560)
> >> #1 0x7fc9bed6f015 in g_malloc0 (/usr/lib64/libglib-2.0.so.0+0x50015)
> >> #2 0x5650e02b83e7 in virtio_add_queue 
> >> /mnt/sdb/qemu-4.2.0-rc0/hw/virtio/virtio.c:2327
> >> #3 0x5650e02847b5 in virtio_serial_device_realize 
> >> /mnt/sdb/qemu-4.2.0-rc0/hw/char/virtio-serial-bus.c:1089
> >> #4 0x5650e02b56a7 in virtio_device_realize 
> >> /mnt/sdb/qemu-4.2.0-rc0/hw/virtio/virtio.c:3504
> >> #5 0x5650e03bf031 in device_set_realized 
> >> /mnt/sdb/qemu-4.2.0-rc0/hw/core/qdev.c:876
> >> #6 0x5650e0531efd in property_set_bool 
> >> /mnt/sdb/qemu-4.2.0-rc0/qom/object.c:2080
> >> #7 0x5650e053650e in object_property_set_qobject 
> >> /mnt/sdb/qemu-4.2.0-rc0/qom/qom-qobject.c:26
> >> #8 0x5650e0533e14 in object_property_set_bool 
> >> /mnt/sdb/qemu-4.2.0-rc0/qom/object.c:1338
> >> #9 0x5650e04c0e37 in virtio_pci_realize 
> >> /mnt/sdb/qemu-4.2.0-rc0/hw/virtio/virtio-pci.c:1801
> >>
> >> Reported-by: Euler Robot 
> >> Signed-off-by: PanNengyuan 
> >> ---
> >>  hw/char/virtio-serial-bus.c | 6 ++
> >>  1 file changed, 6 insertions(+)
> >>
> >> diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
> >> index 3325904..da9019a 100644
> >> --- a/hw/char/virtio-serial-bus.c
> >> +++ b/hw/char/virtio-serial-bus.c
> >> @@ -1126,9 +1126,15 @@ static void 
> >> virtio_serial_device_unrealize(DeviceState *dev, Error **errp)
> >>  {
> >>  VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> >>  VirtIOSerial *vser = VIRTIO_SERIAL(dev);
> >> +int i;
> >>  
> >>  QLIST_REMOVE(vser, next);
> >>  
> >> +for (i = 0; i <= vser->bus.max_nr_ports; i++) {
> >> +virtio_del_queue(vdev, 2 * i);
> >> +virtio_del_queue(vdev, 2 * i + 1);
> >> +}
> >> +
> > 
> > According to virtio_serial_device_realize() and the number of
> > virtio_add_queue(), I think you have more queues to delete:
> > 
> >   4 + 2 * vser->bus.max_nr_ports
> > 
> > (for vser->ivqs[0], vser->ovqs[0], vser->c_ivq, vser->c_ovq,
> > vser->ivqs[i], vser->ovqs[i]).
> > 
> > Thanks,
> > Laurent
> > 
> > 
> Thanks, but I think the queues is correct, the queues in
> virtio_serial_device_realize is as follow:
> 
> // here is 2
> vser->ivqs[0] = virtio_add_queue(vdev, 128, handle_input);
> vser->ovqs[0] = virtio_add_queue(vdev, 128, handle_output);
> 
> // here is 2
> vser->c_ivq = virtio_add_queue(vdev, 32, control_in);
> vser->c_ovq = virtio_add_queue(vdev, 32, control_out);
> 
> // here 2 * (max_nr_ports - 1)  - i is from 1 to max_nr_ports - 1
> for (i = 1; i < vser->bus.max_nr_ports; i++) {
> vser->ivqs[i] = virtio_add_queue(vdev, 128, handle_input);
> vser->ovqs[i] = virtio_add_queue(vdev, 128, handle_output);
> }
> 
> so the total queues number is:  2 * (vser->bus.max_nr_ports + 1)

Rather than worry about this, I posted a patch adding virtio_delete_queue.
How about reusing that, and just using ivqs/ovqs pointers?

-- 
MST

Re: [PATCH v20 0/8] Build ACPI Heterogeneous Memory Attribute Table (HMAT)

2019-12-02 Thread Michael S. Tsirkin

On Tue, Dec 03, 2019 at 08:53:30AM +0800, Tao Xu wrote:
> Hi Michael,
> 
> Could this patch series be queued?
> Thank you very much!
> 
> Tao

QEMU is in freeze, so not yet. Please ping after the release.

-- 
MST

Re: [PATCH] virtio-balloon: fix memory leak while attach virtio-balloon device

2019-12-02 Thread Michael S. Tsirkin

On Tue, Dec 03, 2019 at 09:44:19AM +0800, pannengy...@huawei.com wrote:
> From: PanNengyuan 
> 
> ivq/dvq/svq/free_page_vq is forgot to cleanup in
> virtio_balloon_device_unrealize, the memory leak stack is as follow:
> 
> Direct leak of 14336 byte(s) in 2 object(s) allocated from:
> #0 0x7f99fd9d8560 in calloc (/usr/lib64/libasan.so.3+0xc7560)
> #1 0x7f99fcb20015 in g_malloc0 (/usr/lib64/libglib-2.0.so.0+0x50015)
> #2 0x557d90638437 in virtio_add_queue 
> /mnt/sdb/qemu-4.2.0-rc0/hw/virtio/virtio.c:2327
> #3 0x557d9064401d in virtio_balloon_device_realize 
> /mnt/sdb/qemu-4.2.0-rc0/hw/virtio/virtio-balloon.c:793
> #4 0x557d906356f7 in virtio_device_realize 
> /mnt/sdb/qemu-4.2.0-rc0/hw/virtio/virtio.c:3504
> #5 0x557d9073f081 in device_set_realized 
> /mnt/sdb/qemu-4.2.0-rc0/hw/core/qdev.c:876
> #6 0x557d908b1f4d in property_set_bool 
> /mnt/sdb/qemu-4.2.0-rc0/qom/object.c:2080
> #7 0x557d908b655e in object_property_set_qobject 
> /mnt/sdb/qemu-4.2.0-rc0/qom/qom-qobject.c:26
> 
> Reported-by: Euler Robot 
> Signed-off-by: PanNengyuan 
> ---
>  hw/virtio/virtio-balloon.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index 40b04f5..5329c65 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c
> @@ -831,6 +831,13 @@ static void virtio_balloon_device_unrealize(DeviceState 
> *dev, Error **errp)
>  }
>  balloon_stats_destroy_timer(s);
>  qemu_remove_balloon_handler(s);
> +
> +virtio_del_queue(vdev, 0);
> +virtio_del_queue(vdev, 1);
> +virtio_del_queue(vdev, 2);
> +if (s->free_page_vq) {
> +virtio_del_queue(vdev, 3);
> +}
>  virtio_cleanup(vdev);
>  }

Hmm ok, but how about just doing it through a vq pointer then?
Seems cleaner. E.g. use patch below and add your on top
using the new virtio_delete_queue?

-->
virtio: add ability to delete vq through a pointer

Devices tend to maintain vq pointers, allow deleting them like this.

Signed-off-by: Michael S. Tsirkin 

--

diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index c32a815303..e18756d50d 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -183,6 +183,8 @@ VirtQueue *virtio_add_queue(VirtIODevice *vdev, int 
queue_size,
 
 void virtio_del_queue(VirtIODevice *vdev, int n);
 
+void virtio_delete_queue(VirtQueue *vq);
+
 void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem,
 unsigned int len);
 void virtqueue_flush(VirtQueue *vq, unsigned int count);
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 04716b5f6c..31dd140990 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -2330,17 +2330,22 @@ VirtQueue *virtio_add_queue(VirtIODevice *vdev, int 
queue_size,
 return >vq[i];
 }
 
+void virtio_delete_queue(VirtQueue *vq)
+{
+vq->vring.num = 0;
+vq->vring.num_default = 0;
+vq->handle_output = NULL;
+vq->handle_aio_output = NULL;
+g_free(vq->used_elems);
+}
+
 void virtio_del_queue(VirtIODevice *vdev, int n)
 {
 if (n < 0 || n >= VIRTIO_QUEUE_MAX) {
 abort();
 }
 
-vdev->vq[n].vring.num = 0;
-vdev->vq[n].vring.num_default = 0;
-vdev->vq[n].handle_output = NULL;
-vdev->vq[n].handle_aio_output = NULL;
-g_free(vdev->vq[n].used_elems);
+virtio_delete_queue(>vq[n]);
 }
 
 static void virtio_set_isr(VirtIODevice *vdev, int value)

Re: [for-5.0 3/4] spapr: Clean up RMA size calculation

2019-12-02 Thread Alexey Kardashevskiy




On 03/12/2019 14:44, Alexey Kardashevskiy wrote:
> 
> 
> On 29/11/2019 12:35, David Gibson wrote:
>> Move the calculation of the Real Mode Area (RMA) size into a helper
>> function.  While we're there clean it up and correct it in a few ways:
>>   * Add comments making it clearer where the various constraints come from
>>   * Remove a pointless check that the RMA fits within Node 0 (we've just
>> clamped it so that it does)
>>   * The 16GiB limit we apply is only correct for POWER8, but there is also
>> a 1TiB limit that applies on POWER9.
>>
>> Signed-off-by: David Gibson 
>> ---
>>  hw/ppc/spapr.c | 57 +++---
>>  1 file changed, 35 insertions(+), 22 deletions(-)
>>
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index 52c39daa99..7efd4f2b85 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -2664,6 +2664,40 @@ static PCIHostState *spapr_create_default_phb(void)
>>  return PCI_HOST_BRIDGE(dev);
>>  }
>>  
>> +static hwaddr spapr_rma_size(SpaprMachineState *spapr, Error **errp)
>> +{
>> +MachineState *machine = MACHINE(spapr);
>> +hwaddr rma_size = machine->ram_size;
>> +hwaddr node0_size = spapr_node0_size(machine);
>> +
>> +/* RMA has to fit in the first NUMA node */
>> +rma_size = MIN(rma_size, node0_size);
>> +
>> +/*
>> + * VRMA access is via a special 1TiB SLB mapping, so the RMA can
>> + * never exceed that
>> + */
>> +rma_size = MIN(rma_size, TiB);
>> +
>> +/*
>> + * RMA size is controlled in hardware by LPCR[RMLS].  On POWER8
> 
> 
> RMA is controlled by LPCR on P8 but the RMLS bits on P9 are reserved
> (also reserved in PowerISA 3.0).
> 
> 
>> + * the largest RMA that can be specified there is 16GiB
> 
> 
> The P8 user manual says:
> ===
> The following RMO sizes are available for the POWER8 processor.
> The RMLS[34:37] field in the LPCR defines the RMO sizes, as described below.
> 1000 - 32 MB
> 0011 - 64 MB
> 0111 - 128 MB
> 0100 - 256 MB
> 0010 - 1 GB
> 0001 - 16 GB
>  - 256 GB
> ===
> 
> The maximum seems to be 256GiB.


Ah, update from Paul - we do not actually use what LPCR[RMLS] controls -
Real Mode Offset Register (RMOR).



> 
> 
>> + */
>> +if (!ppc_type_check_compat(machine->cpu_type, CPU_POWERPC_LOGICAL_3_00,
>> +   0, spapr->max_compat_pvr)) {
>> +rma_size = MIN(rma_size, 16 * GiB);
>> +}
>> +
>> +if (rma_size < (MIN_RMA_SLOF * MiB)) {
> 
> 
> nit: it is time to redefine MIN_RMA_SLOF to use MiBs imho :)
> 
> 
>> +error_setg(errp,
>> +"pSeries SLOF firmware requires >= %ldMiB guest RMA (Real Mode Area)",
>> +   MIN_RMA_SLOF);
> 
> Something went wrong with formatting here.
> 
> Otherwise looks good. Thanks,
> 
> 
> 
>> +return -1;
>> +}
>> +
>> +return rma_size;
>> +}
>> +
>>  /* pSeries LPAR / sPAPR hardware init */
>>  static void spapr_machine_init(MachineState *machine)
>>  {
>> @@ -2675,7 +2709,6 @@ static void spapr_machine_init(MachineState *machine)
>>  int i;
>>  MemoryRegion *sysmem = get_system_memory();
>>  MemoryRegion *ram = g_new(MemoryRegion, 1);
>> -hwaddr node0_size = spapr_node0_size(machine);
>>  long load_limit, fw_size;
>>  char *filename;
>>  Error *resize_hpt_err = NULL;
>> @@ -2715,20 +2748,7 @@ static void spapr_machine_init(MachineState *machine)
>>  exit(1);
>>  }
>>  
>> -spapr->rma_size = node0_size;
>> -
>> -/* Actually we don't support unbounded RMA anymore since we added
>> - * proper emulation of HV mode. The max we can get is 16G which
>> - * also happens to be what we configure for PAPR mode so make sure
>> - * we don't do anything bigger than that
>> - */
>> -spapr->rma_size = MIN(spapr->rma_size, 0x4ull);
>> -
>> -if (spapr->rma_size > node0_size) {
>> -error_report("Numa node 0 has to span the RMA (%#08"HWADDR_PRIx")",
>> - spapr->rma_size);
>> -exit(1);
>> -}
>> +spapr->rma_size = spapr_rma_size(spapr, _fatal);
>>  
>>  /* Setup a load limit for the ramdisk leaving room for SLOF and FDT */
>>  load_limit = MIN(spapr->rma_size, RTAS_MAX_ADDR) - FW_OVERHEAD;
>> @@ -2956,13 +2976,6 @@ static void spapr_machine_init(MachineState *machine)
>>  }
>>  }
>>  
>> -if (spapr->rma_size < (MIN_RMA_SLOF * MiB)) {
>> -error_report(
>> -"pSeries SLOF firmware requires >= %ldM guest RMA (Real Mode 
>> Area memory)",
>> -MIN_RMA_SLOF);
>> -exit(1);
>> -}
>> -
>>  if (kernel_filename) {
>>  uint64_t lowaddr = 0;
>>  
>>
> 

-- 
Alexey

Re: [for-5.0 0/4] spapr: Improvements to CAS feature negotiation

2019-12-02 Thread David Gibson

On Mon, Dec 02, 2019 at 08:05:13AM +0100, Cédric Le Goater wrote:
> On 29/11/2019 06:33, David Gibson wrote:
> > This series contains several cleanups to the handling of the
> > ibm,client-architecture-support firmware call used for boot time
> > feature negotiation between the guest OS and the firmware &
> > hypervisor.
> > 
> > Mostly it's just internal polish, but one significant user visible
> > change is that we no longer generate an extra CAS reboot to switch
> > between XICS and XIVE interrupt modes (by far the most common cause of
> > CAS reboots in practice).
> 
> 
> I love it. thanks for removing this extra reboot.

Glad you like it.  I've folded this into ppc-for-5.0 now.

> 
> C. 
> 
> 
> > 
> > David Gibson (4):
> >   spapr: Don't trigger a CAS reboot for XICS/XIVE mode changeover
> >   spapr: Improve handling of fdt buffer size
> >   spapr: Fold h_cas_compose_response() into
> > h_client_architecture_support()
> >   spapr: Simplify ovec diff
> > 
> >  hw/ppc/spapr.c  | 92 +++--
> >  hw/ppc/spapr_hcall.c| 90 +---
> >  hw/ppc/spapr_ovec.c | 30 
> >  include/hw/ppc/spapr.h  |  4 +-
> >  include/hw/ppc/spapr_ovec.h |  4 +-
> >  5 files changed, 83 insertions(+), 137 deletions(-)
> > 
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [for-5.0 4/4] spapr: Correct clamping of RMA to Node 0 size

2019-12-02 Thread Alexey Kardashevskiy




On 29/11/2019 12:35, David Gibson wrote:
> The Real Mode Area (RMA) needs to fit within Node 0 in NUMA configurations.
> We use a helper function spapr_node0_size() to calculate this.
> 
> But that function doesn't actually get the size of Node 0, it gets the
> minimum size of all nodes, ever since b082d65a300 "spapr: Add a helper for
> node0_size calculation".  That was added, apparently, because Node 0 in
> qemu's terms might not have corresponded to Node 0 in PAPR terms (i.e. the
> node with memory at address 0).


After looking at this closely, I think the idea was that the first
node(s) may have only CPUs but not memory, in this case
node#0.node_mem==0 and things crash:


/home/aik/pbuild/qemu-garrison2-ppc64/ppc64-softmmu/qemu-system-ppc64 \
-nodefaults \
-chardev stdio,id=STDIO0,signal=off,mux=on \
-device spapr-vty,id=svty0,reg=0x71000110,chardev=STDIO0 \
-mon id=MON0,chardev=STDIO0,mode=readline \
-nographic \
-vga none \
-enable-kvm \
-m 2G \
-kernel /home/aik/t/vml4120le \
-initrd /home/aik/t/le.cpio \
-object memory-backend-ram,id=memdev1,size=2G \
-numa node,nodeid=0 \
-numa node,nodeid=1,memdev=memdev1 \
-numa cpu,node-id=0 \
-smp 8,threads=8 \
-machine pseries \
-L /home/aik/t/qemu-ppc64-bios/ \
-trace events=qemu_trace_events \
-d guest_errors \
-chardev socket,id=SOCKET0,server,nowait,path=qemu.mon.ssh37742 \
-mon chardev=SOCKET0,mode=control
QEMU PID = 12377
qemu-system-ppc64:qemu_trace_events:80: warning: trace event
'vfio_ram_register' does not exist
qemu-system-ppc64: pSeries SLOF firmware requires >= 128MiB guest RMA
(Real Mode Area)





> That might not have been the case at the time, but it *is* the case now
> that qemu node 0 must have the lowest address, which is the node we need.
> So, we can simplify this logic, folding it into spapr_rma_size(), the only
> remaining caller.
> 
> Signed-off-by: David Gibson 
> ---
>  hw/ppc/spapr.c | 26 ++
>  1 file changed, 10 insertions(+), 16 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 7efd4f2b85..6611f75bdf 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -295,20 +295,6 @@ static void spapr_populate_pa_features(SpaprMachineState 
> *spapr,
>  _FDT((fdt_setprop(fdt, offset, "ibm,pa-features", pa_features, 
> pa_size)));
>  }
>  
> -static hwaddr spapr_node0_size(MachineState *machine)
> -{
> -if (machine->numa_state->num_nodes) {
> -int i;
> -for (i = 0; i < machine->numa_state->num_nodes; ++i) {
> -if (machine->numa_state->nodes[i].node_mem) {
> -return MIN(pow2floor(machine->numa_state->nodes[i].node_mem),
> -   machine->ram_size);
> -}
> -}
> -}
> -return machine->ram_size;
> -}
> -
>  static void add_str(GString *s, const gchar *s1)
>  {
>  g_string_append_len(s, s1, strlen(s1) + 1);
> @@ -2668,10 +2654,13 @@ static hwaddr spapr_rma_size(SpaprMachineState 
> *spapr, Error **errp)
>  {
>  MachineState *machine = MACHINE(spapr);
>  hwaddr rma_size = machine->ram_size;
> -hwaddr node0_size = spapr_node0_size(machine);
>  
>  /* RMA has to fit in the first NUMA node */
> -rma_size = MIN(rma_size, node0_size);
> +if (machine->numa_state->num_nodes) {
> +hwaddr node0_size = machine->numa_state->nodes[0].node_mem;
> +
> +rma_size = MIN(rma_size, node0_size);
> +}
>  
>  /*
>   * VRMA access is via a special 1TiB SLB mapping, so the RMA can
> @@ -2688,6 +2677,11 @@ static hwaddr spapr_rma_size(SpaprMachineState *spapr, 
> Error **errp)
>  rma_size = MIN(rma_size, 16 * GiB);
>  }
>  
> +/*
> + * RMA size must be a power of 2
> + */
> +rma_size = pow2floor(rma_size);
> +
>  if (rma_size < (MIN_RMA_SLOF * MiB)) {
>  error_setg(errp,
>  "pSeries SLOF firmware requires >= %ldMiB guest RMA (Real Mode Area)",
> 

-- 
Alexey

[PATCH v2 2/4] target/arm: Abstract the generic timer frequency

2019-12-02 Thread Andrew Jeffery

Prepare for SoCs such as the ASPEED AST2600 whose firmware configures
CNTFRQ to values significantly larger than the static 62.5MHz value
currently derived from GTIMER_SCALE. As the OS potentially derives its
timer periods from the CNTFRQ value the lack of support for running
QEMUTimers at the appropriate rate leads to sticky behaviour in the
guest.

Substitute the GTIMER_SCALE constant with use of a helper to derive the
period from gt_cntfrq stored in struct ARMCPU. Initially set gt_cntfrq
to the frequency associated with GTIMER_SCALE so current behaviour is
maintained.

Signed-off-by: Andrew Jeffery 
Reviewed-by: Richard Henderson 
---
 target/arm/cpu.c|  2 ++
 target/arm/cpu.h| 10 ++
 target/arm/helper.c | 10 +++---
 3 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 7a4ac9339bf9..5698a74061bb 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -974,6 +974,8 @@ static void arm_cpu_initfn(Object *obj)
 if (tcg_enabled()) {
 cpu->psci_version = 2; /* TCG implements PSCI 0.2 */
 }
+
+cpu->gt_cntfrq = NANOSECONDS_PER_SECOND / GTIMER_SCALE;
 }
 
 static Property arm_cpu_reset_cbar_property =
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 83a809d4bac4..666c03871fdf 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -932,8 +932,18 @@ struct ARMCPU {
  */
 DECLARE_BITMAP(sve_vq_map, ARM_MAX_VQ);
 DECLARE_BITMAP(sve_vq_init, ARM_MAX_VQ);
+
+/* Generic timer counter frequency, in Hz */
+uint64_t gt_cntfrq;
 };
 
+static inline unsigned int gt_cntfrq_period_ns(ARMCPU *cpu)
+{
+/* XXX: Could include qemu/timer.h to get NANOSECONDS_PER_SECOND? */
+const unsigned int ns_per_s = 1000 * 1000 * 1000;
+return ns_per_s > cpu->gt_cntfrq ? ns_per_s / cpu->gt_cntfrq : 1;
+}
+
 void arm_cpu_post_init(Object *obj);
 
 uint64_t arm_cpu_mp_affinity(int idx, uint8_t clustersz);
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 65c4441a3896..2622a9a8d02f 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -2409,7 +2409,9 @@ static CPAccessResult gt_stimer_access(CPUARMState *env,
 
 static uint64_t gt_get_countervalue(CPUARMState *env)
 {
-return qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) / GTIMER_SCALE;
+ARMCPU *cpu = env_archcpu(env);
+
+return qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) / gt_cntfrq_period_ns(cpu);
 }
 
 static void gt_recalc_timer(ARMCPU *cpu, int timeridx)
@@ -2445,7 +2447,7 @@ static void gt_recalc_timer(ARMCPU *cpu, int timeridx)
  * set the timer for as far in the future as possible. When the
  * timer expires we will reset the timer for any remaining period.
  */
-if (nexttick > INT64_MAX / GTIMER_SCALE) {
+if (nexttick > INT64_MAX / gt_cntfrq_period_ns(cpu)) {
 timer_mod_ns(cpu->gt_timer[timeridx], INT64_MAX);
 } else {
 timer_mod(cpu->gt_timer[timeridx], nexttick);
@@ -2874,11 +2876,13 @@ static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
 
 static uint64_t gt_virt_cnt_read(CPUARMState *env, const ARMCPRegInfo *ri)
 {
+ARMCPU *cpu = env_archcpu(env);
+
 /* Currently we have no support for QEMUTimer in linux-user so we
  * can't call gt_get_countervalue(env), instead we directly
  * call the lower level functions.
  */
-return cpu_get_clock() / GTIMER_SCALE;
+return cpu_get_clock() / gt_cntfrq_period_ns(cpu);
 }
 
 static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
-- 
2.20.1

[PATCH v2 3/4] target/arm: Prepare generic timer for per-platform CNTFRQ

2019-12-02 Thread Andrew Jeffery

The ASPEED AST2600 clocks the generic timer at the rate of HPLL. On
recent firmwares this is at 1125MHz, which is considerably quicker than
the assumed 62.5MHz of the current generic timer implementation. The
delta between the value as read from CNTFRQ and the true rate of the
underlying QEMUTimer leads to sticky behaviour in AST2600 guests.

Add a feature-gated property exposing CNTFRQ for ARM CPUs providing the
generic timer. This allows platforms to configure CNTFRQ (and the
associated QEMUTimer) to the appropriate frequency prior to starting the
guest.

As the platform can now determine the rate of CNTFRQ we're exposed to
limitations of QEMUTimer that didn't previously materialise: In the
course of emulation we need to arbitrarily and accurately convert
between guest ticks and time, but we're constrained by QEMUTimer's use
of an integer scaling factor. The effect is QEMUTimer cannot exactly
capture the period of frequencies that do not cleanly divide
NANOSECONDS_PER_SECOND for scaling ticks to time. As such, provide an
equally inaccurate scaling factor for scaling time to ticks so at least
a self-consistent inverse relationship holds.

Signed-off-by: Andrew Jeffery 
Reviewed-by: Richard Henderson 
---
 target/arm/cpu.c| 43 +--
 target/arm/cpu.h| 18 ++
 target/arm/helper.c |  9 -
 3 files changed, 59 insertions(+), 11 deletions(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 5698a74061bb..f186019a77fd 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -974,10 +974,12 @@ static void arm_cpu_initfn(Object *obj)
 if (tcg_enabled()) {
 cpu->psci_version = 2; /* TCG implements PSCI 0.2 */
 }
-
-cpu->gt_cntfrq = NANOSECONDS_PER_SECOND / GTIMER_SCALE;
 }
 
+static Property arm_cpu_gt_cntfrq_property =
+DEFINE_PROP_UINT64("cntfrq", ARMCPU, gt_cntfrq,
+   NANOSECONDS_PER_SECOND / GTIMER_SCALE);
+
 static Property arm_cpu_reset_cbar_property =
 DEFINE_PROP_UINT64("reset-cbar", ARMCPU, reset_cbar, 0);
 
@@ -1174,6 +1176,11 @@ void arm_cpu_post_init(Object *obj)
 
 qdev_property_add_static(DEVICE(obj), _cpu_cfgend_property,
  _abort);
+
+if (arm_feature(>env, ARM_FEATURE_GENERIC_TIMER)) {
+qdev_property_add_static(DEVICE(cpu), _cpu_gt_cntfrq_property,
+ _abort);
+}
 }
 
 static void arm_cpu_finalizefn(Object *obj)
@@ -1253,14 +1260,30 @@ static void arm_cpu_realizefn(DeviceState *dev, Error 
**errp)
 }
 }
 
-cpu->gt_timer[GTIMER_PHYS] = timer_new(QEMU_CLOCK_VIRTUAL, GTIMER_SCALE,
-   arm_gt_ptimer_cb, cpu);
-cpu->gt_timer[GTIMER_VIRT] = timer_new(QEMU_CLOCK_VIRTUAL, GTIMER_SCALE,
-   arm_gt_vtimer_cb, cpu);
-cpu->gt_timer[GTIMER_HYP] = timer_new(QEMU_CLOCK_VIRTUAL, GTIMER_SCALE,
-  arm_gt_htimer_cb, cpu);
-cpu->gt_timer[GTIMER_SEC] = timer_new(QEMU_CLOCK_VIRTUAL, GTIMER_SCALE,
-  arm_gt_stimer_cb, cpu);
+
+{
+uint64_t scale;
+
+if (arm_feature(env, ARM_FEATURE_GENERIC_TIMER)) {
+if (!cpu->gt_cntfrq) {
+error_setg(errp, "Invalid CNTFRQ: %"PRId64"Hz",
+   cpu->gt_cntfrq);
+return;
+}
+scale = gt_cntfrq_period_ns(cpu);
+} else {
+scale = GTIMER_SCALE;
+}
+
+cpu->gt_timer[GTIMER_PHYS] = timer_new(QEMU_CLOCK_VIRTUAL, scale,
+   arm_gt_ptimer_cb, cpu);
+cpu->gt_timer[GTIMER_VIRT] = timer_new(QEMU_CLOCK_VIRTUAL, scale,
+   arm_gt_vtimer_cb, cpu);
+cpu->gt_timer[GTIMER_HYP] = timer_new(QEMU_CLOCK_VIRTUAL, scale,
+  arm_gt_htimer_cb, cpu);
+cpu->gt_timer[GTIMER_SEC] = timer_new(QEMU_CLOCK_VIRTUAL, scale,
+  arm_gt_stimer_cb, cpu);
+}
 #endif
 
 cpu_exec_realizefn(cs, _err);
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 666c03871fdf..0bcd13dcac81 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -939,6 +939,24 @@ struct ARMCPU {
 
 static inline unsigned int gt_cntfrq_period_ns(ARMCPU *cpu)
 {
+/*
+ * The exact approach to calculating guest ticks is:
+ *
+ * muldiv64(qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL), cpu->gt_cntfrq,
+ *  NANOSECONDS_PER_SECOND);
+ *
+ * We don't do that. Rather we intentionally use integer division
+ * truncation below and in the caller for the conversion of host monotonic
+ * time to guest ticks to provide the exact inverse for the semantics of
+ * the QEMUTimer scale factor. QEMUTimer's scale facter is an integer, so
+ * it loses precision when representing

[PATCH v2 4/4] ast2600: Configure CNTFRQ at 1125MHz

2019-12-02 Thread Andrew Jeffery

This matches the configuration set by u-boot on the AST2600.

Signed-off-by: Andrew Jeffery 
Reviewed-by: Richard Henderson 
Reviewed-by: Cédric Le Goater 
---
 hw/arm/aspeed_ast2600.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/arm/aspeed_ast2600.c b/hw/arm/aspeed_ast2600.c
index 931887ac681f..5aecc3b3caec 100644
--- a/hw/arm/aspeed_ast2600.c
+++ b/hw/arm/aspeed_ast2600.c
@@ -259,6 +259,9 @@ static void aspeed_soc_ast2600_realize(DeviceState *dev, 
Error **errp)
 object_property_set_int(OBJECT(>cpu[i]), aspeed_calc_affinity(i),
 "mp-affinity", _abort);
 
+object_property_set_int(OBJECT(>cpu[i]), 112500, "cntfrq",
+_abort);
+
 /*
  * TODO: the secondary CPUs are started and a boot helper
  * is needed when using -kernel
-- 
2.20.1

[PATCH v2 0/4] Expose GT CNTFRQ as a CPU property to support AST2600

2019-12-02 Thread Andrew Jeffery

Hello,

This is a v2 of the belated follow-up from a few of my earlier attempts to fix
up the ARM generic timer for correct behaviour on the ASPEED AST2600 SoC. The
AST2600 clocks the generic timer at the rate of HPLL, which is configured to
1125MHz.  This is significantly quicker than the currently hard-coded generic
timer rate of 62.5MHz and so we see "sticky" behaviour in the guest.

v1 can be found here:

https://patchwork.ozlabs.org/cover/1201887/

Changes since v1:

* Fix a user mode build failure from partial renaming of gt_cntfrq_period_ns()
* Add tags from Cedric and Richard

Please review.

Andrew

Andrew Jeffery (4):
  target/arm: Remove redundant scaling of nexttick
  target/arm: Abstract the generic timer frequency
  target/arm: Prepare generic timer for per-platform CNTFRQ
  ast2600: Configure CNTFRQ at 1125MHz

 hw/arm/aspeed_ast2600.c |  3 +++
 target/arm/cpu.c| 41 +
 target/arm/cpu.h| 28 
 target/arm/helper.c | 24 ++--
 4 files changed, 82 insertions(+), 14 deletions(-)

-- 
2.20.1

[PATCH v2 1/4] target/arm: Remove redundant scaling of nexttick

2019-12-02 Thread Andrew Jeffery

The corner-case codepath was adjusting nexttick such that overflow
wouldn't occur when timer_mod() scaled the value back up. Remove a use
of GTIMER_SCALE and avoid unnecessary operations by calling
timer_mod_ns() directly.

Signed-off-by: Andrew Jeffery 
Reviewed-by: Richard Henderson 
Reviewed-by: Cédric Le Goater 
---
 target/arm/helper.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index a089fb5a6909..65c4441a3896 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -2446,9 +2446,10 @@ static void gt_recalc_timer(ARMCPU *cpu, int timeridx)
  * timer expires we will reset the timer for any remaining period.
  */
 if (nexttick > INT64_MAX / GTIMER_SCALE) {
-nexttick = INT64_MAX / GTIMER_SCALE;
+timer_mod_ns(cpu->gt_timer[timeridx], INT64_MAX);
+} else {
+timer_mod(cpu->gt_timer[timeridx], nexttick);
 }
-timer_mod(cpu->gt_timer[timeridx], nexttick);
 trace_arm_gt_recalc(timeridx, irqstate, nexttick);
 } else {
 /* Timer disabled: ISTATUS and timer output always clear */
-- 
2.20.1

Re: [for-5.0 3/4] spapr: Clean up RMA size calculation

2019-12-02 Thread Alexey Kardashevskiy




On 29/11/2019 12:35, David Gibson wrote:
> Move the calculation of the Real Mode Area (RMA) size into a helper
> function.  While we're there clean it up and correct it in a few ways:
>   * Add comments making it clearer where the various constraints come from
>   * Remove a pointless check that the RMA fits within Node 0 (we've just
> clamped it so that it does)
>   * The 16GiB limit we apply is only correct for POWER8, but there is also
> a 1TiB limit that applies on POWER9.
> 
> Signed-off-by: David Gibson 
> ---
>  hw/ppc/spapr.c | 57 +++---
>  1 file changed, 35 insertions(+), 22 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 52c39daa99..7efd4f2b85 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -2664,6 +2664,40 @@ static PCIHostState *spapr_create_default_phb(void)
>  return PCI_HOST_BRIDGE(dev);
>  }
>  
> +static hwaddr spapr_rma_size(SpaprMachineState *spapr, Error **errp)
> +{
> +MachineState *machine = MACHINE(spapr);
> +hwaddr rma_size = machine->ram_size;
> +hwaddr node0_size = spapr_node0_size(machine);
> +
> +/* RMA has to fit in the first NUMA node */
> +rma_size = MIN(rma_size, node0_size);
> +
> +/*
> + * VRMA access is via a special 1TiB SLB mapping, so the RMA can
> + * never exceed that
> + */
> +rma_size = MIN(rma_size, TiB);
> +
> +/*
> + * RMA size is controlled in hardware by LPCR[RMLS].  On POWER8


RMA is controlled by LPCR on P8 but the RMLS bits on P9 are reserved
(also reserved in PowerISA 3.0).


> + * the largest RMA that can be specified there is 16GiB


The P8 user manual says:
===
The following RMO sizes are available for the POWER8 processor.
The RMLS[34:37] field in the LPCR defines the RMO sizes, as described below.
1000 - 32 MB
0011 - 64 MB
0111 - 128 MB
0100 - 256 MB
0010 - 1 GB
0001 - 16 GB
 - 256 GB
===

The maximum seems to be 256GiB.


> + */
> +if (!ppc_type_check_compat(machine->cpu_type, CPU_POWERPC_LOGICAL_3_00,
> +   0, spapr->max_compat_pvr)) {
> +rma_size = MIN(rma_size, 16 * GiB);
> +}
> +
> +if (rma_size < (MIN_RMA_SLOF * MiB)) {


nit: it is time to redefine MIN_RMA_SLOF to use MiBs imho :)


> +error_setg(errp,
> +"pSeries SLOF firmware requires >= %ldMiB guest RMA (Real Mode Area)",
> +   MIN_RMA_SLOF);

Something went wrong with formatting here.

Otherwise looks good. Thanks,



> +return -1;
> +}
> +
> +return rma_size;
> +}
> +
>  /* pSeries LPAR / sPAPR hardware init */
>  static void spapr_machine_init(MachineState *machine)
>  {
> @@ -2675,7 +2709,6 @@ static void spapr_machine_init(MachineState *machine)
>  int i;
>  MemoryRegion *sysmem = get_system_memory();
>  MemoryRegion *ram = g_new(MemoryRegion, 1);
> -hwaddr node0_size = spapr_node0_size(machine);
>  long load_limit, fw_size;
>  char *filename;
>  Error *resize_hpt_err = NULL;
> @@ -2715,20 +2748,7 @@ static void spapr_machine_init(MachineState *machine)
>  exit(1);
>  }
>  
> -spapr->rma_size = node0_size;
> -
> -/* Actually we don't support unbounded RMA anymore since we added
> - * proper emulation of HV mode. The max we can get is 16G which
> - * also happens to be what we configure for PAPR mode so make sure
> - * we don't do anything bigger than that
> - */
> -spapr->rma_size = MIN(spapr->rma_size, 0x4ull);
> -
> -if (spapr->rma_size > node0_size) {
> -error_report("Numa node 0 has to span the RMA (%#08"HWADDR_PRIx")",
> - spapr->rma_size);
> -exit(1);
> -}
> +spapr->rma_size = spapr_rma_size(spapr, _fatal);
>  
>  /* Setup a load limit for the ramdisk leaving room for SLOF and FDT */
>  load_limit = MIN(spapr->rma_size, RTAS_MAX_ADDR) - FW_OVERHEAD;
> @@ -2956,13 +2976,6 @@ static void spapr_machine_init(MachineState *machine)
>  }
>  }
>  
> -if (spapr->rma_size < (MIN_RMA_SLOF * MiB)) {
> -error_report(
> -"pSeries SLOF firmware requires >= %ldM guest RMA (Real Mode 
> Area memory)",
> -MIN_RMA_SLOF);
> -exit(1);
> -}
> -
>  if (kernel_filename) {
>  uint64_t lowaddr = 0;
>  
> 

-- 
Alexey

Re: [RESEND PATCH v21 5/6] target-arm: kvm64: handle SIGBUS signal from kernel or KVM

2019-12-02 Thread Xiang Zheng



On 2019/11/27 22:17, Beata Michalska wrote:
> On Wed, 27 Nov 2019 at 13:03, Igor Mammedov  wrote:
>>
>> On Wed, 27 Nov 2019 20:47:15 +0800
>> Xiang Zheng  wrote:
>>
>>> Hi Beata,
>>>
>>> Thanks for you review!
>>>
>>> On 2019/11/22 23:47, Beata Michalska wrote:
 Hi,

 On Mon, 11 Nov 2019 at 01:48, Xiang Zheng  wrote:
>
> From: Dongjiu Geng 
>
> Add a SIGBUS signal handler. In this handler, it checks the SIGBUS type,
> translates the host VA delivered by host to guest PA, then fills this PA
> to guest APEI GHES memory, then notifies guest according to the SIGBUS
> type.
>
> When guest accesses the poisoned memory, it will generate a Synchronous
> External Abort(SEA). Then host kernel gets an APEI notification and calls
> memory_failure() to unmapped the affected page in stage 2, finally
> returns to guest.
>
> Guest continues to access the PG_hwpoison page, it will trap to KVM as
> stage2 fault, then a SIGBUS_MCEERR_AR synchronous signal is delivered to
> Qemu, Qemu records this error address into guest APEI GHES memory and
> notifes guest using Synchronous-External-Abort(SEA).
>
> In order to inject a vSEA, we introduce the kvm_inject_arm_sea() function
> in which we can setup the type of exception and the syndrome information.
> When switching to guest, the target vcpu will jump to the synchronous
> external abort vector table entry.
>
> The ESR_ELx.DFSC is set to synchronous external abort(0x10), and the
> ESR_ELx.FnV is set to not valid(0x1), which will tell guest that FAR is
> not valid and hold an UNKNOWN value. These values will be set to KVM
> register structures through KVM_SET_ONE_REG IOCTL.
>
> Signed-off-by: Dongjiu Geng 
> Signed-off-by: Xiang Zheng 
> Reviewed-by: Michael S. Tsirkin 
> ---
>> [...]
> diff --git a/include/hw/acpi/acpi_ghes.h b/include/hw/acpi/acpi_ghes.h
> index cb62ec9c7b..8e3c5b879e 100644
> --- a/include/hw/acpi/acpi_ghes.h
> +++ b/include/hw/acpi/acpi_ghes.h
> @@ -24,6 +24,9 @@
>
>  #include "hw/acpi/bios-linker-loader.h"
>
> +#define ACPI_GHES_CPER_OK   1
> +#define ACPI_GHES_CPER_FAIL 0
> +

 Is there really a need to introduce those ?

>>>
>>> Don't you think it's more clear than using "1" or "0"? :)
>>
>> or maybe just reuse default libc return convention: 0 - ok, -1 - fail
>> and drop custom macros
>>
> 
> Totally agree.
> 

OK, let's reuse default libc return convention.

-- 

Thanks,
Xiang

Re: [RESEND PATCH v21 5/6] target-arm: kvm64: handle SIGBUS signal from kernel or KVM

2019-12-02 Thread Xiang Zheng

On 2019/11/27 22:17, Beata Michalska wrote:
> Hi
> 
> On Wed, 27 Nov 2019 at 12:47, Xiang Zheng  wrote:
>>
>> Hi Beata,
>>
>> Thanks for you review!
>>
> YAW
> 
>> On 2019/11/22 23:47, Beata Michalska wrote:
>>> Hi,
>>>
>>> On Mon, 11 Nov 2019 at 01:48, Xiang Zheng  wrote:

 From: Dongjiu Geng 

 Add a SIGBUS signal handler. In this handler, it checks the SIGBUS type,
 translates the host VA delivered by host to guest PA, then fills this PA
 to guest APEI GHES memory, then notifies guest according to the SIGBUS
 type.

 When guest accesses the poisoned memory, it will generate a Synchronous
 External Abort(SEA). Then host kernel gets an APEI notification and calls
 memory_failure() to unmapped the affected page in stage 2, finally
 returns to guest.

 Guest continues to access the PG_hwpoison page, it will trap to KVM as
 stage2 fault, then a SIGBUS_MCEERR_AR synchronous signal is delivered to
 Qemu, Qemu records this error address into guest APEI GHES memory and
 notifes guest using Synchronous-External-Abort(SEA).

 In order to inject a vSEA, we introduce the kvm_inject_arm_sea() function
 in which we can setup the type of exception and the syndrome information.
 When switching to guest, the target vcpu will jump to the synchronous
 external abort vector table entry.

 The ESR_ELx.DFSC is set to synchronous external abort(0x10), and the
 ESR_ELx.FnV is set to not valid(0x1), which will tell guest that FAR is
 not valid and hold an UNKNOWN value. These values will be set to KVM
 register structures through KVM_SET_ONE_REG IOCTL.

 Signed-off-by: Dongjiu Geng 
 Signed-off-by: Xiang Zheng 
 Reviewed-by: Michael S. Tsirkin 
 ---
  hw/acpi/acpi_ghes.c | 297 
  include/hw/acpi/acpi_ghes.h |   4 +
  include/sysemu/kvm.h|   3 +-
  target/arm/cpu.h|   4 +
  target/arm/helper.c |   2 +-
  target/arm/internals.h  |   5 +-
  target/arm/kvm64.c  |  64 
  target/arm/tlb_helper.c |   2 +-
  target/i386/cpu.h   |   2 +
  9 files changed, 377 insertions(+), 6 deletions(-)

 diff --git a/hw/acpi/acpi_ghes.c b/hw/acpi/acpi_ghes.c
 index 42c00ff3d3..f5b54990c0 100644
 --- a/hw/acpi/acpi_ghes.c
 +++ b/hw/acpi/acpi_ghes.c
 @@ -39,6 +39,34 @@
  /* The max size in bytes for one error block */
  #define ACPI_GHES_MAX_RAW_DATA_LENGTH   0x1000

 +/*
 + * The total size of Generic Error Data Entry
 + * ACPI 6.1/6.2: 18.3.2.7.1 Generic Error Data,
 + * Table 18-343 Generic Error Data Entry
 + */
 +#define ACPI_GHES_DATA_LENGTH   72
 +
 +/*
 + * The memory section CPER size,
 + * UEFI 2.6: N.2.5 Memory Error Section
 + */
 +#define ACPI_GHES_MEM_CPER_LENGTH   80
 +
 +/*
 + * Masks for block_status flags
 + */
 +#define ACPI_GEBS_UNCORRECTABLE 1
>>>
>>> Why not listing all supported statuses ? Similar to error severity below ?
>>>
>>
>> We now only use the first bit for uncorrectable error. The correctable errors
>> are handled in host and would not be delivered to QEMU.
>>
>> I think it's unnecessary to list all the bit masks.
> 
> I'm not sure we are using all the error severity types either, but fair 
> enough.
>>
 +
 +/*
 + * Values for error_severity field
 + */
 +enum AcpiGenericErrorSeverity {
 +ACPI_CPER_SEV_RECOVERABLE,
 +ACPI_CPER_SEV_FATAL,
 +ACPI_CPER_SEV_CORRECTED,
 +ACPI_CPER_SEV_NONE,
 +};
 +
  /*
   * Now only support ARMv8 SEA notification type error source
   */
 @@ -49,6 +77,16 @@
   */
  #define ACPI_GHES_SOURCE_GENERIC_ERROR_V2   10

 +#define UUID_BE(a, b, c, d0, d1, d2, d3, d4, d5, d6, d7)\
 +{{{ ((a) >> 24) & 0xff, ((a) >> 16) & 0xff, ((a) >> 8) & 0xff, (a) & 
 0xff, \
 +((b) >> 8) & 0xff, (b) & 0xff,   \
 +((c) >> 8) & 0xff, (c) & 0xff,\
 +(d0), (d1), (d2), (d3), (d4), (d5), (d6), (d7) } } }
 +
 +#define UEFI_CPER_SEC_PLATFORM_MEM   \
 +UUID_BE(0xA5BC1114, 0x6F64, 0x4EDE, 0xB8, 0x63, 0x3E, 0x83, \
 +0xED, 0x7C, 0x83, 0xB1)
 +
  /*
> 
> As suggested in different thread - could this be also made common with
> NVMe code ?

Sure, I will make it common in a separate patch.

 @@ -1036,6 +1062,44 @@ int kvm_arch_get_registers(CPUState *cs)
  return ret;
  }

 +void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
 +{
 +ram_addr_t ram_addr;
 +hwaddr paddr;
 +
 +assert(code == BUS_MCEERR_AR || code == BUS_MCEERR_AO);
 +
 +if (acpi_enabled && addr &&
 +object_property_get_bool(qdev_get_machine(), "ras",

Re: [for-5.0 2/4] spapr: Don't attempt to clamp RMA to VRMA constraint

2019-12-02 Thread Alexey Kardashevskiy




On 29/11/2019 12:35, David Gibson wrote:
> The Real Mode Area (RMA) is the part of memory which a guest can access
> when in real (MMU off) mode.  Of course, for a guest under KVM, the MMU
> isn't really turned off, it's just in a special translation mode - Virtual
> Real Mode Area (VRMA) - which looks like real mode in guest mode.
> 
> The mechanics of how this works when in Hashed Page Table (HPT) mode, put
> a constraint on the size of the RMA, which depends on the size of the HPT.
> So, the latter part of spapr_setup_hpt_and_vrma() clamps the RMA we
> advertise to the guest based on this VRMA limit.
> 
> There are several things wrong with this:
>  1) spapr_setup_hpt_and_vrma() doesn't actually clamp, it takes the minimum
> of Node 0 memory size and the VRMA limit.  That will *often* work the
> same as clamping, but there can be other constraints on RMA size which
> supersede Node 0 memory size.  We have real bugs caused by this
> (currently worked around in the guest kernel)
>  2) Some callers of spapr_setup_hpt_and_vrma() are in a situation where
> we're past the point that we can actually advertise an RMA limit to the
> guest
>  3) But most fundamentally, the VRMA limit depends on host configuration
> (page size) which shouldn't be visible to the guest, but this partially
> exposes it.  This can cause problems with migration in certain edge
> cases, although we will mostly get away with it.
> 
> In practice, this clamping is almost never applied anyway.  With 64kiB
> pages and the normal rules for sizing of the HPT, the theoretical VRMA
> limit will be 4x(guest memory size) and so never hit.  It will hit with
> 4kiB pages, where it will be (guest memory size)/4.  However all mainstream
> distro kernels for POWER have used a 64kiB page size for at least 10 years.
> 
> So, simply replace this logic with a check that the RMA we've calculated
> based only on guest visible configuration will fit within the host implied
> VRMA limit.  This can break if running HPT guests on a host kernel with
> 4kiB page size.  As noted that's very rare.  There also exist several
> possible workarounds:
>   * Change the host kernel to use 64kiB pages
>   * Use radix MMU (RPT) guests instead of HPT
>   * Use 64kiB hugepages on the host to back guest memory
>   * Increase the guest memory size so that the RMA hits one of the fixed
> limits before the RMA limit.  This is relatively easy on POWER8 which
> has a 16GiB limit, harder on POWER9 which has a 1TiB limit.
>   * Decrease guest memory size so that it's below the lower bound on VRMA
> limit (minimum HPT size is 256kiB, giving a minimum VRAM of 8MiB).
> Difficult in practice since modern guests tend to want 1-2GiB.
>   * Use a guest NUMA configuration which artificially constrains the RMA
> within the VRMA limit (the RMA must always fit within Node 0).
> 
> Previously, on KVM, we also temporarily reduced the rma_size to 256M so
> that the we'd load the kernel and initrd safely, regardless of the VRMA
> limit.  This was a) confusing, b) could significantly limit the size of
> images we could load and c) introduced a behavioural difference between
> KVM and TCG.  So we remove that as well.
> 
> Signed-off-by: David Gibson 



Reviewed-by: Alexey Kardashevskiy 



> ---
>  hw/ppc/spapr.c | 28 ++--
>  hw/ppc/spapr_hcall.c   |  4 ++--
>  include/hw/ppc/spapr.h |  3 +--
>  3 files changed, 13 insertions(+), 22 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 069bd04a8d..52c39daa99 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1618,7 +1618,7 @@ void spapr_reallocate_hpt(SpaprMachineState *spapr, int 
> shift,
>  spapr_set_all_lpcrs(0, LPCR_HR | LPCR_UPRT);
>  }
>  
> -void spapr_setup_hpt_and_vrma(SpaprMachineState *spapr)
> +void spapr_setup_hpt(SpaprMachineState *spapr)
>  {
>  int hpt_shift;
>  
> @@ -1634,10 +1634,16 @@ void spapr_setup_hpt_and_vrma(SpaprMachineState 
> *spapr)
>  }
>  spapr_reallocate_hpt(spapr, hpt_shift, _fatal);
>  
> -if (spapr->vrma_adjust) {
> +if (kvm_enabled()) {
>  hwaddr vrma_limit = kvmppc_vrma_limit(spapr->htab_shift);
>  
> -spapr->rma_size = MIN(spapr_node0_size(MACHINE(spapr)), vrma_limit);
> +/* Check our RMA fits in the possible VRMA */
> +if (vrma_limit < spapr->rma_size) {
> +error_report("Unable to create %" HWADDR_PRIu
> + "MiB RMA (VRMA only allows %" HWADDR_PRIu "MiB",
> + spapr->rma_size / MiB, vrma_limit / MiB);
> +exit(EXIT_FAILURE);
> +}
>  }
>  }
>  
> @@ -1676,7 +1682,7 @@ static void spapr_machine_reset(MachineState *machine)
>  spapr->patb_entry = PATE1_GR;
>  spapr_set_all_lpcrs(LPCR_HR | LPCR_UPRT, LPCR_HR | LPCR_UPRT);
>  } else {
> -spapr_setup_hpt_and_vrma(spapr);
> +spapr_setup_hpt(spapr);
>  }
>  
>

Re: [for-5.0 1/4] spapr,ppc: Simplify signature of kvmppc_rma_size()

2019-12-02 Thread Alexey Kardashevskiy




On 29/11/2019 12:35, David Gibson wrote:
> This function calculates the maximum size of the RMA as implied by the
> host's page size of structure of the VRMA (there are a number of other
> constraints on the RMA size which will supersede this one in many
> circumstances).
> 
> The current interface takes the current RMA size estimate, and clamps it
> to the VRMA derived size.  The only current caller passes in an arguably
> wrong value (it will match the current RMA estimate in some but not all
> cases).
> 
> We want to fix that, but for now just keep concerns separated by having the
> KVM helper function just return the VRMA derived limit, and let the caller
> combine it with other constraints.  We call the new function
> kvmppc_vrma_limit() to more clearly indicate its limited responsibility.
> 
> The helper should only ever be called in the KVM enabled case, so replace
> its !CONFIG_KVM stub with an assert() rather than a dummy value.
> 
> Signed-off-by: David Gibson 


Besides not compiling, otherwise looks good.


Reviewed-by: Alexey Kardashevskiy 




> ---
>  hw/ppc/spapr.c   | 5 +++--
>  target/ppc/kvm.c | 5 ++---
>  target/ppc/kvm_ppc.h | 7 +++
>  3 files changed, 8 insertions(+), 9 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index d9c9a2bcee..069bd04a8d 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1635,8 +1635,9 @@ void spapr_setup_hpt_and_vrma(SpaprMachineState *spapr)
>  spapr_reallocate_hpt(spapr, hpt_shift, _fatal);
>  
>  if (spapr->vrma_adjust) {
> -spapr->rma_size = kvmppc_rma_size(spapr_node0_size(MACHINE(spapr)),
> -  spapr->htab_shift);
> +hwaddr vrma_limit = kvmppc_vrma_limit(spapr->htab_shift);
> +
> +spapr->rma_size = MIN(spapr_node0_size(MACHINE(spapr)), vrma_limit);
>  }
>  }
>  
> diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
> index c77f9848ec..09b3bd6443 100644
> --- a/target/ppc/kvm.c
> +++ b/target/ppc/kvm.c
> @@ -2101,7 +2101,7 @@ void kvmppc_hint_smt_possible(Error **errp)
>  
>  
>  #ifdef TARGET_PPC64
> -uint64_t kvmppc_rma_size(uint64_t current_size, unsigned int hash_shift)
> +uint64_t kvmppc_vrma_limit(unsigned int hash_shift)
>  {
>  struct kvm_ppc_smmu_info info;
>  long rampagesize, best_page_shift;
> @@ -2128,8 +2128,7 @@ uint64_t kvmppc_rma_size(uint64_t current_size, 
> unsigned int hash_shift)
>  }
>  }
>  
> -return MIN(current_size,
> -   1ULL << (best_page_shift + hash_shift - 7));
> +return 1ULL << (best_page_shift + hash_shift - 7));
>  }
>  #endif
>  
> diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
> index 98bd7d5da6..4f0eec4c1b 100644
> --- a/target/ppc/kvm_ppc.h
> +++ b/target/ppc/kvm_ppc.h
> @@ -45,7 +45,7 @@ void *kvmppc_create_spapr_tce(uint32_t liobn, uint32_t 
> page_shift,
>int *pfd, bool need_vfio);
>  int kvmppc_remove_spapr_tce(void *table, int pfd, uint32_t window_size);
>  int kvmppc_reset_htab(int shift_hint);
> -uint64_t kvmppc_rma_size(uint64_t current_size, unsigned int hash_shift);
> +uint64_t kvmppc_vrma_limit(unsigned int hash_shift);
>  bool kvmppc_has_cap_spapr_vfio(void);
>  #endif /* !CONFIG_USER_ONLY */
>  bool kvmppc_has_cap_epr(void);
> @@ -241,10 +241,9 @@ static inline int kvmppc_reset_htab(int shift_hint)
>  return 0;
>  }
>  
> -static inline uint64_t kvmppc_rma_size(uint64_t current_size,
> -   unsigned int hash_shift)
> +static inline uint64_t kvmppc_vrma_limit(unsigned int hash_shift)
>  {
> -return ram_size;
> +g_assert_not_reached();
>  }
>  
>  static inline bool kvmppc_hpt_needs_host_contiguous_pages(void)
> 

-- 
Alexey

[PATCH v4 33/40] target/arm: check TGE and E2H flags for EL0 pauth traps

2019-12-02 Thread Richard Henderson

From: Alex Bennée 

According to ARM ARM we should only trap from the EL1&0 regime.

Signed-off-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 target/arm/pauth_helper.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/target/arm/pauth_helper.c b/target/arm/pauth_helper.c
index 42c9141bb7..c3cb7c8d52 100644
--- a/target/arm/pauth_helper.c
+++ b/target/arm/pauth_helper.c
@@ -371,7 +371,10 @@ static void pauth_check_trap(CPUARMState *env, int el, 
uintptr_t ra)
 if (el < 2 && arm_feature(env, ARM_FEATURE_EL2)) {
 uint64_t hcr = arm_hcr_el2_eff(env);
 bool trap = !(hcr & HCR_API);
-/* FIXME: ARMv8.1-VHE: trap only applies to EL1&0 regime.  */
+if (el == 0) {
+/* Trap only applies to EL1&0 regime.  */
+trap &= (hcr & (HCR_E2H | HCR_TGE)) != (HCR_E2H | HCR_TGE);
+}
 /* FIXME: ARMv8.3-NV: HCR_NV trap takes precedence for ERETA[AB].  */
 if (trap) {
 pauth_trap(env, 2, ra);
-- 
2.17.1

[PATCH v4 40/40] target/arm: Raise only one interrupt in arm_cpu_exec_interrupt

2019-12-02 Thread Richard Henderson

The fall through organization of this function meant that we
would raise an interrupt, then might overwrite that with another.
Since interrupt prioritization is IMPLEMENTATION DEFINED, we
can recognize these in any order we choose.

Unify the code to raise the interrupt in a block at the end.

Signed-off-by: Richard Henderson 
---
 target/arm/cpu.c | 30 --
 1 file changed, 12 insertions(+), 18 deletions(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index a366448c6d..f3360dbb98 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -535,17 +535,15 @@ bool arm_cpu_exec_interrupt(CPUState *cs, int 
interrupt_request)
 uint64_t hcr_el2 = arm_hcr_el2_eff(env);
 uint32_t target_el;
 uint32_t excp_idx;
-bool ret = false;
+
+/* The prioritization of interrupts is IMPLEMENTATION DEFINED. */
 
 if (interrupt_request & CPU_INTERRUPT_FIQ) {
 excp_idx = EXCP_FIQ;
 target_el = arm_phys_excp_target_el(cs, excp_idx, cur_el, secure);
 if (arm_excp_unmasked(cs, excp_idx, target_el,
   cur_el, secure, hcr_el2)) {
-cs->exception_index = excp_idx;
-env->exception.target_el = target_el;
-cc->do_interrupt(cs);
-ret = true;
+goto found;
 }
 }
 if (interrupt_request & CPU_INTERRUPT_HARD) {
@@ -553,10 +551,7 @@ bool arm_cpu_exec_interrupt(CPUState *cs, int 
interrupt_request)
 target_el = arm_phys_excp_target_el(cs, excp_idx, cur_el, secure);
 if (arm_excp_unmasked(cs, excp_idx, target_el,
   cur_el, secure, hcr_el2)) {
-cs->exception_index = excp_idx;
-env->exception.target_el = target_el;
-cc->do_interrupt(cs);
-ret = true;
+goto found;
 }
 }
 if (interrupt_request & CPU_INTERRUPT_VIRQ) {
@@ -564,10 +559,7 @@ bool arm_cpu_exec_interrupt(CPUState *cs, int 
interrupt_request)
 target_el = 1;
 if (arm_excp_unmasked(cs, excp_idx, target_el,
   cur_el, secure, hcr_el2)) {
-cs->exception_index = excp_idx;
-env->exception.target_el = target_el;
-cc->do_interrupt(cs);
-ret = true;
+goto found;
 }
 }
 if (interrupt_request & CPU_INTERRUPT_VFIQ) {
@@ -575,14 +567,16 @@ bool arm_cpu_exec_interrupt(CPUState *cs, int 
interrupt_request)
 target_el = 1;
 if (arm_excp_unmasked(cs, excp_idx, target_el,
   cur_el, secure, hcr_el2)) {
-cs->exception_index = excp_idx;
-env->exception.target_el = target_el;
-cc->do_interrupt(cs);
-ret = true;
+goto found;
 }
 }
+return false;
 
-return ret;
+ found:
+cs->exception_index = excp_idx;
+env->exception.target_el = target_el;
+cc->do_interrupt(cs);
+return true;
 }
 
 #if !defined(CONFIG_USER_ONLY) || !defined(TARGET_AARCH64)
-- 
2.17.1

[PATCH v4 36/40] target/arm: Enable ARMv8.1-VHE in -cpu max

2019-12-02 Thread Richard Henderson

Signed-off-by: Richard Henderson 
---
 target/arm/cpu64.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index a39d6fcea3..009411813f 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -670,6 +670,7 @@ static void aarch64_max_initfn(Object *obj)
 t = cpu->isar.id_aa64mmfr1;
 t = FIELD_DP64(t, ID_AA64MMFR1, HPDS, 1); /* HPD */
 t = FIELD_DP64(t, ID_AA64MMFR1, LO, 1);
+t = FIELD_DP64(t, ID_AA64MMFR1, VH, 1);
 cpu->isar.id_aa64mmfr1 = t;
 
 /* Replicate the same data to the 32-bit id registers.  */
-- 
2.17.1

[PATCH v4 18/40] target/arm: Reorganize ARMMMUIdx

2019-12-02 Thread Richard Henderson

Prepare for, but do not yet implement, the EL2&0 regime.
This involves adding the new MMUIdx enumerators and adjusting
some of the MMUIdx related predicates to match.

Signed-off-by: Richard Henderson 
---
 target/arm/cpu-param.h |   2 +-
 target/arm/cpu.h   | 128 ++---
 target/arm/internals.h |  37 +++-
 target/arm/helper.c|  66 ++---
 target/arm/translate.c |   1 -
 5 files changed, 150 insertions(+), 84 deletions(-)

diff --git a/target/arm/cpu-param.h b/target/arm/cpu-param.h
index 6e6948e960..18ac562346 100644
--- a/target/arm/cpu-param.h
+++ b/target/arm/cpu-param.h
@@ -29,6 +29,6 @@
 # define TARGET_PAGE_BITS_MIN  10
 #endif
 
-#define NB_MMU_MODES 8
+#define NB_MMU_MODES 9
 
 #endif
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 015301e93a..bf8eb57e3a 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -2778,7 +2778,9 @@ static inline bool arm_excp_unmasked(CPUState *cs, 
unsigned int excp_idx,
  *  + NonSecure EL1 & 0 stage 1
  *  + NonSecure EL1 & 0 stage 2
  *  + NonSecure EL2
- *  + Secure EL1 & EL0
+ *  + NonSecure EL2 & 0   (ARMv8.1-VHE)
+ *  + Secure EL0
+ *  + Secure EL1
  *  + Secure EL3
  * If EL3 is 32-bit:
  *  + NonSecure PL1 & 0 stage 1
@@ -2788,8 +2790,9 @@ static inline bool arm_excp_unmasked(CPUState *cs, 
unsigned int excp_idx,
  * (reminder: for 32 bit EL3, Secure PL1 is *EL3*, not EL1.)
  *
  * For QEMU, an mmu_idx is not quite the same as a translation regime because:
- *  1. we need to split the "EL1 & 0" regimes into two mmu_idxes, because they
- * may differ in access permissions even if the VA->PA map is the same
+ *  1. we need to split the "EL1 & 0" and "EL2 & 0" regimes into two mmu_idxes,
+ * because they may differ in access permissions even if the VA->PA map is
+ * the same
  *  2. we want to cache in our TLB the full VA->IPA->PA lookup for a stage 1+2
  * translation, which means that we have one mmu_idx that deals with two
  * concatenated translation regimes [this sort of combined s1+2 TLB is
@@ -2801,19 +2804,23 @@ static inline bool arm_excp_unmasked(CPUState *cs, 
unsigned int excp_idx,
  *  4. we can also safely fold together the "32 bit EL3" and "64 bit EL3"
  * translation regimes, because they map reasonably well to each other
  * and they can't both be active at the same time.
- * This gives us the following list of mmu_idx values:
+ *  5. we want to be able to use the TLB for accesses done as part of a
+ * stage1 page table walk, rather than having to walk the stage2 page
+ * table over and over.
  *
- * NS EL0 (aka NS PL0) stage 1+2
- * NS EL1 (aka NS PL1) stage 1+2
+ * This gives us the following list of cases:
+ *
+ * NS EL0 (aka NS PL0) EL1&0 stage 1+2
+ * NS EL1 (aka NS PL1) EL1&0 stage 1+2
+ * NS EL0 EL2&0
+ * NS EL2 EL2&0
  * NS EL2 (aka NS PL2)
- * S EL3 (aka S PL1)
  * S EL0 (aka S PL0)
  * S EL1 (not used if EL3 is 32 bit)
- * NS EL0+1 stage 2
+ * S EL3 (aka S PL1)
+ * NS EL0&1 stage 2
  *
- * (The last of these is an mmu_idx because we want to be able to use the TLB
- * for the accesses done as part of a stage 1 page table walk, rather than
- * having to walk the stage 2 page table over and over.)
+ * for a total of 9 different mmu_idx.
  *
  * R profile CPUs have an MPU, but can use the same set of MMU indexes
  * as A profile. They only need to distinguish NS EL0 and NS EL1 (and
@@ -2851,26 +2858,47 @@ static inline bool arm_excp_unmasked(CPUState *cs, 
unsigned int excp_idx,
  * For M profile we arrange them to have a bit for priv, a bit for negpri
  * and a bit for secure.
  */
-#define ARM_MMU_IDX_A 0x10 /* A profile */
-#define ARM_MMU_IDX_NOTLB 0x20 /* does not have a TLB */
-#define ARM_MMU_IDX_M 0x40 /* M profile */
+#define ARM_MMU_IDX_A 0x10  /* A profile */
+#define ARM_MMU_IDX_NOTLB 0x20  /* does not have a TLB */
+#define ARM_MMU_IDX_M 0x40  /* M profile */
 
-/* meanings of the bits for M profile mmu idx values */
-#define ARM_MMU_IDX_M_PRIV 0x1
+/* Meanings of the bits for M profile mmu idx values */
+#define ARM_MMU_IDX_M_PRIV   0x1
 #define ARM_MMU_IDX_M_NEGPRI 0x2
-#define ARM_MMU_IDX_M_S 0x4
+#define ARM_MMU_IDX_M_S  0x4  /* Secure */
 
-#define ARM_MMU_IDX_TYPE_MASK (~0x7)
-#define ARM_MMU_IDX_COREIDX_MASK 0x7
+#define ARM_MMU_IDX_TYPE_MASK \
+(ARM_MMU_IDX_A | ARM_MMU_IDX_M | ARM_MMU_IDX_NOTLB)
+#define ARM_MMU_IDX_COREIDX_MASK 0xf
 
 typedef enum ARMMMUIdx {
+/*
+ * A-profile.
+ */
 ARMMMUIdx_EL10_0 = 0 | ARM_MMU_IDX_A,
-ARMMMUIdx_EL10_1 = 1 | ARM_MMU_IDX_A,
-ARMMMUIdx_E2 = 2 | ARM_MMU_IDX_A,
-ARMMMUIdx_SE3 = 3 | ARM_MMU_IDX_A,
-ARMMMUIdx_SE0 = 4 | ARM_MMU_IDX_A,
-ARMMMUIdx_SE1 = 5 | ARM_MMU_IDX_A,
-ARMMMUIdx_Stage2 = 6 | ARM_MMU_IDX_A,
+ARMMMUIdx_EL20_0 = 1 | ARM_MMU_IDX_A,
+
+ARMMMUIdx_EL10_1 = 2 | ARM_MMU_IDX_A,
+
+ARMMMUIdx_E2 = 3 | ARM_MMU_IDX_A,
+ARMMMUIdx_EL20_2 = 4 | ARM_MMU_IDX_A,
+
+ARMMMUIdx_SE0 =5 | ARM_MMU_IDX_A,

[PATCH v4 37/40] target/arm: Move arm_excp_unmasked to cpu.c

2019-12-02 Thread Richard Henderson

This inline function has one user in cpu.c, and need not be exposed
otherwise.  Code movement only, with fixups for checkpatch.

Signed-off-by: Richard Henderson 
---
 target/arm/cpu.h | 111 ---
 target/arm/cpu.c | 119 +++
 2 files changed, 119 insertions(+), 111 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 8e5aaaf415..22935e4433 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -2673,117 +2673,6 @@ bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync);
 #define ARM_CPUID_TI915T  0x54029152
 #define ARM_CPUID_TI925T  0x54029252
 
-static inline bool arm_excp_unmasked(CPUState *cs, unsigned int excp_idx,
- unsigned int target_el)
-{
-CPUARMState *env = cs->env_ptr;
-unsigned int cur_el = arm_current_el(env);
-bool secure = arm_is_secure(env);
-bool pstate_unmasked;
-int8_t unmasked = 0;
-uint64_t hcr_el2;
-
-/* Don't take exceptions if they target a lower EL.
- * This check should catch any exceptions that would not be taken but left
- * pending.
- */
-if (cur_el > target_el) {
-return false;
-}
-
-hcr_el2 = arm_hcr_el2_eff(env);
-
-switch (excp_idx) {
-case EXCP_FIQ:
-pstate_unmasked = !(env->daif & PSTATE_F);
-break;
-
-case EXCP_IRQ:
-pstate_unmasked = !(env->daif & PSTATE_I);
-break;
-
-case EXCP_VFIQ:
-if (secure || !(hcr_el2 & HCR_FMO) || (hcr_el2 & HCR_TGE)) {
-/* VFIQs are only taken when hypervized and non-secure.  */
-return false;
-}
-return !(env->daif & PSTATE_F);
-case EXCP_VIRQ:
-if (secure || !(hcr_el2 & HCR_IMO) || (hcr_el2 & HCR_TGE)) {
-/* VIRQs are only taken when hypervized and non-secure.  */
-return false;
-}
-return !(env->daif & PSTATE_I);
-default:
-g_assert_not_reached();
-}
-
-/* Use the target EL, current execution state and SCR/HCR settings to
- * determine whether the corresponding CPSR bit is used to mask the
- * interrupt.
- */
-if ((target_el > cur_el) && (target_el != 1)) {
-/* Exceptions targeting a higher EL may not be maskable */
-if (arm_feature(env, ARM_FEATURE_AARCH64)) {
-/* 64-bit masking rules are simple: exceptions to EL3
- * can't be masked, and exceptions to EL2 can only be
- * masked from Secure state. The HCR and SCR settings
- * don't affect the masking logic, only the interrupt routing.
- */
-if (target_el == 3 || !secure) {
-unmasked = 1;
-}
-} else {
-/* The old 32-bit-only environment has a more complicated
- * masking setup. HCR and SCR bits not only affect interrupt
- * routing but also change the behaviour of masking.
- */
-bool hcr, scr;
-
-switch (excp_idx) {
-case EXCP_FIQ:
-/* If FIQs are routed to EL3 or EL2 then there are cases where
- * we override the CPSR.F in determining if the exception is
- * masked or not. If neither of these are set then we fall back
- * to the CPSR.F setting otherwise we further assess the state
- * below.
- */
-hcr = hcr_el2 & HCR_FMO;
-scr = (env->cp15.scr_el3 & SCR_FIQ);
-
-/* When EL3 is 32-bit, the SCR.FW bit controls whether the
- * CPSR.F bit masks FIQ interrupts when taken in non-secure
- * state. If SCR.FW is set then FIQs can be masked by CPSR.F
- * when non-secure but only when FIQs are only routed to EL3.
- */
-scr = scr && !((env->cp15.scr_el3 & SCR_FW) && !hcr);
-break;
-case EXCP_IRQ:
-/* When EL3 execution state is 32-bit, if HCR.IMO is set then
- * we may override the CPSR.I masking when in non-secure state.
- * The SCR.IRQ setting has already been taken into 
consideration
- * when setting the target EL, so it does not have a further
- * affect here.
- */
-hcr = hcr_el2 & HCR_IMO;
-scr = false;
-break;
-default:
-g_assert_not_reached();
-}
-
-if ((scr || hcr) && !secure) {
-unmasked = 1;
-}
-}
-}
-
-/* The PSTATE bits only mask the interrupt if we have not overriden the
- * ability above.
- */
-return unmasked || pstate_unmasked;
-}
-
 #define ARM_CPU_TYPE_SUFFIX "-" TYPE_ARM_CPU
 #define ARM_CPU_TYPE_NAME(name) (name ARM_CPU_TYPE_SUFFIX)
 #define CPU_RESOLVING_TYPE TYPE_ARM_CPU
diff --git

[PATCH v4 31/40] target/arm: Update arm_phys_excp_target_el for TGE

2019-12-02 Thread Richard Henderson

The TGE bit routes all asynchronous exceptions to EL2.

Signed-off-by: Richard Henderson 
---
 target/arm/helper.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index b059d9f81a..e0b8c81c5f 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -8316,6 +8316,12 @@ uint32_t arm_phys_excp_target_el(CPUState *cs, uint32_t 
excp_idx,
 break;
 };
 
+/*
+ * For these purposes, TGE and AMO/IMO/FMO both force the
+ * interrupt to EL2.  Fold TGE into the bit extracted above.
+ */
+hcr |= (hcr_el2 & HCR_TGE) != 0;
+
 /* Perform a table-lookup for the target EL given the current state */
 target_el = target_el_table[is64][scr][rw][hcr][secure][cur_el];
 
-- 
2.17.1

[PATCH v4 34/40] target/arm: Update get_a64_user_mem_index for VHE

2019-12-02 Thread Richard Henderson

The EL2&0 translation regime is affected by Load Register (unpriv).

The code structure used here will facilitate later changes in this
area for implementing UAO and NV.

Signed-off-by: Richard Henderson 
---
 target/arm/cpu.h   |  1 +
 target/arm/translate.h |  2 ++
 target/arm/helper.c| 22 +++
 target/arm/translate-a64.c | 44 --
 4 files changed, 53 insertions(+), 16 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index bb5a72520e..8e5aaaf415 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -3239,6 +3239,7 @@ FIELD(TBFLAG_A64, PAUTH_ACTIVE, 8, 1)
 FIELD(TBFLAG_A64, BT, 9, 1)
 FIELD(TBFLAG_A64, BTYPE, 10, 2) /* Not cached. */
 FIELD(TBFLAG_A64, TBID, 12, 2)
+FIELD(TBFLAG_A64, UNPRIV, 14, 1)
 
 static inline bool bswap_code(bool sctlr_b)
 {
diff --git a/target/arm/translate.h b/target/arm/translate.h
index 3760159661..d31d9ad858 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -73,6 +73,8 @@ typedef struct DisasContext {
  * ie A64 LDX*, LDAX*, A32/T32 LDREX*, LDAEX*.
  */
 bool is_ldex;
+/* True if AccType_UNPRIV should be used for LDTR et al */
+bool unpriv;
 /* True if v8.3-PAuth is active.  */
 bool pauth_active;
 /* True with v8.5-BTI and SCTLR_ELx.BT* set.  */
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 3e025eb22e..f2d18bd51a 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11879,6 +11879,28 @@ static uint32_t rebuild_hflags_a64(CPUARMState *env, 
int el, int fp_el,
 }
 }
 
+/* Compute the condition for using AccType_UNPRIV for LDTR et al. */
+/* TODO: ARMv8.2-UAO */
+switch (mmu_idx) {
+case ARMMMUIdx_EL10_1:
+case ARMMMUIdx_SE1:
+/* TODO: ARMv8.3-NV */
+flags = FIELD_DP32(flags, TBFLAG_A64, UNPRIV, 1);
+break;
+case ARMMMUIdx_EL20_2:
+/* TODO: ARMv8.4-SecEL2 */
+/*
+ * Note that EL20_2 is gated by HCR_EL2.E2H == 1, but EL20_0 is
+ * gated by HCR_EL2. == '11', and so is LDTR.
+ */
+if (env->cp15.hcr_el2 & HCR_TGE) {
+flags = FIELD_DP32(flags, TBFLAG_A64, UNPRIV, 1);
+}
+break;
+default:
+break;
+}
+
 return rebuild_hflags_common(env, fp_el, mmu_idx, flags);
 }
 
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index d0b65c49e2..fe492bea90 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -105,25 +105,36 @@ void a64_translate_init(void)
 offsetof(CPUARMState, exclusive_high), "exclusive_high");
 }
 
-static inline int get_a64_user_mem_index(DisasContext *s)
+/*
+ * Return the core mmu_idx to use for A64 "unprivileged load/store" insns
+ */
+static int get_a64_user_mem_index(DisasContext *s)
 {
-/* Return the core mmu_idx to use for A64 "unprivileged load/store" insns:
- *  if EL1, access as if EL0; otherwise access at current EL
+/*
+ * If AccType_UNPRIV is not used, the insn uses AccType_NORMAL,
+ * which is the usual mmu_idx for this cpu state.
  */
-ARMMMUIdx useridx;
+ARMMMUIdx useridx = s->mmu_idx;
 
-switch (s->mmu_idx) {
-case ARMMMUIdx_EL10_1:
-useridx = ARMMMUIdx_EL10_0;
-break;
-case ARMMMUIdx_SE1:
-useridx = ARMMMUIdx_SE0;
-break;
-case ARMMMUIdx_Stage2:
-g_assert_not_reached();
-default:
-useridx = s->mmu_idx;
-break;
+if (s->unpriv) {
+/*
+ * We have pre-computed the condition for AccType_UNPRIV.
+ * Therefore we should never get here with a mmu_idx for
+ * which we do not know the corresponding user mmu_idx.
+ */
+switch (useridx) {
+case ARMMMUIdx_EL10_1:
+useridx = ARMMMUIdx_EL10_0;
+break;
+case ARMMMUIdx_EL20_2:
+useridx = ARMMMUIdx_EL20_0;
+break;
+case ARMMMUIdx_SE1:
+useridx = ARMMMUIdx_SE0;
+break;
+default:
+g_assert_not_reached();
+}
 }
 return arm_to_core_mmu_idx(useridx);
 }
@@ -14169,6 +14180,7 @@ static void 
aarch64_tr_init_disas_context(DisasContextBase *dcbase,
 dc->pauth_active = FIELD_EX32(tb_flags, TBFLAG_A64, PAUTH_ACTIVE);
 dc->bt = FIELD_EX32(tb_flags, TBFLAG_A64, BT);
 dc->btype = FIELD_EX32(tb_flags, TBFLAG_A64, BTYPE);
+dc->unpriv = FIELD_EX32(tb_flags, TBFLAG_A64, UNPRIV);
 dc->vec_len = 0;
 dc->vec_stride = 0;
 dc->cp_regs = arm_cpu->cp_regs;
-- 
2.17.1

[PATCH v4 39/40] target/arm: Use bool for unmasked in arm_excp_unmasked

2019-12-02 Thread Richard Henderson

The value computed is fully boolean; using int8_t is odd.

Signed-off-by: Richard Henderson 
---
 target/arm/cpu.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 7a1177b883..a366448c6d 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -417,7 +417,7 @@ static inline bool arm_excp_unmasked(CPUState *cs, unsigned 
int excp_idx,
 {
 CPUARMState *env = cs->env_ptr;
 bool pstate_unmasked;
-int8_t unmasked = 0;
+bool unmasked = false;
 
 /*
  * Don't take exceptions if they target a lower EL.
@@ -468,7 +468,7 @@ static inline bool arm_excp_unmasked(CPUState *cs, unsigned 
int excp_idx,
  * don't affect the masking logic, only the interrupt routing.
  */
 if (target_el == 3 || !secure) {
-unmasked = 1;
+unmasked = true;
 }
 } else {
 /*
@@ -514,7 +514,7 @@ static inline bool arm_excp_unmasked(CPUState *cs, unsigned 
int excp_idx,
 }
 
 if ((scr || hcr) && !secure) {
-unmasked = 1;
+unmasked = true;
 }
 }
 }
-- 
2.17.1

[PATCH v4 32/40] target/arm: Update {fp,sve}_exception_el for VHE

2019-12-02 Thread Richard Henderson

When TGE+E2H are both set, CPACR_EL1 is ignored.

Signed-off-by: Richard Henderson 
---
 target/arm/helper.c | 53 -
 1 file changed, 28 insertions(+), 25 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index e0b8c81c5f..3e025eb22e 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -5743,7 +5743,9 @@ static const ARMCPRegInfo debug_lpae_cp_reginfo[] = {
 int sve_exception_el(CPUARMState *env, int el)
 {
 #ifndef CONFIG_USER_ONLY
-if (el <= 1) {
+uint64_t hcr_el2 = arm_hcr_el2_eff(env);
+
+if (el <= 1 && (hcr_el2 & (HCR_E2H | HCR_TGE)) != (HCR_E2H | HCR_TGE)) {
 bool disabled = false;
 
 /* The CPACR.ZEN controls traps to EL1:
@@ -5758,8 +5760,7 @@ int sve_exception_el(CPUARMState *env, int el)
 }
 if (disabled) {
 /* route_to_el2 */
-return (arm_feature(env, ARM_FEATURE_EL2)
-&& (arm_hcr_el2_eff(env) & HCR_TGE) ? 2 : 1);
+return hcr_el2 & HCR_TGE ? 2 : 1;
 }
 
 /* Check CPACR.FPEN.  */
@@ -11565,8 +11566,6 @@ uint32_t HELPER(crc32c)(uint32_t acc, uint32_t val, 
uint32_t bytes)
 int fp_exception_el(CPUARMState *env, int cur_el)
 {
 #ifndef CONFIG_USER_ONLY
-int fpen;
-
 /* CPACR and the CPTR registers don't exist before v6, so FP is
  * always accessible
  */
@@ -11594,30 +11593,34 @@ int fp_exception_el(CPUARMState *env, int cur_el)
  * 0, 2 : trap EL0 and EL1/PL1 accesses
  * 1: trap only EL0 accesses
  * 3: trap no accesses
+ * This register is ignored if E2H+TGE are both set.
  */
-fpen = extract32(env->cp15.cpacr_el1, 20, 2);
-switch (fpen) {
-case 0:
-case 2:
-if (cur_el == 0 || cur_el == 1) {
-/* Trap to PL1, which might be EL1 or EL3 */
-if (arm_is_secure(env) && !arm_el_is_aa64(env, 3)) {
+if ((arm_hcr_el2_eff(env) & (HCR_E2H | HCR_TGE)) != (HCR_E2H | HCR_TGE)) {
+int fpen = extract32(env->cp15.cpacr_el1, 20, 2);
+
+switch (fpen) {
+case 0:
+case 2:
+if (cur_el == 0 || cur_el == 1) {
+/* Trap to PL1, which might be EL1 or EL3 */
+if (arm_is_secure(env) && !arm_el_is_aa64(env, 3)) {
+return 3;
+}
+return 1;
+}
+if (cur_el == 3 && !is_a64(env)) {
+/* Secure PL1 running at EL3 */
 return 3;
 }
-return 1;
+break;
+case 1:
+if (cur_el == 0) {
+return 1;
+}
+break;
+case 3:
+break;
 }
-if (cur_el == 3 && !is_a64(env)) {
-/* Secure PL1 running at EL3 */
-return 3;
-}
-break;
-case 1:
-if (cur_el == 0) {
-return 1;
-}
-break;
-case 3:
-break;
 }
 
 /*
-- 
2.17.1

[PATCH v4 29/40] target/arm: Flush tlb for ASID changes in EL2&0 translation regime

2019-12-02 Thread Richard Henderson

Since we only support a single ASID, flush the tlb when it changes.

Signed-off-by: Richard Henderson 
---
 target/arm/helper.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 9df55a8d6b..2a4d4c2c0d 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -3740,6 +3740,15 @@ static void vmsa_ttbr_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 static void vmsa_tcr_ttbr_el2_write(CPUARMState *env, const ARMCPRegInfo *ri,
 uint64_t value)
 {
+/*
+ * If we are running with E2&0 regime, then the ASID is active.
+ * Flush if that changes.
+ */
+if ((arm_hcr_el2_eff(env) & HCR_E2H) &&
+extract64(raw_read(env, ri) ^ value, 48, 16)) {
+tlb_flush_by_mmuidx(env_cpu(env),
+ARMMMUIdxBit_EL20_2 | ARMMMUIdxBit_EL20_0);
+}
 raw_write(env, ri, value);
 }
 
-- 
2.17.1

[PATCH v4 27/40] target/arm: Add VHE system register redirection and aliasing

2019-12-02 Thread Richard Henderson

Several of the EL1/0 registers are redirected to the EL2 version when in
EL2 and HCR_EL2.E2H is set.  Many of these registers have side effects.
Link together the two ARMCPRegInfo structures after they have been
properly instantiated.  Install common dispatch routines to all of the
relevant registers.

The same set of registers that are redirected also have additional
EL12/EL02 aliases created to access the original register that was
redirected.

Omit the generic timer registers from redirection here, because we'll
need multiple kinds of redirection from both EL0 and EL2.

Signed-off-by: Richard Henderson 
---
 target/arm/cpu.h|  44 
 target/arm/helper.c | 162 
 2 files changed, 193 insertions(+), 13 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 4bd1bf915c..bb5a72520e 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -2488,19 +2488,6 @@ struct ARMCPRegInfo {
  */
 ptrdiff_t fieldoffset; /* offsetof(CPUARMState, field) */
 
-/* Offsets of the secure and non-secure fields in CPUARMState for the
- * register if it is banked.  These fields are only used during the static
- * registration of a register.  During hashing the bank associated
- * with a given security state is copied to fieldoffset which is used from
- * there on out.
- *
- * It is expected that register definitions use either fieldoffset or
- * bank_fieldoffsets in the definition but not both.  It is also expected
- * that both bank offsets are set when defining a banked register.  This
- * use indicates that a register is banked.
- */
-ptrdiff_t bank_fieldoffsets[2];
-
 /* Function for making any access checks for this register in addition to
  * those specified by the 'access' permissions bits. If NULL, no extra
  * checks required. The access check is performed at runtime, not at
@@ -2535,6 +2522,37 @@ struct ARMCPRegInfo {
  * fieldoffset is 0 then no reset will be done.
  */
 CPResetFn *resetfn;
+
+union {
+/*
+ * Offsets of the secure and non-secure fields in CPUARMState for
+ * the register if it is banked.  These fields are only used during
+ * the static registration of a register.  During hashing the bank
+ * associated with a given security state is copied to fieldoffset
+ * which is used from there on out.
+ *
+ * It is expected that register definitions use either fieldoffset
+ * or bank_fieldoffsets in the definition but not both.  It is also
+ * expected that both bank offsets are set when defining a banked
+ * register.  This use indicates that a register is banked.
+ */
+ptrdiff_t bank_fieldoffsets[2];
+
+/*
+ * "Original" writefn and readfn.
+ * For ARMv8.1-VHE register aliases, we overwrite the read/write
+ * accessor functions of various EL1/EL0 to perform the runtime
+ * check for which sysreg should actually be modified, and then
+ * forwards the operation.  Before overwriting the accessors,
+ * the original function is copied here, so that accesses that
+ * really do go to the EL1/EL0 version proceed normally.
+ * (The corresponding EL2 register is linked via opaque.)
+ */
+struct {
+CPReadFn *orig_readfn;
+CPWriteFn *orig_writefn;
+};
+};
 };
 
 /* Macros which are lvalues for the field in CPUARMState for the
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 1812588fa1..0baf188078 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -5306,6 +5306,158 @@ static const ARMCPRegInfo el3_cp_reginfo[] = {
 REGINFO_SENTINEL
 };
 
+#ifndef CONFIG_USER_ONLY
+/* Test if system register redirection is to occur in the current state.  */
+static bool redirect_for_e2h(CPUARMState *env)
+{
+return arm_current_el(env) == 2 && (arm_hcr_el2_eff(env) & HCR_E2H);
+}
+
+static uint64_t el2_e2h_read(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+CPReadFn *readfn;
+
+if (redirect_for_e2h(env)) {
+/* Switch to the saved EL2 version of the register.  */
+ri = ri->opaque;
+readfn = ri->readfn;
+} else {
+readfn = ri->orig_readfn;
+}
+if (readfn == NULL) {
+readfn = raw_read;
+}
+return readfn(env, ri);
+}
+
+static void el2_e2h_write(CPUARMState *env, const ARMCPRegInfo *ri,
+  uint64_t value)
+{
+CPWriteFn *writefn;
+
+if (redirect_for_e2h(env)) {
+/* Switch to the saved EL2 version of the register.  */
+ri = ri->opaque;
+writefn = ri->writefn;
+} else {
+writefn = ri->orig_writefn;
+}
+if (writefn == NULL) {
+writefn = raw_write;
+}
+writefn(env, ri, value);
+}
+
+static void define_arm_vh_e2h_redirects_aliases(ARMCPU *cpu)
+{
+struct E2HAlias {
+uint32_t

[PATCH v4 35/40] target/arm: Update arm_cpu_do_interrupt_aarch64 for VHE

2019-12-02 Thread Richard Henderson

When VHE is enabled, we need to take the aa32-ness of EL0
from PSTATE not HCR_EL2, which is controlling EL1.

Signed-off-by: Richard Henderson 
---
 target/arm/helper.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index f2d18bd51a..f3785d5ad6 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -8887,14 +8887,19 @@ static void arm_cpu_do_interrupt_aarch64(CPUState *cs)
  * immediately lower than the target level is using AArch32 or AArch64
  */
 bool is_aa64;
+uint64_t hcr;
 
 switch (new_el) {
 case 3:
 is_aa64 = (env->cp15.scr_el3 & SCR_RW) != 0;
 break;
 case 2:
-is_aa64 = (env->cp15.hcr_el2 & HCR_RW) != 0;
-break;
+hcr = arm_hcr_el2_eff(env);
+if ((hcr & (HCR_E2H | HCR_TGE)) != (HCR_E2H | HCR_TGE)) {
+is_aa64 = (hcr & HCR_RW) != 0;
+break;
+}
+/* fall through */
 case 1:
 is_aa64 = is_a64(env);
 break;
-- 
2.17.1

[PATCH v4 38/40] target/arm: Pass more cpu state to arm_excp_unmasked

2019-12-02 Thread Richard Henderson

Avoid redundant computation of cpu state by passing it in
from the caller, which has already computed it for itself.

Signed-off-by: Richard Henderson 
---
 target/arm/cpu.c | 22 --
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index a36344d4c7..7a1177b883 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -411,14 +411,13 @@ static void arm_cpu_reset(CPUState *s)
 }
 
 static inline bool arm_excp_unmasked(CPUState *cs, unsigned int excp_idx,
- unsigned int target_el)
+ unsigned int target_el,
+ unsigned int cur_el, bool secure,
+ uint64_t hcr_el2)
 {
 CPUARMState *env = cs->env_ptr;
-unsigned int cur_el = arm_current_el(env);
-bool secure = arm_is_secure(env);
 bool pstate_unmasked;
 int8_t unmasked = 0;
-uint64_t hcr_el2;
 
 /*
  * Don't take exceptions if they target a lower EL.
@@ -429,8 +428,6 @@ static inline bool arm_excp_unmasked(CPUState *cs, unsigned 
int excp_idx,
 return false;
 }
 
-hcr_el2 = arm_hcr_el2_eff(env);
-
 switch (excp_idx) {
 case EXCP_FIQ:
 pstate_unmasked = !(env->daif & PSTATE_F);
@@ -535,6 +532,7 @@ bool arm_cpu_exec_interrupt(CPUState *cs, int 
interrupt_request)
 CPUARMState *env = cs->env_ptr;
 uint32_t cur_el = arm_current_el(env);
 bool secure = arm_is_secure(env);
+uint64_t hcr_el2 = arm_hcr_el2_eff(env);
 uint32_t target_el;
 uint32_t excp_idx;
 bool ret = false;
@@ -542,7 +540,8 @@ bool arm_cpu_exec_interrupt(CPUState *cs, int 
interrupt_request)
 if (interrupt_request & CPU_INTERRUPT_FIQ) {
 excp_idx = EXCP_FIQ;
 target_el = arm_phys_excp_target_el(cs, excp_idx, cur_el, secure);
-if (arm_excp_unmasked(cs, excp_idx, target_el)) {
+if (arm_excp_unmasked(cs, excp_idx, target_el,
+  cur_el, secure, hcr_el2)) {
 cs->exception_index = excp_idx;
 env->exception.target_el = target_el;
 cc->do_interrupt(cs);
@@ -552,7 +551,8 @@ bool arm_cpu_exec_interrupt(CPUState *cs, int 
interrupt_request)
 if (interrupt_request & CPU_INTERRUPT_HARD) {
 excp_idx = EXCP_IRQ;
 target_el = arm_phys_excp_target_el(cs, excp_idx, cur_el, secure);
-if (arm_excp_unmasked(cs, excp_idx, target_el)) {
+if (arm_excp_unmasked(cs, excp_idx, target_el,
+  cur_el, secure, hcr_el2)) {
 cs->exception_index = excp_idx;
 env->exception.target_el = target_el;
 cc->do_interrupt(cs);
@@ -562,7 +562,8 @@ bool arm_cpu_exec_interrupt(CPUState *cs, int 
interrupt_request)
 if (interrupt_request & CPU_INTERRUPT_VIRQ) {
 excp_idx = EXCP_VIRQ;
 target_el = 1;
-if (arm_excp_unmasked(cs, excp_idx, target_el)) {
+if (arm_excp_unmasked(cs, excp_idx, target_el,
+  cur_el, secure, hcr_el2)) {
 cs->exception_index = excp_idx;
 env->exception.target_el = target_el;
 cc->do_interrupt(cs);
@@ -572,7 +573,8 @@ bool arm_cpu_exec_interrupt(CPUState *cs, int 
interrupt_request)
 if (interrupt_request & CPU_INTERRUPT_VFIQ) {
 excp_idx = EXCP_VFIQ;
 target_el = 1;
-if (arm_excp_unmasked(cs, excp_idx, target_el)) {
+if (arm_excp_unmasked(cs, excp_idx, target_el,
+  cur_el, secure, hcr_el2)) {
 cs->exception_index = excp_idx;
 env->exception.target_el = target_el;
 cc->do_interrupt(cs);
-- 
2.17.1

[PATCH v4 28/40] target/arm: Add VHE timer register redirection and aliasing

2019-12-02 Thread Richard Henderson

Apart from the wholesale redirection that HCR_EL2.E2H performs
for EL2, there's a separate redirection specific to the timers
that happens for EL0 when running in the EL2&0 regime.

Signed-off-by: Richard Henderson 
---
 target/arm/helper.c | 191 +---
 1 file changed, 179 insertions(+), 12 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 0baf188078..9df55a8d6b 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -2655,6 +2655,70 @@ static void gt_phys_ctl_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 gt_ctl_write(env, ri, GTIMER_PHYS, value);
 }
 
+static int gt_phys_redir_timeridx(CPUARMState *env)
+{
+switch (arm_mmu_idx(env)) {
+case ARMMMUIdx_EL20_0:
+case ARMMMUIdx_EL20_2:
+return GTIMER_HYP;
+default:
+return GTIMER_PHYS;
+}
+}
+
+static int gt_virt_redir_timeridx(CPUARMState *env)
+{
+switch (arm_mmu_idx(env)) {
+case ARMMMUIdx_EL20_0:
+case ARMMMUIdx_EL20_2:
+return GTIMER_HYPVIRT;
+default:
+return GTIMER_VIRT;
+}
+}
+
+static uint64_t gt_phys_redir_cval_read(CPUARMState *env,
+const ARMCPRegInfo *ri)
+{
+int timeridx = gt_phys_redir_timeridx(env);
+return env->cp15.c14_timer[timeridx].cval;
+}
+
+static void gt_phys_redir_cval_write(CPUARMState *env, const ARMCPRegInfo *ri,
+ uint64_t value)
+{
+int timeridx = gt_phys_redir_timeridx(env);
+gt_cval_write(env, ri, timeridx, value);
+}
+
+static uint64_t gt_phys_redir_tval_read(CPUARMState *env,
+const ARMCPRegInfo *ri)
+{
+int timeridx = gt_phys_redir_timeridx(env);
+return gt_tval_read(env, ri, timeridx);
+}
+
+static void gt_phys_redir_tval_write(CPUARMState *env, const ARMCPRegInfo *ri,
+ uint64_t value)
+{
+int timeridx = gt_phys_redir_timeridx(env);
+gt_tval_write(env, ri, timeridx, value);
+}
+
+static uint64_t gt_phys_redir_ctl_read(CPUARMState *env,
+   const ARMCPRegInfo *ri)
+{
+int timeridx = gt_phys_redir_timeridx(env);
+return env->cp15.c14_timer[timeridx].ctl;
+}
+
+static void gt_phys_redir_ctl_write(CPUARMState *env, const ARMCPRegInfo *ri,
+uint64_t value)
+{
+int timeridx = gt_phys_redir_timeridx(env);
+gt_ctl_write(env, ri, timeridx, value);
+}
+
 static void gt_virt_timer_reset(CPUARMState *env, const ARMCPRegInfo *ri)
 {
 gt_timer_reset(env, ri, GTIMER_VIRT);
@@ -2693,6 +2757,48 @@ static void gt_cntvoff_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 gt_recalc_timer(cpu, GTIMER_VIRT);
 }
 
+static uint64_t gt_virt_redir_cval_read(CPUARMState *env,
+const ARMCPRegInfo *ri)
+{
+int timeridx = gt_virt_redir_timeridx(env);
+return env->cp15.c14_timer[timeridx].cval;
+}
+
+static void gt_virt_redir_cval_write(CPUARMState *env, const ARMCPRegInfo *ri,
+ uint64_t value)
+{
+int timeridx = gt_virt_redir_timeridx(env);
+gt_cval_write(env, ri, timeridx, value);
+}
+
+static uint64_t gt_virt_redir_tval_read(CPUARMState *env,
+const ARMCPRegInfo *ri)
+{
+int timeridx = gt_virt_redir_timeridx(env);
+return gt_tval_read(env, ri, timeridx);
+}
+
+static void gt_virt_redir_tval_write(CPUARMState *env, const ARMCPRegInfo *ri,
+ uint64_t value)
+{
+int timeridx = gt_virt_redir_timeridx(env);
+gt_tval_write(env, ri, timeridx, value);
+}
+
+static uint64_t gt_virt_redir_ctl_read(CPUARMState *env,
+   const ARMCPRegInfo *ri)
+{
+int timeridx = gt_virt_redir_timeridx(env);
+return env->cp15.c14_timer[timeridx].ctl;
+}
+
+static void gt_virt_redir_ctl_write(CPUARMState *env, const ARMCPRegInfo *ri,
+uint64_t value)
+{
+int timeridx = gt_virt_redir_timeridx(env);
+gt_ctl_write(env, ri, timeridx, value);
+}
+
 static void gt_hyp_timer_reset(CPUARMState *env, const ARMCPRegInfo *ri)
 {
 gt_timer_reset(env, ri, GTIMER_HYP);
@@ -2842,7 +2948,8 @@ static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
   .accessfn = gt_ptimer_access,
   .fieldoffset = offsetoflow32(CPUARMState,
cp15.c14_timer[GTIMER_PHYS].ctl),
-  .writefn = gt_phys_ctl_write, .raw_writefn = raw_write,
+  .readfn = gt_phys_redir_ctl_read, .raw_readfn = raw_read,
+  .writefn = gt_phys_redir_ctl_write, .raw_writefn = raw_write,
 },
 { .name = "CNTP_CTL_S",
   .cp = 15, .crn = 14, .crm = 2, .opc1 = 0, .opc2 = 1,
@@ -2859,14 +2966,16 @@ static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
   .accessfn = gt_ptimer_access,
   .fieldoffset = offsetof(CPUARMState, cp15.c14_timer[GTIMER_PHYS].ctl),
   .resetvalue

[PATCH v4 25/40] target/arm: Update timer access for VHE

2019-12-02 Thread Richard Henderson

Signed-off-by: Richard Henderson 
---
 target/arm/helper.c | 102 +++-
 1 file changed, 81 insertions(+), 21 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index a4a7f82661..023b8963cf 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -2287,10 +2287,18 @@ static CPAccessResult gt_cntfrq_access(CPUARMState 
*env, const ARMCPRegInfo *ri,
  * Writable only at the highest implemented exception level.
  */
 int el = arm_current_el(env);
+uint64_t hcr;
+uint32_t cntkctl;
 
 switch (el) {
 case 0:
-if (!extract32(env->cp15.c14_cntkctl, 0, 2)) {
+hcr = arm_hcr_el2_eff(env);
+if ((hcr & (HCR_E2H | HCR_TGE)) == (HCR_E2H | HCR_TGE)) {
+cntkctl = env->cp15.cnthctl_el2;
+} else {
+cntkctl = env->cp15.c14_cntkctl;
+}
+if (!extract32(cntkctl, 0, 2)) {
 return CP_ACCESS_TRAP;
 }
 break;
@@ -2318,17 +2326,47 @@ static CPAccessResult gt_counter_access(CPUARMState 
*env, int timeridx,
 {
 unsigned int cur_el = arm_current_el(env);
 bool secure = arm_is_secure(env);
+uint64_t hcr = arm_hcr_el2_eff(env);
 
-/* CNT[PV]CT: not visible from PL0 if ELO[PV]CTEN is zero */
-if (cur_el == 0 &&
-!extract32(env->cp15.c14_cntkctl, timeridx, 1)) {
-return CP_ACCESS_TRAP;
-}
+switch (cur_el) {
+case 0:
+/* If HCR_EL2. == '11': check CNTHCTL_EL2.EL0[PV]CTEN. */
+if ((hcr & (HCR_E2H | HCR_TGE)) == (HCR_E2H | HCR_TGE)) {
+return (extract32(env->cp15.cnthctl_el2, timeridx, 1)
+? CP_ACCESS_OK : CP_ACCESS_TRAP_EL2);
+}
 
-if (arm_feature(env, ARM_FEATURE_EL2) &&
-timeridx == GTIMER_PHYS && !secure && cur_el < 2 &&
-!extract32(env->cp15.cnthctl_el2, 0, 1)) {
-return CP_ACCESS_TRAP_EL2;
+/* CNT[PV]CT: not visible from PL0 if EL0[PV]CTEN is zero */
+if (!extract32(env->cp15.c14_cntkctl, timeridx, 1)) {
+return CP_ACCESS_TRAP;
+}
+
+/* If HCR_EL2. == '10': check CNTHCTL_EL2.EL1PCTEN. */
+if (hcr & HCR_E2H) {
+if (timeridx == GTIMER_PHYS &&
+!extract32(env->cp15.cnthctl_el2, 10, 1)) {
+return CP_ACCESS_TRAP_EL2;
+}
+} else {
+/* If HCR_EL2. == 0: check CNTHCTL_EL2.EL1PCEN. */
+if (arm_feature(env, ARM_FEATURE_EL2) &&
+timeridx == GTIMER_PHYS && !secure &&
+!extract32(env->cp15.cnthctl_el2, 1, 1)) {
+return CP_ACCESS_TRAP_EL2;
+}
+}
+break;
+
+case 1:
+/* Check CNTHCTL_EL2.EL1PCTEN, which changes location based on E2H. */
+if (arm_feature(env, ARM_FEATURE_EL2) &&
+timeridx == GTIMER_PHYS && !secure &&
+(hcr & HCR_E2H
+ ? !extract32(env->cp15.cnthctl_el2, 10, 1)
+ : !extract32(env->cp15.cnthctl_el2, 0, 1))) {
+return CP_ACCESS_TRAP_EL2;
+}
+break;
 }
 return CP_ACCESS_OK;
 }
@@ -2338,19 +2376,41 @@ static CPAccessResult gt_timer_access(CPUARMState *env, 
int timeridx,
 {
 unsigned int cur_el = arm_current_el(env);
 bool secure = arm_is_secure(env);
+uint64_t hcr = arm_hcr_el2_eff(env);
 
-/* CNT[PV]_CVAL, CNT[PV]_CTL, CNT[PV]_TVAL: not visible from PL0 if
- * EL0[PV]TEN is zero.
- */
-if (cur_el == 0 &&
-!extract32(env->cp15.c14_cntkctl, 9 - timeridx, 1)) {
-return CP_ACCESS_TRAP;
-}
+switch (cur_el) {
+case 0:
+if ((hcr & (HCR_E2H | HCR_TGE)) == (HCR_E2H | HCR_TGE)) {
+/* If HCR_EL2. == '11': check CNTHCTL_EL2.EL0[PV]TEN. */
+return (extract32(env->cp15.cnthctl_el2, 9 - timeridx, 1)
+? CP_ACCESS_OK : CP_ACCESS_TRAP_EL2);
+}
 
-if (arm_feature(env, ARM_FEATURE_EL2) &&
-timeridx == GTIMER_PHYS && !secure && cur_el < 2 &&
-!extract32(env->cp15.cnthctl_el2, 1, 1)) {
-return CP_ACCESS_TRAP_EL2;
+/*
+ * CNT[PV]_CVAL, CNT[PV]_CTL, CNT[PV]_TVAL: not visible from
+ * EL0 if EL0[PV]TEN is zero.
+ */
+if (!extract32(env->cp15.c14_cntkctl, 9 - timeridx, 1)) {
+return CP_ACCESS_TRAP;
+}
+/* fall through */
+
+case 1:
+if (arm_feature(env, ARM_FEATURE_EL2) &&
+timeridx == GTIMER_PHYS && !secure) {
+if (hcr & HCR_E2H) {
+/* If HCR_EL2. == '10': check CNTHCTL_EL2.EL1PTEN. */
+if (!extract32(env->cp15.cnthctl_el2, 11, 1)) {
+return CP_ACCESS_TRAP_EL2;
+}
+} else {
+/* If HCR_EL2. == 0: check CNTHCTL_EL2.EL1PCEN. */
+if (!extract32(env->cp15.cnthctl_el2, 1, 1)) {
+return CP_ACCESS_TRAP_EL2;
+}
+}
+}
+break;

[PATCH v4 17/40] target/arm: Tidy ARMMMUIdx m-profile definitions

2019-12-02 Thread Richard Henderson

Replace the magic numbers with the relevant ARM_MMU_IDX_M_* constants.
Keep the definitions short by referencing previous symbols.

Signed-off-by: Richard Henderson 
---
 target/arm/cpu.h | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 6ba5126852..015301e93a 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -2871,14 +2871,14 @@ typedef enum ARMMMUIdx {
 ARMMMUIdx_SE0 = 4 | ARM_MMU_IDX_A,
 ARMMMUIdx_SE1 = 5 | ARM_MMU_IDX_A,
 ARMMMUIdx_Stage2 = 6 | ARM_MMU_IDX_A,
-ARMMMUIdx_MUser = 0 | ARM_MMU_IDX_M,
-ARMMMUIdx_MPriv = 1 | ARM_MMU_IDX_M,
-ARMMMUIdx_MUserNegPri = 2 | ARM_MMU_IDX_M,
-ARMMMUIdx_MPrivNegPri = 3 | ARM_MMU_IDX_M,
-ARMMMUIdx_MSUser = 4 | ARM_MMU_IDX_M,
-ARMMMUIdx_MSPriv = 5 | ARM_MMU_IDX_M,
-ARMMMUIdx_MSUserNegPri = 6 | ARM_MMU_IDX_M,
-ARMMMUIdx_MSPrivNegPri = 7 | ARM_MMU_IDX_M,
+ARMMMUIdx_MUser = ARM_MMU_IDX_M,
+ARMMMUIdx_MPriv = ARM_MMU_IDX_M | ARM_MMU_IDX_M_PRIV,
+ARMMMUIdx_MUserNegPri = ARMMMUIdx_MUser | ARM_MMU_IDX_M_NEGPRI,
+ARMMMUIdx_MPrivNegPri = ARMMMUIdx_MPriv | ARM_MMU_IDX_M_NEGPRI,
+ARMMMUIdx_MSUser = ARMMMUIdx_MUser | ARM_MMU_IDX_M_S,
+ARMMMUIdx_MSPriv = ARMMMUIdx_MPriv | ARM_MMU_IDX_M_S,
+ARMMMUIdx_MSUserNegPri = ARMMMUIdx_MUserNegPri | ARM_MMU_IDX_M_S,
+ARMMMUIdx_MSPrivNegPri = ARMMMUIdx_MPrivNegPri | ARM_MMU_IDX_M_S,
 /* Indexes below here don't have TLBs and are used only for AT system
  * instructions or for the first stage of an S12 page table walk.
  */
-- 
2.17.1

[PATCH v4 19/40] target/arm: Add regime_has_2_ranges

2019-12-02 Thread Richard Henderson

Signed-off-by: Richard Henderson 
---
 target/arm/internals.h | 16 
 target/arm/helper.c| 23 ++-
 target/arm/translate-a64.c |  3 +--
 3 files changed, 23 insertions(+), 19 deletions(-)

diff --git a/target/arm/internals.h b/target/arm/internals.h
index d73615064c..1ca9a7cc78 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -837,6 +837,22 @@ static inline void arm_call_el_change_hook(ARMCPU *cpu)
 }
 }
 
+/* Return true if this address translation regime has two ranges.  */
+static inline bool regime_has_2_ranges(ARMMMUIdx mmu_idx)
+{
+switch (mmu_idx) {
+case ARMMMUIdx_Stage1_E0:
+case ARMMMUIdx_Stage1_E1:
+case ARMMMUIdx_EL10_0:
+case ARMMMUIdx_EL10_1:
+case ARMMMUIdx_EL20_0:
+case ARMMMUIdx_EL20_2:
+return true;
+default:
+return false;
+}
+}
+
 /* Return true if this address translation regime is secure */
 static inline bool regime_is_secure(CPUARMState *env, ARMMMUIdx mmu_idx)
 {
diff --git a/target/arm/helper.c b/target/arm/helper.c
index f86285ffbe..27adf24fa6 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -8885,15 +8885,8 @@ static int get_S1prot(CPUARMState *env, ARMMMUIdx 
mmu_idx, bool is_aa64,
 }
 
 if (is_aa64) {
-switch (regime_el(env, mmu_idx)) {
-case 1:
-if (!is_user) {
-xn = pxn || (user_rw & PAGE_WRITE);
-}
-break;
-case 2:
-case 3:
-break;
+if (regime_has_2_ranges(mmu_idx) && !is_user) {
+xn = pxn || (user_rw & PAGE_WRITE);
 }
 } else if (arm_feature(env, ARM_FEATURE_V7)) {
 switch (regime_el(env, mmu_idx)) {
@@ -9427,7 +9420,6 @@ ARMVAParameters aa64_va_parameters_both(CPUARMState *env, 
uint64_t va,
 ARMMMUIdx mmu_idx)
 {
 uint64_t tcr = regime_tcr(env, mmu_idx)->raw_tcr;
-uint32_t el = regime_el(env, mmu_idx);
 bool tbi, tbid, epd, hpd, using16k, using64k;
 int select, tsz;
 
@@ -9437,7 +9429,7 @@ ARMVAParameters aa64_va_parameters_both(CPUARMState *env, 
uint64_t va,
  */
 select = extract64(va, 55, 1);
 
-if (el > 1) {
+if (!regime_has_2_ranges(mmu_idx)) {
 tsz = extract32(tcr, 0, 6);
 using64k = extract32(tcr, 14, 1);
 using16k = extract32(tcr, 15, 1);
@@ -9593,10 +9585,7 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
target_ulong address,
 param = aa64_va_parameters(env, address, mmu_idx,
access_type != MMU_INST_FETCH);
 level = 0;
-/* If we are in 64-bit EL2 or EL3 then there is no TTBR1, so mark it
- * invalid.
- */
-ttbr1_valid = (el < 2);
+ttbr1_valid = regime_has_2_ranges(mmu_idx);
 addrsize = 64 - 8 * param.tbi;
 inputsize = 64 - param.tsz;
 } else {
@@ -11306,8 +11295,8 @@ static uint32_t rebuild_hflags_a64(CPUARMState *env, 
int el, int fp_el,
 
 flags = FIELD_DP32(flags, TBFLAG_ANY, AARCH64_STATE, 1);
 
-/* FIXME: ARMv8.1-VHE S2 translation regime.  */
-if (regime_el(env, stage1) < 2) {
+/* Get control bits for tagged addresses.  */
+if (regime_has_2_ranges(mmu_idx)) {
 ARMVAParameters p1 = aa64_va_parameters_both(env, -1, stage1);
 tbid = (p1.tbi << 1) | p0.tbi;
 tbii = tbid & ~((p1.tbid << 1) | p0.tbid);
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 885c99f0c9..d0b65c49e2 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -175,8 +175,7 @@ static void gen_top_byte_ignore(DisasContext *s, TCGv_i64 
dst,
 if (tbi == 0) {
 /* Load unmodified address */
 tcg_gen_mov_i64(dst, src);
-} else if (s->current_el >= 2) {
-/* FIXME: ARMv8.1-VHE S2 translation regime.  */
+} else if (!regime_has_2_ranges(s->mmu_idx)) {
 /* Force tag byte to all zero */
 tcg_gen_extract_i64(dst, src, 0, 56);
 } else {
-- 
2.17.1

[PATCH v4 24/40] target/arm: Add the hypervisor virtual counter

2019-12-02 Thread Richard Henderson

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 target/arm/cpu-qom.h |  1 +
 target/arm/cpu.h | 11 +
 target/arm/cpu.c |  2 ++
 target/arm/helper.c  | 57 
 4 files changed, 66 insertions(+), 5 deletions(-)

diff --git a/target/arm/cpu-qom.h b/target/arm/cpu-qom.h
index 7f5b244bde..3a9d31ea9d 100644
--- a/target/arm/cpu-qom.h
+++ b/target/arm/cpu-qom.h
@@ -76,6 +76,7 @@ void arm_gt_ptimer_cb(void *opaque);
 void arm_gt_vtimer_cb(void *opaque);
 void arm_gt_htimer_cb(void *opaque);
 void arm_gt_stimer_cb(void *opaque);
+void arm_gt_hvtimer_cb(void *opaque);
 
 #define ARM_AFF0_SHIFT 0
 #define ARM_AFF0_MASK  (0xFFULL << ARM_AFF0_SHIFT)
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 8aa625734f..4bd1bf915c 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -144,11 +144,12 @@ typedef struct ARMGenericTimer {
 uint64_t ctl; /* Timer Control register */
 } ARMGenericTimer;
 
-#define GTIMER_PHYS 0
-#define GTIMER_VIRT 1
-#define GTIMER_HYP  2
-#define GTIMER_SEC  3
-#define NUM_GTIMERS 4
+#define GTIMER_PHYS 0
+#define GTIMER_VIRT 1
+#define GTIMER_HYP  2
+#define GTIMER_SEC  3
+#define GTIMER_HYPVIRT  4
+#define NUM_GTIMERS 5
 
 typedef struct {
 uint64_t raw_tcr;
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 7a4ac9339b..81c33221f7 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -1259,6 +1259,8 @@ static void arm_cpu_realizefn(DeviceState *dev, Error 
**errp)
   arm_gt_htimer_cb, cpu);
 cpu->gt_timer[GTIMER_SEC] = timer_new(QEMU_CLOCK_VIRTUAL, GTIMER_SCALE,
   arm_gt_stimer_cb, cpu);
+cpu->gt_timer[GTIMER_HYPVIRT] = timer_new(QEMU_CLOCK_VIRTUAL, GTIMER_SCALE,
+  arm_gt_hvtimer_cb, cpu);
 #endif
 
 cpu_exec_realizefn(cs, _err);
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 9ad5015d5c..a4a7f82661 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -2516,6 +2516,7 @@ static uint64_t gt_tval_read(CPUARMState *env, const 
ARMCPRegInfo *ri,
 
 switch (timeridx) {
 case GTIMER_VIRT:
+case GTIMER_HYPVIRT:
 offset = gt_virt_cnt_offset(env);
 break;
 }
@@ -2532,6 +2533,7 @@ static void gt_tval_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 
 switch (timeridx) {
 case GTIMER_VIRT:
+case GTIMER_HYPVIRT:
 offset = gt_virt_cnt_offset(env);
 break;
 }
@@ -2687,6 +2689,34 @@ static void gt_sec_ctl_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 gt_ctl_write(env, ri, GTIMER_SEC, value);
 }
 
+static void gt_hv_timer_reset(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+gt_timer_reset(env, ri, GTIMER_HYPVIRT);
+}
+
+static void gt_hv_cval_write(CPUARMState *env, const ARMCPRegInfo *ri,
+ uint64_t value)
+{
+gt_cval_write(env, ri, GTIMER_HYPVIRT, value);
+}
+
+static uint64_t gt_hv_tval_read(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+return gt_tval_read(env, ri, GTIMER_HYPVIRT);
+}
+
+static void gt_hv_tval_write(CPUARMState *env, const ARMCPRegInfo *ri,
+ uint64_t value)
+{
+gt_tval_write(env, ri, GTIMER_HYPVIRT, value);
+}
+
+static void gt_hv_ctl_write(CPUARMState *env, const ARMCPRegInfo *ri,
+uint64_t value)
+{
+gt_ctl_write(env, ri, GTIMER_HYPVIRT, value);
+}
+
 void arm_gt_ptimer_cb(void *opaque)
 {
 ARMCPU *cpu = opaque;
@@ -2715,6 +2745,13 @@ void arm_gt_stimer_cb(void *opaque)
 gt_recalc_timer(cpu, GTIMER_SEC);
 }
 
+void arm_gt_hvtimer_cb(void *opaque)
+{
+ARMCPU *cpu = opaque;
+
+gt_recalc_timer(cpu, GTIMER_HYPVIRT);
+}
+
 static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
 /* Note that CNTFRQ is purely reads-as-written for the benefit
  * of software; writing it doesn't actually change the timer frequency.
@@ -6989,6 +7026,26 @@ void register_cp_regs_for_features(ARMCPU *cpu)
   .opc0 = 3, .opc1 = 4, .crn = 2, .crm = 0, .opc2 = 1,
   .access = PL2_RW, .writefn = vmsa_tcr_ttbr_el2_write,
   .fieldoffset = offsetof(CPUARMState, cp15.ttbr1_el[2]) },
+#ifndef CONFIG_USER_ONLY
+{ .name = "CNTHV_CVAL_EL2", .state = ARM_CP_STATE_AA64,
+  .opc0 = 3, .opc1 = 4, .crn = 14, .crm = 3, .opc2 = 2,
+  .fieldoffset =
+offsetof(CPUARMState, cp15.c14_timer[GTIMER_HYPVIRT].cval),
+  .type = ARM_CP_IO, .access = PL2_RW,
+  .writefn = gt_hv_cval_write, .raw_writefn = raw_write },
+{ .name = "CNTHV_TVAL_EL2", .state = ARM_CP_STATE_BOTH,
+  .opc0 = 3, .opc1 = 4, .crn = 14, .crm = 3, .opc2 = 0,
+  .type = ARM_CP_NO_RAW | ARM_CP_IO, .access = PL2_RW,
+  .resetfn = gt_hv_timer_reset,
+  .readfn = gt_hv_tval_read, .writefn = gt_hv_tval_write },
+{ .name =

[PATCH v4 22/40] target/arm: Update aa64_zva_access for EL2

2019-12-02 Thread Richard Henderson

The comment that we don't support EL2 is somewhat out of date.
Update to include checks against HCR_EL2.TDZ.

Signed-off-by: Richard Henderson 
---
 target/arm/helper.c | 26 +-
 1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 4f5e0b656c..ffa82b5509 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -4109,11 +4109,27 @@ static void tlbi_aa64_ipas2e1is_write(CPUARMState *env, 
const ARMCPRegInfo *ri,
 static CPAccessResult aa64_zva_access(CPUARMState *env, const ARMCPRegInfo *ri,
   bool isread)
 {
-/* We don't implement EL2, so the only control on DC ZVA is the
- * bit in the SCTLR which can prohibit access for EL0.
- */
-if (arm_current_el(env) == 0 && !(env->cp15.sctlr_el[1] & SCTLR_DZE)) {
-return CP_ACCESS_TRAP;
+int cur_el = arm_current_el(env);
+
+if (cur_el < 2) {
+uint64_t hcr = arm_hcr_el2_eff(env);
+
+if (cur_el == 0) {
+if ((hcr & (HCR_E2H | HCR_TGE)) == (HCR_E2H | HCR_TGE)) {
+if (!(env->cp15.sctlr_el[2] & SCTLR_DZE)) {
+return CP_ACCESS_TRAP_EL2;
+}
+} else {
+if (!(env->cp15.sctlr_el[1] & SCTLR_DZE)) {
+return CP_ACCESS_TRAP;
+}
+if (hcr & HCR_TDZ) {
+return CP_ACCESS_TRAP_EL2;
+}
+}
+} else if (hcr & HCR_TDZ) {
+return CP_ACCESS_TRAP_EL2;
+}
 }
 return CP_ACCESS_OK;
 }
-- 
2.17.1

[PATCH v4 30/40] target/arm: Flush tlbs for E2&0 translation regime

2019-12-02 Thread Richard Henderson

Signed-off-by: Richard Henderson 
---
 target/arm/helper.c | 33 ++---
 1 file changed, 26 insertions(+), 7 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 2a4d4c2c0d..b059d9f81a 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -4123,8 +4123,12 @@ static CPAccessResult aa64_cacheop_access(CPUARMState 
*env,
 
 static int vae1_tlbmask(CPUARMState *env)
 {
+/* Since we exclude secure first, we may read HCR_EL2 directly. */
 if (arm_is_secure_below_el3(env)) {
 return ARMMMUIdxBit_SE1 | ARMMMUIdxBit_SE0;
+} else if ((env->cp15.hcr_el2 & (HCR_E2H | HCR_TGE))
+   == (HCR_E2H | HCR_TGE)) {
+return ARMMMUIdxBit_EL20_2 | ARMMMUIdxBit_EL20_0;
 } else {
 return ARMMMUIdxBit_EL10_1 | ARMMMUIdxBit_EL10_0;
 }
@@ -4158,9 +4162,14 @@ static int vmalle1_tlbmask(CPUARMState *env)
  * Note that the 'ALL' scope must invalidate both stage 1 and
  * stage 2 translations, whereas most other scopes only invalidate
  * stage 1 translations.
+ *
+ * Since we exclude secure first, we may read HCR_EL2 directly.
  */
 if (arm_is_secure_below_el3(env)) {
 return ARMMMUIdxBit_SE1 | ARMMMUIdxBit_SE0;
+} else if ((env->cp15.hcr_el2 & (HCR_E2H | HCR_TGE))
+   == (HCR_E2H | HCR_TGE)) {
+return ARMMMUIdxBit_EL20_2 | ARMMMUIdxBit_EL20_0;
 } else if (arm_feature(env, ARM_FEATURE_EL2)) {
 return ARMMMUIdxBit_EL10_1 | ARMMMUIdxBit_EL10_0 | ARMMMUIdxBit_Stage2;
 } else {
@@ -4177,13 +4186,22 @@ static void tlbi_aa64_alle1_write(CPUARMState *env, 
const ARMCPRegInfo *ri,
 tlb_flush_by_mmuidx(cs, mask);
 }
 
+static int vae2_tlbmask(CPUARMState *env)
+{
+if (arm_hcr_el2_eff(env) & HCR_E2H) {
+return ARMMMUIdxBit_EL20_0 | ARMMMUIdxBit_EL20_2;
+} else {
+return ARMMMUIdxBit_E2;
+}
+}
+
 static void tlbi_aa64_alle2_write(CPUARMState *env, const ARMCPRegInfo *ri,
   uint64_t value)
 {
-ARMCPU *cpu = env_archcpu(env);
-CPUState *cs = CPU(cpu);
+CPUState *cs = env_cpu(env);
+int mask = vae2_tlbmask(env);
 
-tlb_flush_by_mmuidx(cs, ARMMMUIdxBit_E2);
+tlb_flush_by_mmuidx(cs, mask);
 }
 
 static void tlbi_aa64_alle3_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -4208,8 +4226,9 @@ static void tlbi_aa64_alle2is_write(CPUARMState *env, 
const ARMCPRegInfo *ri,
 uint64_t value)
 {
 CPUState *cs = env_cpu(env);
+int mask = vae2_tlbmask(env);
 
-tlb_flush_by_mmuidx_all_cpus_synced(cs, ARMMMUIdxBit_E2);
+tlb_flush_by_mmuidx_all_cpus_synced(cs, mask);
 }
 
 static void tlbi_aa64_alle3is_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -4227,11 +4246,11 @@ static void tlbi_aa64_vae2_write(CPUARMState *env, 
const ARMCPRegInfo *ri,
  * Currently handles both VAE2 and VALE2, since we don't support
  * flush-last-level-only.
  */
-ARMCPU *cpu = env_archcpu(env);
-CPUState *cs = CPU(cpu);
+CPUState *cs = env_cpu(env);
+int mask = vae2_tlbmask(env);
 uint64_t pageaddr = sextract64(value << 12, 0, 56);
 
-tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdxBit_E2);
+tlb_flush_page_by_mmuidx(cs, pageaddr, mask);
 }
 
 static void tlbi_aa64_vae3_write(CPUARMState *env, const ARMCPRegInfo *ri,
-- 
2.17.1

[PATCH v4 23/40] target/arm: Update ctr_el0_access for EL2

2019-12-02 Thread Richard Henderson

Update to include checks against HCR_EL2.TID2.

Signed-off-by: Richard Henderson 
---
 target/arm/helper.c | 26 +-
 1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index ffa82b5509..9ad5015d5c 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -5212,11 +5212,27 @@ static const ARMCPRegInfo el3_cp_reginfo[] = {
 static CPAccessResult ctr_el0_access(CPUARMState *env, const ARMCPRegInfo *ri,
  bool isread)
 {
-/* Only accessible in EL0 if SCTLR.UCT is set (and only in AArch64,
- * but the AArch32 CTR has its own reginfo struct)
- */
-if (arm_current_el(env) == 0 && !(env->cp15.sctlr_el[1] & SCTLR_UCT)) {
-return CP_ACCESS_TRAP;
+int cur_el = arm_current_el(env);
+
+if (cur_el < 2) {
+uint64_t hcr = arm_hcr_el2_eff(env);
+
+if (cur_el == 0) {
+if ((hcr & (HCR_E2H | HCR_TGE)) == (HCR_E2H | HCR_TGE)) {
+if (!(env->cp15.sctlr_el[2] & SCTLR_UCT)) {
+return CP_ACCESS_TRAP_EL2;
+}
+} else {
+if (!(env->cp15.sctlr_el[1] & SCTLR_UCT)) {
+return CP_ACCESS_TRAP;
+}
+if (hcr & HCR_TID2) {
+return CP_ACCESS_TRAP_EL2;
+}
+}
+} else if (hcr & HCR_TID2) {
+return CP_ACCESS_TRAP_EL2;
+}
 }
 return CP_ACCESS_OK;
 }
-- 
2.17.1

[PATCH v4 21/40] target/arm: Update arm_sctlr for VHE

2019-12-02 Thread Richard Henderson

Use the correct sctlr for EL2&0 regime.  Due to header ordering,
and where arm_mmu_idx_el is declared, we need to move the function
out of line.  Use the function in many more places in order to
select the correct control.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 target/arm/cpu.h  | 10 +-
 target/arm/helper-a64.c   |  2 +-
 target/arm/helper.c   | 20 +++-
 target/arm/pauth_helper.c |  9 +
 4 files changed, 18 insertions(+), 23 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index bf8eb57e3a..8aa625734f 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -3100,15 +3100,7 @@ static inline bool arm_sctlr_b(CPUARMState *env)
 (env->cp15.sctlr_el[1] & SCTLR_B) != 0;
 }
 
-static inline uint64_t arm_sctlr(CPUARMState *env, int el)
-{
-if (el == 0) {
-/* FIXME: ARMv8.1-VHE S2 translation regime.  */
-return env->cp15.sctlr_el[1];
-} else {
-return env->cp15.sctlr_el[el];
-}
-}
+uint64_t arm_sctlr(CPUARMState *env, int el);
 
 static inline bool arm_cpu_data_is_big_endian_a32(CPUARMState *env,
   bool sctlr_b)
diff --git a/target/arm/helper-a64.c b/target/arm/helper-a64.c
index b4cd680fc4..abf15cdd3f 100644
--- a/target/arm/helper-a64.c
+++ b/target/arm/helper-a64.c
@@ -70,7 +70,7 @@ static void daif_check(CPUARMState *env, uint32_t op,
uint32_t imm, uintptr_t ra)
 {
 /* DAIF update to PSTATE. This is OK from EL0 only if UMA is set.  */
-if (arm_current_el(env) == 0 && !(env->cp15.sctlr_el[1] & SCTLR_UMA)) {
+if (arm_current_el(env) == 0 && !(arm_sctlr(env, 0) & SCTLR_UMA)) {
 raise_exception_ra(env, EXCP_UDEF,
syn_aa64_sysregtrap(0, extract32(op, 0, 3),
extract32(op, 3, 3), 4,
diff --git a/target/arm/helper.c b/target/arm/helper.c
index c6b4c0a25f..4f5e0b656c 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -3863,7 +3863,7 @@ static void aa64_fpsr_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 static CPAccessResult aa64_daif_access(CPUARMState *env, const ARMCPRegInfo 
*ri,
bool isread)
 {
-if (arm_current_el(env) == 0 && !(env->cp15.sctlr_el[1] & SCTLR_UMA)) {
+if (arm_current_el(env) == 0 && !(arm_sctlr(env, 0) & SCTLR_UMA)) {
 return CP_ACCESS_TRAP;
 }
 return CP_ACCESS_OK;
@@ -3882,7 +3882,7 @@ static CPAccessResult aa64_cacheop_access(CPUARMState 
*env,
 /* Cache invalidate/clean: NOP, but EL0 must UNDEF unless
  * SCTLR_EL1.UCI is set.
  */
-if (arm_current_el(env) == 0 && !(env->cp15.sctlr_el[1] & SCTLR_UCI)) {
+if (arm_current_el(env) == 0 && !(arm_sctlr(env, 0) & SCTLR_UCI)) {
 return CP_ACCESS_TRAP;
 }
 return CP_ACCESS_OK;
@@ -8592,14 +8592,24 @@ static uint32_t regime_el(CPUARMState *env, ARMMMUIdx 
mmu_idx)
 }
 }
 
-#ifndef CONFIG_USER_ONLY
+uint64_t arm_sctlr(CPUARMState *env, int el)
+{
+/* Only EL0 needs to be adjusted for EL1&0 or EL2&0. */
+if (el == 0) {
+ARMMMUIdx mmu_idx = arm_mmu_idx_el(env, 0);
+el = (mmu_idx == ARMMMUIdx_EL20_0 ? 2 : 1);
+}
+return env->cp15.sctlr_el[el];
+}
 
 /* Return the SCTLR value which controls this address translation regime */
-static inline uint32_t regime_sctlr(CPUARMState *env, ARMMMUIdx mmu_idx)
+static inline uint64_t regime_sctlr(CPUARMState *env, ARMMMUIdx mmu_idx)
 {
 return env->cp15.sctlr_el[regime_el(env, mmu_idx)];
 }
 
+#ifndef CONFIG_USER_ONLY
+
 /* Return true if the specified stage of address translation is disabled */
 static inline bool regime_translation_disabled(CPUARMState *env,
ARMMMUIdx mmu_idx)
@@ -11332,7 +11342,7 @@ static uint32_t rebuild_hflags_a64(CPUARMState *env, 
int el, int fp_el,
 flags = FIELD_DP32(flags, TBFLAG_A64, ZCR_LEN, zcr_len);
 }
 
-sctlr = arm_sctlr(env, el);
+sctlr = regime_sctlr(env, stage1);
 
 if (arm_cpu_data_is_big_endian_a64(el, sctlr)) {
 flags = FIELD_DP32(flags, TBFLAG_ANY, BE_DATA, 1);
diff --git a/target/arm/pauth_helper.c b/target/arm/pauth_helper.c
index d3194f2043..42c9141bb7 100644
--- a/target/arm/pauth_helper.c
+++ b/target/arm/pauth_helper.c
@@ -386,14 +386,7 @@ static void pauth_check_trap(CPUARMState *env, int el, 
uintptr_t ra)
 
 static bool pauth_key_enabled(CPUARMState *env, int el, uint32_t bit)
 {
-uint32_t sctlr;
-if (el == 0) {
-/* FIXME: ARMv8.1-VHE S2 translation regime.  */
-sctlr = env->cp15.sctlr_el[1];
-} else {
-sctlr = env->cp15.sctlr_el[el];
-}
-return (sctlr & bit) != 0;
+return (arm_sctlr(env, el) & bit) != 0;
 }
 
 uint64_t HELPER(pacia)(CPUARMState *env, uint64_t x, uint64_t y)
-- 
2.17.1

[PATCH v4 26/40] target/arm: Update define_one_arm_cp_reg_with_opaque for VHE

2019-12-02 Thread Richard Henderson

For ARMv8.1, op1 == 5 is reserved for EL2 aliases of
EL1 and EL0 registers.

Signed-off-by: Richard Henderson 
---
 target/arm/helper.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 023b8963cf..1812588fa1 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -7437,13 +7437,10 @@ void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
 mask = PL0_RW;
 break;
 case 4:
+case 5:
 /* min_EL EL2 */
 mask = PL2_RW;
 break;
-case 5:
-/* unallocated encoding, so not possible */
-assert(false);
-break;
 case 6:
 /* min_EL EL3 */
 mask = PL3_RW;
-- 
2.17.1

[PATCH v4 15/40] target/arm: Expand TBFLAG_ANY.MMUIDX to 4 bits

2019-12-02 Thread Richard Henderson

We are about to expand the number of mmuidx to 10, and so need 4 bits.
For the benefit of reading the number out of -d exec, align it to the
penultimate nibble.

Signed-off-by: Richard Henderson 
---
 target/arm/cpu.h | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index ae9fc1ded3..5f295c7e60 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -3176,17 +3176,17 @@ typedef ARMCPU ArchCPU;
  * Unless otherwise noted, these bits are cached in env->hflags.
  */
 FIELD(TBFLAG_ANY, AARCH64_STATE, 31, 1)
-FIELD(TBFLAG_ANY, MMUIDX, 28, 3)
-FIELD(TBFLAG_ANY, SS_ACTIVE, 27, 1)
-FIELD(TBFLAG_ANY, PSTATE_SS, 26, 1) /* Not cached. */
+FIELD(TBFLAG_ANY, SS_ACTIVE, 30, 1)
+FIELD(TBFLAG_ANY, PSTATE_SS, 29, 1) /* Not cached. */
+FIELD(TBFLAG_ANY, BE_DATA, 28, 1)
+FIELD(TBFLAG_ANY, MMUIDX, 24, 4)
 /* Target EL if we take a floating-point-disabled exception */
-FIELD(TBFLAG_ANY, FPEXC_EL, 24, 2)
-FIELD(TBFLAG_ANY, BE_DATA, 23, 1)
+FIELD(TBFLAG_ANY, FPEXC_EL, 22, 2)
 /*
  * For A-profile only, target EL for debug exceptions.
  * Note that this overlaps with the M-profile-only HANDLER and STACKCHECK bits.
  */
-FIELD(TBFLAG_ANY, DEBUG_TARGET_EL, 21, 2)
+FIELD(TBFLAG_ANY, DEBUG_TARGET_EL, 20, 2)
 
 /*
  * Bit usage when in AArch32 state, both A- and M-profile.
-- 
2.17.1

[PATCH v4 20/40] target/arm: Update arm_mmu_idx for VHE

2019-12-02 Thread Richard Henderson

Return the indexes for the EL2&0 regime when the appropriate bits
are set within HCR_EL2.

Signed-off-by: Richard Henderson 
---
 target/arm/helper.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 27adf24fa6..c6b4c0a25f 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11172,12 +11172,16 @@ ARMMMUIdx arm_mmu_idx_el(CPUARMState *env, int el)
 return arm_v7m_mmu_idx_for_secstate(env, env->v7m.secure);
 }
 
+/* See ARM pseudo-function ELIsInHost.  */
 switch (el) {
 case 0:
-/* TODO: ARMv8.1-VHE */
 if (arm_is_secure_below_el3(env)) {
 return ARMMMUIdx_SE0;
 }
+if ((env->cp15.hcr_el2 & (HCR_E2H | HCR_TGE)) == (HCR_E2H | HCR_TGE)
+&& arm_el_is_aa64(env, 2)) {
+return ARMMMUIdx_EL20_0;
+}
 return ARMMMUIdx_EL10_0;
 case 1:
 if (arm_is_secure_below_el3(env)) {
@@ -11185,8 +11189,11 @@ ARMMMUIdx arm_mmu_idx_el(CPUARMState *env, int el)
 }
 return ARMMMUIdx_EL10_1;
 case 2:
-/* TODO: ARMv8.1-VHE */
 /* TODO: ARMv8.4-SecEL2 */
+/* Note that TGE does not apply at EL2.  */
+if ((env->cp15.hcr_el2 & HCR_E2H) && arm_el_is_aa64(env, 2)) {
+return ARMMMUIdx_EL20_2;
+}
 return ARMMMUIdx_E2;
 case 3:
 return ARMMMUIdx_SE3;
-- 
2.17.1

[PATCH v4 13/40] target/arm: Rename ARMMMUIdx_S1E2 to ARMMMUIdx_E2

2019-12-02 Thread Richard Henderson

This is part of a reorganization to the set of mmu_idx.
The non-secure EL2 regime only has a single stage translation;
there is no point in pointing out that the idx is for stage1.

Signed-off-by: Richard Henderson 
---
 target/arm/cpu.h   |  4 ++--
 target/arm/internals.h |  2 +-
 target/arm/helper.c| 22 +++---
 target/arm/translate.c |  2 +-
 4 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index f307de561a..28259be733 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -2866,7 +2866,7 @@ static inline bool arm_excp_unmasked(CPUState *cs, 
unsigned int excp_idx,
 typedef enum ARMMMUIdx {
 ARMMMUIdx_EL10_0 = 0 | ARM_MMU_IDX_A,
 ARMMMUIdx_EL10_1 = 1 | ARM_MMU_IDX_A,
-ARMMMUIdx_S1E2 = 2 | ARM_MMU_IDX_A,
+ARMMMUIdx_E2 = 2 | ARM_MMU_IDX_A,
 ARMMMUIdx_SE3 = 3 | ARM_MMU_IDX_A,
 ARMMMUIdx_SE0 = 4 | ARM_MMU_IDX_A,
 ARMMMUIdx_SE1 = 5 | ARM_MMU_IDX_A,
@@ -2892,7 +2892,7 @@ typedef enum ARMMMUIdx {
 typedef enum ARMMMUIdxBit {
 ARMMMUIdxBit_EL10_0 = 1 << 0,
 ARMMMUIdxBit_EL10_1 = 1 << 1,
-ARMMMUIdxBit_S1E2 = 1 << 2,
+ARMMMUIdxBit_E2 = 1 << 2,
 ARMMMUIdxBit_SE3 = 1 << 3,
 ARMMMUIdxBit_SE0 = 1 << 4,
 ARMMMUIdxBit_SE1 = 1 << 5,
diff --git a/target/arm/internals.h b/target/arm/internals.h
index 50d258b0e1..aee54dc105 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -812,7 +812,7 @@ static inline bool regime_is_secure(CPUARMState *env, 
ARMMMUIdx mmu_idx)
 case ARMMMUIdx_EL10_1:
 case ARMMMUIdx_Stage1_E0:
 case ARMMMUIdx_Stage1_E1:
-case ARMMMUIdx_S1E2:
+case ARMMMUIdx_E2:
 case ARMMMUIdx_Stage2:
 case ARMMMUIdx_MPrivNegPri:
 case ARMMMUIdx_MUserNegPri:
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 98d00b4549..5172843667 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -728,7 +728,7 @@ static void tlbiall_hyp_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 {
 CPUState *cs = env_cpu(env);
 
-tlb_flush_by_mmuidx(cs, ARMMMUIdxBit_S1E2);
+tlb_flush_by_mmuidx(cs, ARMMMUIdxBit_E2);
 }
 
 static void tlbiall_hyp_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -736,7 +736,7 @@ static void tlbiall_hyp_is_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 {
 CPUState *cs = env_cpu(env);
 
-tlb_flush_by_mmuidx_all_cpus_synced(cs, ARMMMUIdxBit_S1E2);
+tlb_flush_by_mmuidx_all_cpus_synced(cs, ARMMMUIdxBit_E2);
 }
 
 static void tlbimva_hyp_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -745,7 +745,7 @@ static void tlbimva_hyp_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 CPUState *cs = env_cpu(env);
 uint64_t pageaddr = value & ~MAKE_64BIT_MASK(0, 12);
 
-tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdxBit_S1E2);
+tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdxBit_E2);
 }
 
 static void tlbimva_hyp_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -755,7 +755,7 @@ static void tlbimva_hyp_is_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 uint64_t pageaddr = value & ~MAKE_64BIT_MASK(0, 12);
 
 tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
- ARMMMUIdxBit_S1E2);
+ ARMMMUIdxBit_E2);
 }
 
 static const ARMCPRegInfo cp_reginfo[] = {
@@ -3189,7 +3189,7 @@ static void ats1h_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 MMUAccessType access_type = ri->opc2 & 1 ? MMU_DATA_STORE : MMU_DATA_LOAD;
 uint64_t par64;
 
-par64 = do_ats_write(env, value, access_type, ARMMMUIdx_S1E2);
+par64 = do_ats_write(env, value, access_type, ARMMMUIdx_E2);
 
 A32_BANKED_CURRENT_REG_SET(env, par, par64);
 }
@@ -3217,7 +3217,7 @@ static void ats_write64(CPUARMState *env, const 
ARMCPRegInfo *ri,
 mmu_idx = secure ? ARMMMUIdx_SE1 : ARMMMUIdx_Stage1_E1;
 break;
 case 4: /* AT S1E2R, AT S1E2W */
-mmu_idx = ARMMMUIdx_S1E2;
+mmu_idx = ARMMMUIdx_E2;
 break;
 case 6: /* AT S1E3R, AT S1E3W */
 mmu_idx = ARMMMUIdx_SE3;
@@ -3954,7 +3954,7 @@ static void tlbi_aa64_alle2_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 ARMCPU *cpu = env_archcpu(env);
 CPUState *cs = CPU(cpu);
 
-tlb_flush_by_mmuidx(cs, ARMMMUIdxBit_S1E2);
+tlb_flush_by_mmuidx(cs, ARMMMUIdxBit_E2);
 }
 
 static void tlbi_aa64_alle3_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -3980,7 +3980,7 @@ static void tlbi_aa64_alle2is_write(CPUARMState *env, 
const ARMCPRegInfo *ri,
 {
 CPUState *cs = env_cpu(env);
 
-tlb_flush_by_mmuidx_all_cpus_synced(cs, ARMMMUIdxBit_S1E2);
+tlb_flush_by_mmuidx_all_cpus_synced(cs, ARMMMUIdxBit_E2);
 }
 
 static void tlbi_aa64_alle3is_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -4002,7 +4002,7 @@ static void tlbi_aa64_vae2_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 CPUState *cs = CPU(cpu);
 uint64_t pageaddr = sextract64(value << 12, 0, 56);
 
-

[PATCH v4 12/40] target/arm: Rename ARMMMUIdx_S1E3 to ARMMMUIdx_SE3

2019-12-02 Thread Richard Henderson

This is part of a reorganization to the set of mmu_idx.
The EL3 regime only has a single stage translation, and
is always secure.

Signed-off-by: Richard Henderson 
---
 target/arm/cpu.h   |  4 ++--
 target/arm/internals.h |  2 +-
 target/arm/helper.c| 14 +++---
 target/arm/translate.c |  2 +-
 4 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index e8ee316e05..f307de561a 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -2867,7 +2867,7 @@ typedef enum ARMMMUIdx {
 ARMMMUIdx_EL10_0 = 0 | ARM_MMU_IDX_A,
 ARMMMUIdx_EL10_1 = 1 | ARM_MMU_IDX_A,
 ARMMMUIdx_S1E2 = 2 | ARM_MMU_IDX_A,
-ARMMMUIdx_S1E3 = 3 | ARM_MMU_IDX_A,
+ARMMMUIdx_SE3 = 3 | ARM_MMU_IDX_A,
 ARMMMUIdx_SE0 = 4 | ARM_MMU_IDX_A,
 ARMMMUIdx_SE1 = 5 | ARM_MMU_IDX_A,
 ARMMMUIdx_Stage2 = 6 | ARM_MMU_IDX_A,
@@ -2893,7 +2893,7 @@ typedef enum ARMMMUIdxBit {
 ARMMMUIdxBit_EL10_0 = 1 << 0,
 ARMMMUIdxBit_EL10_1 = 1 << 1,
 ARMMMUIdxBit_S1E2 = 1 << 2,
-ARMMMUIdxBit_S1E3 = 1 << 3,
+ARMMMUIdxBit_SE3 = 1 << 3,
 ARMMMUIdxBit_SE0 = 1 << 4,
 ARMMMUIdxBit_SE1 = 1 << 5,
 ARMMMUIdxBit_Stage2 = 1 << 6,
diff --git a/target/arm/internals.h b/target/arm/internals.h
index 3600bf9122..50d258b0e1 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -819,7 +819,7 @@ static inline bool regime_is_secure(CPUARMState *env, 
ARMMMUIdx mmu_idx)
 case ARMMMUIdx_MPriv:
 case ARMMMUIdx_MUser:
 return false;
-case ARMMMUIdx_S1E3:
+case ARMMMUIdx_SE3:
 case ARMMMUIdx_SE0:
 case ARMMMUIdx_SE1:
 case ARMMMUIdx_MSPrivNegPri:
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 377825431a..98d00b4549 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -3138,7 +3138,7 @@ static void ats_write(CPUARMState *env, const 
ARMCPRegInfo *ri, uint64_t value)
 /* stage 1 current state PL1: ATS1CPR, ATS1CPW */
 switch (el) {
 case 3:
-mmu_idx = ARMMMUIdx_S1E3;
+mmu_idx = ARMMMUIdx_SE3;
 break;
 case 2:
 mmu_idx = ARMMMUIdx_Stage1_E1;
@@ -3220,7 +3220,7 @@ static void ats_write64(CPUARMState *env, const 
ARMCPRegInfo *ri,
 mmu_idx = ARMMMUIdx_S1E2;
 break;
 case 6: /* AT S1E3R, AT S1E3W */
-mmu_idx = ARMMMUIdx_S1E3;
+mmu_idx = ARMMMUIdx_SE3;
 break;
 default:
 g_assert_not_reached();
@@ -3963,7 +3963,7 @@ static void tlbi_aa64_alle3_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 ARMCPU *cpu = env_archcpu(env);
 CPUState *cs = CPU(cpu);
 
-tlb_flush_by_mmuidx(cs, ARMMMUIdxBit_S1E3);
+tlb_flush_by_mmuidx(cs, ARMMMUIdxBit_SE3);
 }
 
 static void tlbi_aa64_alle1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -3988,7 +3988,7 @@ static void tlbi_aa64_alle3is_write(CPUARMState *env, 
const ARMCPRegInfo *ri,
 {
 CPUState *cs = env_cpu(env);
 
-tlb_flush_by_mmuidx_all_cpus_synced(cs, ARMMMUIdxBit_S1E3);
+tlb_flush_by_mmuidx_all_cpus_synced(cs, ARMMMUIdxBit_SE3);
 }
 
 static void tlbi_aa64_vae2_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -4016,7 +4016,7 @@ static void tlbi_aa64_vae3_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 CPUState *cs = CPU(cpu);
 uint64_t pageaddr = sextract64(value << 12, 0, 56);
 
-tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdxBit_S1E3);
+tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdxBit_SE3);
 }
 
 static void tlbi_aa64_vae1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -4065,7 +4065,7 @@ static void tlbi_aa64_vae3is_write(CPUARMState *env, 
const ARMCPRegInfo *ri,
 uint64_t pageaddr = sextract64(value << 12, 0, 56);
 
 tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
- ARMMMUIdxBit_S1E3);
+ ARMMMUIdxBit_SE3);
 }
 
 static void tlbi_aa64_ipas2e1_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -8567,7 +8567,7 @@ static inline uint32_t regime_el(CPUARMState *env, 
ARMMMUIdx mmu_idx)
 case ARMMMUIdx_Stage2:
 case ARMMMUIdx_S1E2:
 return 2;
-case ARMMMUIdx_S1E3:
+case ARMMMUIdx_SE3:
 return 3;
 case ARMMMUIdx_SE0:
 return arm_el_is_aa64(env, 3) ? 1 : 3;
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 787e34f258..6cf2fe2806 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -156,7 +156,7 @@ static inline int get_a32_user_mem_index(DisasContext *s)
 case ARMMMUIdx_EL10_0:
 case ARMMMUIdx_EL10_1:
 return arm_to_core_mmu_idx(ARMMMUIdx_EL10_0);
-case ARMMMUIdx_S1E3:
+case ARMMMUIdx_SE3:
 case ARMMMUIdx_SE0:
 case ARMMMUIdx_SE1:
 return arm_to_core_mmu_idx(ARMMMUIdx_SE0);
-- 
2.17.1

[PATCH v4 08/40] target/arm: Rename ARMMMUIdx_S12NSE to ARMMMUIdx_E10_

2019-12-02 Thread Richard Henderson

This is part of a reorganization to the set of mmu_idx.
This emphasizes that they apply to the EL1&0 regime.

Signed-off-by: Richard Henderson 
---
 target/arm/cpu.h   |  8 
 target/arm/internals.h |  4 ++--
 target/arm/helper.c| 40 +++---
 target/arm/translate-a64.c |  4 ++--
 target/arm/translate.c |  6 +++---
 5 files changed, 31 insertions(+), 31 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 9729e62d2c..802cddd2df 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -2864,8 +2864,8 @@ static inline bool arm_excp_unmasked(CPUState *cs, 
unsigned int excp_idx,
 #define ARM_MMU_IDX_COREIDX_MASK 0x7
 
 typedef enum ARMMMUIdx {
-ARMMMUIdx_S12NSE0 = 0 | ARM_MMU_IDX_A,
-ARMMMUIdx_S12NSE1 = 1 | ARM_MMU_IDX_A,
+ARMMMUIdx_EL10_0 = 0 | ARM_MMU_IDX_A,
+ARMMMUIdx_EL10_1 = 1 | ARM_MMU_IDX_A,
 ARMMMUIdx_S1E2 = 2 | ARM_MMU_IDX_A,
 ARMMMUIdx_S1E3 = 3 | ARM_MMU_IDX_A,
 ARMMMUIdx_S1SE0 = 4 | ARM_MMU_IDX_A,
@@ -2890,8 +2890,8 @@ typedef enum ARMMMUIdx {
  * for use when calling tlb_flush_by_mmuidx() and friends.
  */
 typedef enum ARMMMUIdxBit {
-ARMMMUIdxBit_S12NSE0 = 1 << 0,
-ARMMMUIdxBit_S12NSE1 = 1 << 1,
+ARMMMUIdxBit_EL10_0 = 1 << 0,
+ARMMMUIdxBit_EL10_1 = 1 << 1,
 ARMMMUIdxBit_S1E2 = 1 << 2,
 ARMMMUIdxBit_S1E3 = 1 << 3,
 ARMMMUIdxBit_S1SE0 = 1 << 4,
diff --git a/target/arm/internals.h b/target/arm/internals.h
index f5313dd3d4..54142dd789 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -808,8 +808,8 @@ static inline void arm_call_el_change_hook(ARMCPU *cpu)
 static inline bool regime_is_secure(CPUARMState *env, ARMMMUIdx mmu_idx)
 {
 switch (mmu_idx) {
-case ARMMMUIdx_S12NSE0:
-case ARMMMUIdx_S12NSE1:
+case ARMMMUIdx_EL10_0:
+case ARMMMUIdx_EL10_1:
 case ARMMMUIdx_S1NSE0:
 case ARMMMUIdx_S1NSE1:
 case ARMMMUIdx_S1E2:
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 6c09cda4ea..d2b90763ca 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -670,8 +670,8 @@ static void tlbiall_nsnh_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 CPUState *cs = env_cpu(env);
 
 tlb_flush_by_mmuidx(cs,
-ARMMMUIdxBit_S12NSE1 |
-ARMMMUIdxBit_S12NSE0 |
+ARMMMUIdxBit_EL10_1 |
+ARMMMUIdxBit_EL10_0 |
 ARMMMUIdxBit_S2NS);
 }
 
@@ -681,8 +681,8 @@ static void tlbiall_nsnh_is_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 CPUState *cs = env_cpu(env);
 
 tlb_flush_by_mmuidx_all_cpus_synced(cs,
-ARMMMUIdxBit_S12NSE1 |
-ARMMMUIdxBit_S12NSE0 |
+ARMMMUIdxBit_EL10_1 |
+ARMMMUIdxBit_EL10_0 |
 ARMMMUIdxBit_S2NS);
 }
 
@@ -3068,7 +3068,7 @@ static uint64_t do_ats_write(CPUARMState *env, uint64_t 
value,
 format64 = arm_s1_regime_using_lpae_format(env, mmu_idx);
 
 if (arm_feature(env, ARM_FEATURE_EL2)) {
-if (mmu_idx == ARMMMUIdx_S12NSE0 || mmu_idx == ARMMMUIdx_S12NSE1) {
+if (mmu_idx == ARMMMUIdx_EL10_0 || mmu_idx == ARMMMUIdx_EL10_1) {
 format64 |= env->cp15.hcr_el2 & (HCR_VM | HCR_DC);
 } else {
 format64 |= arm_current_el(env) == 2;
@@ -3167,11 +3167,11 @@ static void ats_write(CPUARMState *env, const 
ARMCPRegInfo *ri, uint64_t value)
 break;
 case 4:
 /* stage 1+2 NonSecure PL1: ATS12NSOPR, ATS12NSOPW */
-mmu_idx = ARMMMUIdx_S12NSE1;
+mmu_idx = ARMMMUIdx_EL10_1;
 break;
 case 6:
 /* stage 1+2 NonSecure PL0: ATS12NSOUR, ATS12NSOUW */
-mmu_idx = ARMMMUIdx_S12NSE0;
+mmu_idx = ARMMMUIdx_EL10_0;
 break;
 default:
 g_assert_not_reached();
@@ -3229,10 +3229,10 @@ static void ats_write64(CPUARMState *env, const 
ARMCPRegInfo *ri,
 mmu_idx = secure ? ARMMMUIdx_S1SE0 : ARMMMUIdx_S1NSE0;
 break;
 case 4: /* AT S12E1R, AT S12E1W */
-mmu_idx = secure ? ARMMMUIdx_S1SE1 : ARMMMUIdx_S12NSE1;
+mmu_idx = secure ? ARMMMUIdx_S1SE1 : ARMMMUIdx_EL10_1;
 break;
 case 6: /* AT S12E0R, AT S12E0W */
-mmu_idx = secure ? ARMMMUIdx_S1SE0 : ARMMMUIdx_S12NSE0;
+mmu_idx = secure ? ARMMMUIdx_S1SE0 : ARMMMUIdx_EL10_0;
 break;
 default:
 g_assert_not_reached();
@@ -3531,8 +3531,8 @@ static void vttbr_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 /* Accesses to VTTBR may change the VMID so we must flush the TLB.  */
 if (raw_read(env, ri) != value) {
 tlb_flush_by_mmuidx(cs,
-ARMMMUIdxBit_S12NSE1 |
-ARMMMUIdxBit_S12NSE0 |
+ARMMMUIdxBit_EL10_1 |
+

[PATCH v4 16/40] target/arm: Rearrange ARMMMUIdxBit

2019-12-02 Thread Richard Henderson

Define via macro expansion, so that renumbering of the base ARMMMUIdx
symbols is automatically reflexed in the bit definitions.

Signed-off-by: Richard Henderson 
---
 target/arm/cpu.h | 39 +++
 1 file changed, 23 insertions(+), 16 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 5f295c7e60..6ba5126852 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -2886,27 +2886,34 @@ typedef enum ARMMMUIdx {
 ARMMMUIdx_Stage1_E1 = 1 | ARM_MMU_IDX_NOTLB,
 } ARMMMUIdx;
 
-/* Bit macros for the core-mmu-index values for each index,
+/*
+ * Bit macros for the core-mmu-index values for each index,
  * for use when calling tlb_flush_by_mmuidx() and friends.
  */
+#define TO_CORE_BIT(NAME) \
+ARMMMUIdxBit_##NAME = 1 << (ARMMMUIdx_##NAME & ARM_MMU_IDX_COREIDX_MASK)
+
 typedef enum ARMMMUIdxBit {
-ARMMMUIdxBit_EL10_0 = 1 << 0,
-ARMMMUIdxBit_EL10_1 = 1 << 1,
-ARMMMUIdxBit_E2 = 1 << 2,
-ARMMMUIdxBit_SE3 = 1 << 3,
-ARMMMUIdxBit_SE0 = 1 << 4,
-ARMMMUIdxBit_SE1 = 1 << 5,
-ARMMMUIdxBit_Stage2 = 1 << 6,
-ARMMMUIdxBit_MUser = 1 << 0,
-ARMMMUIdxBit_MPriv = 1 << 1,
-ARMMMUIdxBit_MUserNegPri = 1 << 2,
-ARMMMUIdxBit_MPrivNegPri = 1 << 3,
-ARMMMUIdxBit_MSUser = 1 << 4,
-ARMMMUIdxBit_MSPriv = 1 << 5,
-ARMMMUIdxBit_MSUserNegPri = 1 << 6,
-ARMMMUIdxBit_MSPrivNegPri = 1 << 7,
+TO_CORE_BIT(EL10_0),
+TO_CORE_BIT(EL10_1),
+TO_CORE_BIT(E2),
+TO_CORE_BIT(SE0),
+TO_CORE_BIT(SE1),
+TO_CORE_BIT(SE3),
+TO_CORE_BIT(Stage2),
+
+TO_CORE_BIT(MUser),
+TO_CORE_BIT(MPriv),
+TO_CORE_BIT(MUserNegPri),
+TO_CORE_BIT(MPrivNegPri),
+TO_CORE_BIT(MSUser),
+TO_CORE_BIT(MSPriv),
+TO_CORE_BIT(MSUserNegPri),
+TO_CORE_BIT(MSPrivNegPri),
 } ARMMMUIdxBit;
 
+#undef TO_CORE_BIT
+
 #define MMU_USER_IDX 0
 
 static inline int arm_to_core_mmu_idx(ARMMMUIdx mmu_idx)
-- 
2.17.1

[PATCH v4 10/40] target/arm: Rename ARMMMUIdx_S1NSE* to ARMMMUIdx_Stage1_E*

2019-12-02 Thread Richard Henderson

This is part of a reorganization to the set of mmu_idx.
The EL1&0 regime is the only one that uses 2-stage translation.
Spelling out Stage avoids confusion with Secure.

Signed-off-by: Richard Henderson 
---
 target/arm/cpu.h   |  4 ++--
 target/arm/internals.h |  6 +++---
 target/arm/helper.c| 27 ++-
 3 files changed, 19 insertions(+), 18 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index fdb868f2e9..0714c52176 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -2882,8 +2882,8 @@ typedef enum ARMMMUIdx {
 /* Indexes below here don't have TLBs and are used only for AT system
  * instructions or for the first stage of an S12 page table walk.
  */
-ARMMMUIdx_S1NSE0 = 0 | ARM_MMU_IDX_NOTLB,
-ARMMMUIdx_S1NSE1 = 1 | ARM_MMU_IDX_NOTLB,
+ARMMMUIdx_Stage1_E0 = 0 | ARM_MMU_IDX_NOTLB,
+ARMMMUIdx_Stage1_E1 = 1 | ARM_MMU_IDX_NOTLB,
 } ARMMMUIdx;
 
 /* Bit macros for the core-mmu-index values for each index,
diff --git a/target/arm/internals.h b/target/arm/internals.h
index ca8be78bbf..3fd1518f3b 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -810,8 +810,8 @@ static inline bool regime_is_secure(CPUARMState *env, 
ARMMMUIdx mmu_idx)
 switch (mmu_idx) {
 case ARMMMUIdx_EL10_0:
 case ARMMMUIdx_EL10_1:
-case ARMMMUIdx_S1NSE0:
-case ARMMMUIdx_S1NSE1:
+case ARMMMUIdx_Stage1_E0:
+case ARMMMUIdx_Stage1_E1:
 case ARMMMUIdx_S1E2:
 case ARMMMUIdx_Stage2:
 case ARMMMUIdx_MPrivNegPri:
@@ -975,7 +975,7 @@ ARMMMUIdx arm_mmu_idx(CPUARMState *env);
 #ifdef CONFIG_USER_ONLY
 static inline ARMMMUIdx arm_stage1_mmu_idx(CPUARMState *env)
 {
-return ARMMMUIdx_S1NSE0;
+return ARMMMUIdx_Stage1_E0;
 }
 #else
 ARMMMUIdx arm_stage1_mmu_idx(CPUARMState *env);
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 97677f8482..a34accec20 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -2992,7 +2992,8 @@ static uint64_t do_ats_write(CPUARMState *env, uint64_t 
value,
 bool take_exc = false;
 
 if (fi.s1ptw && current_el == 1 && !arm_is_secure(env)
-&& (mmu_idx == ARMMMUIdx_S1NSE1 || mmu_idx == ARMMMUIdx_S1NSE0)) {
+&& (mmu_idx == ARMMMUIdx_Stage1_E1
+|| mmu_idx == ARMMMUIdx_Stage1_E0)) {
 /*
  * Synchronous stage 2 fault on an access made as part of the
  * translation table walk for AT S1E0* or AT S1E1* insn
@@ -3140,10 +3141,10 @@ static void ats_write(CPUARMState *env, const 
ARMCPRegInfo *ri, uint64_t value)
 mmu_idx = ARMMMUIdx_S1E3;
 break;
 case 2:
-mmu_idx = ARMMMUIdx_S1NSE1;
+mmu_idx = ARMMMUIdx_Stage1_E1;
 break;
 case 1:
-mmu_idx = secure ? ARMMMUIdx_S1SE1 : ARMMMUIdx_S1NSE1;
+mmu_idx = secure ? ARMMMUIdx_S1SE1 : ARMMMUIdx_Stage1_E1;
 break;
 default:
 g_assert_not_reached();
@@ -3156,10 +3157,10 @@ static void ats_write(CPUARMState *env, const 
ARMCPRegInfo *ri, uint64_t value)
 mmu_idx = ARMMMUIdx_S1SE0;
 break;
 case 2:
-mmu_idx = ARMMMUIdx_S1NSE0;
+mmu_idx = ARMMMUIdx_Stage1_E0;
 break;
 case 1:
-mmu_idx = secure ? ARMMMUIdx_S1SE0 : ARMMMUIdx_S1NSE0;
+mmu_idx = secure ? ARMMMUIdx_S1SE0 : ARMMMUIdx_Stage1_E0;
 break;
 default:
 g_assert_not_reached();
@@ -3213,7 +3214,7 @@ static void ats_write64(CPUARMState *env, const 
ARMCPRegInfo *ri,
 case 0:
 switch (ri->opc1) {
 case 0: /* AT S1E1R, AT S1E1W */
-mmu_idx = secure ? ARMMMUIdx_S1SE1 : ARMMMUIdx_S1NSE1;
+mmu_idx = secure ? ARMMMUIdx_S1SE1 : ARMMMUIdx_Stage1_E1;
 break;
 case 4: /* AT S1E2R, AT S1E2W */
 mmu_idx = ARMMMUIdx_S1E2;
@@ -3226,7 +3227,7 @@ static void ats_write64(CPUARMState *env, const 
ARMCPRegInfo *ri,
 }
 break;
 case 2: /* AT S1E0R, AT S1E0W */
-mmu_idx = secure ? ARMMMUIdx_S1SE0 : ARMMMUIdx_S1NSE0;
+mmu_idx = secure ? ARMMMUIdx_S1SE0 : ARMMMUIdx_Stage1_E0;
 break;
 case 4: /* AT S12E1R, AT S12E1W */
 mmu_idx = secure ? ARMMMUIdx_S1SE1 : ARMMMUIdx_EL10_1;
@@ -8571,8 +8572,8 @@ static inline uint32_t regime_el(CPUARMState *env, 
ARMMMUIdx mmu_idx)
 case ARMMMUIdx_S1SE0:
 return arm_el_is_aa64(env, 3) ? 1 : 3;
 case ARMMMUIdx_S1SE1:
-case ARMMMUIdx_S1NSE0:
-case ARMMMUIdx_S1NSE1:
+case ARMMMUIdx_Stage1_E0:
+case ARMMMUIdx_Stage1_E1:
 case ARMMMUIdx_MPrivNegPri:
 case ARMMMUIdx_MUserNegPri:
 case ARMMMUIdx_MPriv:
@@ -8630,7 +8631,7 @@ static inline bool 
regime_translation_disabled(CPUARMState *env,
 }
 
 if ((env->cp15.hcr_el2 & HCR_DC) &&
-(mmu_idx == ARMMMUIdx_S1NSE0 || mmu_idx == ARMMMUIdx_S1NSE1)) {
+(mmu_idx == ARMMMUIdx_Stage1_E0 || mmu_idx ==

[PATCH v4 11/40] target/arm: Rename ARMMMUIdx_S1SE* to ARMMMUIdx_SE*

2019-12-02 Thread Richard Henderson

This is part of a reorganization to the set of mmu_idx.
The Secure regimes all have a single stage translation;
there is no point in pointing out that the idx is for stage1.

Signed-off-by: Richard Henderson 
---
 target/arm/cpu.h   |  8 
 target/arm/internals.h |  4 ++--
 target/arm/translate.h |  2 +-
 target/arm/helper.c| 26 +-
 target/arm/translate-a64.c |  4 ++--
 target/arm/translate.c |  6 +++---
 6 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 0714c52176..e8ee316e05 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -2868,8 +2868,8 @@ typedef enum ARMMMUIdx {
 ARMMMUIdx_EL10_1 = 1 | ARM_MMU_IDX_A,
 ARMMMUIdx_S1E2 = 2 | ARM_MMU_IDX_A,
 ARMMMUIdx_S1E3 = 3 | ARM_MMU_IDX_A,
-ARMMMUIdx_S1SE0 = 4 | ARM_MMU_IDX_A,
-ARMMMUIdx_S1SE1 = 5 | ARM_MMU_IDX_A,
+ARMMMUIdx_SE0 = 4 | ARM_MMU_IDX_A,
+ARMMMUIdx_SE1 = 5 | ARM_MMU_IDX_A,
 ARMMMUIdx_Stage2 = 6 | ARM_MMU_IDX_A,
 ARMMMUIdx_MUser = 0 | ARM_MMU_IDX_M,
 ARMMMUIdx_MPriv = 1 | ARM_MMU_IDX_M,
@@ -2894,8 +2894,8 @@ typedef enum ARMMMUIdxBit {
 ARMMMUIdxBit_EL10_1 = 1 << 1,
 ARMMMUIdxBit_S1E2 = 1 << 2,
 ARMMMUIdxBit_S1E3 = 1 << 3,
-ARMMMUIdxBit_S1SE0 = 1 << 4,
-ARMMMUIdxBit_S1SE1 = 1 << 5,
+ARMMMUIdxBit_SE0 = 1 << 4,
+ARMMMUIdxBit_SE1 = 1 << 5,
 ARMMMUIdxBit_Stage2 = 1 << 6,
 ARMMMUIdxBit_MUser = 1 << 0,
 ARMMMUIdxBit_MPriv = 1 << 1,
diff --git a/target/arm/internals.h b/target/arm/internals.h
index 3fd1518f3b..3600bf9122 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -820,8 +820,8 @@ static inline bool regime_is_secure(CPUARMState *env, 
ARMMMUIdx mmu_idx)
 case ARMMMUIdx_MUser:
 return false;
 case ARMMMUIdx_S1E3:
-case ARMMMUIdx_S1SE0:
-case ARMMMUIdx_S1SE1:
+case ARMMMUIdx_SE0:
+case ARMMMUIdx_SE1:
 case ARMMMUIdx_MSPrivNegPri:
 case ARMMMUIdx_MSUserNegPri:
 case ARMMMUIdx_MSPriv:
diff --git a/target/arm/translate.h b/target/arm/translate.h
index dd24f91f26..3760159661 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -124,7 +124,7 @@ static inline int default_exception_el(DisasContext *s)
  * exceptions can only be routed to ELs above 1, so we target the higher of
  * 1 or the current EL.
  */
-return (s->mmu_idx == ARMMMUIdx_S1SE0 && s->secure_routed_to_el3)
+return (s->mmu_idx == ARMMMUIdx_SE0 && s->secure_routed_to_el3)
 ? 3 : MAX(1, s->current_el);
 }
 
diff --git a/target/arm/helper.c b/target/arm/helper.c
index a34accec20..377825431a 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -3144,7 +3144,7 @@ static void ats_write(CPUARMState *env, const 
ARMCPRegInfo *ri, uint64_t value)
 mmu_idx = ARMMMUIdx_Stage1_E1;
 break;
 case 1:
-mmu_idx = secure ? ARMMMUIdx_S1SE1 : ARMMMUIdx_Stage1_E1;
+mmu_idx = secure ? ARMMMUIdx_SE1 : ARMMMUIdx_Stage1_E1;
 break;
 default:
 g_assert_not_reached();
@@ -3154,13 +3154,13 @@ static void ats_write(CPUARMState *env, const 
ARMCPRegInfo *ri, uint64_t value)
 /* stage 1 current state PL0: ATS1CUR, ATS1CUW */
 switch (el) {
 case 3:
-mmu_idx = ARMMMUIdx_S1SE0;
+mmu_idx = ARMMMUIdx_SE0;
 break;
 case 2:
 mmu_idx = ARMMMUIdx_Stage1_E0;
 break;
 case 1:
-mmu_idx = secure ? ARMMMUIdx_S1SE0 : ARMMMUIdx_Stage1_E0;
+mmu_idx = secure ? ARMMMUIdx_SE0 : ARMMMUIdx_Stage1_E0;
 break;
 default:
 g_assert_not_reached();
@@ -3214,7 +3214,7 @@ static void ats_write64(CPUARMState *env, const 
ARMCPRegInfo *ri,
 case 0:
 switch (ri->opc1) {
 case 0: /* AT S1E1R, AT S1E1W */
-mmu_idx = secure ? ARMMMUIdx_S1SE1 : ARMMMUIdx_Stage1_E1;
+mmu_idx = secure ? ARMMMUIdx_SE1 : ARMMMUIdx_Stage1_E1;
 break;
 case 4: /* AT S1E2R, AT S1E2W */
 mmu_idx = ARMMMUIdx_S1E2;
@@ -3227,13 +3227,13 @@ static void ats_write64(CPUARMState *env, const 
ARMCPRegInfo *ri,
 }
 break;
 case 2: /* AT S1E0R, AT S1E0W */
-mmu_idx = secure ? ARMMMUIdx_S1SE0 : ARMMMUIdx_Stage1_E0;
+mmu_idx = secure ? ARMMMUIdx_SE0 : ARMMMUIdx_Stage1_E0;
 break;
 case 4: /* AT S12E1R, AT S12E1W */
-mmu_idx = secure ? ARMMMUIdx_S1SE1 : ARMMMUIdx_EL10_1;
+mmu_idx = secure ? ARMMMUIdx_SE1 : ARMMMUIdx_EL10_1;
 break;
 case 6: /* AT S12E0R, AT S12E0W */
-mmu_idx = secure ? ARMMMUIdx_S1SE0 : ARMMMUIdx_EL10_0;
+mmu_idx = secure ? ARMMMUIdx_SE0 : ARMMMUIdx_EL10_0;
 break;
 default:
 g_assert_not_reached();
@@ -3895,7 +3895,7 @@ static CPAccessResult aa64_cacheop_access(CPUARMState 
*env,
 static int vae1_tlbmask(CPUARMState *env)
 {
 if

[PATCH v4 05/40] target/arm: Update CNTVCT_EL0 for VHE

2019-12-02 Thread Richard Henderson

The virtual offset may be 0 depending on EL, E2H and TGE.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 target/arm/helper.c | 40 +---
 1 file changed, 37 insertions(+), 3 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 06ec4641f3..731507a82f 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -2475,9 +2475,31 @@ static uint64_t gt_cnt_read(CPUARMState *env, const 
ARMCPRegInfo *ri)
 return gt_get_countervalue(env);
 }
 
+static uint64_t gt_virt_cnt_offset(CPUARMState *env)
+{
+uint64_t hcr;
+
+switch (arm_current_el(env)) {
+case 2:
+hcr = arm_hcr_el2_eff(env);
+if (hcr & HCR_E2H) {
+return 0;
+}
+break;
+case 0:
+hcr = arm_hcr_el2_eff(env);
+if ((hcr & (HCR_E2H | HCR_TGE)) == (HCR_E2H | HCR_TGE)) {
+return 0;
+}
+break;
+}
+
+return env->cp15.cntvoff_el2;
+}
+
 static uint64_t gt_virt_cnt_read(CPUARMState *env, const ARMCPRegInfo *ri)
 {
-return gt_get_countervalue(env) - env->cp15.cntvoff_el2;
+return gt_get_countervalue(env) - gt_virt_cnt_offset(env);
 }
 
 static void gt_cval_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -2492,7 +2514,13 @@ static void gt_cval_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 static uint64_t gt_tval_read(CPUARMState *env, const ARMCPRegInfo *ri,
  int timeridx)
 {
-uint64_t offset = timeridx == GTIMER_VIRT ? env->cp15.cntvoff_el2 : 0;
+uint64_t offset = 0;
+
+switch (timeridx) {
+case GTIMER_VIRT:
+offset = gt_virt_cnt_offset(env);
+break;
+}
 
 return (uint32_t)(env->cp15.c14_timer[timeridx].cval -
   (gt_get_countervalue(env) - offset));
@@ -2502,7 +2530,13 @@ static void gt_tval_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
   int timeridx,
   uint64_t value)
 {
-uint64_t offset = timeridx == GTIMER_VIRT ? env->cp15.cntvoff_el2 : 0;
+uint64_t offset = 0;
+
+switch (timeridx) {
+case GTIMER_VIRT:
+offset = gt_virt_cnt_offset(env);
+break;
+}
 
 trace_arm_gt_tval_write(timeridx, value);
 env->cp15.c14_timer[timeridx].cval = gt_get_countervalue(env) - offset +
-- 
2.17.1

[PATCH v4 14/40] target/arm: Recover 4 bits from TBFLAGs

2019-12-02 Thread Richard Henderson

We had completely run out of TBFLAG bits.
Split A- and M-profile bits into two overlapping buckets.
This results in 4 free bits.

Signed-off-by: Richard Henderson 
---
 target/arm/cpu.h   | 52 ---
 target/arm/helper.c| 17 ++---
 target/arm/translate.c | 56 +++---
 3 files changed, 70 insertions(+), 55 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 28259be733..ae9fc1ded3 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -3188,38 +3188,50 @@ FIELD(TBFLAG_ANY, BE_DATA, 23, 1)
  */
 FIELD(TBFLAG_ANY, DEBUG_TARGET_EL, 21, 2)
 
-/* Bit usage when in AArch32 state: */
-FIELD(TBFLAG_A32, THUMB, 0, 1)  /* Not cached. */
-FIELD(TBFLAG_A32, VECLEN, 1, 3) /* Not cached. */
-FIELD(TBFLAG_A32, VECSTRIDE, 4, 2)  /* Not cached. */
+/*
+ * Bit usage when in AArch32 state, both A- and M-profile.
+ */
+FIELD(TBFLAG_AM32, CONDEXEC, 0, 8)  /* Not cached. */
+FIELD(TBFLAG_AM32, THUMB, 8, 1) /* Not cached. */
+
+/*
+ * Bit usage when in AArch32 state, for A-profile only.
+ */
+FIELD(TBFLAG_A32, VECLEN, 9, 3) /* Not cached. */
+FIELD(TBFLAG_A32, VECSTRIDE, 12, 2) /* Not cached. */
 /*
  * We store the bottom two bits of the CPAR as TB flags and handle
  * checks on the other bits at runtime. This shares the same bits as
  * VECSTRIDE, which is OK as no XScale CPU has VFP.
  * Not cached, because VECLEN+VECSTRIDE are not cached.
  */
-FIELD(TBFLAG_A32, XSCALE_CPAR, 4, 2)
+FIELD(TBFLAG_A32, XSCALE_CPAR, 12, 2)
+FIELD(TBFLAG_A32, VFPEN, 14, 1) /* Partially cached, minus FPEXC. */
+FIELD(TBFLAG_A32, SCTLR_B, 15, 1)
 /*
  * Indicates whether cp register reads and writes by guest code should access
  * the secure or nonsecure bank of banked registers; note that this is not
  * the same thing as the current security state of the processor!
  */
-FIELD(TBFLAG_A32, NS, 6, 1)
-FIELD(TBFLAG_A32, VFPEN, 7, 1)  /* Partially cached, minus FPEXC. */
-FIELD(TBFLAG_A32, CONDEXEC, 8, 8)   /* Not cached. */
-FIELD(TBFLAG_A32, SCTLR_B, 16, 1)
-/* For M profile only, set if FPCCR.LSPACT is set */
-FIELD(TBFLAG_A32, LSPACT, 18, 1)/* Not cached. */
-/* For M profile only, set if we must create a new FP context */
-FIELD(TBFLAG_A32, NEW_FP_CTXT_NEEDED, 19, 1) /* Not cached. */
-/* For M profile only, set if FPCCR.S does not match current security state */
-FIELD(TBFLAG_A32, FPCCR_S_WRONG, 20, 1) /* Not cached. */
-/* For M profile only, Handler (ie not Thread) mode */
-FIELD(TBFLAG_A32, HANDLER, 21, 1)
-/* For M profile only, whether we should generate stack-limit checks */
-FIELD(TBFLAG_A32, STACKCHECK, 22, 1)
+FIELD(TBFLAG_A32, NS, 16, 1)
 
-/* Bit usage when in AArch64 state */
+/*
+ * Bit usage when in AArch32 state, for M-profile only.
+ */
+/* Handler (ie not Thread) mode */
+FIELD(TBFLAG_M32, HANDLER, 9, 1)
+/* Whether we should generate stack-limit checks */
+FIELD(TBFLAG_M32, STACKCHECK, 10, 1)
+/* Set if FPCCR.LSPACT is set */
+FIELD(TBFLAG_M32, LSPACT, 11, 1) /* Not cached. */
+/* Set if we must create a new FP context */
+FIELD(TBFLAG_M32, NEW_FP_CTXT_NEEDED, 12, 1) /* Not cached. */
+/* Set if FPCCR.S does not match current security state */
+FIELD(TBFLAG_M32, FPCCR_S_WRONG, 13, 1)  /* Not cached. */
+
+/*
+ * Bit usage when in AArch64 state
+ */
 FIELD(TBFLAG_A64, TBII, 0, 2)
 FIELD(TBFLAG_A64, SVEEXC_EL, 2, 2)
 FIELD(TBFLAG_A64, ZCR_LEN, 4, 4)
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 5172843667..ec5c7fa325 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11207,11 +11207,8 @@ static uint32_t rebuild_hflags_m32(CPUARMState *env, 
int fp_el,
 {
 uint32_t flags = 0;
 
-/* v8M always enables the fpu.  */
-flags = FIELD_DP32(flags, TBFLAG_A32, VFPEN, 1);
-
 if (arm_v7m_is_handler_mode(env)) {
-flags = FIELD_DP32(flags, TBFLAG_A32, HANDLER, 1);
+flags = FIELD_DP32(flags, TBFLAG_M32, HANDLER, 1);
 }
 
 /*
@@ -11222,7 +11219,7 @@ static uint32_t rebuild_hflags_m32(CPUARMState *env, 
int fp_el,
 if (arm_feature(env, ARM_FEATURE_V8) &&
 !((mmu_idx & ARM_MMU_IDX_M_NEGPRI) &&
   (env->v7m.ccr[env->v7m.secure] & R_V7M_CCR_STKOFHFNMIGN_MASK))) {
-flags = FIELD_DP32(flags, TBFLAG_A32, STACKCHECK, 1);
+flags = FIELD_DP32(flags, TBFLAG_M32, STACKCHECK, 1);
 }
 
 return rebuild_hflags_common_32(env, fp_el, mmu_idx, flags);
@@ -11385,7 +11382,7 @@ void cpu_get_tb_cpu_state(CPUARMState *env, 
target_ulong *pc,
 if (arm_feature(env, ARM_FEATURE_M_SECURITY) &&
 FIELD_EX32(env->v7m.fpccr[M_REG_S], V7M_FPCCR, S)
 != env->v7m.secure) {
-flags = FIELD_DP32(flags, TBFLAG_A32, FPCCR_S_WRONG, 1);
+flags = FIELD_DP32(flags, TBFLAG_M32, FPCCR_S_WRONG, 1);
 }
 
 if ((env->v7m.fpccr[env->v7m.secure] & R_V7M_FPCCR_ASPEN_MASK) &&
@@ -11397,12

[PATCH v4 07/40] target/arm: Simplify tlb_force_broadcast alternatives

2019-12-02 Thread Richard Henderson

Rather than call to a separate function and re-compute any
parameters for the flush, simply use the correct flush
function directly.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 target/arm/helper.c | 52 +
 1 file changed, 24 insertions(+), 28 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 0b0130d814..6c09cda4ea 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -614,56 +614,54 @@ static void tlbiall_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
   uint64_t value)
 {
 /* Invalidate all (TLBIALL) */
-ARMCPU *cpu = env_archcpu(env);
+CPUState *cs = env_cpu(env);
 
 if (tlb_force_broadcast(env)) {
-tlbiall_is_write(env, NULL, value);
-return;
+tlb_flush_all_cpus_synced(cs);
+} else {
+tlb_flush(cs);
 }
-
-tlb_flush(CPU(cpu));
 }
 
 static void tlbimva_write(CPUARMState *env, const ARMCPRegInfo *ri,
   uint64_t value)
 {
 /* Invalidate single TLB entry by MVA and ASID (TLBIMVA) */
-ARMCPU *cpu = env_archcpu(env);
+CPUState *cs = env_cpu(env);
 
+value &= TARGET_PAGE_MASK;
 if (tlb_force_broadcast(env)) {
-tlbimva_is_write(env, NULL, value);
-return;
+tlb_flush_page_all_cpus_synced(cs, value);
+} else {
+tlb_flush_page(cs, value);
 }
-
-tlb_flush_page(CPU(cpu), value & TARGET_PAGE_MASK);
 }
 
 static void tlbiasid_write(CPUARMState *env, const ARMCPRegInfo *ri,
uint64_t value)
 {
 /* Invalidate by ASID (TLBIASID) */
-ARMCPU *cpu = env_archcpu(env);
+CPUState *cs = env_cpu(env);
 
 if (tlb_force_broadcast(env)) {
-tlbiasid_is_write(env, NULL, value);
-return;
+tlb_flush_all_cpus_synced(cs);
+} else {
+tlb_flush(cs);
 }
-
-tlb_flush(CPU(cpu));
 }
 
 static void tlbimvaa_write(CPUARMState *env, const ARMCPRegInfo *ri,
uint64_t value)
 {
 /* Invalidate single entry by MVA, all ASIDs (TLBIMVAA) */
-ARMCPU *cpu = env_archcpu(env);
+CPUState *cs = env_cpu(env);
 
+value &= TARGET_PAGE_MASK;
 if (tlb_force_broadcast(env)) {
-tlbimvaa_is_write(env, NULL, value);
-return;
+tlb_flush_page_all_cpus_synced(cs, value);
+} else {
+tlb_flush_page(cs, value);
 }
-
-tlb_flush_page(CPU(cpu), value & TARGET_PAGE_MASK);
 }
 
 static void tlbiall_nsnh_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -3915,11 +3913,10 @@ static void tlbi_aa64_vmalle1_write(CPUARMState *env, 
const ARMCPRegInfo *ri,
 int mask = vae1_tlbmask(env);
 
 if (tlb_force_broadcast(env)) {
-tlbi_aa64_vmalle1is_write(env, NULL, value);
-return;
+tlb_flush_by_mmuidx_all_cpus_synced(cs, mask);
+} else {
+tlb_flush_by_mmuidx(cs, mask);
 }
-
-tlb_flush_by_mmuidx(cs, mask);
 }
 
 static int vmalle1_tlbmask(CPUARMState *env)
@@ -4041,11 +4038,10 @@ static void tlbi_aa64_vae1_write(CPUARMState *env, 
const ARMCPRegInfo *ri,
 uint64_t pageaddr = sextract64(value << 12, 0, 56);
 
 if (tlb_force_broadcast(env)) {
-tlbi_aa64_vae1is_write(env, NULL, value);
-return;
+tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr, mask);
+} else {
+tlb_flush_page_by_mmuidx(cs, pageaddr, mask);
 }
-
-tlb_flush_page_by_mmuidx(cs, pageaddr, mask);
 }
 
 static void tlbi_aa64_vae2is_write(CPUARMState *env, const ARMCPRegInfo *ri,
-- 
2.17.1

[PATCH v4 09/40] target/arm: Rename ARMMMUIdx_S2NS to ARMMMUIdx_Stage2

2019-12-02 Thread Richard Henderson

The EL1&0 regime is the only one that uses 2-stage translation.

Signed-off-by: Richard Henderson 
---
 target/arm/cpu.h   |  4 +--
 target/arm/internals.h |  2 +-
 target/arm/helper.c| 57 --
 target/arm/translate-a64.c |  2 +-
 target/arm/translate.c |  2 +-
 5 files changed, 35 insertions(+), 32 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 802cddd2df..fdb868f2e9 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -2870,7 +2870,7 @@ typedef enum ARMMMUIdx {
 ARMMMUIdx_S1E3 = 3 | ARM_MMU_IDX_A,
 ARMMMUIdx_S1SE0 = 4 | ARM_MMU_IDX_A,
 ARMMMUIdx_S1SE1 = 5 | ARM_MMU_IDX_A,
-ARMMMUIdx_S2NS = 6 | ARM_MMU_IDX_A,
+ARMMMUIdx_Stage2 = 6 | ARM_MMU_IDX_A,
 ARMMMUIdx_MUser = 0 | ARM_MMU_IDX_M,
 ARMMMUIdx_MPriv = 1 | ARM_MMU_IDX_M,
 ARMMMUIdx_MUserNegPri = 2 | ARM_MMU_IDX_M,
@@ -2896,7 +2896,7 @@ typedef enum ARMMMUIdxBit {
 ARMMMUIdxBit_S1E3 = 1 << 3,
 ARMMMUIdxBit_S1SE0 = 1 << 4,
 ARMMMUIdxBit_S1SE1 = 1 << 5,
-ARMMMUIdxBit_S2NS = 1 << 6,
+ARMMMUIdxBit_Stage2 = 1 << 6,
 ARMMMUIdxBit_MUser = 1 << 0,
 ARMMMUIdxBit_MPriv = 1 << 1,
 ARMMMUIdxBit_MUserNegPri = 1 << 2,
diff --git a/target/arm/internals.h b/target/arm/internals.h
index 54142dd789..ca8be78bbf 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -813,7 +813,7 @@ static inline bool regime_is_secure(CPUARMState *env, 
ARMMMUIdx mmu_idx)
 case ARMMMUIdx_S1NSE0:
 case ARMMMUIdx_S1NSE1:
 case ARMMMUIdx_S1E2:
-case ARMMMUIdx_S2NS:
+case ARMMMUIdx_Stage2:
 case ARMMMUIdx_MPrivNegPri:
 case ARMMMUIdx_MUserNegPri:
 case ARMMMUIdx_MPriv:
diff --git a/target/arm/helper.c b/target/arm/helper.c
index d2b90763ca..97677f8482 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -672,7 +672,7 @@ static void tlbiall_nsnh_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 tlb_flush_by_mmuidx(cs,
 ARMMMUIdxBit_EL10_1 |
 ARMMMUIdxBit_EL10_0 |
-ARMMMUIdxBit_S2NS);
+ARMMMUIdxBit_Stage2);
 }
 
 static void tlbiall_nsnh_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -683,7 +683,7 @@ static void tlbiall_nsnh_is_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 tlb_flush_by_mmuidx_all_cpus_synced(cs,
 ARMMMUIdxBit_EL10_1 |
 ARMMMUIdxBit_EL10_0 |
-ARMMMUIdxBit_S2NS);
+ARMMMUIdxBit_Stage2);
 }
 
 static void tlbiipas2_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -704,7 +704,7 @@ static void tlbiipas2_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 
 pageaddr = sextract64(value << 12, 0, 40);
 
-tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdxBit_S2NS);
+tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdxBit_Stage2);
 }
 
 static void tlbiipas2_is_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -720,7 +720,7 @@ static void tlbiipas2_is_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 pageaddr = sextract64(value << 12, 0, 40);
 
 tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr,
- ARMMMUIdxBit_S2NS);
+ ARMMMUIdxBit_Stage2);
 }
 
 static void tlbiall_hyp_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -3528,12 +3528,15 @@ static void vttbr_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 ARMCPU *cpu = env_archcpu(env);
 CPUState *cs = CPU(cpu);
 
-/* Accesses to VTTBR may change the VMID so we must flush the TLB.  */
+/*
+ * A change in VMID to the stage2 page table (Stage2) invalidates
+ * the combined stage 1&2 tlbs (EL10_1 and EL10_0).
+ */
 if (raw_read(env, ri) != value) {
 tlb_flush_by_mmuidx(cs,
 ARMMMUIdxBit_EL10_1 |
 ARMMMUIdxBit_EL10_0 |
-ARMMMUIdxBit_S2NS);
+ARMMMUIdxBit_Stage2);
 raw_write(env, ri, value);
 }
 }
@@ -3929,7 +3932,7 @@ static int vmalle1_tlbmask(CPUARMState *env)
 if (arm_is_secure_below_el3(env)) {
 return ARMMMUIdxBit_S1SE1 | ARMMMUIdxBit_S1SE0;
 } else if (arm_feature(env, ARM_FEATURE_EL2)) {
-return ARMMMUIdxBit_EL10_1 | ARMMMUIdxBit_EL10_0 | ARMMMUIdxBit_S2NS;
+return ARMMMUIdxBit_EL10_1 | ARMMMUIdxBit_EL10_0 | ARMMMUIdxBit_Stage2;
 } else {
 return ARMMMUIdxBit_EL10_1 | ARMMMUIdxBit_EL10_0;
 }
@@ -4083,7 +4086,7 @@ static void tlbi_aa64_ipas2e1_write(CPUARMState *env, 
const ARMCPRegInfo *ri,
 
 pageaddr = sextract64(value << 12, 0, 48);
 
-tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdxBit_S2NS);
+tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdxBit_Stage2);
 }
 
 static void tlbi_aa64_ipas2e1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@

[PATCH v4 04/40] target/arm: Add TTBR1_EL2

2019-12-02 Thread Richard Henderson

At the same time, add writefn to TTBR0_EL2 and TCR_EL2.
A later patch will update any ASID therein.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 target/arm/helper.c | 18 +-
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index b4d774632d..06ec4641f3 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -3484,6 +3484,12 @@ static void vmsa_ttbr_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 raw_write(env, ri, value);
 }
 
+static void vmsa_tcr_ttbr_el2_write(CPUARMState *env, const ARMCPRegInfo *ri,
+uint64_t value)
+{
+raw_write(env, ri, value);
+}
+
 static void vttbr_write(CPUARMState *env, const ARMCPRegInfo *ri,
 uint64_t value)
 {
@@ -4893,10 +4899,8 @@ static const ARMCPRegInfo el2_cp_reginfo[] = {
   .resetvalue = 0 },
 { .name = "TCR_EL2", .state = ARM_CP_STATE_BOTH,
   .opc0 = 3, .opc1 = 4, .crn = 2, .crm = 0, .opc2 = 2,
-  .access = PL2_RW,
-  /* no .writefn needed as this can't cause an ASID change;
-   * no .raw_writefn or .resetfn needed as we never use mask/base_mask
-   */
+  .access = PL2_RW, .writefn = vmsa_tcr_ttbr_el2_write,
+  /* no .raw_writefn or .resetfn needed as we never use mask/base_mask */
   .fieldoffset = offsetof(CPUARMState, cp15.tcr_el[2]) },
 { .name = "VTCR", .state = ARM_CP_STATE_AA32,
   .cp = 15, .opc1 = 4, .crn = 2, .crm = 1, .opc2 = 2,
@@ -4930,7 +4934,7 @@ static const ARMCPRegInfo el2_cp_reginfo[] = {
   .fieldoffset = offsetof(CPUARMState, cp15.tpidr_el[2]) },
 { .name = "TTBR0_EL2", .state = ARM_CP_STATE_AA64,
   .opc0 = 3, .opc1 = 4, .crn = 2, .crm = 0, .opc2 = 0,
-  .access = PL2_RW, .resetvalue = 0,
+  .access = PL2_RW, .resetvalue = 0, .writefn = vmsa_tcr_ttbr_el2_write,
   .fieldoffset = offsetof(CPUARMState, cp15.ttbr0_el[2]) },
 { .name = "HTTBR", .cp = 15, .opc1 = 4, .crm = 2,
   .access = PL2_RW, .type = ARM_CP_64BIT | ARM_CP_ALIAS,
@@ -6959,6 +6963,10 @@ void register_cp_regs_for_features(ARMCPU *cpu)
   .opc0 = 3, .opc1 = 4, .crn = 13, .crm = 0, .opc2 = 1,
   .access = PL2_RW,
   .fieldoffset = offsetof(CPUARMState, cp15.contextidr_el[2]) },
+{ .name = "TTBR1_EL2", .state = ARM_CP_STATE_AA64,
+  .opc0 = 3, .opc1 = 4, .crn = 2, .crm = 0, .opc2 = 1,
+  .access = PL2_RW, .writefn = vmsa_tcr_ttbr_el2_write,
+  .fieldoffset = offsetof(CPUARMState, cp15.ttbr1_el[2]) },
 REGINFO_SENTINEL
 };
 define_arm_cp_regs(cpu, vhe_reginfo);
-- 
2.17.1

[PATCH v4 06/40] target/arm: Split out vae1_tlbmask, vmalle1_tlbmask

2019-12-02 Thread Richard Henderson

No functional change, but unify code sequences.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 target/arm/helper.c | 118 ++--
 1 file changed, 37 insertions(+), 81 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 731507a82f..0b0130d814 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -3890,70 +3890,61 @@ static CPAccessResult aa64_cacheop_access(CPUARMState 
*env,
  * Page D4-1736 (DDI0487A.b)
  */
 
+static int vae1_tlbmask(CPUARMState *env)
+{
+if (arm_is_secure_below_el3(env)) {
+return ARMMMUIdxBit_S1SE1 | ARMMMUIdxBit_S1SE0;
+} else {
+return ARMMMUIdxBit_S12NSE1 | ARMMMUIdxBit_S12NSE0;
+}
+}
+
 static void tlbi_aa64_vmalle1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
   uint64_t value)
 {
 CPUState *cs = env_cpu(env);
-bool sec = arm_is_secure_below_el3(env);
+int mask = vae1_tlbmask(env);
 
-if (sec) {
-tlb_flush_by_mmuidx_all_cpus_synced(cs,
-ARMMMUIdxBit_S1SE1 |
-ARMMMUIdxBit_S1SE0);
-} else {
-tlb_flush_by_mmuidx_all_cpus_synced(cs,
-ARMMMUIdxBit_S12NSE1 |
-ARMMMUIdxBit_S12NSE0);
-}
+tlb_flush_by_mmuidx_all_cpus_synced(cs, mask);
 }
 
 static void tlbi_aa64_vmalle1_write(CPUARMState *env, const ARMCPRegInfo *ri,
 uint64_t value)
 {
 CPUState *cs = env_cpu(env);
+int mask = vae1_tlbmask(env);
 
 if (tlb_force_broadcast(env)) {
 tlbi_aa64_vmalle1is_write(env, NULL, value);
 return;
 }
 
+tlb_flush_by_mmuidx(cs, mask);
+}
+
+static int vmalle1_tlbmask(CPUARMState *env)
+{
+/*
+ * Note that the 'ALL' scope must invalidate both stage 1 and
+ * stage 2 translations, whereas most other scopes only invalidate
+ * stage 1 translations.
+ */
 if (arm_is_secure_below_el3(env)) {
-tlb_flush_by_mmuidx(cs,
-ARMMMUIdxBit_S1SE1 |
-ARMMMUIdxBit_S1SE0);
+return ARMMMUIdxBit_S1SE1 | ARMMMUIdxBit_S1SE0;
+} else if (arm_feature(env, ARM_FEATURE_EL2)) {
+return ARMMMUIdxBit_S12NSE1 | ARMMMUIdxBit_S12NSE0 | ARMMMUIdxBit_S2NS;
 } else {
-tlb_flush_by_mmuidx(cs,
-ARMMMUIdxBit_S12NSE1 |
-ARMMMUIdxBit_S12NSE0);
+return ARMMMUIdxBit_S12NSE1 | ARMMMUIdxBit_S12NSE0;
 }
 }
 
 static void tlbi_aa64_alle1_write(CPUARMState *env, const ARMCPRegInfo *ri,
   uint64_t value)
 {
-/* Note that the 'ALL' scope must invalidate both stage 1 and
- * stage 2 translations, whereas most other scopes only invalidate
- * stage 1 translations.
- */
-ARMCPU *cpu = env_archcpu(env);
-CPUState *cs = CPU(cpu);
+CPUState *cs = env_cpu(env);
+int mask = vmalle1_tlbmask(env);
 
-if (arm_is_secure_below_el3(env)) {
-tlb_flush_by_mmuidx(cs,
-ARMMMUIdxBit_S1SE1 |
-ARMMMUIdxBit_S1SE0);
-} else {
-if (arm_feature(env, ARM_FEATURE_EL2)) {
-tlb_flush_by_mmuidx(cs,
-ARMMMUIdxBit_S12NSE1 |
-ARMMMUIdxBit_S12NSE0 |
-ARMMMUIdxBit_S2NS);
-} else {
-tlb_flush_by_mmuidx(cs,
-ARMMMUIdxBit_S12NSE1 |
-ARMMMUIdxBit_S12NSE0);
-}
-}
+tlb_flush_by_mmuidx(cs, mask);
 }
 
 static void tlbi_aa64_alle2_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -3977,28 +3968,10 @@ static void tlbi_aa64_alle3_write(CPUARMState *env, 
const ARMCPRegInfo *ri,
 static void tlbi_aa64_alle1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
 uint64_t value)
 {
-/* Note that the 'ALL' scope must invalidate both stage 1 and
- * stage 2 translations, whereas most other scopes only invalidate
- * stage 1 translations.
- */
 CPUState *cs = env_cpu(env);
-bool sec = arm_is_secure_below_el3(env);
-bool has_el2 = arm_feature(env, ARM_FEATURE_EL2);
+int mask = vmalle1_tlbmask(env);
 
-if (sec) {
-tlb_flush_by_mmuidx_all_cpus_synced(cs,
-ARMMMUIdxBit_S1SE1 |
-ARMMMUIdxBit_S1SE0);
-} else if (has_el2) {
-tlb_flush_by_mmuidx_all_cpus_synced(cs,
-ARMMMUIdxBit_S12NSE1 |
-ARMMMUIdxBit_S12NSE0 |
-ARMMMUIdxBit_S2NS);
-} else {
-  tlb_flush_by_mmuidx_all_cpus_synced(cs,
-

[PATCH v4 03/40] target/arm: Add CONTEXTIDR_EL2

2019-12-02 Thread Richard Henderson

Not all of the breakpoint types are supported, but those that
only examine contextidr are extended to support the new register.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 target/arm/debug_helper.c | 50 +--
 target/arm/helper.c   | 11 +
 2 files changed, 49 insertions(+), 12 deletions(-)

diff --git a/target/arm/debug_helper.c b/target/arm/debug_helper.c
index dde80273ff..2e3e90c6a5 100644
--- a/target/arm/debug_helper.c
+++ b/target/arm/debug_helper.c
@@ -20,6 +20,7 @@ static bool linked_bp_matches(ARMCPU *cpu, int lbn)
 int ctx_cmps = extract32(cpu->dbgdidr, 20, 4);
 int bt;
 uint32_t contextidr;
+uint64_t hcr_el2;
 
 /*
  * Links to unimplemented or non-context aware breakpoints are
@@ -40,24 +41,44 @@ static bool linked_bp_matches(ARMCPU *cpu, int lbn)
 }
 
 bt = extract64(bcr, 20, 4);
-
-/*
- * We match the whole register even if this is AArch32 using the
- * short descriptor format (in which case it holds both PROCID and ASID),
- * since we don't implement the optional v7 context ID masking.
- */
-contextidr = extract64(env->cp15.contextidr_el[1], 0, 32);
+hcr_el2 = arm_hcr_el2_eff(env);
 
 switch (bt) {
 case 3: /* linked context ID match */
-if (arm_current_el(env) > 1) {
-/* Context matches never fire in EL2 or (AArch64) EL3 */
+switch (arm_current_el(env)) {
+default:
+/* Context matches never fire in AArch64 EL3 */
 return false;
+case 2:
+if (!(hcr_el2 & HCR_E2H)) {
+/* Context matches never fire in EL2 without E2H enabled. */
+return false;
+}
+contextidr = env->cp15.contextidr_el[2];
+break;
+case 1:
+contextidr = env->cp15.contextidr_el[1];
+break;
+case 0:
+if ((hcr_el2 & (HCR_E2H | HCR_TGE)) == (HCR_E2H | HCR_TGE)) {
+contextidr = env->cp15.contextidr_el[2];
+} else {
+contextidr = env->cp15.contextidr_el[1];
+}
+break;
 }
-return (contextidr == extract64(env->cp15.dbgbvr[lbn], 0, 32));
-case 5: /* linked address mismatch (reserved in AArch64) */
+break;
+
+case 7:  /* linked contextidr_el1 match */
+contextidr = env->cp15.contextidr_el[1];
+break;
+case 13: /* linked contextidr_el2 match */
+contextidr = env->cp15.contextidr_el[2];
+break;
+
 case 9: /* linked VMID match (reserved if no EL2) */
 case 11: /* linked context ID and VMID match (reserved if no EL2) */
+case 15: /* linked full context ID match */
 default:
 /*
  * Links to Unlinked context breakpoints must generate no
@@ -66,7 +87,12 @@ static bool linked_bp_matches(ARMCPU *cpu, int lbn)
 return false;
 }
 
-return false;
+/*
+ * We match the whole register even if this is AArch32 using the
+ * short descriptor format (in which case it holds both PROCID and ASID),
+ * since we don't implement the optional v7 context ID masking.
+ */
+return contextidr == (uint32_t)env->cp15.dbgbvr[lbn];
 }
 
 static bool bp_wp_matches(ARMCPU *cpu, int n, bool is_wp)
diff --git a/target/arm/helper.c b/target/arm/helper.c
index d81daadf45..b4d774632d 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -6953,6 +6953,17 @@ void register_cp_regs_for_features(ARMCPU *cpu)
 define_arm_cp_regs(cpu, lor_reginfo);
 }
 
+if (arm_feature(env, ARM_FEATURE_EL2) && cpu_isar_feature(aa64_vh, cpu)) {
+static const ARMCPRegInfo vhe_reginfo[] = {
+{ .name = "CONTEXTIDR_EL2", .state = ARM_CP_STATE_AA64,
+  .opc0 = 3, .opc1 = 4, .crn = 13, .crm = 0, .opc2 = 1,
+  .access = PL2_RW,
+  .fieldoffset = offsetof(CPUARMState, cp15.contextidr_el[2]) },
+REGINFO_SENTINEL
+};
+define_arm_cp_regs(cpu, vhe_reginfo);
+}
+
 if (cpu_isar_feature(aa64_sve, cpu)) {
 define_one_arm_cp_reg(cpu, _el1_reginfo);
 if (arm_feature(env, ARM_FEATURE_EL2)) {
-- 
2.17.1

[PATCH v4 02/40] target/arm: Enable HCR_E2H for VHE

2019-12-02 Thread Richard Henderson

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 target/arm/cpu.h| 7 ---
 target/arm/helper.c | 6 +-
 2 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 994cad2014..9729e62d2c 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -1387,13 +1387,6 @@ static inline void xpsr_write(CPUARMState *env, uint32_t 
val, uint32_t mask)
 #define HCR_ATA   (1ULL << 56)
 #define HCR_DCT   (1ULL << 57)
 
-/*
- * When we actually implement ARMv8.1-VHE we should add HCR_E2H to
- * HCR_MASK and then clear it again if the feature bit is not set in
- * hcr_write().
- */
-#define HCR_MASK  ((1ULL << 34) - 1)
-
 #define SCR_NS(1U << 0)
 #define SCR_IRQ   (1U << 1)
 #define SCR_FIQ   (1U << 2)
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 0bf8f53d4b..d81daadf45 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -4672,7 +4672,8 @@ static const ARMCPRegInfo el3_no_el2_v8_cp_reginfo[] = {
 static void hcr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
 {
 ARMCPU *cpu = env_archcpu(env);
-uint64_t valid_mask = HCR_MASK;
+/* Begin with bits defined in base ARMv8.0.  */
+uint64_t valid_mask = MAKE_64BIT_MASK(0, 34);
 
 if (arm_feature(env, ARM_FEATURE_EL3)) {
 valid_mask &= ~HCR_HCD;
@@ -4686,6 +4687,9 @@ static void hcr_write(CPUARMState *env, const 
ARMCPRegInfo *ri, uint64_t value)
  */
 valid_mask &= ~HCR_TSC;
 }
+if (cpu_isar_feature(aa64_vh, cpu)) {
+valid_mask |= HCR_E2H;
+}
 if (cpu_isar_feature(aa64_lor, cpu)) {
 valid_mask |= HCR_TLOR;
 }
-- 
2.17.1

[PATCH v4 00/40] target/arm: Implement ARMv8.1-VHE

2019-12-02 Thread Richard Henderson

Version 3 was posted back in August.  Though quite a lot has changed
and perhaps there's no use in comparing.  I haven't done a list.

Against master, it is the first version that can actually boot a
nested kernel under kvm, so that's certainly a change for the better.

It's not even particularly slow.  With both outer and nested kernel
using a minimal busybox initrd, the outer kernel boots in 4.0 seconds
and the nested kernel boots in 6.7 seconds.


r~


Alex Bennée (1):
  target/arm: check TGE and E2H flags for EL0 pauth traps

Richard Henderson (39):
  target/arm: Define isar_feature_aa64_vh
  target/arm: Enable HCR_E2H for VHE
  target/arm: Add CONTEXTIDR_EL2
  target/arm: Add TTBR1_EL2
  target/arm: Update CNTVCT_EL0 for VHE
  target/arm: Split out vae1_tlbmask, vmalle1_tlbmask
  target/arm: Simplify tlb_force_broadcast alternatives
  target/arm: Rename ARMMMUIdx*_S12NSE* to ARMMMUIdx*_E10_*
  target/arm: Rename ARMMMUIdx_S2NS to ARMMMUIdx_Stage2
  target/arm: Rename ARMMMUIdx_S1NSE* to ARMMMUIdx_Stage1_E*
  target/arm: Rename ARMMMUIdx_S1SE* to ARMMMUIdx_SE*
  target/arm: Rename ARMMMUIdx*_S1E3 to ARMMMUIdx*_SE3
  target/arm: Rename ARMMMUIdx_S1E2 to ARMMMUIdx_E2
  target/arm: Recover 4 bits from TBFLAGs
  target/arm: Expand TBFLAG_ANY.MMUIDX to 4 bits
  target/arm: Rearrange ARMMMUIdxBit
  target/arm: Tidy ARMMMUIdx m-profile definitions
  target/arm: Reorganize ARMMMUIdx
  target/arm: Add regime_has_2_ranges
  target/arm: Update arm_mmu_idx for VHE
  target/arm: Update arm_sctlr for VHE
  target/arm: Update aa64_zva_access for EL2
  target/arm: Update ctr_el0_access for EL2
  target/arm: Add the hypervisor virtual counter
  target/arm: Update timer access for VHE
  target/arm: Update define_one_arm_cp_reg_with_opaque for VHE
  target/arm: Add VHE system register redirection and aliasing
  target/arm: Add VHE timer register redirection and aliasing
  target/arm: Flush tlb for ASID changes in EL2&0 translation regime
  target/arm: Flush tlbs for E2&0 translation regime
  target/arm: Update arm_phys_excp_target_el for TGE
  target/arm: Update {fp,sve}_exception_el for VHE
  target/arm: Update get_a64_user_mem_index for VHE
  target/arm: Update arm_cpu_do_interrupt_aarch64 for VHE
  target/arm: Enable ARMv8.1-VHE in -cpu max
  target/arm: Move arm_excp_unmasked to cpu.c
  target/arm: Pass more cpu state to arm_excp_unmasked
  target/arm: Use bool for unmasked in arm_excp_unmasked
  target/arm: Raise only one interrupt in arm_cpu_exec_interrupt

 target/arm/cpu-param.h |2 +-
 target/arm/cpu-qom.h   |1 +
 target/arm/cpu.h   |  434 +
 target/arm/internals.h |   71 ++-
 target/arm/translate.h |4 +-
 target/arm/cpu.c   |  161 -
 target/arm/cpu64.c |1 +
 target/arm/debug_helper.c  |   50 +-
 target/arm/helper-a64.c|2 +-
 target/arm/helper.c| 1219 +++-
 target/arm/pauth_helper.c  |   14 +-
 target/arm/translate-a64.c |   47 +-
 target/arm/translate.c |   73 ++-
 13 files changed, 1397 insertions(+), 682 deletions(-)

-- 
2.17.1

[PATCH v4 01/40] target/arm: Define isar_feature_aa64_vh

2019-12-02 Thread Richard Henderson

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 target/arm/cpu.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 83a809d4ba..994cad2014 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -3632,6 +3632,11 @@ static inline bool isar_feature_aa64_sve(const 
ARMISARegisters *id)
 return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, SVE) != 0;
 }
 
+static inline bool isar_feature_aa64_vh(const ARMISARegisters *id)
+{
+return FIELD_EX64(id->id_aa64mmfr1, ID_AA64MMFR1, VH) != 0;
+}
+
 static inline bool isar_feature_aa64_lor(const ARMISARegisters *id)
 {
 return FIELD_EX64(id->id_aa64mmfr1, ID_AA64MMFR1, LO) != 0;
-- 
2.17.1

Re: [RESEND PATCH v21 0/6] Add ARMv8 RAS virtualization support in QEMU

2019-12-02 Thread gengdongjiu

On 2019/12/3 2:27, Peter Maydell wrote:
>> application within guest, host does not know which application encounters
>> errors.
>>
>> For the ARMv8 SEA/SEI, KVM or host kernel delivers SIGBUS to notify 
>> userspace.
>> After user space gets the notification, it will record the CPER into guest 
>> GHES
>> buffer and inject an exception or IRQ into guest.
>>
>> In the current implementation, if the type of SIGBUS is BUS_MCEERR_AR, we 
>> will
>> treat it as a synchronous exception, and notify guest with ARMv8 SEA
>> notification type after recording CPER into guest.
> Hi; I've given you reviewed-by tags on a couple of patches; other
> people have given review comments on some of the other patches,
> so I think you have enough to do a v22 addressing those.
Thanks very much for the reviewed-by tags,  we will upload v22.


> > thanks
> -- PMM
> .
>

Re: [PATCH v37 00/17] QEMU AVR 8 bit cores

2019-12-02 Thread Aleksandar Markovic

On Tuesday, December 3, 2019, Aleksandar Markovic <
aleksandar.m.m...@gmail.com> wrote:

>
>
> On Tuesday, December 3, 2019, Aleksandar Markovic <
> aleksandar.m.m...@gmail.com> wrote:
>
>>
>>
>> On Monday, December 2, 2019, Aleksandar Markovic <
>> aleksandar.m.m...@gmail.com> wrote:
>>
>>>
>>>
>>> On Monday, December 2, 2019, Michael Rolnik  wrote:
>>>
 how can I get this elf flags from within QEMU?

>
>
>>> In one of files from your "machine" patch, you have this snippet:
>>>
>>> +bytes_loaded = load_elf(
>>> +filename, NULL, NULL, NULL, NULL, NULL, NULL, 0, EM_NONE,
>>> 0, 0);
>>>
>>> With this line you a kind of "blindly" load whatever you find in the
>>> file "filename". I think you need to modify load_elf() to fetch the
>>> information on what core the elf in question is compiled for. Additionally,
>>> you most likely have to check if the elf file is compiled for AVR at all.
>>>
>>> I don't know enough about AVR-specifics of ELF format, but I know that
>>> we in MIPS read successfuly some MIPS-specific things we need to know. Do
>>> some research for ELF format headrr content for AVR.
>>>
>>> This is really missing in your series, seriously.
>>>
>>> Please keep in mind that I don't have right now at hand any dev system,
>>> so all I said here is off of my head.
>>>
>>> You have to do some code digging.
>>>
>>>
>> First, you need to update
>>
>> https://github.com/qemu/qemu/blob/master/include/elf.h
>>
>> with bits and pieces for AVR.
>>
>> In binutils file:
>>
>> https://github.com/bminor/binutils-gdb/blob/master/include/elf/common.h
>>
>> you will spot the line:
>>
>> #define EM_AVR 83 /* Atmel AVR 8-bit microcontroller */
>>
>> that is the value of e_machine field for AVR, which you need to insert in
>> qemu's include/elf.h about at line 162.
>>
>> Then, in another binutils file:
>>
>> https://github.com/bminor/binutils-gdb/blob/master/include/elf/avr.h
>>
>> you find the lines:
>>
>> #define E_AVR_MACH_AVR1 1
>> #define E_AVR_MACH_AVR2 2
>> #define E_AVR_MACH_AVR25 25
>> #define E_AVR_MACH_AVR3 3
>> #define E_AVR_MACH_AVR31 31
>> #define E_AVR_MACH_AVR35 35
>> #define E_AVR_MACH_AVR4 4
>> #define E_AVR_MACH_AVR5 5
>> #define E_AVR_MACH_AVR51 51
>> #define E_AVR_MACH_AVR6 6
>> #define E_AVR_MACH_AVRTINY 100
>> #define E_AVR_MACH_XMEGA1 101
>> #define E_AVR_MACH_XMEGA2 102
>> #define E_AVR_MACH_XMEGA3 103
>> #define E_AVR_MACH_XMEGA4 104
>> #define E_AVR_MACH_XMEGA5 105
>> #define E_AVR_MACH_XMEGA6 106
>> #define E_AVR_MACH_XMEGA7 107
>>
>> That you also need to insert in qemu's include/elf.h, probably at the end
>> of tge foke or elsewhere.
>>
>> Perhaps something more you need to insert into that file, you'll see.
>>
>> Than, you need to modify the file where load_elf() resides with AVR
>> support, take a look at other architectures' support, and adjust to what
>> you need.
>>
>> I know it will be contrieved at times, but, personally, similar ELF
>> support must be done for any upcoming platform. Only if there is some
>> unsourmantable obstacle, that support can be omitted.
>>
>> I am on vacation next 10 days.
>>
>>
> In the source of readelf utility:
>
>
> static void
> decode_AVR_machine_flags (unsigned e_flags, char buf[], size_t size)
> {
>   --size; /* Leave space for null terminator.  */
>
>   switch (e_flags & EF_AVR_MACH)
> {
> case E_AVR_MACH_AVR1:
>   strncat (buf, ", avr:1", size);
>   break;
> case E_AVR_MACH_AVR2:
>   strncat (buf, ", avr:2", size);
>   break;
> case E_AVR_MACH_AVR25:
>   strncat (buf, ", avr:25", size);
>   break;
> case E_AVR_MACH_AVR3:
>   strncat (buf, ", avr:3", size);
>   break;
> case E_AVR_MACH_AVR31:
>   strncat (buf, ", avr:31", size);
>   break;
> case E_AVR_MACH_AVR35:
>   strncat (buf, ", avr:35", size);
>   break;
> case E_AVR_MACH_AVR4:
>   strncat (buf, ", avr:4", size);
>   break;
> case E_AVR_MACH_AVR5:
>   strncat (buf, ", avr:5", size);
>   break;
> case E_AVR_MACH_AVR51:
>   strncat (buf, ", avr:51", size);
>   break;
> case E_AVR_MACH_AVR6:
>   strncat (buf, ", avr:6", size);
>   break;
> case E_AVR_MACH_AVRTINY:
>   strncat (buf, ", avr:100", size);
>   break;
> case E_AVR_MACH_XMEGA1:
>   strncat (buf, ", avr:101", size);
>   break;
> case E_AVR_MACH_XMEGA2:
>   strncat (buf, ", avr:102", size);
>   break;
> case E_AVR_MACH_XMEGA3:
>   strncat (buf, ", avr:103", size);
>   break;
> case E_AVR_MACH_XMEGA4:
>   strncat (buf, ", avr:104", size);
>   break;
> case E_AVR_MACH_XMEGA5:
>   strncat (buf, ", avr:105", size);
>   break;
> case E_AVR_MACH_XMEGA6:
>   strncat (buf, ", avr:106", size);
>   break;
> case E_AVR_MACH_XMEGA7:
>   strncat (buf, ", avr:107", size);
>   break;
> default:
>   strncat (buf, ", avr:", size);
>   break;
> }
>
>
> So, it looks, for 8-bit AVR,

[PATCH] virtio-balloon: fix memory leak while attach virtio-balloon device

2019-12-02 Thread pannengyuan

From: PanNengyuan 

ivq/dvq/svq/free_page_vq is forgot to cleanup in
virtio_balloon_device_unrealize, the memory leak stack is as follow:

Direct leak of 14336 byte(s) in 2 object(s) allocated from:
#0 0x7f99fd9d8560 in calloc (/usr/lib64/libasan.so.3+0xc7560)
#1 0x7f99fcb20015 in g_malloc0 (/usr/lib64/libglib-2.0.so.0+0x50015)
#2 0x557d90638437 in virtio_add_queue 
/mnt/sdb/qemu-4.2.0-rc0/hw/virtio/virtio.c:2327
#3 0x557d9064401d in virtio_balloon_device_realize 
/mnt/sdb/qemu-4.2.0-rc0/hw/virtio/virtio-balloon.c:793
#4 0x557d906356f7 in virtio_device_realize 
/mnt/sdb/qemu-4.2.0-rc0/hw/virtio/virtio.c:3504
#5 0x557d9073f081 in device_set_realized 
/mnt/sdb/qemu-4.2.0-rc0/hw/core/qdev.c:876
#6 0x557d908b1f4d in property_set_bool 
/mnt/sdb/qemu-4.2.0-rc0/qom/object.c:2080
#7 0x557d908b655e in object_property_set_qobject 
/mnt/sdb/qemu-4.2.0-rc0/qom/qom-qobject.c:26

Reported-by: Euler Robot 
Signed-off-by: PanNengyuan 
---
 hw/virtio/virtio-balloon.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 40b04f5..5329c65 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -831,6 +831,13 @@ static void virtio_balloon_device_unrealize(DeviceState 
*dev, Error **errp)
 }
 balloon_stats_destroy_timer(s);
 qemu_remove_balloon_handler(s);
+
+virtio_del_queue(vdev, 0);
+virtio_del_queue(vdev, 1);
+virtio_del_queue(vdev, 2);
+if (s->free_page_vq) {
+virtio_del_queue(vdev, 3);
+}
 virtio_cleanup(vdev);
 }
 
-- 
2.7.2.windows.1

Re: [PATCH v37 00/17] QEMU AVR 8 bit cores

2019-12-02 Thread Aleksandar Markovic

On Tuesday, December 3, 2019, Aleksandar Markovic <
aleksandar.m.m...@gmail.com> wrote:

>
>
> On Monday, December 2, 2019, Aleksandar Markovic <
> aleksandar.m.m...@gmail.com> wrote:
>
>>
>>
>> On Monday, December 2, 2019, Michael Rolnik  wrote:
>>
>>> how can I get this elf flags from within QEMU?
>>>


>> In one of files from your "machine" patch, you have this snippet:
>>
>> +bytes_loaded = load_elf(
>> +filename, NULL, NULL, NULL, NULL, NULL, NULL, 0, EM_NONE, 0,
>> 0);
>>
>> With this line you a kind of "blindly" load whatever you find in the file
>> "filename". I think you need to modify load_elf() to fetch the information
>> on what core the elf in question is compiled for. Additionally, you most
>> likely have to check if the elf file is compiled for AVR at all.
>>
>> I don't know enough about AVR-specifics of ELF format, but I know that we
>> in MIPS read successfuly some MIPS-specific things we need to know. Do some
>> research for ELF format headrr content for AVR.
>>
>> This is really missing in your series, seriously.
>>
>> Please keep in mind that I don't have right now at hand any dev system,
>> so all I said here is off of my head.
>>
>> You have to do some code digging.
>>
>>
> First, you need to update
>
> https://github.com/qemu/qemu/blob/master/include/elf.h
>
> with bits and pieces for AVR.
>
> In binutils file:
>
> https://github.com/bminor/binutils-gdb/blob/master/include/elf/common.h
>
> you will spot the line:
>
> #define EM_AVR 83 /* Atmel AVR 8-bit microcontroller */
>
> that is the value of e_machine field for AVR, which you need to insert in
> qemu's include/elf.h about at line 162.
>
> Then, in another binutils file:
>
> https://github.com/bminor/binutils-gdb/blob/master/include/elf/avr.h
>
> you find the lines:
>
> #define E_AVR_MACH_AVR1 1
> #define E_AVR_MACH_AVR2 2
> #define E_AVR_MACH_AVR25 25
> #define E_AVR_MACH_AVR3 3
> #define E_AVR_MACH_AVR31 31
> #define E_AVR_MACH_AVR35 35
> #define E_AVR_MACH_AVR4 4
> #define E_AVR_MACH_AVR5 5
> #define E_AVR_MACH_AVR51 51
> #define E_AVR_MACH_AVR6 6
> #define E_AVR_MACH_AVRTINY 100
> #define E_AVR_MACH_XMEGA1 101
> #define E_AVR_MACH_XMEGA2 102
> #define E_AVR_MACH_XMEGA3 103
> #define E_AVR_MACH_XMEGA4 104
> #define E_AVR_MACH_XMEGA5 105
> #define E_AVR_MACH_XMEGA6 106
> #define E_AVR_MACH_XMEGA7 107
>
> That you also need to insert in qemu's include/elf.h, probably at the end
> of tge foke or elsewhere.
>
> Perhaps something more you need to insert into that file, you'll see.
>
> Than, you need to modify the file where load_elf() resides with AVR
> support, take a look at other architectures' support, and adjust to what
> you need.
>
> I know it will be contrieved at times, but, personally, similar ELF
> support must be done for any upcoming platform. Only if there is some
> unsourmantable obstacle, that support can be omitted.
>
> I am on vacation next 10 days.
>
>
In the source of readelf utility:


static void
decode_AVR_machine_flags (unsigned e_flags, char buf[], size_t size)
{
  --size; /* Leave space for null terminator.  */

  switch (e_flags & EF_AVR_MACH)
{
case E_AVR_MACH_AVR1:
  strncat (buf, ", avr:1", size);
  break;
case E_AVR_MACH_AVR2:
  strncat (buf, ", avr:2", size);
  break;
case E_AVR_MACH_AVR25:
  strncat (buf, ", avr:25", size);
  break;
case E_AVR_MACH_AVR3:
  strncat (buf, ", avr:3", size);
  break;
case E_AVR_MACH_AVR31:
  strncat (buf, ", avr:31", size);
  break;
case E_AVR_MACH_AVR35:
  strncat (buf, ", avr:35", size);
  break;
case E_AVR_MACH_AVR4:
  strncat (buf, ", avr:4", size);
  break;
case E_AVR_MACH_AVR5:
  strncat (buf, ", avr:5", size);
  break;
case E_AVR_MACH_AVR51:
  strncat (buf, ", avr:51", size);
  break;
case E_AVR_MACH_AVR6:
  strncat (buf, ", avr:6", size);
  break;
case E_AVR_MACH_AVRTINY:
  strncat (buf, ", avr:100", size);
  break;
case E_AVR_MACH_XMEGA1:
  strncat (buf, ", avr:101", size);
  break;
case E_AVR_MACH_XMEGA2:
  strncat (buf, ", avr:102", size);
  break;
case E_AVR_MACH_XMEGA3:
  strncat (buf, ", avr:103", size);
  break;
case E_AVR_MACH_XMEGA4:
  strncat (buf, ", avr:104", size);
  break;
case E_AVR_MACH_XMEGA5:
  strncat (buf, ", avr:105", size);
  break;
case E_AVR_MACH_XMEGA6:
  strncat (buf, ", avr:106", size);
  break;
case E_AVR_MACH_XMEGA7:
  strncat (buf, ", avr:107", size);
  break;
default:
  strncat (buf, ", avr:", size);
  break;
}


So, it looks, for 8-bit AVR, e_machine must be 83 (EM_AVR), while e_flags
is one of E_AVR_MACH_XXX constants. You just need to store somewhere
E_AVR_MACH_XXX that you read from given ELF file, and compare it with core
specified by "-cpu" command line option.



> Yours,
> Aleksandar
>
> .
>
>> Best regards, Aleksandar
>>
>>
>>> On Mon, Dec 2, 2019 at 4:01 PM

Re: [PATCH] virtio-serial-bus: fix memory leak while attach virtio-serial-bus

2019-12-02 Thread pannengyuan




On 2019/12/2 21:58, Laurent Vivier wrote:
> On 02/12/2019 12:15, pannengy...@huawei.com wrote:
>> From: PanNengyuan 
>>
>> ivqs/ovqs/c_ivq/c_ovq is forgot to cleanup in
>> virtio_serial_device_unrealize, the memory leak stack is as bellow:
>>
>> Direct leak of 1290240 byte(s) in 180 object(s) allocated from:
>> #0 0x7fc9bfc27560 in calloc (/usr/lib64/libasan.so.3+0xc7560)
>> #1 0x7fc9bed6f015 in g_malloc0 (/usr/lib64/libglib-2.0.so.0+0x50015)
>> #2 0x5650e02b83e7 in virtio_add_queue 
>> /mnt/sdb/qemu-4.2.0-rc0/hw/virtio/virtio.c:2327
>> #3 0x5650e02847b5 in virtio_serial_device_realize 
>> /mnt/sdb/qemu-4.2.0-rc0/hw/char/virtio-serial-bus.c:1089
>> #4 0x5650e02b56a7 in virtio_device_realize 
>> /mnt/sdb/qemu-4.2.0-rc0/hw/virtio/virtio.c:3504
>> #5 0x5650e03bf031 in device_set_realized 
>> /mnt/sdb/qemu-4.2.0-rc0/hw/core/qdev.c:876
>> #6 0x5650e0531efd in property_set_bool 
>> /mnt/sdb/qemu-4.2.0-rc0/qom/object.c:2080
>> #7 0x5650e053650e in object_property_set_qobject 
>> /mnt/sdb/qemu-4.2.0-rc0/qom/qom-qobject.c:26
>> #8 0x5650e0533e14 in object_property_set_bool 
>> /mnt/sdb/qemu-4.2.0-rc0/qom/object.c:1338
>> #9 0x5650e04c0e37 in virtio_pci_realize 
>> /mnt/sdb/qemu-4.2.0-rc0/hw/virtio/virtio-pci.c:1801
>>
>> Reported-by: Euler Robot 
>> Signed-off-by: PanNengyuan 
>> ---
>>  hw/char/virtio-serial-bus.c | 6 ++
>>  1 file changed, 6 insertions(+)
>>
>> diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
>> index 3325904..da9019a 100644
>> --- a/hw/char/virtio-serial-bus.c
>> +++ b/hw/char/virtio-serial-bus.c
>> @@ -1126,9 +1126,15 @@ static void 
>> virtio_serial_device_unrealize(DeviceState *dev, Error **errp)
>>  {
>>  VirtIODevice *vdev = VIRTIO_DEVICE(dev);
>>  VirtIOSerial *vser = VIRTIO_SERIAL(dev);
>> +int i;
>>  
>>  QLIST_REMOVE(vser, next);
>>  
>> +for (i = 0; i <= vser->bus.max_nr_ports; i++) {
>> +virtio_del_queue(vdev, 2 * i);
>> +virtio_del_queue(vdev, 2 * i + 1);
>> +}
>> +
> 
> According to virtio_serial_device_realize() and the number of
> virtio_add_queue(), I think you have more queues to delete:
> 
>   4 + 2 * vser->bus.max_nr_ports
> 
> (for vser->ivqs[0], vser->ovqs[0], vser->c_ivq, vser->c_ovq,
> vser->ivqs[i], vser->ovqs[i]).
> 
> Thanks,
> Laurent
> 
> 
Thanks, but I think the queues is correct, the queues in
virtio_serial_device_realize is as follow:

// here is 2
vser->ivqs[0] = virtio_add_queue(vdev, 128, handle_input);
vser->ovqs[0] = virtio_add_queue(vdev, 128, handle_output);

// here is 2
vser->c_ivq = virtio_add_queue(vdev, 32, control_in);
vser->c_ovq = virtio_add_queue(vdev, 32, control_out);

// here 2 * (max_nr_ports - 1)  - i is from 1 to max_nr_ports - 1
for (i = 1; i < vser->bus.max_nr_ports; i++) {
vser->ivqs[i] = virtio_add_queue(vdev, 128, handle_input);
vser->ovqs[i] = virtio_add_queue(vdev, 128, handle_output);
}

so the total queues number is:  2 * (vser->bus.max_nr_ports + 1)

Re: [PATCH v20 0/8] Build ACPI Heterogeneous Memory Attribute Table (HMAT)

2019-12-02 Thread Tao Xu


Hi Michael,

Could this patch series be queued?
Thank you very much!

Tao

On 11/29/2019 3:56 PM, Xu, Tao3 wrote:

This series of patches will build Heterogeneous Memory Attribute Table (HMAT)
according to the command line. The ACPI HMAT describes the memory attributes,
such as memory side cache attributes and bandwidth and latency details,
related to the Memory Proximity Domain.
The software is expected to use HMAT information as hint for optimization.

In the linux kernel, the codes in drivers/acpi/hmat/hmat.c parse and report
the platform's HMAT tables.

The V19 patches link:
https://patchwork.kernel.org/cover/11265525/

Changelog:
v20:
 - Use g_assert_true and g_assert_false to replace g_assert
   (Thomas and Markus)
 - Rename assoc as associativity, update the QAPI description (Markus)
 - Disable cache level 0 in hmat-cache option (Igor)
 - Keep base and bitmap unchanged when latency or bandwidth
   out of range
 - Fix the broken CI case when user input latency or bandwidth
   less than required.
v19:
 - Add description about the machine property 'hmat' in commit
   message (Markus)
 - Update the QAPI comments
 - Add a check for no memory side cache
 - Add some fail cases for hmat-cache when level=0
v18:
 - Defer patches 01/14~06/14 of V17, use qapi type uint64 and
   only nanosecond for latency (Markus)
 - Rewrite the lines over 80 characters(Igor)
v17:
 - Add check when user input latency or bandwidth 0, the
   lb_info_provided should also be 0. Because in ACPI 6.3 5.2.27.4,
   0 means the corresponding latency or bandwidth information is
   not provided.
 - Fix the infinite loop when node->latency is 0.
 - Use NumaHmatCacheOptions to replace HMAT_Cache_Info (Igor)
 - Add check for unordered cache level input (Igor)
 - Add some fail test cases (Igor)
v16:
 - Add and use qemu_strtold_finite to parse size, support full
   64bit precision, modify related test cases (Eduardo and Markus)
 - Simplify struct HMAT_LB_Info and related code, unify latency
   and bandwidth (Igor)
 - Add cross check with hmat_lb data (Igor)
 - Fields in Cache Attributes are promoted to uint32_t before
   shifting (Igor)
 - Add case for QMP build HMAT (Igor)
v15:
 - Add a new patch to refactor do_strtosz() (Eduardo)
 - Make tests without breaking CI (Michael)
v14:
 - Reuse the codes of do_strtosz to build qemu_strtotime_ns
   (Eduardo)
 - Squash patch v13 01/12 and 02/12 together (Daniel and Eduardo)
 - Drop time unit picosecond (Eric)
 - Use qemu ctz64 and clz64 instead of builtin function
v13:
 - Modify some text description
 - Drop "initiator_valid" field in struct NodeInfo
 - Reuse Garray to store the raw bandwidth and bandwidth data
 - Calculate common base unit using range bitmap
 - Add a patch to alculate hmat latency and bandwidth entry list
 - Drop the total_levels option and use readable cache size
 - Remove the unnecessary head file
 - Use decimal notation with appropriate suffix for cache size

Liu Jingqi (5):
   numa: Extend CLI to provide memory latency and bandwidth information
   numa: Extend CLI to provide memory side cache information
   hmat acpi: Build Memory Proximity Domain Attributes Structure(s)
   hmat acpi: Build System Locality Latency and Bandwidth Information
 Structure(s)
   hmat acpi: Build Memory Side Cache Information Structure(s)

Tao Xu (3):
   numa: Extend CLI to provide initiator information for numa nodes
   tests/numa: Add case for QMP build HMAT
   tests/bios-tables-test: add test cases for ACPI HMAT

  hw/acpi/Kconfig   |   7 +-
  hw/acpi/Makefile.objs |   1 +
  hw/acpi/hmat.c| 268 +++
  hw/acpi/hmat.h|  42 
  hw/core/machine.c |  64 ++
  hw/core/numa.c| 297 ++
  hw/i386/acpi-build.c  |   5 +
  include/sysemu/numa.h |  63 ++
  qapi/machine.json | 180 +++-
  qemu-options.hx   |  95 +++-
  tests/bios-tables-test-allowed-diff.h |   8 +
  tests/bios-tables-test.c  |  44 
  tests/data/acpi/pc/APIC.acpihmat  |   0
  tests/data/acpi/pc/DSDT.acpihmat  |   0
  tests/data/acpi/pc/HMAT.acpihmat  |   0
  tests/data/acpi/pc/SRAT.acpihmat  |   0
  tests/data/acpi/q35/APIC.acpihmat |   0
  tests/data/acpi/q35/DSDT.acpihmat |   0
  tests/data/acpi/q35/HMAT.acpihmat |   0
  tests/data/acpi/q35/SRAT.acpihmat |   0
  tests/numa-test.c | 213 ++
  21 files changed, 1276 insertions(+), 11 deletions(-)
  create mode 100644 hw/acpi/hmat.c
  create mode 100644 hw/acpi/hmat.h
  create mode 100644 tests/data/acpi/pc/APIC.acpihmat
  create mode 100644

[Bug 1854878] [NEW] Physical USB thumbdrive treated as read-only

2019-12-02 Thread Ben321

Public bug reported:

So I have installed FreeDOS on my USB thumbdrive, by using Rufus.
Everything goes as expected so far. That's good.

When I run QEMU with this command line:
qemu-system-x86_64.exe -drive file=\\.\PhysicalDrive1

it of course is read-only, just like the resulting console message says:
WARNING: Image format was not specified for '\\.\PhysicalDrive1' and probing 
guessed raw.
 Automatically detecting the format is dangerous for raw images, write 
operations on block 0 will be restricted.
 Specify the 'raw' format explicitly to remove the restrictions.


So what I then did, was I ran QEMU with this command line:
qemu-system-x86_64.exe -drive file=\\.\PhysicalDrive1,format=raw

As expected, the above mentioned console message no longer appears.
However, beyond that, QEMU doesn't behave as it should regarding read-only 
status. When I try any operation that involves writing to the drive, it becomes 
quite clear that the drive is still read-only. Any writing operations to the 
drive result in FreeDOS giving me the error message:
Error writing to drive C: DOS area: sector not found.

The above situation is clearly a bug. QEMU should not be treating it as
read-only once I specify format=raw.

Note that drive C is how the guest OS refers to the USB thumbdrive (it's
drive E in my host OS, and drive C in my host OS is the actual system
drive).

And yes, it is a QEMU bug. It's not a FreeDOS bug I tested it with this command 
line, so that all changes would be written to a temporary snapshot file:
qemu-system-x86_64.exe -drive file=\\.\PhysicalDrive1,format=raw,snapshot
That last drive option "snapshot" tells QEMU to create a temporary snapshot 
file, and to write all changes to that. When I do that, all write operations 
are successful. So it seems that there is a bug in QEMU where it keeps 
read-only mode in place for a physical drive, even when format=raw is 
specified. Please fix this bug. Thanks in advance.

Here's my current setup.
Host OS: Windows 10 (64bit)
Guest OS: FreeDOS
QEMU version: 4.1.0

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1854878

Title:
  Physical USB thumbdrive treated as read-only

Status in QEMU:
  New

Bug description:
  So I have installed FreeDOS on my USB thumbdrive, by using Rufus.
  Everything goes as expected so far. That's good.

  When I run QEMU with this command line:
  qemu-system-x86_64.exe -drive file=\\.\PhysicalDrive1

  it of course is read-only, just like the resulting console message says:
  WARNING: Image format was not specified for '\\.\PhysicalDrive1' and probing 
guessed raw.
   Automatically detecting the format is dangerous for raw images, 
write operations on block 0 will be restricted.
   Specify the 'raw' format explicitly to remove the restrictions.

  
  So what I then did, was I ran QEMU with this command line:
  qemu-system-x86_64.exe -drive file=\\.\PhysicalDrive1,format=raw

  As expected, the above mentioned console message no longer appears.
  However, beyond that, QEMU doesn't behave as it should regarding read-only 
status. When I try any operation that involves writing to the drive, it becomes 
quite clear that the drive is still read-only. Any writing operations to the 
drive result in FreeDOS giving me the error message:
  Error writing to drive C: DOS area: sector not found.

  The above situation is clearly a bug. QEMU should not be treating it
  as read-only once I specify format=raw.

  Note that drive C is how the guest OS refers to the USB thumbdrive
  (it's drive E in my host OS, and drive C in my host OS is the actual
  system drive).

  And yes, it is a QEMU bug. It's not a FreeDOS bug I tested it with this 
command line, so that all changes would be written to a temporary snapshot file:
  qemu-system-x86_64.exe -drive file=\\.\PhysicalDrive1,format=raw,snapshot
  That last drive option "snapshot" tells QEMU to create a temporary snapshot 
file, and to write all changes to that. When I do that, all write operations 
are successful. So it seems that there is a bug in QEMU where it keeps 
read-only mode in place for a physical drive, even when format=raw is 
specified. Please fix this bug. Thanks in advance.

  Here's my current setup.
  Host OS: Windows 10 (64bit)
  Guest OS: FreeDOS
  QEMU version: 4.1.0

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1854878/+subscriptions

Re: [PATCHv3] exynos4210_gic: Suppress gcc9 format-truncation warnings

2019-12-02 Thread David Gibson

On Mon, Dec 02, 2019 at 05:44:11PM +, Peter Maydell wrote:
> On Mon, 2 Dec 2019 at 16:08, Richard Henderson
>  wrote:
> >
> > On 12/1/19 6:08 AM, David Gibson wrote:
> > >
> > > -for (i = 0; i < s->num_cpu; i++) {
> > > +/*
> > > + * This clues in gcc that our on-stack buffers do, in fact have
> > > + * enough room for the cpu numbers.  gcc 9.2.1 on 32-bit x86
> > > + * doesn't figure this out, otherwise and gives spurious warnings.
> > > + */
> > > +assert(n <= EXYNOS4210_NCPUS);
> > > +for (i = 0; i < n; i++) {
> > > +
> > >  /* Map CPU interface per SMP Core */
> >
> > Watch out for the extra line added at the start of the block.  Otherwise,
> >
> > Reviewed-by: Richard Henderson 
> 
> I thought about putting this in rc4 but eventually decided
> against it. Queued for 5.0 (with the stray extra blank line
> removed).

Great!

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

[PATCH v3 1/5] hvf: non-RAM, non-ROMD memory ranges are now correctly mapped in

2019-12-02 Thread Cameron Esfahani via

If an area is non-RAM and non-ROMD, then remove mappings so accesses
will trap and can be emulated.  Change hvf_find_overlap_slot() to take
a size instead of an end address: it wouldn't return a slot because
callers would pass the same address for start and end.  Don't always
map area as read/write/execute, respect area flags.

Signed-off-by: Cameron Esfahani 
Signed-off-by: Paolo Bonzini 
---
 target/i386/hvf/hvf.c | 50 ++-
 1 file changed, 35 insertions(+), 15 deletions(-)

diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
index 231732aaf7..0b50cfcbc6 100644
--- a/target/i386/hvf/hvf.c
+++ b/target/i386/hvf/hvf.c
@@ -107,14 +107,14 @@ static void assert_hvf_ok(hv_return_t ret)
 }
 
 /* Memory slots */
-hvf_slot *hvf_find_overlap_slot(uint64_t start, uint64_t end)
+hvf_slot *hvf_find_overlap_slot(uint64_t start, uint64_t size)
 {
 hvf_slot *slot;
 int x;
 for (x = 0; x < hvf_state->num_slots; ++x) {
 slot = _state->slots[x];
 if (slot->size && start < (slot->start + slot->size) &&
-end > slot->start) {
+(start + size) > slot->start) {
 return slot;
 }
 }
@@ -129,12 +129,10 @@ struct mac_slot {
 };
 
 struct mac_slot mac_slots[32];
-#define ALIGN(x, y)  (((x) + (y) - 1) & ~((y) - 1))
 
-static int do_hvf_set_memory(hvf_slot *slot)
+static int do_hvf_set_memory(hvf_slot *slot, hv_memory_flags_t flags)
 {
 struct mac_slot *macslot;
-hv_memory_flags_t flags;
 hv_return_t ret;
 
 macslot = _slots[slot->slot_id];
@@ -151,8 +149,6 @@ static int do_hvf_set_memory(hvf_slot *slot)
 return 0;
 }
 
-flags = HV_MEMORY_READ | HV_MEMORY_WRITE | HV_MEMORY_EXEC;
-
 macslot->present = 1;
 macslot->gpa_start = slot->start;
 macslot->size = slot->size;
@@ -165,14 +161,24 @@ void hvf_set_phys_mem(MemoryRegionSection *section, bool 
add)
 {
 hvf_slot *mem;
 MemoryRegion *area = section->mr;
+bool writeable = !area->readonly && !area->rom_device;
+hv_memory_flags_t flags;
 
 if (!memory_region_is_ram(area)) {
-return;
+if (writeable) {
+return;
+} else if (!memory_region_is_romd(area)) {
+/*
+ * If the memory device is not in romd_mode, then we actually want
+ * to remove the hvf memory slot so all accesses will trap.
+ */
+ add = false;
+}
 }
 
 mem = hvf_find_overlap_slot(
 section->offset_within_address_space,
-section->offset_within_address_space + 
int128_get64(section->size));
+int128_get64(section->size));
 
 if (mem && add) {
 if (mem->size == int128_get64(section->size) &&
@@ -186,7 +192,7 @@ void hvf_set_phys_mem(MemoryRegionSection *section, bool 
add)
 /* Region needs to be reset. set the size to 0 and remap it. */
 if (mem) {
 mem->size = 0;
-if (do_hvf_set_memory(mem)) {
+if (do_hvf_set_memory(mem, 0)) {
 error_report("Failed to reset overlapping slot");
 abort();
 }
@@ -196,6 +202,13 @@ void hvf_set_phys_mem(MemoryRegionSection *section, bool 
add)
 return;
 }
 
+if (area->readonly ||
+(!memory_region_is_ram(area) && memory_region_is_romd(area))) {
+flags = HV_MEMORY_READ | HV_MEMORY_EXEC;
+} else {
+flags = HV_MEMORY_READ | HV_MEMORY_WRITE | HV_MEMORY_EXEC;
+}
+
 /* Now make a new slot. */
 int x;
 
@@ -216,7 +229,7 @@ void hvf_set_phys_mem(MemoryRegionSection *section, bool 
add)
 mem->start = section->offset_within_address_space;
 mem->region = area;
 
-if (do_hvf_set_memory(mem)) {
+if (do_hvf_set_memory(mem, flags)) {
 error_report("Error registering new memory slot");
 abort();
 }
@@ -345,7 +358,14 @@ static bool ept_emulation_fault(hvf_slot *slot, uint64_t 
gpa, uint64_t ept_qual)
 return false;
 }
 
-return !slot;
+if (!slot) {
+return true;
+}
+if (!memory_region_is_ram(slot->region) &&
+!(read && memory_region_is_romd(slot->region))) {
+return true;
+}
+return false;
 }
 
 static void hvf_set_dirty_tracking(MemoryRegionSection *section, bool on)
@@ -354,7 +374,7 @@ static void hvf_set_dirty_tracking(MemoryRegionSection 
*section, bool on)
 
 slot = hvf_find_overlap_slot(
 section->offset_within_address_space,
-section->offset_within_address_space + 
int128_get64(section->size));
+int128_get64(section->size));
 
 /* protect region against writes; begin tracking it */
 if (on) {
@@ -720,7 +740,7 @@ int hvf_vcpu_exec(CPUState *cpu)
 ret = EXCP_INTERRUPT;
 break;
 }
-/* Need to check if MMIO or unmmaped fault */
+/* Need to check if MMIO or unmapped fault */
 case EXIT_REASON_EPT_FAULT:
 {
 hvf_slot *slot;
@@ -731,7 +751,7 @@

[PATCH v3 2/5] hvf: remove TSC synchronization code because it isn't fully complete

2019-12-02 Thread Cameron Esfahani via

The existing code in QEMU's HVF support to attempt to synchronize TSC
across multiple cores is not sufficient.  TSC value on other cores
can go backwards.  Until implementation is fixed, remove calls to
hv_vm_sync_tsc().  Pass through TSC to guest OS.

Signed-off-by: Cameron Esfahani 
Signed-off-by: Paolo Bonzini 
---
 target/i386/hvf/hvf.c | 3 +--
 target/i386/hvf/x86_emu.c | 3 ---
 target/i386/hvf/x86hvf.c  | 4 
 3 files changed, 1 insertion(+), 9 deletions(-)

diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
index 0b50cfcbc6..90fd50acfc 100644
--- a/target/i386/hvf/hvf.c
+++ b/target/i386/hvf/hvf.c
@@ -518,7 +518,6 @@ void hvf_reset_vcpu(CPUState *cpu) {
 wreg(cpu->hvf_fd, HV_X86_R8 + i, 0x0);
 }
 
-hv_vm_sync_tsc(0);
 hv_vcpu_invalidate_tlb(cpu->hvf_fd);
 hv_vcpu_flush(cpu->hvf_fd);
 }
@@ -612,7 +611,7 @@ int hvf_init_vcpu(CPUState *cpu)
 hv_vcpu_enable_native_msr(cpu->hvf_fd, MSR_GSBASE, 1);
 hv_vcpu_enable_native_msr(cpu->hvf_fd, MSR_KERNELGSBASE, 1);
 hv_vcpu_enable_native_msr(cpu->hvf_fd, MSR_TSC_AUX, 1);
-/*hv_vcpu_enable_native_msr(cpu->hvf_fd, MSR_IA32_TSC, 1);*/
+hv_vcpu_enable_native_msr(cpu->hvf_fd, MSR_IA32_TSC, 1);
 hv_vcpu_enable_native_msr(cpu->hvf_fd, MSR_IA32_SYSENTER_CS, 1);
 hv_vcpu_enable_native_msr(cpu->hvf_fd, MSR_IA32_SYSENTER_EIP, 1);
 hv_vcpu_enable_native_msr(cpu->hvf_fd, MSR_IA32_SYSENTER_ESP, 1);
diff --git a/target/i386/hvf/x86_emu.c b/target/i386/hvf/x86_emu.c
index 1b04bd7e94..3df767209d 100644
--- a/target/i386/hvf/x86_emu.c
+++ b/target/i386/hvf/x86_emu.c
@@ -772,9 +772,6 @@ void simulate_wrmsr(struct CPUState *cpu)
 
 switch (msr) {
 case MSR_IA32_TSC:
-/* if (!osx_is_sierra())
- wvmcs(cpu->hvf_fd, VMCS_TSC_OFFSET, data - rdtscp());
-hv_vm_sync_tsc(data);*/
 break;
 case MSR_IA32_APICBASE:
 cpu_set_apic_base(X86_CPU(cpu)->apic_state, data);
diff --git a/target/i386/hvf/x86hvf.c b/target/i386/hvf/x86hvf.c
index e0ea02d631..1485b95776 100644
--- a/target/i386/hvf/x86hvf.c
+++ b/target/i386/hvf/x86hvf.c
@@ -152,10 +152,6 @@ void hvf_put_msrs(CPUState *cpu_state)
 
 hv_vcpu_write_msr(cpu_state->hvf_fd, MSR_GSBASE, env->segs[R_GS].base);
 hv_vcpu_write_msr(cpu_state->hvf_fd, MSR_FSBASE, env->segs[R_FS].base);
-
-/* if (!osx_is_sierra())
- wvmcs(cpu_state->hvf_fd, VMCS_TSC_OFFSET, env->tsc - rdtscp());*/
-hv_vm_sync_tsc(env->tsc);
 }
 
 
-- 
2.24.0

[PATCH v3 4/5] hvf: more accurately match SDM when setting CR0 and PDPTE registers

2019-12-02 Thread Cameron Esfahani via

More accurately match SDM when setting CR0 and PDPTE registers.

Clear PDPTE registers when resetting vcpus.

Signed-off-by: Cameron Esfahani 
Signed-off-by: Paolo Bonzini 
---
 target/i386/hvf/hvf.c |  8 
 target/i386/hvf/vmx.h | 18 ++
 2 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
index 90fd50acfc..784e67d77e 100644
--- a/target/i386/hvf/hvf.c
+++ b/target/i386/hvf/hvf.c
@@ -441,12 +441,20 @@ static MemoryListener hvf_memory_listener = {
 };
 
 void hvf_reset_vcpu(CPUState *cpu) {
+uint64_t pdpte[4] = {0, 0, 0, 0};
+int i;
 
 /* TODO: this shouldn't be needed; there is already a call to
  * cpu_synchronize_all_post_reset in vl.c
  */
 wvmcs(cpu->hvf_fd, VMCS_ENTRY_CTLS, 0);
 wvmcs(cpu->hvf_fd, VMCS_GUEST_IA32_EFER, 0);
+
+/* Initialize PDPTE */
+for (i = 0; i < 4; i++) {
+wvmcs(cpu->hvf_fd, VMCS_GUEST_PDPTE0 + i * 2, pdpte[i]);
+}
+
 macvm_set_cr0(cpu->hvf_fd, 0x6010);
 
 wvmcs(cpu->hvf_fd, VMCS_CR4_MASK, CR4_VMXE_MASK);
diff --git a/target/i386/hvf/vmx.h b/target/i386/hvf/vmx.h
index 5dc52ecad6..eb8894cd58 100644
--- a/target/i386/hvf/vmx.h
+++ b/target/i386/hvf/vmx.h
@@ -121,6 +121,7 @@ static inline void macvm_set_cr0(hv_vcpuid_t vcpu, uint64_t 
cr0)
 uint64_t pdpte[4] = {0, 0, 0, 0};
 uint64_t efer = rvmcs(vcpu, VMCS_GUEST_IA32_EFER);
 uint64_t old_cr0 = rvmcs(vcpu, VMCS_GUEST_CR0);
+uint64_t mask = CR0_PG | CR0_CD | CR0_NW | CR0_NE | CR0_ET;
 
 if ((cr0 & CR0_PG) && (rvmcs(vcpu, VMCS_GUEST_CR4) & CR4_PAE) &&
 !(efer & MSR_EFER_LME)) {
@@ -128,18 +129,15 @@ static inline void macvm_set_cr0(hv_vcpuid_t vcpu, 
uint64_t cr0)
  rvmcs(vcpu, VMCS_GUEST_CR3) & ~0x1f,
  MEMTXATTRS_UNSPECIFIED,
  (uint8_t *)pdpte, 32, 0);
+/* Only set PDPTE when appropriate. */
+for (i = 0; i < 4; i++) {
+wvmcs(vcpu, VMCS_GUEST_PDPTE0 + i * 2, pdpte[i]);
+}
 }
 
-for (i = 0; i < 4; i++) {
-wvmcs(vcpu, VMCS_GUEST_PDPTE0 + i * 2, pdpte[i]);
-}
-
-wvmcs(vcpu, VMCS_CR0_MASK, CR0_CD | CR0_NE | CR0_PG);
+wvmcs(vcpu, VMCS_CR0_MASK, mask);
 wvmcs(vcpu, VMCS_CR0_SHADOW, cr0);
 
-cr0 &= ~CR0_CD;
-wvmcs(vcpu, VMCS_GUEST_CR0, cr0 | CR0_NE | CR0_ET);
-
 if (efer & MSR_EFER_LME) {
 if (!(old_cr0 & CR0_PG) && (cr0 & CR0_PG)) {
 enter_long_mode(vcpu, cr0, efer);
@@ -149,6 +147,10 @@ static inline void macvm_set_cr0(hv_vcpuid_t vcpu, 
uint64_t cr0)
 }
 }
 
+/* Filter new CR0 after we are finished examining it above. */
+cr0 = (cr0 & ~(mask & ~CR0_PG));
+wvmcs(vcpu, VMCS_GUEST_CR0, cr0 | CR0_NE | CR0_ET);
+
 hv_vcpu_invalidate_tlb(vcpu);
 hv_vcpu_flush(vcpu);
 }
-- 
2.24.0

[PATCH v3 0/5] hvf: stability fixes for HVF

2019-12-02 Thread Cameron Esfahani via

The following patches fix stability issues with running QEMU on Apple
Hypervisor Framework (HVF):
- non-RAM, non-ROMD areas need to trap so accesses can be correctly
  emulated.
- Current TSC synchronization implementation is insufficient: when
  running with more than 1 core, TSC values can go backwards.  Until
  a correct implementation can be written, remove calls to
  hv_vm_sync_tsc().  Pass through TSC to guest OS.
- Fix REX emulation in relation to legacy prefixes.
- More correctly match SDM when setting CR0 and PDPTE registers.
- Previous implementation in hvf_inject_interrupts() would always inject
  VMCS_INTR_T_SWINTR even when VMCS_INTR_T_HWINTR was required.  Now
  correctly determine when VMCS_INTR_T_HWINTR is appropriate versus
  VMCS_INTR_T_SWINTR.  Under heavy loads, interrupts got misrouted.

Changes in v3:
- Change previous code which saved interrupt/exception type in
  hvf_store_events() to inject later in hvf_inject_interrupts().
  Now, hvf_inject_interrupts() will correctly determine when it's appropriate
  to inject VMCS_INTR_T_HWINTR versus VMCS_INTR_T_SWINTR.  From feedback by
  Paolo Bonzini to make code more similar to KVM model.

Changes in v2:
- Fix code style errors.

Cameron Esfahani (5):
  hvf: non-RAM, non-ROMD memory ranges are now correctly mapped in
  hvf: remove TSC synchronization code because it isn't fully complete
  hvf: correctly handle REX prefix in relation to legacy prefixes
  hvf: more accurately match SDM when setting CR0 and PDPTE registers
  hvf: correctly inject VMCS_INTR_T_HWINTR versus VMCS_INTR_T_SWINTR.

 target/i386/hvf/hvf.c| 65 ++--
 target/i386/hvf/vmx.h| 18 +-
 target/i386/hvf/x86_decode.c | 64 +++
 target/i386/hvf/x86_decode.h | 20 +--
 target/i386/hvf/x86_emu.c|  3 --
 target/i386/hvf/x86hvf.c | 18 +-
 6 files changed, 112 insertions(+), 76 deletions(-)

-- 
2.24.0

[PATCH v3 5/5] hvf: correctly inject VMCS_INTR_T_HWINTR versus VMCS_INTR_T_SWINTR.

2019-12-02 Thread Cameron Esfahani via

Previous implementation in hvf_inject_interrupts() would always inject
VMCS_INTR_T_SWINTR even when VMCS_INTR_T_HWINTR was required.  Now
correctly determine when VMCS_INTR_T_HWINTR is appropriate versus
VMCS_INTR_T_SWINTR.

Make sure to clear ins_len and has_error_code when ins_len isn't
valid and error_code isn't set.

Signed-off-by: Cameron Esfahani 
---
 target/i386/hvf/hvf.c|  4 +++-
 target/i386/hvf/x86hvf.c | 14 +-
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
index 784e67d77e..d72543dc31 100644
--- a/target/i386/hvf/hvf.c
+++ b/target/i386/hvf/hvf.c
@@ -637,6 +637,8 @@ static void hvf_store_events(CPUState *cpu, uint32_t 
ins_len, uint64_t idtvec_in
 env->exception_injected = 0;
 env->interrupt_injected = -1;
 env->nmi_injected = false;
+env->ins_len = 0;
+env->has_error_code = false;
 if (idtvec_info & VMCS_IDT_VEC_VALID) {
 switch (idtvec_info & VMCS_IDT_VEC_TYPE) {
 case VMCS_IDT_VEC_HWINTR:
@@ -659,7 +661,7 @@ static void hvf_store_events(CPUState *cpu, uint32_t 
ins_len, uint64_t idtvec_in
 (idtvec_info & VMCS_IDT_VEC_TYPE) == VMCS_IDT_VEC_SWINTR) {
 env->ins_len = ins_len;
 }
-if (idtvec_info & VMCS_INTR_DEL_ERRCODE) {
+if (idtvec_info & VMCS_IDT_VEC_ERRCODE_VALID) {
 env->has_error_code = true;
 env->error_code = rvmcs(cpu->hvf_fd, VMCS_IDT_VECTORING_ERROR);
 }
diff --git a/target/i386/hvf/x86hvf.c b/target/i386/hvf/x86hvf.c
index 1485b95776..edefe5319a 100644
--- a/target/i386/hvf/x86hvf.c
+++ b/target/i386/hvf/x86hvf.c
@@ -345,8 +345,6 @@ void vmx_clear_int_window_exiting(CPUState *cpu)
  ~VMCS_PRI_PROC_BASED_CTLS_INT_WINDOW_EXITING);
 }
 
-#define NMI_VEC 2
-
 bool hvf_inject_interrupts(CPUState *cpu_state)
 {
 X86CPU *x86cpu = X86_CPU(cpu_state);
@@ -357,7 +355,11 @@ bool hvf_inject_interrupts(CPUState *cpu_state)
 bool have_event = true;
 if (env->interrupt_injected != -1) {
 vector = env->interrupt_injected;
-intr_type = VMCS_INTR_T_SWINTR;
+if (env->ins_len) {
+intr_type = VMCS_INTR_T_SWINTR;
+} else {
+intr_type = VMCS_INTR_T_HWINTR;
+}
 } else if (env->exception_nr != -1) {
 vector = env->exception_nr;
 if (vector == EXCP03_INT3 || vector == EXCP04_INTO) {
@@ -366,7 +368,7 @@ bool hvf_inject_interrupts(CPUState *cpu_state)
 intr_type = VMCS_INTR_T_HWEXCEPTION;
 }
 } else if (env->nmi_injected) {
-vector = NMI_VEC;
+vector = EXCP02_NMI;
 intr_type = VMCS_INTR_T_NMI;
 } else {
 have_event = false;
@@ -390,6 +392,8 @@ bool hvf_inject_interrupts(CPUState *cpu_state)
 if (env->has_error_code) {
 wvmcs(cpu_state->hvf_fd, VMCS_ENTRY_EXCEPTION_ERROR,
   env->error_code);
+/* Indicate that VMCS_ENTRY_EXCEPTION_ERROR is valid */
+info |= VMCS_INTR_DEL_ERRCODE;
 }
 /*printf("reinject  %lx err %d\n", info, err);*/
 wvmcs(cpu_state->hvf_fd, VMCS_ENTRY_INTR_INFO, info);
@@ -399,7 +403,7 @@ bool hvf_inject_interrupts(CPUState *cpu_state)
 if (cpu_state->interrupt_request & CPU_INTERRUPT_NMI) {
 if (!(env->hflags2 & HF2_NMI_MASK) && !(info & VMCS_INTR_VALID)) {
 cpu_state->interrupt_request &= ~CPU_INTERRUPT_NMI;
-info = VMCS_INTR_VALID | VMCS_INTR_T_NMI | NMI_VEC;
+info = VMCS_INTR_VALID | VMCS_INTR_T_NMI | EXCP02_NMI;
 wvmcs(cpu_state->hvf_fd, VMCS_ENTRY_INTR_INFO, info);
 } else {
 vmx_set_nmi_window_exiting(cpu_state);
-- 
2.24.0

[PATCH v3 3/5] hvf: correctly handle REX prefix in relation to legacy prefixes

2019-12-02 Thread Cameron Esfahani via

In real x86 processors, the REX prefix must come after legacy prefixes.
REX before legacy is ignored.  Update the HVF emulation code to properly
handle this.  Fix some spelling errors in constants.  Fix some decoder
table initialization issues found by Coverity.

Signed-off-by: Cameron Esfahani 
Signed-off-by: Paolo Bonzini 
---
 target/i386/hvf/x86_decode.c | 64 
 target/i386/hvf/x86_decode.h | 20 +--
 2 files changed, 46 insertions(+), 38 deletions(-)

diff --git a/target/i386/hvf/x86_decode.c b/target/i386/hvf/x86_decode.c
index 822fa1866e..77c346605f 100644
--- a/target/i386/hvf/x86_decode.c
+++ b/target/i386/hvf/x86_decode.c
@@ -122,7 +122,8 @@ static void decode_rax(CPUX86State *env, struct x86_decode 
*decode,
 {
 op->type = X86_VAR_REG;
 op->reg = R_EAX;
-op->ptr = get_reg_ref(env, op->reg, decode->rex.rex, 0,
+/* Since reg is always AX, REX prefix has no impact. */
+op->ptr = get_reg_ref(env, op->reg, false, 0,
   decode->operand_size);
 }
 
@@ -1687,40 +1688,37 @@ calc_addr:
 }
 }
 
-target_ulong get_reg_ref(CPUX86State *env, int reg, int rex, int is_extended,
- int size)
+target_ulong get_reg_ref(CPUX86State *env, int reg, int rex_present,
+ int is_extended, int size)
 {
 target_ulong ptr = 0;
-int which = 0;
 
 if (is_extended) {
 reg |= R_R8;
 }
 
-
 switch (size) {
 case 1:
-if (is_extended || reg < 4 || rex) {
-which = 1;
+if (is_extended || reg < 4 || rex_present) {
 ptr = (target_ulong)(env, reg);
 } else {
-which = 2;
 ptr = (target_ulong)(env, reg - 4);
 }
 break;
 default:
-which = 3;
 ptr = (target_ulong)(env, reg);
 break;
 }
 return ptr;
 }
 
-target_ulong get_reg_val(CPUX86State *env, int reg, int rex, int is_extended,
- int size)
+target_ulong get_reg_val(CPUX86State *env, int reg, int rex_present,
+ int is_extended, int size)
 {
 target_ulong val = 0;
-memcpy(, (void *)get_reg_ref(env, reg, rex, is_extended, size), size);
+memcpy(,
+   (void *)get_reg_ref(env, reg, rex_present, is_extended, size),
+   size);
 return val;
 }
 
@@ -1853,28 +1851,38 @@ void calc_modrm_operand(CPUX86State *env, struct 
x86_decode *decode,
 static void decode_prefix(CPUX86State *env, struct x86_decode *decode)
 {
 while (1) {
+/*
+ * REX prefix must come after legacy prefixes.
+ * REX before legacy is ignored.
+ * Clear rex to simulate this.
+ */
 uint8_t byte = decode_byte(env, decode);
 switch (byte) {
 case PREFIX_LOCK:
 decode->lock = byte;
+decode->rex.rex = 0;
 break;
 case PREFIX_REPN:
 case PREFIX_REP:
 decode->rep = byte;
+decode->rex.rex = 0;
 break;
-case PREFIX_CS_SEG_OVEERIDE:
-case PREFIX_SS_SEG_OVEERIDE:
-case PREFIX_DS_SEG_OVEERIDE:
-case PREFIX_ES_SEG_OVEERIDE:
-case PREFIX_FS_SEG_OVEERIDE:
-case PREFIX_GS_SEG_OVEERIDE:
+case PREFIX_CS_SEG_OVERRIDE:
+case PREFIX_SS_SEG_OVERRIDE:
+case PREFIX_DS_SEG_OVERRIDE:
+case PREFIX_ES_SEG_OVERRIDE:
+case PREFIX_FS_SEG_OVERRIDE:
+case PREFIX_GS_SEG_OVERRIDE:
 decode->segment_override = byte;
+decode->rex.rex = 0;
 break;
 case PREFIX_OP_SIZE_OVERRIDE:
 decode->op_size_override = byte;
+decode->rex.rex = 0;
 break;
 case PREFIX_ADDR_SIZE_OVERRIDE:
 decode->addr_size_override = byte;
+decode->rex.rex = 0;
 break;
 case PREFIX_REX ... (PREFIX_REX + 0xf):
 if (x86_is_long_mode(env_cpu(env))) {
@@ -2111,14 +2119,14 @@ void init_decoder()
 {
 int i;
 
-for (i = 0; i < ARRAY_SIZE(_decode_tbl2); i++) {
-memcpy(_decode_tbl1, _inst, sizeof(invl_inst));
+for (i = 0; i < ARRAY_SIZE(_decode_tbl1); i++) {
+memcpy(&_decode_tbl1[i], _inst, sizeof(invl_inst));
 }
 for (i = 0; i < ARRAY_SIZE(_decode_tbl2); i++) {
-memcpy(_decode_tbl2, _inst, sizeof(invl_inst));
+memcpy(&_decode_tbl2[i], _inst, sizeof(invl_inst));
 }
 for (i = 0; i < ARRAY_SIZE(_decode_tbl3); i++) {
-memcpy(_decode_tbl3, _inst, sizeof(invl_inst_x87));
+memcpy(&_decode_tbl3[i], _inst_x87, sizeof(invl_inst_x87));
 
 }
 for (i = 0; i < ARRAY_SIZE(_1op_inst); i++) {
@@ -2167,22 +2175,22 @@ target_ulong decode_linear_addr(CPUX86State *env, 
struct x86_decode *decode,
target_ulong addr, X86Seg seg)
 {
 switch (decode->segment_override) {
-case PREFIX_CS_SEG_OVEERIDE:
+case PREFIX_CS_SEG_OVERRIDE:
 seg =

Re: [PATCH 2/4] target/arm: Abstract the generic timer frequency

2019-12-02 Thread Andrew Jeffery




On Tue, 3 Dec 2019, at 04:42, Peter Maydell wrote:
> On Thu, 28 Nov 2019 at 05:44, Andrew Jeffery  wrote:
> >
> > Prepare for SoCs such as the ASPEED AST2600 whose firmware configures
> > CNTFRQ to values significantly larger than the static 62.5MHz value
> > currently derived from GTIMER_SCALE. As the OS potentially derives its
> > timer periods from the CNTFRQ value the lack of support for running
> > QEMUTimers at the appropriate rate leads to sticky behaviour in the
> > guest.
> >
> > Substitute the GTIMER_SCALE constant with use of a helper to derive the
> > period from gt_cntfrq stored in struct ARMCPU. Initially set gt_cntfrq
> > to the frequency associated with GTIMER_SCALE so current behaviour is
> > maintained.
> >
> > Signed-off-by: Andrew Jeffery 
> 
> > +static inline unsigned int gt_cntfrq_period_ns(ARMCPU *cpu)
> > +{
> > +/* XXX: Could include qemu/timer.h to get NANOSECONDS_PER_SECOND? */
> > +const unsigned int ns_per_s = 1000 * 1000 * 1000;
> > +return ns_per_s > cpu->gt_cntfrq ? ns_per_s / cpu->gt_cntfrq : 1;
> > +}
> 
> This function is named gt_cntfrq_period_ns()...
> 
> >  static uint64_t gt_virt_cnt_read(CPUARMState *env, const ARMCPRegInfo *ri)
> >  {
> > +ARMCPU *cpu = env_archcpu(env);
> > +
> >  /* Currently we have no support for QEMUTimer in linux-user so we
> >   * can't call gt_get_countervalue(env), instead we directly
> >   * call the lower level functions.
> >   */
> > -return cpu_get_clock() / GTIMER_SCALE;
> > +return cpu_get_clock() / gt_cntfrq_period(cpu);
> >  }
> 
> ...but here we call gt_cntfrq_period(), which doesn't exist,
> and indeed at least one of the patchew build systems reported
> it as a compile failure.
> 

Ah yep, I failed to test user mode after renaming the function and missed this.

I haven't seen an alert from patchew though, I wonder where that got to?

Andrew

[PATCH v37 00/17] QEMU AVR 8 bit cores

2019-12-02 Thread Aleksandar Markovic

On Monday, December 2, 2019, Aleksandar Markovic <
aleksandar.m.m...@gmail.com> wrote:

>
>
> On Monday, December 2, 2019, Michael Rolnik  wrote:
>
>> how can I get this elf flags from within QEMU?
>>
>>>
>>>
> In one of files from your "machine" patch, you have this snippet:
>
> +bytes_loaded = load_elf(
> +filename, NULL, NULL, NULL, NULL, NULL, NULL, 0, EM_NONE, 0,
> 0);
>
> With this line you a kind of "blindly" load whatever you find in the file
> "filename". I think you need to modify load_elf() to fetch the information
> on what core the elf in question is compiled for. Additionally, you most
> likely have to check if the elf file is compiled for AVR at all.
>
> I don't know enough about AVR-specifics of ELF format, but I know that we
> in MIPS read successfuly some MIPS-specific things we need to know. Do some
> research for ELF format headrr content for AVR.
>
> This is really missing in your series, seriously.
>
> Please keep in mind that I don't have right now at hand any dev system, so
> all I said here is off of my head.
>
> You have to do some code digging.
>
>
First, you need to update

https://github.com/qemu/qemu/blob/master/include/elf.h

with bits and pieces for AVR.

In binutils file:

https://github.com/bminor/binutils-gdb/blob/master/include/elf/common.h

you will spot the line:

#define EM_AVR 83 /* Atmel AVR 8-bit microcontroller */

that is the value of e_machine field for AVR, which you need to insert in
qemu's include/elf.h about at line 162.

Then, in another binutils file:

https://github.com/bminor/binutils-gdb/blob/master/include/elf/avr.h

you find the lines:

#define E_AVR_MACH_AVR1 1
#define E_AVR_MACH_AVR2 2
#define E_AVR_MACH_AVR25 25
#define E_AVR_MACH_AVR3 3
#define E_AVR_MACH_AVR31 31
#define E_AVR_MACH_AVR35 35
#define E_AVR_MACH_AVR4 4
#define E_AVR_MACH_AVR5 5
#define E_AVR_MACH_AVR51 51
#define E_AVR_MACH_AVR6 6
#define E_AVR_MACH_AVRTINY 100
#define E_AVR_MACH_XMEGA1 101
#define E_AVR_MACH_XMEGA2 102
#define E_AVR_MACH_XMEGA3 103
#define E_AVR_MACH_XMEGA4 104
#define E_AVR_MACH_XMEGA5 105
#define E_AVR_MACH_XMEGA6 106
#define E_AVR_MACH_XMEGA7 107

That you also need to insert in qemu's include/elf.h, probably at the end
of tge foke or elsewhere.

Perhaps something more you need to insert into that file, you'll see.

Than, you need to modify the file where load_elf() resides with AVR
support, take a look at other architectures' support, and adjust to what
you need.

I know it will be contrieved at times, but, personally, similar ELF support
must be done for any upcoming platform. Only if there is some
unsourmantable obstacle, that support can be omitted.

I am on vacation next 10 days.

Yours,
Aleksandar

.

> Best regards, Aleksandar
>
>
>> On Mon, Dec 2, 2019 at 4:01 PM Aleksandar Markovic <
>> aleksandar.m.m...@gmail.com> wrote:
>>
>>>
>>>
>>> On Monday, December 2, 2019, Michael Rolnik  wrote:
>>>
 No, I don't.
 but I also can load and execute a binary file which does not have this
 information.

>
>
>>> OK. Let's think about that for a while. I currently think you have here
>>> an opportunity to add a really clean interface from the outset of AVR
>>> support in QEMU (that even some established platforms don't have in full),
>>> which is, trust me, very important for future. And it not that difficult to
>>> implement at all. But let's both think for a while.
>>>
>>> Best regards,
>>> Aleksandar
>>>
>>>
>>>
 On Mon, Dec 2, 2019 at 11:59 AM Aleksandar Markovic <
 aleksandar.m.m...@gmail.com> wrote:

>
>
> On Monday, December 2, 2019, Aleksandar Markovic <
> aleksandar.m.m...@gmail.com> wrote:
>
>>
>>
>> On Saturday, November 30, 2019, Michael Rolnik 
>> wrote:
>>
>>> There is *-cpu *option where you can specify what CPU you want, if
>>> this option is not specified avr6 (avr6-avr-cpu) is chosen.
>>>
>>> *./avr-softmmu/qemu-system-avr -cpu help*
>>> avr1-avr-cpu
>>> avr2-avr-cpu
>>> avr25-avr-cpu
>>> avr3-avr-cpu
>>> avr31-avr-cpu
>>> avr35-avr-cpu
>>> avr4-avr-cpu
>>> avr5-avr-cpu
>>> avr51-avr-cpu
>>> avr6-avr-cpu
>>> xmega2-avr-cpu
>>> xmega4-avr-cpu
>>> xmega5-avr-cpu
>>> xmega6-avr-cpu
>>> xmega7-avr-cpu
>>>
>>>
>> What happens if you specify a core via -cpu, and supply elf file
>> compiled for another core?
>>
>>
> It looks there is some related info written in ELF header. This is
> from a binutils header:
>
> (so it looks you could detect the core from elf file - do you do that
> detection right now?)
>
> #define E_AVR_MACH_AVR1 1
> #define E_AVR_MACH_AVR2 2
> #define E_AVR_MACH_AVR25   25
> #define E_AVR_MACH_AVR3 3
> #define E_AVR_MACH_AVR31   31
> #define E_AVR_MACH_AVR35   35
> #define E_AVR_MACH_AVR4 4
> #define E_AVR_MACH_AVR5 5
> #define E_AVR_MACH_AVR51   51
> #define

Re: [RFC] QEMU Gating CI

2019-12-02 Thread Cleber Rosa

On Mon, Dec 02, 2019 at 11:36:35AM -0700, Warner Losh wrote:
> 
> Just make sure that any pipeline and mandatory CI steps don't slow things
> down too much... While the examples have talked about 1 or 2 pull requests
> getting done in parallel, and that's great, the problem is when you try to
> land 10 or 20 all at once, one that causes the failure and you aren't sure
> which one it actually is... Make sure whatever you design has sane
> exception case handling to not cause too much collateral damage... I worked
> one place that would back everything out if a once-a-week CI test ran and
> had failures... That CI test-run took 2 days to run, so it wasn't practical
> to run it often, or for every commit. In the end, though, the powers that
> be implemented a automated bisection tool that made it marginally less
> sucky..
> 
> Warner

What I would personally like to see is the availability of enough
resources to give a ~2 hour max result turnaround, that is, the
complete pipeline finishes within that 2 hours.  Of course the exact
max time should be a constructed consensus.

If someone is contributing a new job supposed to run on existing
hardware, its acceptance should be carefully considered.  If more
hardware is being added and the job is capable of running parallel
with others, than it shouldn't be an issue (I don't think we'll hit
GitLab's scheduling limits anytime soon).

With regards to the "1 or 2 pull requests done in parallel", of course
there could be a queue of pending jobs, but given that the idea is for
these jobs to be run based on maintainers actions (say a Merge
Request), the volume should be much lower than if individual
contributors were triggering the same jobs on their patch series, and
not at all on every commit (as you describe with the ~2 days jobs).

Anyway, thanks for the feedback and please do not refrain from further
participation in this effort.  Your experience seems quite valuable.

Thanks,
- Cleber.

Re: bitmaps -- copying allocation status into dirty bitmaps

2019-12-02 Thread John Snow




On 11/4/19 6:27 AM, Max Reitz wrote:
> On 04.11.19 12:21, Max Reitz wrote:
>> On 01.11.19 16:42, John Snow wrote:
>>> Hi, in one of my infamously unreadable and long status emails, I
>>> mentioned possibly wanting to copy allocation data into bitmaps as a way
>>> to enable users to create (external) snapshots from outside of the
>>> libvirt/qemu context.
>>>
>>> (That is: to repair checkpoints in libvirt after a user extended the
>>> backing chain themselves, you want to restore bitmap information for
>>> that node. Conveniently, this information IS the allocation map, so we
>>> can do this.)
>>>
>>> It came up at KVM Forum that we probably do want this, because oVirt
>>> likes the idea of being able to manipulate these chains from outside of
>>> libvirt/qemu.
>>>
>>> Denis suggested that instead of a new command, we can create a special
>>> name -- maybe "#ALLOCATED" or something similar that can never be
>>> allocated as a user-defined bitmap name -- as a special source for the
>>> merge command.
>>>
>>> You'd issue a merge from "#ALLOCATED" to "myBitmap0" to copy the current
>>> allocation data into "myBitmap0", for instance.
>>
>> Sounds fun, but is there actually any use for this if the only purpose
>> is to work as a source for merge?
>>
>> I mean, it would be interesting if it worked exactly like a perma-RO
>> pseudo-bitmap that whenever you try to get data from it performs a
>> block-status call.  But as you say, that would probably be too slow, and
>> it would take a lot of code modifications, so I wonder if there is
>> actually any purpose for this.
>>
>>> Some thoughts:
>>>
>>> - The only commands where this pseudo-bitmap makes sense is merge.
>>> enable/disable/remove/clear/add don't make sense here.
>>>
>>> - This pseudo bitmap might make sense for backup, but it's not needed;
>>> you can just merge into an empty/enabled bitmap and then use that.
>>>
>>> - Creating an allocation bitmap on-the-fly is probably not possible
>>> directly in the merge command, because the disk status calls might take
>>> too long...
>>>
>>> Hm, actually, I'm not sure how to solve that one. Merge would need to
>>> become a job (or an async QMP command?) or we'd need to keep an
>>> allocation bitmap object around and in-sync. I don't really want to do
>>> either, so maybe I'm missing an obvious/better solution.
>>
>> All of what you wrote in this mail makes me think it would make much
>> more sense to just add a “block-dirty-bitmap-create-from” job with an
>> enum of targets.  (One of which would be “allocated-blocks”.)

Sounds good. (What are the other targets? Questions-for-later?)

> 
> I forgot to add that of course the advantage of a pseudo-bitmap would be
> that it’s always up to date, but as you said, it would be slow to query
> (and it might even yield, which isn’t what callers expect) and at least
> for block allocation, it seems unnecessary to me (because writes will
> keep the new bitmap created from allocated-blocks up-to-date).
> 
> Max
> 

Who knows what's happened in the month since I've been gone, but I think
I agree completely with your assessment.

In our meeting with Denis it seemed like it was the optimal thing to
make a pseudo-bitmap for merge so we didn't have to add a new command,
but I think it's clear that the async properties are going to prohibit
that nice solution and we will indeed need a job.

--js

[PATCH 05/10] arm: allwinner-h3: add System Control module

2019-12-02 Thread Niek Linnenbank

The Allwinner H3 System on Chip has an System Control
module that provides system wide generic controls and
device information. This commit adds support for the
Allwinner H3 System Control module.

Signed-off-by: Niek Linnenbank 
---
 hw/arm/allwinner-h3.c |  11 ++
 hw/misc/Makefile.objs |   1 +
 hw/misc/allwinner-h3-syscon.c | 139 ++
 include/hw/arm/allwinner-h3.h |   2 +
 include/hw/misc/allwinner-h3-syscon.h |  43 
 5 files changed, 196 insertions(+)
 create mode 100644 hw/misc/allwinner-h3-syscon.c
 create mode 100644 include/hw/misc/allwinner-h3-syscon.h

diff --git a/hw/arm/allwinner-h3.c b/hw/arm/allwinner-h3.c
index afeb49c0ac..ebd8fde412 100644
--- a/hw/arm/allwinner-h3.c
+++ b/hw/arm/allwinner-h3.c
@@ -41,6 +41,9 @@ static void aw_h3_init(Object *obj)
 
 sysbus_init_child_obj(obj, "ccu", >ccu, sizeof(s->ccu),
   TYPE_AW_H3_CLK);
+
+sysbus_init_child_obj(obj, "syscon", >syscon, sizeof(s->syscon),
+  TYPE_AW_H3_SYSCON);
 }
 
 static void aw_h3_realize(DeviceState *dev, Error **errp)
@@ -184,6 +187,14 @@ static void aw_h3_realize(DeviceState *dev, Error **errp)
 }
 sysbus_mmio_map(SYS_BUS_DEVICE(>ccu), 0, AW_H3_CCU_BASE);
 
+/* System Control */
+object_property_set_bool(OBJECT(>syscon), true, "realized", );
+if (err) {
+error_propagate(errp, err);
+return;
+}
+sysbus_mmio_map(SYS_BUS_DEVICE(>syscon), 0, AW_H3_SYSCON_BASE);
+
 /* Universal Serial Bus */
 sysbus_create_simple(TYPE_AW_H3_EHCI, AW_H3_EHCI0_BASE,
  s->irq[AW_H3_GIC_SPI_EHCI0]);
diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
index 200ed44ce1..b234aefba5 100644
--- a/hw/misc/Makefile.objs
+++ b/hw/misc/Makefile.objs
@@ -29,6 +29,7 @@ common-obj-$(CONFIG_MACIO) += macio/
 common-obj-$(CONFIG_IVSHMEM_DEVICE) += ivshmem.o
 
 common-obj-$(CONFIG_ALLWINNER_H3) += allwinner-h3-clk.o
+common-obj-$(CONFIG_ALLWINNER_H3) += allwinner-h3-syscon.o
 common-obj-$(CONFIG_REALVIEW) += arm_sysctl.o
 common-obj-$(CONFIG_NSERIES) += cbus.o
 common-obj-$(CONFIG_ECCMEMCTL) += eccmemctl.o
diff --git a/hw/misc/allwinner-h3-syscon.c b/hw/misc/allwinner-h3-syscon.c
new file mode 100644
index 00..66bd518a05
--- /dev/null
+++ b/hw/misc/allwinner-h3-syscon.c
@@ -0,0 +1,139 @@
+/*
+ * Allwinner H3 System Control emulation
+ *
+ * Copyright (C) 2019 Niek Linnenbank 
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#include "qemu/osdep.h"
+#include "hw/sysbus.h"
+#include "migration/vmstate.h"
+#include "qemu/log.h"
+#include "qemu/module.h"
+#include "hw/misc/allwinner-h3-syscon.h"
+
+/* SYSCON register offsets */
+#define REG_VER (0x24)  /* Version */
+#define REG_EMAC_PHY_CLK(0x30)  /* EMAC PHY Clock */
+#define REG_INDEX(offset)   (offset / sizeof(uint32_t))
+
+/* SYSCON register reset values */
+#define REG_VER_RST (0x0)
+#define REG_EMAC_PHY_CLK_RST(0x58000)
+
+static uint64_t allwinner_h3_syscon_read(void *opaque, hwaddr offset,
+ unsigned size)
+{
+const AwH3SysconState *s = (AwH3SysconState *)opaque;
+const uint32_t idx = REG_INDEX(offset);
+
+if (idx >= AW_H3_SYSCON_REGS_NUM) {
+qemu_log_mask(LOG_GUEST_ERROR, "%s: bad read offset 0x%04x\n",
+  __func__, (uint32_t)offset);
+return 0;
+}
+
+return s->regs[idx];
+}
+
+static void allwinner_h3_syscon_write(void *opaque, hwaddr offset,
+  uint64_t val, unsigned size)
+{
+AwH3SysconState *s = (AwH3SysconState *)opaque;
+const uint32_t idx = REG_INDEX(offset);
+
+if (idx >= AW_H3_SYSCON_REGS_NUM) {
+qemu_log_mask(LOG_GUEST_ERROR, "%s: bad write offset 0x%04x\n",
+  __func__, (uint32_t)offset);
+return;
+}
+
+switch (offset) {
+case REG_VER:   /* Version */
+break;
+default:
+s->regs[idx] = (uint32_t) val;
+break;
+}
+}
+
+static const MemoryRegionOps allwinner_h3_syscon_ops = {
+.read = allwinner_h3_syscon_read,
+.write = allwinner_h3_syscon_write,
+.endianness = DEVICE_NATIVE_ENDIAN,
+.valid = {
+.min_access_size = 4,
+.max_access_size = 4,
+.unaligned = false
+}
+};

1 2 3 4 >

1 - 100 of 301 matches

Mail list logo