date:20211012

Re: [RFC PATCH v2 00/16] Initial support for machine creation via QMP

2021-10-12 Thread Mark Burton

Fixed
Cheers
Mark.


> On 13 Oct 2021, at 00:16, Alistair Francis  wrote:
> 
> On Thu, Sep 23, 2021 at 2:22 AM Damien Hedde  
> wrote:
>> 
>> Hi,
>> 
>> The goal of this work is to bring dynamic machine creation to QEMU:
>> we want to setup a machine without compiling a specific machine C
>> code. It would ease supporting highly configurable platforms (for
>> example resulting from an automated design flow). The requirements
>> for such configuration include begin able to specify the number of
>> cores, available peripherals, emmory mapping, IRQ mapping, etc.
>> 
>> This series focuses on the first step: populating a machine with
>> devices during its creation. We propose patches to support this
>> using QMP commands. This is a working set of patches and improves
>> over the earlier rfc (posted in May):
>> https://lists.gnu.org/archive/html/qemu-devel/2021-05/msg03706.html
>> 
>> Although it is working and could be merged, it is tag as an RFC:
>> we probably need to discuss the conditions for allowing a device to
>> be created at an early stage. Patches 6, 10 and 13, 15 and 16 depend
>> on such conditions and are subject to change. Other patches are
>> unrelated to this point.
>> 
>> We address several issues in this series. They are detailed below.
>> 
>> ## 1. Stoping QEMU to populate the machine with devices
>> 
>> QEMU goes through several steps (called _machine phases_) when
>> creating the machine: 'no-machine', 'machine-created',
>> 'accel-created', 'initialized', and finally 'ready'. At 'ready'
>> phase, QEMU is ready to start (see Paolo's page
>> https://wiki.qemu.org/User:Paolo_Bonzini/Machine_init_sequence for
>> more details).
>> 
>> Using the -preconfig CLI option, QEMU can be stopped today during
>> the 'accel-created' phase. Then the 'x-exit-preconfig' QMP command
>> triggers QEMU moving forwards to the completion of the machine
>> creation ('ready' phase).
>> 
>> The devices are created during the 'initialized' phase.
>> In this phase the machine init() method has been executed and thus
>> machine properties have been handled. Although the sysbus exists and
>> the machine may have been populated by the init(),
>> _machine_init_done_ notifiers have not been called yet. At this point
>> we can add more devices to a machine.
>> 
>> We propose to add 2 QMP commands:
>> + The 'query-machine-phase' command would return the current machine
>>  phase.
>> + The 'x-machine-init' command would advance the machine phase to
>>  'initialized'. 'x-exit-preconfig' could then still be used to
>>  advance to the last phase.
>> 
>> ## 2. Adding devices
>> 
>> Right now, the user can create devices in 2 ways: using '-device' CLI
>> option or 'device_add' QMP command. Both are executed after the
>> machine is ready: such devices are hot-plugged. We propose to allow
>> 'device_add' QMP command to be used during the 'initialized' phase.
>> 
>> In this series, we keep the constraint that the device must be
>> 'user-creatable' (this is a device class flag). We do not see any
>> reason why a device the user can hot-plug could not be created at an
>> earlier stage.
>> 
>> This part is still RFC because, as Peter mentioned it (in this thread
>> https://lists.gnu.org/archive/html/qemu-devel/2021-08/msg01933.html),
>> we may want additional or distinct conditions for:
>> + device we can hot-plug
>> + device we can add in '-preconfig' (cold-plug)
>> We are open to suggestions. We could for example add a
>> 'preconfig-creatable' or 'init-creatable' flag to device class, which
>> can identify a set of devices we can create this way.
>> 
>> The main addition is how we handle the case of sysbus devices. Sysbus
>> devices are particular because unlike, for example, pci devices, you
>> have to manually handle the memory mapping and interrupts wiring. So
>> right now, a sysbus device is dynamically creatable (using -device
>> CLI option or device_add QMP command) only if:
>> + it is 'user_creatable' (like any other device),
>> + and it is in the current machine sysbus device allow list.
>> 
>> In this series, we propose to relax the second constraint during the
>> earlier phases of machine creation so that when using -preconfig we
>> can create any 'user-creatable' sysbus device. When the machine
>> progresses to the 'ready' phase, sysbus devices creation will come
>> back to the legacy behavior: it will be possible only based on the
>> per-machine authorization basis.
>> 
>> For sysbus devices, wiring interrupts is not a problem as we can use
>> the 'qom-set' QMP command, but memory mapping is.
>> 
>> ## 3. Memory mapping
>> 
>> There is no point allowing the creation sysbus devices if we cannot
>> map them onto the memory bus (the 'sysbus').
>> 
>> As far as we know, right now, there is no way to add memory mapping
>> for sysbus device using QMP commands. We propose a 'x-sysbus-mmio-map'
>> command to do this. This command would only be allowed during the
>> 'initialized' phase when using -preconfig.
>> 
>> ## 4. Working

Re: [RFC PATCH v4 20/20] vdpa: Add custom IOTLB translations to SVQ

2021-10-12 Thread Jason Wang




在 2021/10/1 下午3:06, Eugenio Pérez 写道:

Use translations added in VhostIOVATree in SVQ.

Now every element needs to store the previous address also, so VirtQueue
can consume the elements properly. This adds a little overhead per VQ
element, having to allocate more memory to stash them. As a possible
optimization, this allocation could be avoided if the descriptor is not
a chain but a single one, but this is left undone.

TODO: iova range should be queried before, and add logic to fail when
GPA is outside of its range and memory listener or svq add it.

Signed-off-by: Eugenio Pérez
---
  hw/virtio/vhost-shadow-virtqueue.h |   4 +-
  hw/virtio/vhost-shadow-virtqueue.c | 130 -
  hw/virtio/vhost-vdpa.c |  40 -
  hw/virtio/trace-events |   1 +
  4 files changed, 152 insertions(+), 23 deletions(-)



Think hard about the whole logic. This is safe since qemu memory map 
will fail if guest submits a invalidate IOVA.


Then I wonder if we do something much more simpler:

1) Using qemu VA as IOVA but only maps the VA that belongs to guest
2) Then we don't need any IOVA tree here, what we need is to just map 
vring and use qemu VA without any translation


Thanks

Re: [PATCH v2 00/37] Add D-Bus display backend

2021-10-12 Thread Gerd Hoffmann

On Sun, Oct 10, 2021 at 01:08:01AM +0400, marcandre.lur...@redhat.com wrote:
> From: Marc-André Lureau 
> 
> Hi,
> 
> Both Spice and VNC are relatively complex and inefficient for local-only
> display/console export.
> 
> The goal of this display backend is to export over D-Bus an interface close to
> the QEMU internal APIs. Any -display or -audio backend should be possible to
> implement externally that way. It will allow third-parties to maintain their 
> own
> backends (UI toolkits, servers etc), and eventually reduce the responsability 
> on
> QEMU.
> 
> D-Bus is the protocol of choice for the desktop, it has many convenient 
> bindings
> for various languages and tools. Data blob transfer is more efficient than QMP
> too. Backends can come and go as needed: you can have several display opened
> (say Boxes & virt-manager), while exporting the display over VNC for example
> from a different process. It works best on Unix, but there is some Windows
> support too (even Windows has some AF_UNIX nowadays, and the WSL2 situation 
> may
> change the future of QEMU on Windows anyway).
> 
> Using it only requires "-display dbus" on any reasonable Linux desktop with a
> D-Bus session bus. Then you use can use busctl, d-feet or gdbus, ex:
> $ gdbus introspect --session -r -d org.qemu -o /
> 
> See the different patches and documentation for further options. The p2p=on 
> mode
> should also allow users running bus-less (on MacOS for ex). We can also add 
> TCP
> socket if needed (although more work would be needed in this case to replace
> the FD-passing with some extra TCP listening socket).

Wow.  That series got a lot of fine tuning.  The patches look all good
to me.

Acked-by: Gerd Hoffmann 

> A WIP Rust/Gtk4 client and VNC server is: 
> https://gitlab.com/marcandre.lureau/qemu-display/
> (check README.md for details, then `cargo run` should connect to QEMU)

Hmm, that wants rather cutting edge versions, stock Fedora 34 isn't new
enough to build it.  And I don't feel like updating to Fedora 35 beta
for that.  So unfortunately I couldn't easily test it, but I'd love to
see that live in action.

Is it possible to keep the client running while starting and stopping
qemu (comparable to "virt-viewer --wait --reconnect" behaviour)?

take care,
  Gerd

Re: [PATCH v2] hw/usb/vt82c686-uhci-pci: Use ISA instead of PCI interrupts

2021-10-12 Thread Gerd Hoffmann

On Mon, Oct 11, 2021 at 12:31:17PM +0200, BALATON Zoltan wrote:
> On Mon, 11 Oct 2021, Gerd Hoffmann wrote:
> > On Tue, Oct 05, 2021 at 03:12:05PM +0200, BALATON Zoltan wrote:
> > > This device is part of a superio/ISA bridge chip and IRQs from it are
> > > routed to an ISA interrupt set by the Interrupt Line PCI config
> > > register. Change uhci_update_irq() to allow this and use it from
> > > vt82c686-uhci-pci.
> > 
> > Hmm, shouldn't this logic be part of the superio bridge emulation?
> 
> But how? The ISA bridge does not know about PCI and PCI does not know about
> ISA. UHCI is a PCIDevice and would raise PCI interrupts. Where and how could
> I convert that to ISA interrupts? (Oh and ISA in QEMU is not Qdev'ified and
> I don't want to do that as it's too much work and too much to break that I
> can't even test so if an alternative solution involves that then get
> somebody do that first.) This patch puts the irq mapping in the vt82xx
> specific vt82c686-uhci-pci.c which in the real chip also contains the ISA
> bridge so in a way it's part of the superio bridge emulation in that this
> uhci variant is part of that chip model.

I'd suggest to first switch over uhci to use pci_allocate_irq() +
qemu_set_irq() (see ehci for example).

With that in place it should be possible to have vt82c686-uhci-pci.c
create a different IRQ setup without changes elsewhere in uhci and
without adding extra callbacks.

HTH,
  Gerd

[PATCH v3 1/2] numa: Require distance map when empty node exists

2021-10-12 Thread Gavin Shan

The following option is used to specify the distance map. It's
possible the option isn't provided by user. In this case, the
distance map isn't populated and exposed to platform. On the
other hand, the empty NUMA node, where no memory resides, is
allowed on platforms like ARM64 virt. For these empty NUMA
nodes, their corresponding device-tree nodes aren't populated,
but their NUMA IDs should be included in the "/distance-map"
device-tree node, so that kernel can probe them properly if
device-tree is used.

  -numa,dist,src=,dst=,val=

This adds extra check after the machine is initialized, to
ask for the distance map from user when empty nodes exist in
device-tree.

Signed-off-by: Gavin Shan 
---
 hw/core/machine.c |  4 
 hw/core/numa.c| 24 
 include/sysemu/numa.h |  1 +
 3 files changed, 29 insertions(+)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index b8d95eec32..c0765ad973 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -1355,6 +1355,10 @@ void machine_run_board_init(MachineState *machine)
 accel_init_interfaces(ACCEL_GET_CLASS(machine->accelerator));
 machine_class->init(machine);
 phase_advance(PHASE_MACHINE_INITIALIZED);
+
+if (machine->numa_state) {
+numa_complete_validation(machine);
+}
 }
 
 static NotifierList machine_init_done_notifiers =
diff --git a/hw/core/numa.c b/hw/core/numa.c
index 510d096a88..7404b7dd38 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -727,6 +727,30 @@ void numa_complete_configuration(MachineState *ms)
 }
 }
 
+/*
+ * When device-tree is used by the machine, the empty node IDs should
+ * be included in the distance map. So we need provide pairs of distances
+ * in this case.
+ */
+void numa_complete_validation(MachineState *ms)
+{
+NodeInfo *numa_info = ms->numa_state->nodes;
+int nb_numa_nodes = ms->numa_state->num_nodes;
+int i;
+
+if (!ms->fdt || ms->numa_state->have_numa_distance) {
+return;
+}
+
+for (i = 0; i < nb_numa_nodes; i++) {
+if (numa_info[i].present && !numa_info[i].node_mem) {
+error_report("Empty node %d found, please provide "
+ "distance map.", i);
+exit(EXIT_FAILURE);
+}
+}
+}
+
 void parse_numa_opts(MachineState *ms)
 {
 qemu_opts_foreach(qemu_find_opts("numa"), parse_numa, ms, _fatal);
diff --git a/include/sysemu/numa.h b/include/sysemu/numa.h
index 4173ef2afa..80f25ab830 100644
--- a/include/sysemu/numa.h
+++ b/include/sysemu/numa.h
@@ -104,6 +104,7 @@ void parse_numa_hmat_lb(NumaState *numa_state, 
NumaHmatLBOptions *node,
 void parse_numa_hmat_cache(MachineState *ms, NumaHmatCacheOptions *node,
Error **errp);
 void numa_complete_configuration(MachineState *ms);
+void numa_complete_validation(MachineState *ms);
 void query_numa_node_mem(NumaNodeMem node_mem[], MachineState *ms);
 extern QemuOptsList qemu_numa_opts;
 void numa_cpu_pre_plug(const struct CPUArchId *slot, DeviceState *dev,
-- 
2.23.0

[PATCH v3 2/2] hw/arm/virt: Don't create device-tree node for empty NUMA node

2021-10-12 Thread Gavin Shan

The empty NUMA node, where no memory resides, are allowed. For
example, the following command line specifies two empty NUMA nodes.
With this, QEMU fails to boot because of the conflicting device-tree
node names, as the following error message indicates.

  /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
  -accel kvm -machine virt,gic-version=host   \
  -cpu host -smp 4,sockets=2,cores=2,threads=1\
  -m 1024M,slots=16,maxmem=64G\
  -object memory-backend-ram,id=mem0,size=512M\
  -object memory-backend-ram,id=mem1,size=512M\
  -numa node,nodeid=0,cpus=0-1,memdev=mem0\
  -numa node,nodeid=1,cpus=2-3,memdev=mem1\
  -numa node,nodeid=2 \
  -numa node,nodeid=3
:
  qemu-system-aarch64: FDT: Failed to create subnode /memory@8000: 
FDT_ERR_EXISTS

As specified by linux device-tree binding document, the device-tree
nodes for these empty NUMA nodes shouldn't be generated. However,
the corresponding NUMA node IDs should be included in the distance
map device-tree node. This skips populating the device-tree nodes
for these empty NUMA nodes to avoid the error, so that QEMU can be
started successfully.

Signed-off-by: Gavin Shan 
Reviewed-by: Andrew Jones 
---
 hw/arm/boot.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 57efb61ee4..4e5898fcdc 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -603,6 +603,10 @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info 
*binfo,
 mem_base = binfo->loader_start;
 for (i = 0; i < ms->numa_state->num_nodes; i++) {
 mem_len = ms->numa_state->nodes[i].node_mem;
+if (!mem_len) {
+continue;
+}
+
 rc = fdt_add_memory_node(fdt, acells, mem_base,
  scells, mem_len, i);
 if (rc < 0) {
-- 
2.23.0

[PATCH v3 0/2] hw/arm/virt: Fix qemu booting failure on device-tree

2021-10-12 Thread Gavin Shan

The empty NUMA nodes, where no memory resides, are allowed on ARM64 virt
platform. However, QEMU fails to boot because the device-tree can't be
populated due to the conflicting device-tree node names of these empty
NUMA nodes. For example, QEMU fails to boot and the following error
message reported when below command line is used.

  /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
  -accel kvm -machine virt,gic-version=host   \
  -cpu host -smp 4,sockets=2,cores=2,threads=1\
  -m 1024M,slots=16,maxmem=64G\
  -object memory-backend-ram,id=mem0,size=512M\
  -object memory-backend-ram,id=mem1,size=512M\
  -numa node,nodeid=0,cpus=0-1,memdev=mem0\
  -numa node,nodeid=1,cpus=2-3,memdev=mem1\
  -numa node,nodeid=2 \
  -numa node,nodeid=3 \
:
  qemu-system-aarch64: FDT: Failed to create subnode /memory@8000: 
FDT_ERR_EXISTS

The lastest device-tree specification doesn't indicate how the device-tree
nodes should be populated for these empty NUMA nodes. The proposed way
to handle this is documented in linux kernel. The linux kernel patches
have been acknoledged and merged to upstream pretty soon.

  https://lkml.org/lkml/2021/9/27/31

This series follows the suggestion, which is included in linux kernel
patches, to resolve the QEMU boot failure issue: The corresponding
device-tree nodes aren't created for the empty NUMA nodes, but their
distance map matrix should be provided by users so that the empty
NUMA node IDs can be parsed properly.

Changelog
=
v3:
   * Require users to provide distance map matrix when empty
 NUMA node is included. The default distance map won't
 be generated any more(Igor/Drew)
v2:
   * Amend PATCH[01/02]'s changelog to explain why we needn't
 switch to disable generating the default distance map(Drew)

Gavin Shan (2):
  numa: Require distance map when empty node exists
  hw/arm/virt: Don't create device-tree node for empty NUMA node

 hw/arm/boot.c |  4 
 hw/core/machine.c |  4 
 hw/core/numa.c| 24 
 include/sysemu/numa.h |  1 +
 4 files changed, 33 insertions(+)

-- 
2.23.0

Re: [RFC PATCH v4 17/20] vhost: Use VRING_AVAIL_F_NO_INTERRUPT at device call on shadow virtqueue

2021-10-12 Thread Jason Wang




在 2021/10/1 下午3:06, Eugenio Pérez 写道:

Signed-off-by: Eugenio Pérez 



Commit log please.

Thanks



---
  hw/virtio/vhost-shadow-virtqueue.c | 24 +++-
  1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.c 
b/hw/virtio/vhost-shadow-virtqueue.c
index 775f8d36a0..2fd0bab75d 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -60,6 +60,9 @@ typedef struct VhostShadowVirtqueue {
  
  /* Next head to consume from device */

  uint16_t used_idx;
+
+/* Cache for the exposed notification flag */
+bool notification;
  } VhostShadowVirtqueue;
  
  /* If the device is using some of these, SVQ cannot communicate */

@@ -105,6 +108,24 @@ bool vhost_svq_valid_device_features(uint64_t 
*dev_features)
  return r;
  }
  
+static void vhost_svq_set_notification(VhostShadowVirtqueue *svq, bool enable)

+{
+uint16_t notification_flag;
+
+if (svq->notification == enable) {
+return;
+}
+
+notification_flag = cpu_to_le16(VRING_AVAIL_F_NO_INTERRUPT);
+
+svq->notification = enable;
+if (enable) {
+svq->vring.avail->flags &= ~notification_flag;
+} else {
+svq->vring.avail->flags |= notification_flag;
+}
+}
+
  static void vhost_vring_write_descs(VhostShadowVirtqueue *svq,
  const struct iovec *iovec,
  size_t num, bool more_descs, bool write)
@@ -273,7 +294,7 @@ static void vhost_svq_handle_call_no_test(EventNotifier *n)
  do {
  unsigned i = 0;
  
-/* TODO: Use VRING_AVAIL_F_NO_INTERRUPT */

+vhost_svq_set_notification(svq, false);
  while (true) {
  g_autofree VirtQueueElement *elem = vhost_svq_get_buf(svq);
  if (!elem) {
@@ -286,6 +307,7 @@ static void vhost_svq_handle_call_no_test(EventNotifier *n)
  
  virtqueue_flush(vq, i);

  event_notifier_set(>guest_call_notifier);
+vhost_svq_set_notification(svq, true);
  } while (vhost_svq_more_used(svq));
  }

Re: [RFC PATCH v4 16/20] vhost: Check for device VRING_USED_F_NO_NOTIFY at shadow virtqueue kick

2021-10-12 Thread Jason Wang




在 2021/10/1 下午3:05, Eugenio Pérez 写道:

Signed-off-by: Eugenio Pérez 
---
  hw/virtio/vhost-shadow-virtqueue.c | 11 ++-
  1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.c 
b/hw/virtio/vhost-shadow-virtqueue.c
index df7e6fa3ec..775f8d36a0 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -173,6 +173,15 @@ static void vhost_svq_add(VhostShadowVirtqueue *svq, 
VirtQueueElement *elem)
  svq->ring_id_maps[qemu_head] = elem;
  }
  
+static void vhost_svq_kick(VhostShadowVirtqueue *svq)

+{
+/* Make sure we are reading updated device flag */



I guess this would be better:

    /* We need to expose available array entries before checking used
 * flags. */

(Borrowed from kernel codes).

Thanks



+smp_mb();
+if (!(svq->vring.used->flags & VRING_USED_F_NO_NOTIFY)) {
+event_notifier_set(>kick_notifier);
+}
+}
+
  /* Handle guest->device notifications */
  static void vhost_handle_guest_kick(EventNotifier *n)
  {
@@ -197,7 +206,7 @@ static void vhost_handle_guest_kick(EventNotifier *n)
  }
  
  vhost_svq_add(svq, elem);

-event_notifier_set(>kick_notifier);
+vhost_svq_kick(svq);
  }
  
  virtio_queue_set_notification(svq->vq, true);

Re: [RFC PATCH v4 15/20] vhost: Shadow virtqueue buffers forwarding

2021-10-12 Thread Jason Wang




在 2021/10/1 下午3:05, Eugenio Pérez 写道:

Initial version of shadow virtqueue that actually forward buffers. There
are no iommu support at the moment, and that will be addressed in future
patches of this series. Since all vhost-vdpa devices uses forced IOMMU,
this means that SVQ is not usable at this point of the series on any
device.

For simplicity it only supports modern devices, that expects vring
in little endian, with split ring and no event idx or indirect
descriptors. Support for them will not be added in this series.

It reuses the VirtQueue code for the device part. The driver part is
based on Linux's virtio_ring driver, but with stripped functionality
and optimizations so it's easier to review. Later commits add simpler
ones.

SVQ uses VIRTIO_CONFIG_S_DEVICE_STOPPED to pause the device and
retrieve its status (next available idx the device was going to
consume) race-free. It can later reset the device to replace vring
addresses etc. When SVQ starts qemu can resume consuming the guest's
driver ring from that state, without notice from the latter.

This status bit VIRTIO_CONFIG_S_DEVICE_STOPPED is currently discussed
in VirtIO, and is implemented in qemu VirtIO-net devices in previous
commits.

Removal of _S_DEVICE_STOPPED bit (in other words, resuming the device)
can be done in the future if an use case arises. At this moment we can
just rely on reseting the full device.

Signed-off-by: Eugenio Pérez 
---
  qapi/net.json  |   2 +-
  hw/virtio/vhost-shadow-virtqueue.c | 237 -
  hw/virtio/vhost-vdpa.c | 109 -
  3 files changed, 337 insertions(+), 11 deletions(-)

diff --git a/qapi/net.json b/qapi/net.json
index fe546b0e7c..1f4a55f2c5 100644
--- a/qapi/net.json
+++ b/qapi/net.json
@@ -86,7 +86,7 @@
  #
  # @name: the device name of the VirtIO device
  #
-# @enable: true to use the alternate shadow VQ notifications
+# @enable: true to use the alternate shadow VQ buffers fowarding path
  #
  # Returns: Error if failure, or 'no error' for success.
  #
diff --git a/hw/virtio/vhost-shadow-virtqueue.c 
b/hw/virtio/vhost-shadow-virtqueue.c
index 34e159d4fd..df7e6fa3ec 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -10,6 +10,7 @@
  #include "qemu/osdep.h"
  #include "hw/virtio/vhost-shadow-virtqueue.h"
  #include "hw/virtio/vhost.h"
+#include "hw/virtio/virtio-access.h"
  
  #include "standard-headers/linux/vhost_types.h"
  
@@ -44,15 +45,135 @@ typedef struct VhostShadowVirtqueue {
  
  /* Virtio device */

  VirtIODevice *vdev;
+
+/* Map for returning guest's descriptors */
+VirtQueueElement **ring_id_maps;
+
+/* Next head to expose to device */
+uint16_t avail_idx_shadow;
+
+/* Next free descriptor */
+uint16_t free_head;
+
+/* Last seen used idx */
+uint16_t shadow_used_idx;
+
+/* Next head to consume from device */
+uint16_t used_idx;



Let's use "last_used_idx" as kernel driver did.



  } VhostShadowVirtqueue;
  
  /* If the device is using some of these, SVQ cannot communicate */

  bool vhost_svq_valid_device_features(uint64_t *dev_features)
  {
-return true;
+uint64_t b;
+bool r = true;
+
+for (b = VIRTIO_TRANSPORT_F_START; b <= VIRTIO_TRANSPORT_F_END; ++b) {
+switch (b) {
+case VIRTIO_F_NOTIFY_ON_EMPTY:
+case VIRTIO_F_ANY_LAYOUT:
+/* SVQ is fine with this feature */
+continue;
+
+case VIRTIO_F_ACCESS_PLATFORM:
+/* SVQ needs this feature disabled. Can't continue */



So code can explain itself, need a comment to explain why.



+if (*dev_features & BIT_ULL(b)) {
+clear_bit(b, dev_features);
+r = false;
+}
+break;
+
+case VIRTIO_F_VERSION_1:
+/* SVQ needs this feature, so can't continue */



A comment to explain why SVQ needs this feature.



+if (!(*dev_features & BIT_ULL(b))) {
+set_bit(b, dev_features);
+r = false;
+}
+continue;
+
+default:
+/*
+ * SVQ must disable this feature, let's hope the device is fine
+ * without it.
+ */
+if (*dev_features & BIT_ULL(b)) {
+clear_bit(b, dev_features);
+}
+}
+}
+
+return r;
+}



Let's move this to patch 14.



+
+static void vhost_vring_write_descs(VhostShadowVirtqueue *svq,
+const struct iovec *iovec,
+size_t num, bool more_descs, bool write)
+{
+uint16_t i = svq->free_head, last = svq->free_head;
+unsigned n;
+uint16_t flags = write ? cpu_to_le16(VRING_DESC_F_WRITE) : 0;
+vring_desc_t *descs = svq->vring.desc;
+
+if (num == 0) {
+return;
+}
+
+for (n = 0; n < num; n++) {
+if (more_descs || (n + 1 < num)) {
+

Re: [RFC PATCH v4 13/20] vdpa: Save host and guest features

2021-10-12 Thread Jason Wang




在 2021/10/1 下午3:05, Eugenio Pérez 写道:

Those are needed for SVQ: Host ones are needed to check if SVQ knows
how to talk with the device and for feature negotiation, and guest ones
to know if SVQ can talk with it.

Signed-off-by: Eugenio Pérez 
---
  include/hw/virtio/vhost-vdpa.h |  2 ++
  hw/virtio/vhost-vdpa.c | 31 ---
  2 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
index fddac248b3..9044ae694b 100644
--- a/include/hw/virtio/vhost-vdpa.h
+++ b/include/hw/virtio/vhost-vdpa.h
@@ -26,6 +26,8 @@ typedef struct vhost_vdpa {
  int device_fd;
  uint32_t msg_type;
  MemoryListener listener;
+uint64_t host_features;
+uint64_t guest_features;



Any reason that we can't use the features stored in VirtioDevice?

Thanks



  bool shadow_vqs_enabled;
  GPtrArray *shadow_vqs;
  struct vhost_dev *dev;
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 6c5f4c98b8..a057e8277d 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -439,10 +439,19 @@ static int vhost_vdpa_set_mem_table(struct vhost_dev *dev,
  return 0;
  }
  
-static int vhost_vdpa_set_features(struct vhost_dev *dev,

-   uint64_t features)
+/**
+ * Internal set_features() that follows vhost/VirtIO protocol for that
+ */
+static int vhost_vdpa_backend_set_features(struct vhost_dev *dev,
+   uint64_t features)
  {
+struct vhost_vdpa *v = dev->opaque;
+
  int ret;
+if (v->host_features & BIT_ULL(VIRTIO_F_QUEUE_STATE)) {
+features |= BIT_ULL(VIRTIO_F_QUEUE_STATE);
+}
+
  trace_vhost_vdpa_set_features(dev, features);
  ret = vhost_vdpa_call(dev, VHOST_SET_FEATURES, );
  uint8_t status = 0;
@@ -455,6 +464,17 @@ static int vhost_vdpa_set_features(struct vhost_dev *dev,
  return !(status & VIRTIO_CONFIG_S_FEATURES_OK);
  }
  
+/**

+ * Exposed vhost set features
+ */
+static int vhost_vdpa_set_features(struct vhost_dev *dev,
+   uint64_t features)
+{
+struct vhost_vdpa *v = dev->opaque;
+v->guest_features = features;
+return vhost_vdpa_backend_set_features(dev, features);
+}
+
  static int vhost_vdpa_set_backend_cap(struct vhost_dev *dev)
  {
  uint64_t features;
@@ -673,12 +693,17 @@ static int vhost_vdpa_set_vring_call(struct vhost_dev 
*dev,
  }
  
  static int vhost_vdpa_get_features(struct vhost_dev *dev,

- uint64_t *features)
+   uint64_t *features)
  {
  int ret;
  
  ret = vhost_vdpa_call(dev, VHOST_GET_FEATURES, features);

  trace_vhost_vdpa_get_features(dev, *features);
+
+if (ret == 0) {
+struct vhost_vdpa *v = dev->opaque;
+v->host_features = *features;
+}
  return ret;
  }

Re: [RFC PATCH v4 12/20] virtio: Add vhost_shadow_vq_get_vring_addr

2021-10-12 Thread Jason Wang




在 2021/10/1 下午3:05, Eugenio Pérez 写道:

It reports the shadow virtqueue address from qemu virtual address space



I think both the title and commit log needs to more tweaks. Looking at 
the codes, what id does is actually introduce vring into svq.





Signed-off-by: Eugenio Pérez 
---
  hw/virtio/vhost-shadow-virtqueue.h |  4 +++
  hw/virtio/vhost-shadow-virtqueue.c | 50 ++
  2 files changed, 54 insertions(+)

diff --git a/hw/virtio/vhost-shadow-virtqueue.h 
b/hw/virtio/vhost-shadow-virtqueue.h
index 237cfceb9c..2df3d117f5 100644
--- a/hw/virtio/vhost-shadow-virtqueue.h
+++ b/hw/virtio/vhost-shadow-virtqueue.h
@@ -16,6 +16,10 @@ typedef struct VhostShadowVirtqueue VhostShadowVirtqueue;
  
  EventNotifier *vhost_svq_get_svq_call_notifier(VhostShadowVirtqueue *svq);

  void vhost_svq_set_guest_call_notifier(VhostShadowVirtqueue *svq, int 
call_fd);
+void vhost_svq_get_vring_addr(const VhostShadowVirtqueue *svq,
+  struct vhost_vring_addr *addr);
+size_t vhost_svq_driver_area_size(const VhostShadowVirtqueue *svq);
+size_t vhost_svq_device_area_size(const VhostShadowVirtqueue *svq);
  
  bool vhost_svq_start(struct vhost_dev *dev, unsigned idx,

   VhostShadowVirtqueue *svq);
diff --git a/hw/virtio/vhost-shadow-virtqueue.c 
b/hw/virtio/vhost-shadow-virtqueue.c
index 3fe129cf63..5c1899f6af 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -18,6 +18,9 @@
  
  /* Shadow virtqueue to relay notifications */

  typedef struct VhostShadowVirtqueue {
+/* Shadow vring */
+struct vring vring;
+
  /* Shadow kick notifier, sent to vhost */
  EventNotifier kick_notifier;
  /* Shadow call notifier, sent to vhost */
@@ -38,6 +41,9 @@ typedef struct VhostShadowVirtqueue {
  
  /* Virtio queue shadowing */

  VirtQueue *vq;
+
+/* Virtio device */
+VirtIODevice *vdev;
  } VhostShadowVirtqueue;
  
  /* Forward guest notifications */

@@ -93,6 +99,35 @@ void vhost_svq_set_guest_call_notifier(VhostShadowVirtqueue 
*svq, int call_fd)
  event_notifier_init_fd(>guest_call_notifier, call_fd);
  }
  
+/*

+ * Get the shadow vq vring address.
+ * @svq Shadow virtqueue
+ * @addr Destination to store address
+ */
+void vhost_svq_get_vring_addr(const VhostShadowVirtqueue *svq,
+  struct vhost_vring_addr *addr)
+{
+addr->desc_user_addr = (uint64_t)svq->vring.desc;
+addr->avail_user_addr = (uint64_t)svq->vring.avail;
+addr->used_user_addr = (uint64_t)svq->vring.used;
+}
+
+size_t vhost_svq_driver_area_size(const VhostShadowVirtqueue *svq)
+{
+uint16_t vq_idx = virtio_get_queue_index(svq->vq);
+size_t desc_size = virtio_queue_get_desc_size(svq->vdev, vq_idx);
+size_t avail_size = virtio_queue_get_avail_size(svq->vdev, vq_idx);
+
+return ROUND_UP(desc_size + avail_size, qemu_real_host_page_size);



Is this round up required by the spec?



+}
+
+size_t vhost_svq_device_area_size(const VhostShadowVirtqueue *svq)
+{
+uint16_t vq_idx = virtio_get_queue_index(svq->vq);
+size_t used_size = virtio_queue_get_used_size(svq->vdev, vq_idx);
+return ROUND_UP(used_size, qemu_real_host_page_size);
+}
+
  /*
   * Restore the vhost guest to host notifier, i.e., disables svq effect.
   */
@@ -178,6 +213,10 @@ void vhost_svq_stop(struct vhost_dev *dev, unsigned idx,
  VhostShadowVirtqueue *vhost_svq_new(struct vhost_dev *dev, int idx)
  {
  int vq_idx = dev->vq_index + idx;
+unsigned num = virtio_queue_get_num(dev->vdev, vq_idx);
+size_t desc_size = virtio_queue_get_desc_size(dev->vdev, vq_idx);
+size_t driver_size;
+size_t device_size;
  g_autofree VhostShadowVirtqueue *svq = g_new0(VhostShadowVirtqueue, 1);
  int r;
  
@@ -196,6 +235,15 @@ VhostShadowVirtqueue *vhost_svq_new(struct vhost_dev *dev, int idx)

  }
  
  svq->vq = virtio_get_queue(dev->vdev, vq_idx);

+svq->vdev = dev->vdev;
+driver_size = vhost_svq_driver_area_size(svq);
+device_size = vhost_svq_device_area_size(svq);
+svq->vring.num = num;
+svq->vring.desc = qemu_memalign(qemu_real_host_page_size, driver_size);
+svq->vring.avail = (void *)((char *)svq->vring.desc + desc_size);
+memset(svq->vring.desc, 0, driver_size);



Any reason for using the contiguous area for both desc and avail?

Thanks



+svq->vring.used = qemu_memalign(qemu_real_host_page_size, device_size);
+memset(svq->vring.used, 0, device_size);
  event_notifier_set_handler(>call_notifier,
 vhost_svq_handle_call);
  return g_steal_pointer();
@@ -215,5 +263,7 @@ void vhost_svq_free(VhostShadowVirtqueue *vq)
  event_notifier_cleanup(>kick_notifier);
  event_notifier_set_handler(>call_notifier, NULL);
  event_notifier_cleanup(>call_notifier);
+qemu_vfree(vq->vring.desc);
+qemu_vfree(vq->vring.used);
  g_free(vq);
  }

Re: [RFC PATCH v4 11/20] vhost: Route host->guest notification through shadow virtqueue

2021-10-12 Thread Jason Wang




在 2021/10/1 下午3:05, Eugenio Pérez 写道:

This will make qemu aware of the device used buffers, allowing it to
write the guest memory with its contents if needed.

Since the use of vhost_virtqueue_start can unmasks and discard call
events, vhost_virtqueue_start should be modified in one of these ways:
* Split in two: One of them uses all logic to start a queue with no
   side effects for the guest, and another one tha actually assumes that
   the guest has just started the device. Vdpa should use just the
   former.
* Actually store and check if the guest notifier is masked, and do it
   conditionally.
* Left as it is, and duplicate all the logic in vhost-vdpa.



Btw, the log looks not clear. I guess this patch goes for method 3. If 
yes, we need explain it and why.


Thanks




Signed-off-by: Eugenio Pérez

Re: [RFC PATCH v4 11/20] vhost: Route host->guest notification through shadow virtqueue

2021-10-12 Thread Jason Wang




在 2021/10/1 下午3:05, Eugenio Pérez 写道:

This will make qemu aware of the device used buffers, allowing it to
write the guest memory with its contents if needed.

Since the use of vhost_virtqueue_start can unmasks and discard call
events, vhost_virtqueue_start should be modified in one of these ways:
* Split in two: One of them uses all logic to start a queue with no
   side effects for the guest, and another one tha actually assumes that
   the guest has just started the device. Vdpa should use just the
   former.
* Actually store and check if the guest notifier is masked, and do it
   conditionally.
* Left as it is, and duplicate all the logic in vhost-vdpa.

Signed-off-by: Eugenio Pérez 
---
  hw/virtio/vhost-shadow-virtqueue.c | 19 +++
  hw/virtio/vhost-vdpa.c | 38 +-
  2 files changed, 56 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.c 
b/hw/virtio/vhost-shadow-virtqueue.c
index 21dc99ab5d..3fe129cf63 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -53,6 +53,22 @@ static void vhost_handle_guest_kick(EventNotifier *n)
  event_notifier_set(>kick_notifier);
  }
  
+/* Forward vhost notifications */

+static void vhost_svq_handle_call_no_test(EventNotifier *n)
+{
+VhostShadowVirtqueue *svq = container_of(n, VhostShadowVirtqueue,
+ call_notifier);
+
+event_notifier_set(>guest_call_notifier);
+}
+
+static void vhost_svq_handle_call(EventNotifier *n)
+{
+if (likely(event_notifier_test_and_clear(n))) {
+vhost_svq_handle_call_no_test(n);
+}
+}
+
  /*
   * Obtain the SVQ call notifier, where vhost device notifies SVQ that there
   * exists pending used buffers.
@@ -180,6 +196,8 @@ VhostShadowVirtqueue *vhost_svq_new(struct vhost_dev *dev, 
int idx)
  }
  
  svq->vq = virtio_get_queue(dev->vdev, vq_idx);

+event_notifier_set_handler(>call_notifier,
+   vhost_svq_handle_call);
  return g_steal_pointer();
  
  err_init_call_notifier:

@@ -195,6 +213,7 @@ err_init_kick_notifier:
  void vhost_svq_free(VhostShadowVirtqueue *vq)
  {
  event_notifier_cleanup(>kick_notifier);
+event_notifier_set_handler(>call_notifier, NULL);
  event_notifier_cleanup(>call_notifier);
  g_free(vq);
  }
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index bc34de2439..6c5f4c98b8 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -712,13 +712,40 @@ static bool vhost_vdpa_svq_start_vq(struct vhost_dev 
*dev, unsigned idx)
  {
  struct vhost_vdpa *v = dev->opaque;
  VhostShadowVirtqueue *svq = g_ptr_array_index(v->shadow_vqs, idx);
-return vhost_svq_start(dev, idx, svq);
+EventNotifier *vhost_call_notifier = vhost_svq_get_svq_call_notifier(svq);
+struct vhost_vring_file vhost_call_file = {
+.index = idx + dev->vq_index,
+.fd = event_notifier_get_fd(vhost_call_notifier),
+};
+int r;
+bool b;
+
+/* Set shadow vq -> guest notifier */
+assert(v->call_fd[idx]);



We need aovid the asser() here. On which case we can hit this?



+vhost_svq_set_guest_call_notifier(svq, v->call_fd[idx]);
+
+b = vhost_svq_start(dev, idx, svq);
+if (unlikely(!b)) {
+return false;
+}
+
+/* Set device -> SVQ notifier */
+r = vhost_vdpa_set_vring_dev_call(dev, _call_file);
+if (unlikely(r)) {
+error_report("vhost_vdpa_set_vring_call for shadow vq failed");
+return false;
+}



Similar to kick, do we need to set_vring_call() before vhost_svq_start()?



+
+/* Check for pending calls */
+event_notifier_set(vhost_call_notifier);



Interesting, can this result spurious interrupt?



+return true;
  }
  
  static unsigned vhost_vdpa_enable_svq(struct vhost_vdpa *v, bool enable)

  {
  struct vhost_dev *hdev = v->dev;
  unsigned n;
+int r;
  
  if (enable == v->shadow_vqs_enabled) {

  return hdev->nvqs;
@@ -752,9 +779,18 @@ static unsigned vhost_vdpa_enable_svq(struct vhost_vdpa 
*v, bool enable)
  if (!enable) {
  /* Disable all queues or clean up failed start */
  for (n = 0; n < v->shadow_vqs->len; ++n) {
+struct vhost_vring_file file = {
+.index = vhost_vdpa_get_vq_index(hdev, n),
+.fd = v->call_fd[n],
+};
+
+r = vhost_vdpa_set_vring_call(hdev, );
+assert(r == 0);
+
  unsigned vq_idx = vhost_vdpa_get_vq_index(hdev, n);
  VhostShadowVirtqueue *svq = g_ptr_array_index(v->shadow_vqs, n);
  vhost_svq_stop(hdev, n, svq);
+/* TODO: This can unmask or override call fd! */



I don't get this comment. Does this mean the current code can't work 
with mask_notifiers? If yes, this is something we need to fix.


Thanks



  vhost_virtqueue_start(hdev, hdev->vdev, >vqs[n], vq_idx);
  }

Re: [RFC PATCH v4 10/20] vhost-vdpa: Take into account SVQ in vhost_vdpa_set_vring_call

2021-10-12 Thread Jason Wang




在 2021/10/1 下午3:05, Eugenio Pérez 写道:

Signed-off-by: Eugenio Pérez 
---
  hw/virtio/vhost-vdpa.c | 17 ++---
  1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 57a857444a..bc34de2439 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -649,16 +649,27 @@ static int vhost_vdpa_set_vring_kick(struct vhost_dev 
*dev,
  return vhost_vdpa_call(dev, VHOST_SET_VRING_KICK, file);
  }
  
+static int vhost_vdpa_set_vring_dev_call(struct vhost_dev *dev,

+ struct vhost_vring_file *file)
+{
+trace_vhost_vdpa_set_vring_call(dev, file->index, file->fd);
+return vhost_vdpa_call(dev, VHOST_SET_VRING_CALL, file);
+}
+
  static int vhost_vdpa_set_vring_call(struct vhost_dev *dev,
 struct vhost_vring_file *file)
  {
  struct vhost_vdpa *v = dev->opaque;
  int vdpa_idx = vhost_vdpa_get_vq_index(dev, file->index);
  
-trace_vhost_vdpa_set_vring_call(dev, file->index, file->fd);

-
  v->call_fd[vdpa_idx] = file->fd;
-return vhost_vdpa_call(dev, VHOST_SET_VRING_CALL, file);
+if (v->shadow_vqs_enabled) {
+VhostShadowVirtqueue *svq = g_ptr_array_index(v->shadow_vqs, vdpa_idx);
+vhost_svq_set_guest_call_notifier(svq, file->fd);
+return 0;
+} else {
+return vhost_vdpa_set_vring_dev_call(dev, file);
+}



I feel like we should do the same for kick fd.

Thanks



  }
  
  static int vhost_vdpa_get_features(struct vhost_dev *dev,

Re: [RFC PATCH v4 09/20] vdpa: Save call_fd in vhost-vdpa

2021-10-12 Thread Jason Wang




在 2021/10/1 下午3:05, Eugenio Pérez 写道:

We need to know it to switch to Shadow VirtQueue.

Signed-off-by: Eugenio Pérez 
---
  include/hw/virtio/vhost-vdpa.h | 2 ++
  hw/virtio/vhost-vdpa.c | 5 +
  2 files changed, 7 insertions(+)

diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
index 48aae59d8e..fddac248b3 100644
--- a/include/hw/virtio/vhost-vdpa.h
+++ b/include/hw/virtio/vhost-vdpa.h
@@ -30,6 +30,8 @@ typedef struct vhost_vdpa {
  GPtrArray *shadow_vqs;
  struct vhost_dev *dev;
  QLIST_ENTRY(vhost_vdpa) entry;
+/* File descriptor the device uses to call VM/SVQ */
+int call_fd[VIRTIO_QUEUE_MAX];



Any reason we don't do this for kick_fd or why 
virtio_queue_get_guest_notifier() can't work here? Need a comment or 
commit log.


I think we need to have a consistent way to handle both kick and call fd.

Thanks



  VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
  } VhostVDPA;
  
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c

index 36c954a779..57a857444a 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -652,7 +652,12 @@ static int vhost_vdpa_set_vring_kick(struct vhost_dev *dev,
  static int vhost_vdpa_set_vring_call(struct vhost_dev *dev,
 struct vhost_vring_file *file)
  {
+struct vhost_vdpa *v = dev->opaque;
+int vdpa_idx = vhost_vdpa_get_vq_index(dev, file->index);
+
  trace_vhost_vdpa_set_vring_call(dev, file->index, file->fd);
+
+v->call_fd[vdpa_idx] = file->fd;
  return vhost_vdpa_call(dev, VHOST_SET_VRING_CALL, file);
  }

Re: [RFC PATCH v4 08/20] vhost: Route guest->host notification through shadow virtqueue

2021-10-12 Thread Jason Wang




在 2021/10/1 下午3:05, Eugenio Pérez 写道:

Shadow virtqueue notifications forwarding is disabled when vhost_dev
stops, so code flow follows usual cleanup.

Also, host notifiers must be disabled at SVQ start,



Any reason for this?



and they will not
start if SVQ has been enabled when device is stopped. This is trivial
to address, but it is left out for simplicity at this moment.



It looks to me this patch also contains the following logics

1) codes to enable svq

2) codes to let svq to be enabled from QMP.

I think they need to be split out, we may endup with the following 
series of patches


1) svq skeleton with enable/disable
2) route host notifier to svq
3) route guest notifier to svq
4) codes to enable svq
5) enable svq via QMP




Signed-off-by: Eugenio Pérez 
---
  qapi/net.json  |   2 +-
  hw/virtio/vhost-shadow-virtqueue.h |   8 ++
  include/hw/virtio/vhost-vdpa.h |   4 +
  hw/virtio/vhost-shadow-virtqueue.c | 138 -
  hw/virtio/vhost-vdpa.c | 116 +++-
  5 files changed, 264 insertions(+), 4 deletions(-)

diff --git a/qapi/net.json b/qapi/net.json
index a2c30fd455..fe546b0e7c 100644
--- a/qapi/net.json
+++ b/qapi/net.json
@@ -88,7 +88,7 @@
  #
  # @enable: true to use the alternate shadow VQ notifications
  #
-# Returns: Always error, since SVQ is not implemented at the moment.
+# Returns: Error if failure, or 'no error' for success.
  #
  # Since: 6.2
  #
diff --git a/hw/virtio/vhost-shadow-virtqueue.h 
b/hw/virtio/vhost-shadow-virtqueue.h
index 27ac6388fa..237cfceb9c 100644
--- a/hw/virtio/vhost-shadow-virtqueue.h
+++ b/hw/virtio/vhost-shadow-virtqueue.h
@@ -14,6 +14,14 @@
  
  typedef struct VhostShadowVirtqueue VhostShadowVirtqueue;
  
+EventNotifier *vhost_svq_get_svq_call_notifier(VhostShadowVirtqueue *svq);



Let's move this function to another patch since it's unrelated to the 
guest->host routing.




+void vhost_svq_set_guest_call_notifier(VhostShadowVirtqueue *svq, int call_fd);
+
+bool vhost_svq_start(struct vhost_dev *dev, unsigned idx,
+ VhostShadowVirtqueue *svq);
+void vhost_svq_stop(struct vhost_dev *dev, unsigned idx,
+VhostShadowVirtqueue *svq);
+
  VhostShadowVirtqueue *vhost_svq_new(struct vhost_dev *dev, int idx);
  
  void vhost_svq_free(VhostShadowVirtqueue *vq);

diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
index 0d565bb5bd..48aae59d8e 100644
--- a/include/hw/virtio/vhost-vdpa.h
+++ b/include/hw/virtio/vhost-vdpa.h
@@ -12,6 +12,8 @@
  #ifndef HW_VIRTIO_VHOST_VDPA_H
  #define HW_VIRTIO_VHOST_VDPA_H
  
+#include 

+
  #include "qemu/queue.h"
  #include "hw/virtio/virtio.h"
  
@@ -24,6 +26,8 @@ typedef struct vhost_vdpa {

  int device_fd;
  uint32_t msg_type;
  MemoryListener listener;
+bool shadow_vqs_enabled;
+GPtrArray *shadow_vqs;
  struct vhost_dev *dev;
  QLIST_ENTRY(vhost_vdpa) entry;
  VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
diff --git a/hw/virtio/vhost-shadow-virtqueue.c 
b/hw/virtio/vhost-shadow-virtqueue.c
index c4826a1b56..21dc99ab5d 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -9,9 +9,12 @@
  
  #include "qemu/osdep.h"

  #include "hw/virtio/vhost-shadow-virtqueue.h"
+#include "hw/virtio/vhost.h"
+
+#include "standard-headers/linux/vhost_types.h"
  
  #include "qemu/error-report.h"

-#include "qemu/event_notifier.h"
+#include "qemu/main-loop.h"
  
  /* Shadow virtqueue to relay notifications */

  typedef struct VhostShadowVirtqueue {
@@ -19,14 +22,146 @@ typedef struct VhostShadowVirtqueue {
  EventNotifier kick_notifier;
  /* Shadow call notifier, sent to vhost */
  EventNotifier call_notifier;
+
+/*
+ * Borrowed virtqueue's guest to host notifier.
+ * To borrow it in this event notifier allows to register on the event
+ * loop and access the associated shadow virtqueue easily. If we use the
+ * VirtQueue, we don't have an easy way to retrieve it.
+ *
+ * So shadow virtqueue must not clean it, or we would lose VirtQueue one.
+ */
+EventNotifier host_notifier;
+
+/* Guest's call notifier, where SVQ calls guest. */
+EventNotifier guest_call_notifier;



To be consistent, let's simply use "guest_notifier" here.



+
+/* Virtio queue shadowing */
+VirtQueue *vq;
  } VhostShadowVirtqueue;
  
+/* Forward guest notifications */

+static void vhost_handle_guest_kick(EventNotifier *n)
+{
+VhostShadowVirtqueue *svq = container_of(n, VhostShadowVirtqueue,
+ host_notifier);
+
+if (unlikely(!event_notifier_test_and_clear(n))) {
+return;
+}



Is there a chance that we may stop the processing of available buffers 
during the svq enabling? There could be no kick from the guest in this case.




+
+event_notifier_set(>kick_notifier);
+}
+
+/*
+ * Obtain the SVQ call notifier, where vhost device

[PATCH v4 48/48] tests/tcg/multiarch: Add sigbus.c

2021-10-12 Thread Richard Henderson

A mostly generic test for unaligned access raising SIGBUS.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 tests/tcg/multiarch/sigbus.c | 68 
 1 file changed, 68 insertions(+)
 create mode 100644 tests/tcg/multiarch/sigbus.c

diff --git a/tests/tcg/multiarch/sigbus.c b/tests/tcg/multiarch/sigbus.c
new file mode 100644
index 00..8134c5fd56
--- /dev/null
+++ b/tests/tcg/multiarch/sigbus.c
@@ -0,0 +1,68 @@
+#define _GNU_SOURCE 1
+
+#include 
+#include 
+#include 
+#include 
+
+
+unsigned long long x = 0x8877665544332211ull;
+void * volatile p = (void *) + 1;
+
+void sigbus(int sig, siginfo_t *info, void *uc)
+{
+assert(sig == SIGBUS);
+assert(info->si_signo == SIGBUS);
+#ifdef BUS_ADRALN
+assert(info->si_code == BUS_ADRALN);
+#endif
+assert(info->si_addr == p);
+exit(EXIT_SUCCESS);
+}
+
+int main()
+{
+struct sigaction sa = {
+.sa_sigaction = sigbus,
+.sa_flags = SA_SIGINFO
+};
+int allow_fail = 0;
+int tmp;
+
+tmp = sigaction(SIGBUS, , NULL);
+assert(tmp == 0);
+
+/*
+ * Select an operation that's likely to enforce alignment.
+ * On many guests that support unaligned accesses by default,
+ * this is often an atomic operation.
+ */
+#if defined(__aarch64__)
+asm volatile("ldxr %w0,[%1]" : "=r"(tmp) : "r"(p) : "memory");
+#elif defined(__alpha__)
+asm volatile("ldl_l %0,0(%1)" : "=r"(tmp) : "r"(p) : "memory");
+#elif defined(__arm__)
+asm volatile("ldrex %0,[%1]" : "=r"(tmp) : "r"(p) : "memory");
+#elif defined(__powerpc__)
+asm volatile("lwarx %0,0,%1" : "=r"(tmp) : "r"(p) : "memory");
+#elif defined(__riscv_atomic)
+asm volatile("lr.w %0,(%1)" : "=r"(tmp) : "r"(p) : "memory");
+#else
+/* No insn known to fault unaligned -- try for a straight load. */
+allow_fail = 1;
+tmp = *(volatile int *)p;
+#endif
+
+assert(allow_fail);
+
+/*
+ * We didn't see a signal.
+ * We might as well validate the unaligned load worked.
+ */
+if (BYTE_ORDER == LITTLE_ENDIAN) {
+assert(tmp == 0x55443322);
+} else {
+assert(tmp == 0x77665544);
+}
+return EXIT_SUCCESS;
+}
-- 
2.25.1

[PATCH v4 45/48] tcg/s390: Support raising sigbus for user-only

2021-10-12 Thread Richard Henderson

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 tcg/s390x/tcg-target.h |  2 --
 tcg/s390x/tcg-target.c.inc | 59 --
 2 files changed, 57 insertions(+), 4 deletions(-)

diff --git a/tcg/s390x/tcg-target.h b/tcg/s390x/tcg-target.h
index 527ada0f63..69217d995b 100644
--- a/tcg/s390x/tcg-target.h
+++ b/tcg/s390x/tcg-target.h
@@ -178,9 +178,7 @@ static inline void tb_target_set_jmp_target(uintptr_t 
tc_ptr, uintptr_t jmp_rx,
 /* no need to flush icache explicitly */
 }
 
-#ifdef CONFIG_SOFTMMU
 #define TCG_TARGET_NEED_LDST_LABELS
-#endif
 #define TCG_TARGET_NEED_POOL_LABELS
 
 #endif
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index 8938c446c8..bc6a13d797 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -29,6 +29,7 @@
 #error "unsupported code generation mode"
 #endif
 
+#include "../tcg-ldst.c.inc"
 #include "../tcg-pool.c.inc"
 #include "elf.h"
 
@@ -136,6 +137,7 @@ typedef enum S390Opcode {
 RI_OIHL = 0xa509,
 RI_OILH = 0xa50a,
 RI_OILL = 0xa50b,
+RI_TMLL = 0xa701,
 
 RIE_CGIJ= 0xec7c,
 RIE_CGRJ= 0xec64,
@@ -1804,8 +1806,6 @@ static void tcg_out_qemu_st_direct(TCGContext *s, MemOp 
opc, TCGReg data,
 }
 
 #if defined(CONFIG_SOFTMMU)
-#include "../tcg-ldst.c.inc"
-
 /* We're expecting to use a 20-bit negative offset on the tlb memory ops.  */
 QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
 QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -(1 << 19));
@@ -1942,6 +1942,53 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, 
TCGLabelQemuLdst *lb)
 return true;
 }
 #else
+static void tcg_out_test_alignment(TCGContext *s, bool is_ld,
+   TCGReg addrlo, unsigned a_bits)
+{
+unsigned a_mask = (1 << a_bits) - 1;
+TCGLabelQemuLdst *l = new_ldst_label(s);
+
+l->is_ld = is_ld;
+l->addrlo_reg = addrlo;
+
+/* We are expecting a_bits to max out at 7, much lower than TMLL. */
+tcg_debug_assert(a_bits < 16);
+tcg_out_insn(s, RI, TMLL, addrlo, a_mask);
+
+tcg_out16(s, RI_BRC | (7 << 4)); /* CC in {1,2,3} */
+l->label_ptr[0] = s->code_ptr;
+s->code_ptr += 1;
+
+l->raddr = tcg_splitwx_to_rx(s->code_ptr);
+}
+
+static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l)
+{
+if (!patch_reloc(l->label_ptr[0], R_390_PC16DBL,
+ (intptr_t)tcg_splitwx_to_rx(s->code_ptr), 2)) {
+return false;
+}
+
+tcg_out_mov(s, TCG_TYPE_TL, TCG_REG_R3, l->addrlo_reg);
+tcg_out_mov(s, TCG_TYPE_PTR, TCG_REG_R2, TCG_AREG0);
+
+/* "Tail call" to the helper, with the return address back inline. */
+tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R14, (uintptr_t)l->raddr);
+tgen_gotoi(s, S390_CC_ALWAYS, (const void *)(l->is_ld ? helper_unaligned_ld
+ : helper_unaligned_st));
+return true;
+}
+
+static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
+{
+return tcg_out_fail_alignment(s, l);
+}
+
+static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
+{
+return tcg_out_fail_alignment(s, l);
+}
+
 static void tcg_prepare_user_ldst(TCGContext *s, TCGReg *addr_reg,
   TCGReg *index_reg, tcg_target_long *disp)
 {
@@ -1980,7 +2027,11 @@ static void tcg_out_qemu_ld(TCGContext* s, TCGReg 
data_reg, TCGReg addr_reg,
 #else
 TCGReg index_reg;
 tcg_target_long disp;
+unsigned a_bits = get_alignment_bits(opc);
 
+if (a_bits) {
+tcg_out_test_alignment(s, true, addr_reg, a_bits);
+}
 tcg_prepare_user_ldst(s, _reg, _reg, );
 tcg_out_qemu_ld_direct(s, opc, data_reg, addr_reg, index_reg, disp);
 #endif
@@ -2007,7 +2058,11 @@ static void tcg_out_qemu_st(TCGContext* s, TCGReg 
data_reg, TCGReg addr_reg,
 #else
 TCGReg index_reg;
 tcg_target_long disp;
+unsigned a_bits = get_alignment_bits(opc);
 
+if (a_bits) {
+tcg_out_test_alignment(s, false, addr_reg, a_bits);
+}
 tcg_prepare_user_ldst(s, _reg, _reg, );
 tcg_out_qemu_st_direct(s, opc, data_reg, addr_reg, index_reg, disp);
 #endif
-- 
2.25.1

Re: [PATCH v4 09/48] target/ppc: Restrict ppc_cpu_do_unaligned_access to sysemu

2021-10-12 Thread Warner Losh

On Tue, Oct 12, 2021 at 8:52 PM Richard Henderson <
richard.hender...@linaro.org> wrote:

> This is not used by, nor required by, user-only.
>
> Signed-off-by: Richard Henderson 
> ---
>  target/ppc/internal.h| 8 +++-
>  target/ppc/excp_helper.c | 8 +++-
>  2 files changed, 6 insertions(+), 10 deletions(-)
>

Reviewed-by: Warner Losh 


> diff --git a/target/ppc/internal.h b/target/ppc/internal.h
> index 339974b7d8..6aa9484f34 100644
> --- a/target/ppc/internal.h
> +++ b/target/ppc/internal.h
> @@ -211,11 +211,6 @@ void helper_compute_fprf_float16(CPUPPCState *env,
> float16 arg);
>  void helper_compute_fprf_float32(CPUPPCState *env, float32 arg);
>  void helper_compute_fprf_float128(CPUPPCState *env, float128 arg);
>
> -/* Raise a data fault alignment exception for the specified virtual
> address */
> -void ppc_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
> - MMUAccessType access_type, int mmu_idx,
> - uintptr_t retaddr) QEMU_NORETURN;
> -
>  /* translate.c */
>
>  int ppc_fixup_cpu(PowerPCCPU *cpu);
> @@ -291,6 +286,9 @@ void ppc_cpu_record_sigsegv(CPUState *cs, vaddr addr,
>  bool ppc_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
>MMUAccessType access_type, int mmu_idx,
>bool probe, uintptr_t retaddr);
> +void ppc_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
> + MMUAccessType access_type, int mmu_idx,
> + uintptr_t retaddr) QEMU_NORETURN;
>  #endif
>
>  #endif /* PPC_INTERNAL_H */
> diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
> index e568a54536..17607adbe4 100644
> --- a/target/ppc/excp_helper.c
> +++ b/target/ppc/excp_helper.c
> @@ -1454,11 +1454,8 @@ void helper_book3s_msgsndp(CPUPPCState *env,
> target_ulong rb)
>
>  book3s_msgsnd_common(pir, PPC_INTERRUPT_DOORBELL);
>  }
> -#endif
> -#endif /* CONFIG_TCG */
> -#endif
> +#endif /* TARGET_PPC64 */
>
> -#ifdef CONFIG_TCG
>  void ppc_cpu_do_unaligned_access(CPUState *cs, vaddr vaddr,
>   MMUAccessType access_type,
>   int mmu_idx, uintptr_t retaddr)
> @@ -1483,4 +1480,5 @@ void ppc_cpu_do_unaligned_access(CPUState *cs, vaddr
> vaddr,
>  env->error_code = 0;
>  cpu_loop_exit_restore(cs, retaddr);
>  }
> -#endif
> +#endif /* CONFIG_TCG */
> +#endif /* !CONFIG_USER_ONLY */
> --
> 2.25.1
>
>
>

[PATCH v4 42/48] tcg/i386: Support raising sigbus for user-only

2021-10-12 Thread Richard Henderson

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 tcg/i386/tcg-target.h |   2 -
 tcg/i386/tcg-target.c.inc | 103 --
 2 files changed, 98 insertions(+), 7 deletions(-)

diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h
index b00a6da293..3b2c9437a0 100644
--- a/tcg/i386/tcg-target.h
+++ b/tcg/i386/tcg-target.h
@@ -232,9 +232,7 @@ static inline void tb_target_set_jmp_target(uintptr_t 
tc_ptr, uintptr_t jmp_rx,
 
 #define TCG_TARGET_HAS_MEMORY_BSWAP  have_movbe
 
-#ifdef CONFIG_SOFTMMU
 #define TCG_TARGET_NEED_LDST_LABELS
-#endif
 #define TCG_TARGET_NEED_POOL_LABELS
 
 #endif
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index 84b109bb84..e073868d8f 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -22,6 +22,7 @@
  * THE SOFTWARE.
  */
 
+#include "../tcg-ldst.c.inc"
 #include "../tcg-pool.c.inc"
 
 #ifdef CONFIG_DEBUG_TCG
@@ -421,8 +422,9 @@ static bool tcg_target_const_match(int64_t val, TCGType 
type, int ct)
 #define OPC_VZEROUPPER  (0x77 | P_EXT)
 #define OPC_XCHG_ax_r32(0x90)
 
-#define OPC_GRP3_Ev(0xf7)
-#define OPC_GRP5   (0xff)
+#define OPC_GRP3_Eb (0xf6)
+#define OPC_GRP3_Ev (0xf7)
+#define OPC_GRP5(0xff)
 #define OPC_GRP14   (0x73 | P_EXT | P_DATA16)
 
 /* Group 1 opcode extensions for 0x80-0x83.
@@ -444,6 +446,7 @@ static bool tcg_target_const_match(int64_t val, TCGType 
type, int ct)
 #define SHIFT_SAR 7
 
 /* Group 3 opcode extensions for 0xf6, 0xf7.  To be used with OPC_GRP3.  */
+#define EXT3_TESTi 0
 #define EXT3_NOT   2
 #define EXT3_NEG   3
 #define EXT3_MUL   4
@@ -1606,8 +1609,6 @@ static void tcg_out_nopn(TCGContext *s, int n)
 }
 
 #if defined(CONFIG_SOFTMMU)
-#include "../tcg-ldst.c.inc"
-
 /* helper signature: helper_ret_ld_mmu(CPUState *env, target_ulong addr,
  * int mmu_idx, uintptr_t ra)
  */
@@ -1916,7 +1917,84 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, 
TCGLabelQemuLdst *l)
 tcg_out_jmp(s, qemu_st_helpers[opc & (MO_BSWAP | MO_SIZE)]);
 return true;
 }
-#elif TCG_TARGET_REG_BITS == 32
+#else
+
+static void tcg_out_test_alignment(TCGContext *s, bool is_ld, TCGReg addrlo,
+   TCGReg addrhi, unsigned a_bits)
+{
+unsigned a_mask = (1 << a_bits) - 1;
+TCGLabelQemuLdst *label;
+
+/*
+ * We are expecting a_bits to max out at 7, so we can usually use testb.
+ * For i686, we have to use testl for %esi/%edi.
+ */
+if (a_mask <= 0xff && (TCG_TARGET_REG_BITS == 64 || addrlo < 4)) {
+tcg_out_modrm(s, OPC_GRP3_Eb | P_REXB_RM, EXT3_TESTi, addrlo);
+tcg_out8(s, a_mask);
+} else {
+tcg_out_modrm(s, OPC_GRP3_Ev, EXT3_TESTi, addrlo);
+tcg_out32(s, a_mask);
+}
+
+/* jne slow_path */
+tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0);
+
+label = new_ldst_label(s);
+label->is_ld = is_ld;
+label->addrlo_reg = addrlo;
+label->addrhi_reg = addrhi;
+label->raddr = tcg_splitwx_to_rx(s->code_ptr + 4);
+label->label_ptr[0] = s->code_ptr;
+
+s->code_ptr += 4;
+}
+
+static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l)
+{
+/* resolve label address */
+tcg_patch32(l->label_ptr[0], s->code_ptr - l->label_ptr[0] - 4);
+
+if (TCG_TARGET_REG_BITS == 32) {
+int ofs = 0;
+
+tcg_out_st(s, TCG_TYPE_PTR, TCG_AREG0, TCG_REG_ESP, ofs);
+ofs += 4;
+
+tcg_out_st(s, TCG_TYPE_I32, l->addrlo_reg, TCG_REG_ESP, ofs);
+ofs += 4;
+if (TARGET_LONG_BITS == 64) {
+tcg_out_st(s, TCG_TYPE_I32, l->addrhi_reg, TCG_REG_ESP, ofs);
+ofs += 4;
+}
+
+tcg_out_pushi(s, (uintptr_t)l->raddr);
+} else {
+tcg_out_mov(s, TCG_TYPE_TL, tcg_target_call_iarg_regs[1],
+l->addrlo_reg);
+tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0);
+
+tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_RAX, (uintptr_t)l->raddr);
+tcg_out_push(s, TCG_REG_RAX);
+}
+
+/* "Tail call" to the helper, with the return address back inline. */
+tcg_out_jmp(s, (const void *)(l->is_ld ? helper_unaligned_ld
+  : helper_unaligned_st));
+return true;
+}
+
+static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
+{
+return tcg_out_fail_alignment(s, l);
+}
+
+static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
+{
+return tcg_out_fail_alignment(s, l);
+}
+
+#if TCG_TARGET_REG_BITS == 32
 # define x86_guest_base_seg 0
 # define x86_guest_base_index   -1
 # define x86_guest_base_offset  guest_base
@@ -1950,6 +2028,7 @@ static inline int setup_guest_base_seg(void)
 return 0;
 }
 # endif
+#endif
 #endif /* SOFTMMU */
 
 static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg datalo, TCGReg datahi,
@@ -2059,6 +2138,8 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg

[PATCH v4 47/48] tcg/riscv: Support raising sigbus for user-only

2021-10-12 Thread Richard Henderson

Signed-off-by: Richard Henderson 
---
 tcg/riscv/tcg-target.h |  2 --
 tcg/riscv/tcg-target.c.inc | 63 --
 2 files changed, 61 insertions(+), 4 deletions(-)

diff --git a/tcg/riscv/tcg-target.h b/tcg/riscv/tcg-target.h
index ef78b99e98..11c9b3e4f4 100644
--- a/tcg/riscv/tcg-target.h
+++ b/tcg/riscv/tcg-target.h
@@ -165,9 +165,7 @@ void tb_target_set_jmp_target(uintptr_t, uintptr_t, 
uintptr_t, uintptr_t);
 
 #define TCG_TARGET_DEFAULT_MO (0)
 
-#ifdef CONFIG_SOFTMMU
 #define TCG_TARGET_NEED_LDST_LABELS
-#endif
 #define TCG_TARGET_NEED_POOL_LABELS
 
 #define TCG_TARGET_HAS_MEMORY_BSWAP 0
diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
index 9b13a46fb4..49e84cbe13 100644
--- a/tcg/riscv/tcg-target.c.inc
+++ b/tcg/riscv/tcg-target.c.inc
@@ -27,6 +27,7 @@
  * THE SOFTWARE.
  */
 
+#include "../tcg-ldst.c.inc"
 #include "../tcg-pool.c.inc"
 
 #ifdef CONFIG_DEBUG_TCG
@@ -847,8 +848,6 @@ static void tcg_out_mb(TCGContext *s, TCGArg a0)
  */
 
 #if defined(CONFIG_SOFTMMU)
-#include "../tcg-ldst.c.inc"
-
 /* helper signature: helper_ret_ld_mmu(CPUState *env, target_ulong addr,
  * MemOpIdx oi, uintptr_t ra)
  */
@@ -1053,6 +1052,54 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, 
TCGLabelQemuLdst *l)
 tcg_out_goto(s, l->raddr);
 return true;
 }
+#else
+
+static void tcg_out_test_alignment(TCGContext *s, bool is_ld, TCGReg addr_reg,
+   unsigned a_bits)
+{
+unsigned a_mask = (1 << a_bits) - 1;
+TCGLabelQemuLdst *l = new_ldst_label(s);
+
+l->is_ld = is_ld;
+l->addrlo_reg = addr_reg;
+
+/* We are expecting a_bits to max out at 7, so we can always use andi. */
+tcg_debug_assert(a_bits < 12);
+tcg_out_opc_imm(s, OPC_ANDI, TCG_REG_TMP1, addr_reg, a_mask);
+
+l->label_ptr[0] = s->code_ptr;
+tcg_out_opc_branch(s, OPC_BNE, TCG_REG_TMP1, TCG_REG_ZERO, 0);
+
+l->raddr = tcg_splitwx_to_rx(s->code_ptr);
+}
+
+static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l)
+{
+/* resolve label address */
+if (!reloc_sbimm12(l->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
+return false;
+}
+
+tcg_out_mov(s, TCG_TYPE_TL, TCG_REG_A1, l->addrlo_reg);
+tcg_out_mov(s, TCG_TYPE_PTR, TCG_REG_A0, TCG_AREG0);
+
+/* tail call, with the return address back inline. */
+tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_RA, (uintptr_t)l->raddr);
+tcg_out_call_int(s, (const void *)(l->is_ld ? helper_unaligned_ld
+   : helper_unaligned_st), true);
+return true;
+}
+
+static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
+{
+return tcg_out_fail_alignment(s, l);
+}
+
+static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
+{
+return tcg_out_fail_alignment(s, l);
+}
+
 #endif /* CONFIG_SOFTMMU */
 
 static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg lo, TCGReg hi,
@@ -1108,6 +1155,8 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg 
*args, bool is_64)
 MemOp opc;
 #if defined(CONFIG_SOFTMMU)
 tcg_insn_unit *label_ptr[1];
+#else
+unsigned a_bits;
 #endif
 TCGReg base = TCG_REG_TMP0;
 
@@ -1130,6 +1179,10 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg 
*args, bool is_64)
 tcg_out_ext32u(s, base, addr_regl);
 addr_regl = base;
 }
+a_bits = get_alignment_bits(opc);
+if (a_bits) {
+tcg_out_test_alignment(s, true, addr_regl, a_bits);
+}
 if (guest_base != 0) {
 tcg_out_opc_reg(s, OPC_ADD, base, TCG_GUEST_BASE_REG, addr_regl);
 }
@@ -1174,6 +1227,8 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg 
*args, bool is_64)
 MemOp opc;
 #if defined(CONFIG_SOFTMMU)
 tcg_insn_unit *label_ptr[1];
+#else
+unsigned a_bits;
 #endif
 TCGReg base = TCG_REG_TMP0;
 
@@ -1196,6 +1251,10 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg 
*args, bool is_64)
 tcg_out_ext32u(s, base, addr_regl);
 addr_regl = base;
 }
+a_bits = get_alignment_bits(opc);
+if (a_bits) {
+tcg_out_test_alignment(s, false, addr_regl, a_bits);
+}
 if (guest_base != 0) {
 tcg_out_opc_reg(s, OPC_ADD, base, TCG_GUEST_BASE_REG, addr_regl);
 }
-- 
2.25.1

[PATCH v4 44/48] tcg/ppc: Support raising sigbus for user-only

2021-10-12 Thread Richard Henderson

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 tcg/ppc/tcg-target.h |  2 -
 tcg/ppc/tcg-target.c.inc | 98 
 2 files changed, 90 insertions(+), 10 deletions(-)

diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h
index 0943192cde..c775c97b61 100644
--- a/tcg/ppc/tcg-target.h
+++ b/tcg/ppc/tcg-target.h
@@ -182,9 +182,7 @@ void tb_target_set_jmp_target(uintptr_t, uintptr_t, 
uintptr_t, uintptr_t);
 #define TCG_TARGET_DEFAULT_MO (0)
 #define TCG_TARGET_HAS_MEMORY_BSWAP 1
 
-#ifdef CONFIG_SOFTMMU
 #define TCG_TARGET_NEED_LDST_LABELS
-#endif
 #define TCG_TARGET_NEED_POOL_LABELS
 
 #endif
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index 3e4ca2be88..8a117e0665 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -24,6 +24,7 @@
 
 #include "elf.h"
 #include "../tcg-pool.c.inc"
+#include "../tcg-ldst.c.inc"
 
 /*
  * Standardize on the _CALL_FOO symbols used by GCC:
@@ -1881,7 +1882,8 @@ void tb_target_set_jmp_target(uintptr_t tc_ptr, uintptr_t 
jmp_rx,
 }
 }
 
-static void tcg_out_call(TCGContext *s, const tcg_insn_unit *target)
+static void tcg_out_call_int(TCGContext *s, int lk,
+ const tcg_insn_unit *target)
 {
 #ifdef _CALL_AIX
 /* Look through the descriptor.  If the branch is in range, and we
@@ -1892,7 +1894,7 @@ static void tcg_out_call(TCGContext *s, const 
tcg_insn_unit *target)
 
 if (in_range_b(diff) && toc == (uint32_t)toc) {
 tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_TMP1, toc);
-tcg_out_b(s, LK, tgt);
+tcg_out_b(s, lk, tgt);
 } else {
 /* Fold the low bits of the constant into the addresses below.  */
 intptr_t arg = (intptr_t)target;
@@ -1907,7 +1909,7 @@ static void tcg_out_call(TCGContext *s, const 
tcg_insn_unit *target)
 tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_R0, TCG_REG_TMP1, ofs);
 tcg_out32(s, MTSPR | RA(TCG_REG_R0) | CTR);
 tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_R2, TCG_REG_TMP1, ofs + SZP);
-tcg_out32(s, BCCTR | BO_ALWAYS | LK);
+tcg_out32(s, BCCTR | BO_ALWAYS | lk);
 }
 #elif defined(_CALL_ELF) && _CALL_ELF == 2
 intptr_t diff;
@@ -1921,16 +1923,21 @@ static void tcg_out_call(TCGContext *s, const 
tcg_insn_unit *target)
 
 diff = tcg_pcrel_diff(s, target);
 if (in_range_b(diff)) {
-tcg_out_b(s, LK, target);
+tcg_out_b(s, lk, target);
 } else {
 tcg_out32(s, MTSPR | RS(TCG_REG_R12) | CTR);
-tcg_out32(s, BCCTR | BO_ALWAYS | LK);
+tcg_out32(s, BCCTR | BO_ALWAYS | lk);
 }
 #else
-tcg_out_b(s, LK, target);
+tcg_out_b(s, lk, target);
 #endif
 }
 
+static void tcg_out_call(TCGContext *s, const tcg_insn_unit *target)
+{
+tcg_out_call_int(s, LK, target);
+}
+
 static const uint32_t qemu_ldx_opc[(MO_SSIZE + MO_BSWAP) + 1] = {
 [MO_UB] = LBZX,
 [MO_UW] = LHZX,
@@ -1960,8 +1967,6 @@ static const uint32_t qemu_exts_opc[4] = {
 };
 
 #if defined (CONFIG_SOFTMMU)
-#include "../tcg-ldst.c.inc"
-
 /* helper signature: helper_ld_mmu(CPUState *env, target_ulong addr,
  * int mmu_idx, uintptr_t ra)
  */
@@ -2227,6 +2232,71 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, 
TCGLabelQemuLdst *lb)
 tcg_out_b(s, 0, lb->raddr);
 return true;
 }
+#else
+
+static void tcg_out_test_alignment(TCGContext *s, bool is_ld, TCGReg addrlo,
+   TCGReg addrhi, unsigned a_bits)
+{
+unsigned a_mask = (1 << a_bits) - 1;
+TCGLabelQemuLdst *label = new_ldst_label(s);
+
+label->is_ld = is_ld;
+label->addrlo_reg = addrlo;
+label->addrhi_reg = addrhi;
+
+/* We are expecting a_bits to max out at 7, much lower than ANDI. */
+tcg_debug_assert(a_bits < 16);
+tcg_out32(s, ANDI | SAI(addrlo, TCG_REG_R0, a_mask));
+
+label->label_ptr[0] = s->code_ptr;
+tcg_out32(s, BC | BI(0, CR_EQ) | BO_COND_FALSE | LK);
+
+label->raddr = tcg_splitwx_to_rx(s->code_ptr);
+}
+
+static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l)
+{
+if (!reloc_pc14(l->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
+return false;
+}
+
+if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
+TCGReg arg = TCG_REG_R4;
+#ifdef TCG_TARGET_CALL_ALIGN_ARGS
+arg |= 1;
+#endif
+if (l->addrlo_reg != arg) {
+tcg_out_mov(s, TCG_TYPE_I32, arg, l->addrhi_reg);
+tcg_out_mov(s, TCG_TYPE_I32, arg + 1, l->addrlo_reg);
+} else if (l->addrhi_reg != arg + 1) {
+tcg_out_mov(s, TCG_TYPE_I32, arg + 1, l->addrlo_reg);
+tcg_out_mov(s, TCG_TYPE_I32, arg, l->addrhi_reg);
+} else {
+tcg_out_mov(s, TCG_TYPE_I32, TCG_REG_R0, arg);
+tcg_out_mov(s, TCG_TYPE_I32, arg, arg + 1);
+tcg_out_mov(s, TCG_TYPE_I32, arg + 1, TCG_REG_R0);
+}
+} else {
+tcg_out_mov(s, TCG_TYPE_TL, TCG_REG_R4, l->addrlo_reg);
+}
+

[PATCH v4 41/48] tcg: Canonicalize alignment flags in MemOp

2021-10-12 Thread Richard Henderson

Having observed e.g. al8+leq in dumps, canonicalize to al+leq.

Signed-off-by: Richard Henderson 
---
 tcg/tcg-op.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index b1cfd36f29..61b492d89f 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -2765,7 +2765,12 @@ void tcg_gen_lookup_and_goto_ptr(void)
 static inline MemOp tcg_canonicalize_memop(MemOp op, bool is64, bool st)
 {
 /* Trigger the asserts within as early as possible.  */
-(void)get_alignment_bits(op);
+unsigned a_bits = get_alignment_bits(op);
+
+/* Prefer MO_ALIGN+MO_XX over MO_ALIGN_XX+MO_XX */
+if (a_bits == (op & MO_SIZE)) {
+op = (op & ~MO_AMASK) | MO_ALIGN;
+}
 
 switch (op & MO_SIZE) {
 case MO_8:
-- 
2.25.1

[PATCH v4 39/48] target/sh4: Implement prctl_unalign_sigbus

2021-10-12 Thread Richard Henderson

Leave TARGET_ALIGNED_ONLY set, but use the new CPUState
flag to set MO_UNALN for the instructions that the kernel
handles in the unaligned trap.

The Linux kernel does not handle all memory operations: no
floating-point and no MAC.

Signed-off-by: Richard Henderson 
---
 linux-user/sh4/target_prctl.h |  2 +-
 target/sh4/cpu.h  |  4 +++
 target/sh4/translate.c| 50 ---
 3 files changed, 39 insertions(+), 17 deletions(-)

diff --git a/linux-user/sh4/target_prctl.h b/linux-user/sh4/target_prctl.h
index eb53b31ad5..5629ddbf39 100644
--- a/linux-user/sh4/target_prctl.h
+++ b/linux-user/sh4/target_prctl.h
@@ -1 +1 @@
-/* No special prctl support required. */
+#include "../generic/target_prctl_unalign.h"
diff --git a/target/sh4/cpu.h b/target/sh4/cpu.h
index 4cfb109f56..fb9dd9db2f 100644
--- a/target/sh4/cpu.h
+++ b/target/sh4/cpu.h
@@ -83,6 +83,7 @@
 #define DELAY_SLOT_RTE (1 << 2)
 
 #define TB_FLAG_PENDING_MOVCA  (1 << 3)
+#define TB_FLAG_UNALIGN(1 << 4)
 
 #define GUSA_SHIFT 4
 #ifdef CONFIG_USER_ONLY
@@ -373,6 +374,9 @@ static inline void cpu_get_tb_cpu_state(CPUSH4State *env, 
target_ulong *pc,
 | (env->sr & ((1u << SR_MD) | (1u << SR_RB)))  /* Bits 29-30 */
 | (env->sr & (1u << SR_FD))/* Bit 15 */
 | (env->movcal_backup ? TB_FLAG_PENDING_MOVCA : 0); /* Bit 3 */
+#ifdef CONFIG_USER_ONLY
+*flags |= TB_FLAG_UNALIGN * !env_cpu(env)->prctl_unalign_sigbus;
+#endif
 }
 
 #endif /* SH4_CPU_H */
diff --git a/target/sh4/translate.c b/target/sh4/translate.c
index d363050272..7965db586f 100644
--- a/target/sh4/translate.c
+++ b/target/sh4/translate.c
@@ -50,8 +50,10 @@ typedef struct DisasContext {
 
 #if defined(CONFIG_USER_ONLY)
 #define IS_USER(ctx) 1
+#define UNALIGN(C)   (ctx->tbflags & TB_FLAG_UNALIGN ? MO_UNALN : 0)
 #else
 #define IS_USER(ctx) (!(ctx->tbflags & (1u << SR_MD)))
+#define UNALIGN(C)   0
 #endif
 
 /* Target-specific values for ctx->base.is_jmp.  */
@@ -499,7 +501,8 @@ static void _decode_opc(DisasContext * ctx)
{
TCGv addr = tcg_temp_new();
tcg_gen_addi_i32(addr, REG(B11_8), B3_0 * 4);
-tcg_gen_qemu_st_i32(REG(B7_4), addr, ctx->memidx, MO_TEUL);
+tcg_gen_qemu_st_i32(REG(B7_4), addr, ctx->memidx,
+MO_TEUL | UNALIGN(ctx));
tcg_temp_free(addr);
}
return;
@@ -507,7 +510,8 @@ static void _decode_opc(DisasContext * ctx)
{
TCGv addr = tcg_temp_new();
tcg_gen_addi_i32(addr, REG(B7_4), B3_0 * 4);
-tcg_gen_qemu_ld_i32(REG(B11_8), addr, ctx->memidx, MO_TESL);
+tcg_gen_qemu_ld_i32(REG(B11_8), addr, ctx->memidx,
+MO_TESL | UNALIGN(ctx));
tcg_temp_free(addr);
}
return;
@@ -562,19 +566,23 @@ static void _decode_opc(DisasContext * ctx)
 tcg_gen_qemu_st_i32(REG(B7_4), REG(B11_8), ctx->memidx, MO_UB);
return;
 case 0x2001:   /* mov.w Rm,@Rn */
-tcg_gen_qemu_st_i32(REG(B7_4), REG(B11_8), ctx->memidx, MO_TEUW);
+tcg_gen_qemu_st_i32(REG(B7_4), REG(B11_8), ctx->memidx,
+MO_TEUW | UNALIGN(ctx));
return;
 case 0x2002:   /* mov.l Rm,@Rn */
-tcg_gen_qemu_st_i32(REG(B7_4), REG(B11_8), ctx->memidx, MO_TEUL);
+tcg_gen_qemu_st_i32(REG(B7_4), REG(B11_8), ctx->memidx,
+MO_TEUL | UNALIGN(ctx));
return;
 case 0x6000:   /* mov.b @Rm,Rn */
 tcg_gen_qemu_ld_i32(REG(B11_8), REG(B7_4), ctx->memidx, MO_SB);
return;
 case 0x6001:   /* mov.w @Rm,Rn */
-tcg_gen_qemu_ld_i32(REG(B11_8), REG(B7_4), ctx->memidx, MO_TESW);
+tcg_gen_qemu_ld_i32(REG(B11_8), REG(B7_4), ctx->memidx,
+MO_TESW | UNALIGN(ctx));
return;
 case 0x6002:   /* mov.l @Rm,Rn */
-tcg_gen_qemu_ld_i32(REG(B11_8), REG(B7_4), ctx->memidx, MO_TESL);
+tcg_gen_qemu_ld_i32(REG(B11_8), REG(B7_4), ctx->memidx,
+MO_TESL | UNALIGN(ctx));
return;
 case 0x2004:   /* mov.b Rm,@-Rn */
{
@@ -590,7 +598,8 @@ static void _decode_opc(DisasContext * ctx)
{
TCGv addr = tcg_temp_new();
tcg_gen_subi_i32(addr, REG(B11_8), 2);
-tcg_gen_qemu_st_i32(REG(B7_4), addr, ctx->memidx, MO_TEUW);
+tcg_gen_qemu_st_i32(REG(B7_4), addr, ctx->memidx,
+MO_TEUW | UNALIGN(ctx));
tcg_gen_mov_i32(REG(B11_8), addr);
tcg_temp_free(addr);
}
@@ -599,7 +608,8 @@ static void _decode_opc(DisasContext * ctx)
{
TCGv addr = tcg_temp_new();
tcg_gen_subi_i32(addr, REG(B11_8), 4);
-tcg_gen_qemu_st_i32(REG(B7_4), addr, ctx->memidx, MO_TEUL);
+

[PATCH v4 46/48] tcg/tci: Support raising sigbus for user-only

2021-10-12 Thread Richard Henderson

Signed-off-by: Richard Henderson 
---
 tcg/tci.c | 20 ++--
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/tcg/tci.c b/tcg/tci.c
index e76087ccac..92a7c81674 100644
--- a/tcg/tci.c
+++ b/tcg/tci.c
@@ -292,11 +292,11 @@ static bool tci_compare64(uint64_t u0, uint64_t u1, 
TCGCond condition)
 static uint64_t tci_qemu_ld(CPUArchState *env, target_ulong taddr,
 MemOpIdx oi, const void *tb_ptr)
 {
-MemOp mop = get_memop(oi) & (MO_BSWAP | MO_SSIZE);
+MemOp mop = get_memop(oi);
 uintptr_t ra = (uintptr_t)tb_ptr;
 
 #ifdef CONFIG_SOFTMMU
-switch (mop) {
+switch (mop & (MO_BSWAP | MO_SSIZE)) {
 case MO_UB:
 return helper_ret_ldub_mmu(env, taddr, oi, ra);
 case MO_SB:
@@ -326,10 +326,14 @@ static uint64_t tci_qemu_ld(CPUArchState *env, 
target_ulong taddr,
 }
 #else
 void *haddr = g2h(env_cpu(env), taddr);
+unsigned a_mask = (1u << get_alignment_bits(mop)) - 1;
 uint64_t ret;
 
 set_helper_retaddr(ra);
-switch (mop) {
+if (taddr & a_mask) {
+helper_unaligned_ld(env, taddr);
+}
+switch (mop & (MO_BSWAP | MO_SSIZE)) {
 case MO_UB:
 ret = ldub_p(haddr);
 break;
@@ -377,11 +381,11 @@ static uint64_t tci_qemu_ld(CPUArchState *env, 
target_ulong taddr,
 static void tci_qemu_st(CPUArchState *env, target_ulong taddr, uint64_t val,
 MemOpIdx oi, const void *tb_ptr)
 {
-MemOp mop = get_memop(oi) & (MO_BSWAP | MO_SSIZE);
+MemOp mop = get_memop(oi);
 uintptr_t ra = (uintptr_t)tb_ptr;
 
 #ifdef CONFIG_SOFTMMU
-switch (mop) {
+switch (mop & (MO_BSWAP | MO_SIZE)) {
 case MO_UB:
 helper_ret_stb_mmu(env, taddr, val, oi, ra);
 break;
@@ -408,9 +412,13 @@ static void tci_qemu_st(CPUArchState *env, target_ulong 
taddr, uint64_t val,
 }
 #else
 void *haddr = g2h(env_cpu(env), taddr);
+unsigned a_mask = (1u << get_alignment_bits(mop)) - 1;
 
 set_helper_retaddr(ra);
-switch (mop) {
+if (taddr & a_mask) {
+helper_unaligned_st(env, taddr);
+}
+switch (mop & (MO_BSWAP | MO_SIZE)) {
 case MO_UB:
 stb_p(haddr, val);
 break;
-- 
2.25.1

[PATCH v4 40/48] linux-user/signal: Handle BUS_ADRALN in host_signal_handler

2021-10-12 Thread Richard Henderson

Handle BUS_ADRALN via cpu_loop_exit_sigbus, but allow other SIGBUS
si_codes to continue into the host-to-guest signal coversion code.

Signed-off-by: Richard Henderson 
---
 linux-user/signal.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/linux-user/signal.c b/linux-user/signal.c
index df2c8678d0..81c45bfce9 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -860,6 +860,9 @@ static void host_signal_handler(int host_sig, siginfo_t 
*info, void *puc)
 cpu_loop_exit_sigsegv(cpu, guest_addr, access_type, maperr, pc);
 } else {
 sigprocmask(SIG_SETMASK, >uc_sigmask, NULL);
+if (info->si_code == BUS_ADRALN) {
+cpu_loop_exit_sigbus(cpu, guest_addr, access_type, pc);
+}
 }
 
 sync_sig = true;
-- 
2.25.1

[PATCH v4 43/48] tcg/aarch64: Support raising sigbus for user-only

2021-10-12 Thread Richard Henderson

Signed-off-by: Richard Henderson 
---
 tcg/aarch64/tcg-target.h |  2 -
 tcg/aarch64/tcg-target.c.inc | 91 +---
 2 files changed, 74 insertions(+), 19 deletions(-)

diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h
index 7a93ac8023..876af589ce 100644
--- a/tcg/aarch64/tcg-target.h
+++ b/tcg/aarch64/tcg-target.h
@@ -151,9 +151,7 @@ typedef enum {
 
 void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t, uintptr_t);
 
-#ifdef CONFIG_SOFTMMU
 #define TCG_TARGET_NEED_LDST_LABELS
-#endif
 #define TCG_TARGET_NEED_POOL_LABELS
 
 #endif /* AARCH64_TCG_TARGET_H */
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
index 5edca8d44d..1f205f90b2 100644
--- a/tcg/aarch64/tcg-target.c.inc
+++ b/tcg/aarch64/tcg-target.c.inc
@@ -10,6 +10,7 @@
  * See the COPYING file in the top-level directory for details.
  */
 
+#include "../tcg-ldst.c.inc"
 #include "../tcg-pool.c.inc"
 #include "qemu/bitops.h"
 
@@ -443,6 +444,7 @@ typedef enum {
 I3404_ANDI  = 0x1200,
 I3404_ORRI  = 0x3200,
 I3404_EORI  = 0x5200,
+I3404_ANDSI = 0x7200,
 
 /* Move wide immediate instructions.  */
 I3405_MOVN  = 0x1280,
@@ -1328,8 +1330,9 @@ static void tcg_out_goto_long(TCGContext *s, const 
tcg_insn_unit *target)
 if (offset == sextract64(offset, 0, 26)) {
 tcg_out_insn(s, 3206, B, offset);
 } else {
-tcg_out_movi(s, TCG_TYPE_I64, TCG_REG_TMP, (intptr_t)target);
-tcg_out_insn(s, 3207, BR, TCG_REG_TMP);
+/* Choose X9 as a call-clobbered non-LR temporary. */
+tcg_out_movi(s, TCG_TYPE_I64, TCG_REG_X9, (intptr_t)target);
+tcg_out_insn(s, 3207, BR, TCG_REG_X9);
 }
 }
 
@@ -1541,9 +1544,14 @@ static void tcg_out_cltz(TCGContext *s, TCGType ext, 
TCGReg d,
 }
 }
 
-#ifdef CONFIG_SOFTMMU
-#include "../tcg-ldst.c.inc"
+static void tcg_out_adr(TCGContext *s, TCGReg rd, const void *target)
+{
+ptrdiff_t offset = tcg_pcrel_diff(s, target);
+tcg_debug_assert(offset == sextract64(offset, 0, 21));
+tcg_out_insn(s, 3406, ADR, rd, offset);
+}
 
+#ifdef CONFIG_SOFTMMU
 /* helper signature: helper_ret_ld_mmu(CPUState *env, target_ulong addr,
  * MemOpIdx oi, uintptr_t ra)
  */
@@ -1577,13 +1585,6 @@ static void * const qemu_st_helpers[MO_SIZE + 1] = {
 #endif
 };
 
-static inline void tcg_out_adr(TCGContext *s, TCGReg rd, const void *target)
-{
-ptrdiff_t offset = tcg_pcrel_diff(s, target);
-tcg_debug_assert(offset == sextract64(offset, 0, 21));
-tcg_out_insn(s, 3406, ADR, rd, offset);
-}
-
 static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
 {
 MemOpIdx oi = lb->oi;
@@ -1714,15 +1715,58 @@ static void tcg_out_tlb_read(TCGContext *s, TCGReg 
addr_reg, MemOp opc,
 tcg_out_insn(s, 3202, B_C, TCG_COND_NE, 0);
 }
 
+#else
+static void tcg_out_test_alignment(TCGContext *s, bool is_ld, TCGReg addr_reg,
+   unsigned a_bits)
+{
+unsigned a_mask = (1 << a_bits) - 1;
+TCGLabelQemuLdst *label = new_ldst_label(s);
+
+label->is_ld = is_ld;
+label->addrlo_reg = addr_reg;
+
+/* tst addr, #mask */
+tcg_out_logicali(s, I3404_ANDSI, 0, TCG_REG_XZR, addr_reg, a_mask);
+
+label->label_ptr[0] = s->code_ptr;
+
+/* b.ne slow_path */
+tcg_out_insn(s, 3202, B_C, TCG_COND_NE, 0);
+
+label->raddr = tcg_splitwx_to_rx(s->code_ptr);
+}
+
+static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l)
+{
+if (!reloc_pc19(l->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
+return false;
+}
+
+tcg_out_mov(s, TCG_TYPE_TL, TCG_REG_X1, l->addrlo_reg);
+tcg_out_mov(s, TCG_TYPE_PTR, TCG_REG_X0, TCG_AREG0);
+
+/* "Tail call" to the helper, with the return address back inline. */
+tcg_out_adr(s, TCG_REG_LR, l->raddr);
+tcg_out_goto_long(s, (const void *)(l->is_ld ? helper_unaligned_ld
+: helper_unaligned_st));
+return true;
+}
+
+static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
+{
+return tcg_out_fail_alignment(s, l);
+}
+
+static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
+{
+return tcg_out_fail_alignment(s, l);
+}
 #endif /* CONFIG_SOFTMMU */
 
 static void tcg_out_qemu_ld_direct(TCGContext *s, MemOp memop, TCGType ext,
TCGReg data_r, TCGReg addr_r,
TCGType otype, TCGReg off_r)
 {
-/* Byte swapping is left to middle-end expansion. */
-tcg_debug_assert((memop & MO_BSWAP) == 0);
-
 switch (memop & MO_SSIZE) {
 case MO_UB:
 tcg_out_ldst_r(s, I3312_LDRB, data_r, addr_r, otype, off_r);
@@ -1756,9 +1800,6 @@ static void tcg_out_qemu_st_direct(TCGContext *s, MemOp 
memop,
TCGReg data_r, TCGReg addr_r,
TCGType otype, TCGReg off_r)
 {
-

[PATCH v4 35/48] target/alpha: Reorg fp memory operations

2021-10-12 Thread Richard Henderson

Pass in the context to each mini-helper, instead of an
incorrectly named "flags".  Separate gen_load_fp and
gen_store_fp, away from the integer helpers.

Signed-off-by: Richard Henderson 
---
 target/alpha/translate.c | 83 +++-
 1 file changed, 57 insertions(+), 26 deletions(-)

diff --git a/target/alpha/translate.c b/target/alpha/translate.c
index b034206688..bfdd485508 100644
--- a/target/alpha/translate.c
+++ b/target/alpha/translate.c
@@ -267,30 +267,47 @@ static inline DisasJumpType gen_invalid(DisasContext *ctx)
 return gen_excp(ctx, EXCP_OPCDEC, 0);
 }
 
-static inline void gen_qemu_ldf(TCGv t0, TCGv t1, int flags)
+static void gen_ldf(DisasContext *ctx, TCGv dest, TCGv addr)
 {
 TCGv_i32 tmp32 = tcg_temp_new_i32();
-tcg_gen_qemu_ld_i32(tmp32, t1, flags, MO_LEUL);
-gen_helper_memory_to_f(t0, tmp32);
+tcg_gen_qemu_ld_i32(tmp32, addr, ctx->mem_idx, MO_LEUL);
+gen_helper_memory_to_f(dest, tmp32);
 tcg_temp_free_i32(tmp32);
 }
 
-static inline void gen_qemu_ldg(TCGv t0, TCGv t1, int flags)
+static void gen_ldg(DisasContext *ctx, TCGv dest, TCGv addr)
 {
 TCGv tmp = tcg_temp_new();
-tcg_gen_qemu_ld_i64(tmp, t1, flags, MO_LEQ);
-gen_helper_memory_to_g(t0, tmp);
+tcg_gen_qemu_ld_i64(tmp, addr, ctx->mem_idx, MO_LEQ);
+gen_helper_memory_to_g(dest, tmp);
 tcg_temp_free(tmp);
 }
 
-static inline void gen_qemu_lds(TCGv t0, TCGv t1, int flags)
+static void gen_lds(DisasContext *ctx, TCGv dest, TCGv addr)
 {
 TCGv_i32 tmp32 = tcg_temp_new_i32();
-tcg_gen_qemu_ld_i32(tmp32, t1, flags, MO_LEUL);
-gen_helper_memory_to_s(t0, tmp32);
+tcg_gen_qemu_ld_i32(tmp32, addr, ctx->mem_idx, MO_LEUL);
+gen_helper_memory_to_s(dest, tmp32);
 tcg_temp_free_i32(tmp32);
 }
 
+static void gen_ldt(DisasContext *ctx, TCGv dest, TCGv addr)
+{
+tcg_gen_qemu_ld_i64(dest, addr, ctx->mem_idx, MO_LEQ);
+}
+
+static void gen_load_fp(DisasContext *ctx, int ra, int rb, int32_t disp16,
+void (*func)(DisasContext *, TCGv, TCGv))
+{
+/* Loads to $f31 are prefetches, which we can treat as nops. */
+if (likely(ra != 31)) {
+TCGv addr = tcg_temp_new();
+tcg_gen_addi_i64(addr, load_gpr(ctx, rb), disp16);
+func(ctx, cpu_fir[ra], addr);
+tcg_temp_free(addr);
+}
+}
+
 static inline void gen_qemu_ldl_l(TCGv t0, TCGv t1, int flags)
 {
 tcg_gen_qemu_ld_i64(t0, t1, flags, MO_LESL);
@@ -338,30 +355,44 @@ static inline void gen_load_mem(DisasContext *ctx,
 tcg_temp_free(tmp);
 }
 
-static inline void gen_qemu_stf(TCGv t0, TCGv t1, int flags)
+static void gen_stf(DisasContext *ctx, TCGv src, TCGv addr)
 {
 TCGv_i32 tmp32 = tcg_temp_new_i32();
-gen_helper_f_to_memory(tmp32, t0);
-tcg_gen_qemu_st_i32(tmp32, t1, flags, MO_LEUL);
+gen_helper_f_to_memory(tmp32, addr);
+tcg_gen_qemu_st_i32(tmp32, addr, ctx->mem_idx, MO_LEUL);
 tcg_temp_free_i32(tmp32);
 }
 
-static inline void gen_qemu_stg(TCGv t0, TCGv t1, int flags)
+static void gen_stg(DisasContext *ctx, TCGv src, TCGv addr)
 {
 TCGv tmp = tcg_temp_new();
-gen_helper_g_to_memory(tmp, t0);
-tcg_gen_qemu_st_i64(tmp, t1, flags, MO_LEQ);
+gen_helper_g_to_memory(tmp, src);
+tcg_gen_qemu_st_i64(tmp, addr, ctx->mem_idx, MO_LEQ);
 tcg_temp_free(tmp);
 }
 
-static inline void gen_qemu_sts(TCGv t0, TCGv t1, int flags)
+static void gen_sts(DisasContext *ctx, TCGv src, TCGv addr)
 {
 TCGv_i32 tmp32 = tcg_temp_new_i32();
-gen_helper_s_to_memory(tmp32, t0);
-tcg_gen_qemu_st_i32(tmp32, t1, flags, MO_LEUL);
+gen_helper_s_to_memory(tmp32, src);
+tcg_gen_qemu_st_i32(tmp32, addr, ctx->mem_idx, MO_LEUL);
 tcg_temp_free_i32(tmp32);
 }
 
+static void gen_stt(DisasContext *ctx, TCGv src, TCGv addr)
+{
+tcg_gen_qemu_st_i64(src, addr, ctx->mem_idx, MO_LEQ);
+}
+
+static void gen_store_fp(DisasContext *ctx, int ra, int rb, int32_t disp16,
+ void (*func)(DisasContext *, TCGv, TCGv))
+{
+TCGv addr = tcg_temp_new();
+tcg_gen_addi_i64(addr, load_gpr(ctx, rb), disp16);
+func(ctx, load_fpr(ctx, ra), addr);
+tcg_temp_free(addr);
+}
+
 static inline void gen_store_mem(DisasContext *ctx,
  void (*tcg_gen_qemu_store)(TCGv t0, TCGv t1,
 int flags),
@@ -2776,42 +2807,42 @@ static DisasJumpType translate_one(DisasContext *ctx, 
uint32_t insn)
 case 0x20:
 /* LDF */
 REQUIRE_FEN;
-gen_load_mem(ctx, _qemu_ldf, ra, rb, disp16, 1, 0);
+gen_load_fp(ctx, ra, rb, disp16, gen_ldf);
 break;
 case 0x21:
 /* LDG */
 REQUIRE_FEN;
-gen_load_mem(ctx, _qemu_ldg, ra, rb, disp16, 1, 0);
+gen_load_fp(ctx, ra, rb, disp16, gen_ldg);
 break;
 case 0x22:
 /* LDS */
 REQUIRE_FEN;
-gen_load_mem(ctx, _qemu_lds, ra, rb, disp16, 1, 0);
+gen_load_fp(ctx, ra, rb,

[PATCH v4 36/48] target/alpha: Reorg integer memory operations

2021-10-12 Thread Richard Henderson

Pass in the MemOp instead of a callback.
Drop the fp argument; add a locked argument.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/alpha/translate.c | 104 +++
 1 file changed, 40 insertions(+), 64 deletions(-)

diff --git a/target/alpha/translate.c b/target/alpha/translate.c
index bfdd485508..0eee3a1bcc 100644
--- a/target/alpha/translate.c
+++ b/target/alpha/translate.c
@@ -308,27 +308,10 @@ static void gen_load_fp(DisasContext *ctx, int ra, int 
rb, int32_t disp16,
 }
 }
 
-static inline void gen_qemu_ldl_l(TCGv t0, TCGv t1, int flags)
+static void gen_load_int(DisasContext *ctx, int ra, int rb, int32_t disp16,
+ MemOp op, bool clear, bool locked)
 {
-tcg_gen_qemu_ld_i64(t0, t1, flags, MO_LESL);
-tcg_gen_mov_i64(cpu_lock_addr, t1);
-tcg_gen_mov_i64(cpu_lock_value, t0);
-}
-
-static inline void gen_qemu_ldq_l(TCGv t0, TCGv t1, int flags)
-{
-tcg_gen_qemu_ld_i64(t0, t1, flags, MO_LEQ);
-tcg_gen_mov_i64(cpu_lock_addr, t1);
-tcg_gen_mov_i64(cpu_lock_value, t0);
-}
-
-static inline void gen_load_mem(DisasContext *ctx,
-void (*tcg_gen_qemu_load)(TCGv t0, TCGv t1,
-  int flags),
-int ra, int rb, int32_t disp16, bool fp,
-bool clear)
-{
-TCGv tmp, addr, va;
+TCGv addr, dest;
 
 /* LDQ_U with ra $31 is UNOP.  Other various loads are forms of
prefetches, which we can treat as nops.  No worries about
@@ -337,22 +320,20 @@ static inline void gen_load_mem(DisasContext *ctx,
 return;
 }
 
-tmp = tcg_temp_new();
-addr = load_gpr(ctx, rb);
-
-if (disp16) {
-tcg_gen_addi_i64(tmp, addr, disp16);
-addr = tmp;
-}
+addr = tcg_temp_new();
+tcg_gen_addi_i64(addr, load_gpr(ctx, rb), disp16);
 if (clear) {
-tcg_gen_andi_i64(tmp, addr, ~0x7);
-addr = tmp;
+tcg_gen_andi_i64(addr, addr, ~0x7);
 }
 
-va = (fp ? cpu_fir[ra] : ctx->ir[ra]);
-tcg_gen_qemu_load(va, addr, ctx->mem_idx);
+dest = ctx->ir[ra];
+tcg_gen_qemu_ld_i64(dest, addr, ctx->mem_idx, op);
 
-tcg_temp_free(tmp);
+if (locked) {
+tcg_gen_mov_i64(cpu_lock_addr, addr);
+tcg_gen_mov_i64(cpu_lock_value, dest);
+}
+tcg_temp_free(addr);
 }
 
 static void gen_stf(DisasContext *ctx, TCGv src, TCGv addr)
@@ -393,30 +374,21 @@ static void gen_store_fp(DisasContext *ctx, int ra, int 
rb, int32_t disp16,
 tcg_temp_free(addr);
 }
 
-static inline void gen_store_mem(DisasContext *ctx,
- void (*tcg_gen_qemu_store)(TCGv t0, TCGv t1,
-int flags),
- int ra, int rb, int32_t disp16, bool fp,
- bool clear)
+static void gen_store_int(DisasContext *ctx, int ra, int rb, int32_t disp16,
+  MemOp op, bool clear)
 {
-TCGv tmp, addr, va;
+TCGv addr, src;
 
-tmp = tcg_temp_new();
-addr = load_gpr(ctx, rb);
-
-if (disp16) {
-tcg_gen_addi_i64(tmp, addr, disp16);
-addr = tmp;
-}
+addr = tcg_temp_new();
+tcg_gen_addi_i64(addr, load_gpr(ctx, rb), disp16);
 if (clear) {
-tcg_gen_andi_i64(tmp, addr, ~0x7);
-addr = tmp;
+tcg_gen_andi_i64(addr, addr, ~0x7);
 }
 
-va = (fp ? load_fpr(ctx, ra) : load_gpr(ctx, ra));
-tcg_gen_qemu_store(va, addr, ctx->mem_idx);
+src = load_gpr(ctx, ra);
+tcg_gen_qemu_st_i64(src, addr, ctx->mem_idx, op);
 
-tcg_temp_free(tmp);
+tcg_temp_free(addr);
 }
 
 static DisasJumpType gen_store_conditional(DisasContext *ctx, int ra, int rb,
@@ -1511,30 +1483,30 @@ static DisasJumpType translate_one(DisasContext *ctx, 
uint32_t insn)
 case 0x0A:
 /* LDBU */
 REQUIRE_AMASK(BWX);
-gen_load_mem(ctx, _gen_qemu_ld8u, ra, rb, disp16, 0, 0);
+gen_load_int(ctx, ra, rb, disp16, MO_UB, 0, 0);
 break;
 case 0x0B:
 /* LDQ_U */
-gen_load_mem(ctx, _gen_qemu_ld64, ra, rb, disp16, 0, 1);
+gen_load_int(ctx, ra, rb, disp16, MO_LEQ, 1, 0);
 break;
 case 0x0C:
 /* LDWU */
 REQUIRE_AMASK(BWX);
-gen_load_mem(ctx, _gen_qemu_ld16u, ra, rb, disp16, 0, 0);
+gen_load_int(ctx, ra, rb, disp16, MO_LEUW, 0, 0);
 break;
 case 0x0D:
 /* STW */
 REQUIRE_AMASK(BWX);
-gen_store_mem(ctx, _gen_qemu_st16, ra, rb, disp16, 0, 0);
+gen_store_int(ctx, ra, rb, disp16, MO_LEUW, 0);
 break;
 case 0x0E:
 /* STB */
 REQUIRE_AMASK(BWX);
-gen_store_mem(ctx, _gen_qemu_st8, ra, rb, disp16, 0, 0);
+gen_store_int(ctx, ra, rb, disp16, MO_UB, 0);
 break;
 case 0x0F:
 /* STQ_U */
-gen_store_mem(ctx, _gen_qemu_st64, ra,

[PATCH v4 38/48] target/hppa: Implement prctl_unalign_sigbus

2021-10-12 Thread Richard Henderson

Leave TARGET_ALIGNED_ONLY set, but use the new CPUState
flag to set MO_UNALN for the instructions that the kernel
handles in the unaligned trap.

Signed-off-by: Richard Henderson 
---
 linux-user/hppa/target_prctl.h |  2 +-
 target/hppa/cpu.h  |  5 -
 target/hppa/translate.c| 19 +++
 3 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/linux-user/hppa/target_prctl.h b/linux-user/hppa/target_prctl.h
index eb53b31ad5..5629ddbf39 100644
--- a/linux-user/hppa/target_prctl.h
+++ b/linux-user/hppa/target_prctl.h
@@ -1 +1 @@
-/* No special prctl support required. */
+#include "../generic/target_prctl_unalign.h"
diff --git a/target/hppa/cpu.h b/target/hppa/cpu.h
index 294fd7297f..45fd338b02 100644
--- a/target/hppa/cpu.h
+++ b/target/hppa/cpu.h
@@ -259,12 +259,14 @@ static inline target_ulong hppa_form_gva(CPUHPPAState 
*env, uint64_t spc,
 return hppa_form_gva_psw(env->psw, spc, off);
 }
 
-/* Since PSW_{I,CB} will never need to be in tb->flags, reuse them.
+/*
+ * Since PSW_{I,CB} will never need to be in tb->flags, reuse them.
  * TB_FLAG_SR_SAME indicates that SR4 through SR7 all contain the
  * same value.
  */
 #define TB_FLAG_SR_SAME PSW_I
 #define TB_FLAG_PRIV_SHIFT  8
+#define TB_FLAG_UNALIGN 0x400
 
 static inline void cpu_get_tb_cpu_state(CPUHPPAState *env, target_ulong *pc,
 target_ulong *cs_base,
@@ -279,6 +281,7 @@ static inline void cpu_get_tb_cpu_state(CPUHPPAState *env, 
target_ulong *pc,
 #ifdef CONFIG_USER_ONLY
 *pc = env->iaoq_f & -4;
 *cs_base = env->iaoq_b & -4;
+flags |= TB_FLAG_UNALIGN * !env_cpu(env)->prctl_unalign_sigbus;
 #else
 /* ??? E, T, H, L, B, P bits need to be here, when implemented.  */
 flags |= env->psw & (PSW_W | PSW_C | PSW_D);
diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index c3698cf067..fdaa2b12b8 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -272,8 +272,18 @@ typedef struct DisasContext {
 int mmu_idx;
 int privilege;
 bool psw_n_nonzero;
+
+#ifdef CONFIG_USER_ONLY
+MemOp unalign;
+#endif
 } DisasContext;
 
+#ifdef CONFIG_USER_ONLY
+#define UNALIGN(C)  (C)->unalign
+#else
+#define UNALIGN(C)  0
+#endif
+
 /* Note that ssm/rsm instructions number PSW_W and PSW_E differently.  */
 static int expand_sm_imm(DisasContext *ctx, int val)
 {
@@ -1477,7 +1487,7 @@ static void do_load_32(DisasContext *ctx, TCGv_i32 dest, 
unsigned rb,
 
 form_gva(ctx, , , rb, rx, scale, disp, sp, modify,
  ctx->mmu_idx == MMU_PHYS_IDX);
-tcg_gen_qemu_ld_reg(dest, addr, ctx->mmu_idx, mop);
+tcg_gen_qemu_ld_reg(dest, addr, ctx->mmu_idx, mop | UNALIGN(ctx));
 if (modify) {
 save_gpr(ctx, rb, ofs);
 }
@@ -1495,7 +1505,7 @@ static void do_load_64(DisasContext *ctx, TCGv_i64 dest, 
unsigned rb,
 
 form_gva(ctx, , , rb, rx, scale, disp, sp, modify,
  ctx->mmu_idx == MMU_PHYS_IDX);
-tcg_gen_qemu_ld_i64(dest, addr, ctx->mmu_idx, mop);
+tcg_gen_qemu_ld_i64(dest, addr, ctx->mmu_idx, mop | UNALIGN(ctx));
 if (modify) {
 save_gpr(ctx, rb, ofs);
 }
@@ -1513,7 +1523,7 @@ static void do_store_32(DisasContext *ctx, TCGv_i32 src, 
unsigned rb,
 
 form_gva(ctx, , , rb, rx, scale, disp, sp, modify,
  ctx->mmu_idx == MMU_PHYS_IDX);
-tcg_gen_qemu_st_i32(src, addr, ctx->mmu_idx, mop);
+tcg_gen_qemu_st_i32(src, addr, ctx->mmu_idx, mop | UNALIGN(ctx));
 if (modify) {
 save_gpr(ctx, rb, ofs);
 }
@@ -1531,7 +1541,7 @@ static void do_store_64(DisasContext *ctx, TCGv_i64 src, 
unsigned rb,
 
 form_gva(ctx, , , rb, rx, scale, disp, sp, modify,
  ctx->mmu_idx == MMU_PHYS_IDX);
-tcg_gen_qemu_st_i64(src, addr, ctx->mmu_idx, mop);
+tcg_gen_qemu_st_i64(src, addr, ctx->mmu_idx, mop | UNALIGN(ctx));
 if (modify) {
 save_gpr(ctx, rb, ofs);
 }
@@ -4110,6 +4120,7 @@ static void hppa_tr_init_disas_context(DisasContextBase 
*dcbase, CPUState *cs)
 ctx->mmu_idx = MMU_USER_IDX;
 ctx->iaoq_f = ctx->base.pc_first | MMU_USER_IDX;
 ctx->iaoq_b = ctx->base.tb->cs_base | MMU_USER_IDX;
+ctx->unalign = (ctx->tb_flags & TB_FLAG_UNALIGN ? MO_UNALN : MO_ALIGN);
 #else
 ctx->privilege = (ctx->tb_flags >> TB_FLAG_PRIV_SHIFT) & 3;
 ctx->mmu_idx = (ctx->tb_flags & PSW_D ? ctx->privilege : MMU_PHYS_IDX);
-- 
2.25.1

[PATCH v4 31/48] linux-user: Split out do_prctl and subroutines

2021-10-12 Thread Richard Henderson

Since the prctl constants are supposed to be generic, supply
any that are not provided by the host.

Split out subroutines for PR_GET_FP_MODE, PR_SET_FP_MODE,
PR_GET_VL, PR_SET_VL, PR_RESET_KEYS, PR_SET_TAGGED_ADDR_CTRL,
PR_GET_TAGGED_ADDR_CTRL.  Return EINVAL for guests that do
not support these options rather than pass them on to the host.

Signed-off-by: Richard Henderson 
---
 linux-user/aarch64/target_prctl.h| 160 ++
 linux-user/aarch64/target_syscall.h  |  23 --
 linux-user/alpha/target_prctl.h  |   1 +
 linux-user/arm/target_prctl.h|   1 +
 linux-user/cris/target_prctl.h   |   1 +
 linux-user/hexagon/target_prctl.h|   1 +
 linux-user/hppa/target_prctl.h   |   1 +
 linux-user/i386/target_prctl.h   |   1 +
 linux-user/m68k/target_prctl.h   |   1 +
 linux-user/microblaze/target_prctl.h |   1 +
 linux-user/mips/target_prctl.h   |  88 ++
 linux-user/mips/target_syscall.h |   6 -
 linux-user/mips64/target_prctl.h |   1 +
 linux-user/mips64/target_syscall.h   |   6 -
 linux-user/nios2/target_prctl.h  |   1 +
 linux-user/openrisc/target_prctl.h   |   1 +
 linux-user/ppc/target_prctl.h|   1 +
 linux-user/riscv/target_prctl.h  |   1 +
 linux-user/s390x/target_prctl.h  |   1 +
 linux-user/sh4/target_prctl.h|   1 +
 linux-user/sparc/target_prctl.h  |   1 +
 linux-user/x86_64/target_prctl.h |   1 +
 linux-user/xtensa/target_prctl.h |   1 +
 linux-user/syscall.c | 433 +--
 24 files changed, 414 insertions(+), 320 deletions(-)
 create mode 100644 linux-user/aarch64/target_prctl.h
 create mode 100644 linux-user/alpha/target_prctl.h
 create mode 100644 linux-user/arm/target_prctl.h
 create mode 100644 linux-user/cris/target_prctl.h
 create mode 100644 linux-user/hexagon/target_prctl.h
 create mode 100644 linux-user/hppa/target_prctl.h
 create mode 100644 linux-user/i386/target_prctl.h
 create mode 100644 linux-user/m68k/target_prctl.h
 create mode 100644 linux-user/microblaze/target_prctl.h
 create mode 100644 linux-user/mips/target_prctl.h
 create mode 100644 linux-user/mips64/target_prctl.h
 create mode 100644 linux-user/nios2/target_prctl.h
 create mode 100644 linux-user/openrisc/target_prctl.h
 create mode 100644 linux-user/ppc/target_prctl.h
 create mode 100644 linux-user/riscv/target_prctl.h
 create mode 100644 linux-user/s390x/target_prctl.h
 create mode 100644 linux-user/sh4/target_prctl.h
 create mode 100644 linux-user/sparc/target_prctl.h
 create mode 100644 linux-user/x86_64/target_prctl.h
 create mode 100644 linux-user/xtensa/target_prctl.h

diff --git a/linux-user/aarch64/target_prctl.h 
b/linux-user/aarch64/target_prctl.h
new file mode 100644
index 00..3f5a5d3933
--- /dev/null
+++ b/linux-user/aarch64/target_prctl.h
@@ -0,0 +1,160 @@
+/*
+ * AArch64 specific prctl functions for linux-user
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+#ifndef AARCH64_TARGET_PRCTL_H
+#define AARCH64_TARGET_PRCTL_H
+
+static abi_long do_prctl_get_vl(CPUArchState *env)
+{
+ARMCPU *cpu = env_archcpu(env);
+if (cpu_isar_feature(aa64_sve, cpu)) {
+return ((cpu->env.vfp.zcr_el[1] & 0xf) + 1) * 16;
+}
+return -TARGET_EINVAL;
+}
+#define do_prctl_get_vl do_prctl_get_vl
+
+static abi_long do_prctl_set_vl(CPUArchState *env, abi_long arg2)
+{
+/*
+ * We cannot support either PR_SVE_SET_VL_ONEXEC or PR_SVE_VL_INHERIT.
+ * Note the kernel definition of sve_vl_valid allows for VQ=512,
+ * i.e. VL=8192, even though the current architectural maximum is VQ=16.
+ */
+if (cpu_isar_feature(aa64_sve, env_archcpu(env))
+&& arg2 >= 0 && arg2 <= 512 * 16 && !(arg2 & 15)) {
+ARMCPU *cpu = env_archcpu(env);
+uint32_t vq, old_vq;
+
+old_vq = (env->vfp.zcr_el[1] & 0xf) + 1;
+vq = MAX(arg2 / 16, 1);
+vq = MIN(vq, cpu->sve_max_vq);
+
+if (vq < old_vq) {
+aarch64_sve_narrow_vq(env, vq);
+}
+env->vfp.zcr_el[1] = vq - 1;
+arm_rebuild_hflags(env);
+return vq * 16;
+}
+return -TARGET_EINVAL;
+}
+#define do_prctl_set_vl do_prctl_set_vl
+
+static abi_long do_prctl_reset_keys(CPUArchState *env, abi_long arg2)
+{
+ARMCPU *cpu = env_archcpu(env);
+
+if (cpu_isar_feature(aa64_pauth, cpu)) {
+int all = (PR_PAC_APIAKEY | PR_PAC_APIBKEY |
+   PR_PAC_APDAKEY | PR_PAC_APDBKEY | PR_PAC_APGAKEY);
+int ret = 0;
+Error *err = NULL;
+
+if (arg2 == 0) {
+arg2 = all;
+} else if (arg2 & ~all) {
+return -TARGET_EINVAL;
+}
+if (arg2 & PR_PAC_APIAKEY) {
+ret |= qemu_guest_getrandom(>keys.apia,
+sizeof(ARMPACKey), );
+}
+if (arg2 & PR_PAC_APIBKEY) {
+ret |= qemu_guest_getrandom(>keys.apib,
+sizeof(ARMPACKey), );
+}
+if (arg2 &

[PATCH v4 37/48] target/alpha: Implement prctl_unalign_sigbus

2021-10-12 Thread Richard Henderson

Leave TARGET_ALIGNED_ONLY set, but use the new CPUState
flag to set MO_UNALN for the instructions that the kernel
handles in the unaligned trap.

Signed-off-by: Richard Henderson 
---
 linux-user/alpha/target_prctl.h |  2 +-
 target/alpha/cpu.h  |  5 +
 target/alpha/translate.c| 31 ++-
 3 files changed, 28 insertions(+), 10 deletions(-)

diff --git a/linux-user/alpha/target_prctl.h b/linux-user/alpha/target_prctl.h
index eb53b31ad5..5629ddbf39 100644
--- a/linux-user/alpha/target_prctl.h
+++ b/linux-user/alpha/target_prctl.h
@@ -1 +1 @@
-/* No special prctl support required. */
+#include "../generic/target_prctl_unalign.h"
diff --git a/target/alpha/cpu.h b/target/alpha/cpu.h
index d49cc36d07..da5ccf7b63 100644
--- a/target/alpha/cpu.h
+++ b/target/alpha/cpu.h
@@ -386,6 +386,8 @@ enum {
 #define ENV_FLAG_TB_MASK \
 (ENV_FLAG_PAL_MODE | ENV_FLAG_PS_USER | ENV_FLAG_FEN)
 
+#define TB_FLAG_UNALIGN   (1u << 1)
+
 static inline int cpu_mmu_index(CPUAlphaState *env, bool ifetch)
 {
 int ret = env->flags & ENV_FLAG_PS_USER ? MMU_USER_IDX : MMU_KERNEL_IDX;
@@ -468,6 +470,9 @@ static inline void cpu_get_tb_cpu_state(CPUAlphaState *env, 
target_ulong *pc,
 *pc = env->pc;
 *cs_base = 0;
 *pflags = env->flags & ENV_FLAG_TB_MASK;
+#ifdef CONFIG_USER_ONLY
+*pflags |= TB_FLAG_UNALIGN * !env_cpu(env)->prctl_unalign_sigbus;
+#endif
 }
 
 #ifdef CONFIG_USER_ONLY
diff --git a/target/alpha/translate.c b/target/alpha/translate.c
index 0eee3a1bcc..2656037b8b 100644
--- a/target/alpha/translate.c
+++ b/target/alpha/translate.c
@@ -45,7 +45,9 @@ typedef struct DisasContext DisasContext;
 struct DisasContext {
 DisasContextBase base;
 
-#ifndef CONFIG_USER_ONLY
+#ifdef CONFIG_USER_ONLY
+MemOp unalign;
+#else
 uint64_t palbr;
 #endif
 uint32_t tbflags;
@@ -68,6 +70,12 @@ struct DisasContext {
 TCGv sink;
 };
 
+#ifdef CONFIG_USER_ONLY
+#define UNALIGN(C)  (C)->unalign
+#else
+#define UNALIGN(C)  0
+#endif
+
 /* Target-specific return values from translate_one, indicating the
state of the TB.  Note that DISAS_NEXT indicates that we are not
exiting the TB.  */
@@ -270,7 +278,7 @@ static inline DisasJumpType gen_invalid(DisasContext *ctx)
 static void gen_ldf(DisasContext *ctx, TCGv dest, TCGv addr)
 {
 TCGv_i32 tmp32 = tcg_temp_new_i32();
-tcg_gen_qemu_ld_i32(tmp32, addr, ctx->mem_idx, MO_LEUL);
+tcg_gen_qemu_ld_i32(tmp32, addr, ctx->mem_idx, MO_LEUL | UNALIGN(ctx));
 gen_helper_memory_to_f(dest, tmp32);
 tcg_temp_free_i32(tmp32);
 }
@@ -278,7 +286,7 @@ static void gen_ldf(DisasContext *ctx, TCGv dest, TCGv addr)
 static void gen_ldg(DisasContext *ctx, TCGv dest, TCGv addr)
 {
 TCGv tmp = tcg_temp_new();
-tcg_gen_qemu_ld_i64(tmp, addr, ctx->mem_idx, MO_LEQ);
+tcg_gen_qemu_ld_i64(tmp, addr, ctx->mem_idx, MO_LEQ | UNALIGN(ctx));
 gen_helper_memory_to_g(dest, tmp);
 tcg_temp_free(tmp);
 }
@@ -286,14 +294,14 @@ static void gen_ldg(DisasContext *ctx, TCGv dest, TCGv 
addr)
 static void gen_lds(DisasContext *ctx, TCGv dest, TCGv addr)
 {
 TCGv_i32 tmp32 = tcg_temp_new_i32();
-tcg_gen_qemu_ld_i32(tmp32, addr, ctx->mem_idx, MO_LEUL);
+tcg_gen_qemu_ld_i32(tmp32, addr, ctx->mem_idx, MO_LEUL | UNALIGN(ctx));
 gen_helper_memory_to_s(dest, tmp32);
 tcg_temp_free_i32(tmp32);
 }
 
 static void gen_ldt(DisasContext *ctx, TCGv dest, TCGv addr)
 {
-tcg_gen_qemu_ld_i64(dest, addr, ctx->mem_idx, MO_LEQ);
+tcg_gen_qemu_ld_i64(dest, addr, ctx->mem_idx, MO_LEQ | UNALIGN(ctx));
 }
 
 static void gen_load_fp(DisasContext *ctx, int ra, int rb, int32_t disp16,
@@ -324,6 +332,8 @@ static void gen_load_int(DisasContext *ctx, int ra, int rb, 
int32_t disp16,
 tcg_gen_addi_i64(addr, load_gpr(ctx, rb), disp16);
 if (clear) {
 tcg_gen_andi_i64(addr, addr, ~0x7);
+} else if (!locked) {
+op |= UNALIGN(ctx);
 }
 
 dest = ctx->ir[ra];
@@ -340,7 +350,7 @@ static void gen_stf(DisasContext *ctx, TCGv src, TCGv addr)
 {
 TCGv_i32 tmp32 = tcg_temp_new_i32();
 gen_helper_f_to_memory(tmp32, addr);
-tcg_gen_qemu_st_i32(tmp32, addr, ctx->mem_idx, MO_LEUL);
+tcg_gen_qemu_st_i32(tmp32, addr, ctx->mem_idx, MO_LEUL | UNALIGN(ctx));
 tcg_temp_free_i32(tmp32);
 }
 
@@ -348,7 +358,7 @@ static void gen_stg(DisasContext *ctx, TCGv src, TCGv addr)
 {
 TCGv tmp = tcg_temp_new();
 gen_helper_g_to_memory(tmp, src);
-tcg_gen_qemu_st_i64(tmp, addr, ctx->mem_idx, MO_LEQ);
+tcg_gen_qemu_st_i64(tmp, addr, ctx->mem_idx, MO_LEQ | UNALIGN(ctx));
 tcg_temp_free(tmp);
 }
 
@@ -356,13 +366,13 @@ static void gen_sts(DisasContext *ctx, TCGv src, TCGv 
addr)
 {
 TCGv_i32 tmp32 = tcg_temp_new_i32();
 gen_helper_s_to_memory(tmp32, src);
-tcg_gen_qemu_st_i32(tmp32, addr, ctx->mem_idx, MO_LEUL);
+tcg_gen_qemu_st_i32(tmp32, addr, ctx->mem_idx, MO_LEUL | UNALIGN(ctx));
 tcg_temp_free_i32(tmp32);
 }
 
 static void gen_stt(DisasContext *ctx, TCGv

[PATCH v4 33/48] Revert "cpu: Move cpu_common_props to hw/core/cpu.c"

2021-10-12 Thread Richard Henderson

This reverts commit 1b36e4f5a5de585210ea95f2257839c2312be28f.

Despite a comment saying why cpu_common_props cannot be placed in
a file that is compiled once, it was moved anyway.  Revert that.

Since then, Property is not defined in hw/core/cpu.h, so it is now
easier to declare a function to install the properties rather than
the Property array itself.

Cc: Eduardo Habkost 
Suggested-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 include/hw/core/cpu.h |  1 +
 cpu.c | 21 +
 hw/core/cpu-common.c  | 17 +
 3 files changed, 23 insertions(+), 16 deletions(-)

diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index b7d5bc1200..1a10497af3 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -1008,6 +1008,7 @@ void QEMU_NORETURN cpu_abort(CPUState *cpu, const char 
*fmt, ...)
 GCC_FMT_ATTR(2, 3);
 
 /* $(top_srcdir)/cpu.c */
+void cpu_class_init_props(DeviceClass *dc);
 void cpu_exec_initfn(CPUState *cpu);
 void cpu_exec_realizefn(CPUState *cpu, Error **errp);
 void cpu_exec_unrealizefn(CPUState *cpu);
diff --git a/cpu.c b/cpu.c
index e1799a15bc..9bce67ef55 100644
--- a/cpu.c
+++ b/cpu.c
@@ -179,6 +179,27 @@ void cpu_exec_unrealizefn(CPUState *cpu)
 cpu_list_remove(cpu);
 }
 
+static Property cpu_common_props[] = {
+#ifndef CONFIG_USER_ONLY
+/*
+ * Create a memory property for softmmu CPU object,
+ * so users can wire up its memory. (This can't go in hw/core/cpu.c
+ * because that file is compiled only once for both user-mode
+ * and system builds.) The default if no link is set up is to use
+ * the system address space.
+ */
+DEFINE_PROP_LINK("memory", CPUState, memory, TYPE_MEMORY_REGION,
+ MemoryRegion *),
+#endif
+DEFINE_PROP_BOOL("start-powered-off", CPUState, start_powered_off, false),
+DEFINE_PROP_END_OF_LIST(),
+};
+
+void cpu_class_init_props(DeviceClass *dc)
+{
+device_class_set_props(dc, cpu_common_props);
+}
+
 void cpu_exec_initfn(CPUState *cpu)
 {
 cpu->as = NULL;
diff --git a/hw/core/cpu-common.c b/hw/core/cpu-common.c
index e2f5a64604..9e3241b430 100644
--- a/hw/core/cpu-common.c
+++ b/hw/core/cpu-common.c
@@ -257,21 +257,6 @@ static int64_t cpu_common_get_arch_id(CPUState *cpu)
 return cpu->cpu_index;
 }
 
-static Property cpu_common_props[] = {
-#ifndef CONFIG_USER_ONLY
-/* Create a memory property for softmmu CPU object,
- * so users can wire up its memory. (This can't go in hw/core/cpu.c
- * because that file is compiled only once for both user-mode
- * and system builds.) The default if no link is set up is to use
- * the system address space.
- */
-DEFINE_PROP_LINK("memory", CPUState, memory, TYPE_MEMORY_REGION,
- MemoryRegion *),
-#endif
-DEFINE_PROP_BOOL("start-powered-off", CPUState, start_powered_off, false),
-DEFINE_PROP_END_OF_LIST(),
-};
-
 static void cpu_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
@@ -286,7 +271,7 @@ static void cpu_class_init(ObjectClass *klass, void *data)
 dc->realize = cpu_common_realizefn;
 dc->unrealize = cpu_common_unrealizefn;
 dc->reset = cpu_common_reset;
-device_class_set_props(dc, cpu_common_props);
+cpu_class_init_props(dc);
 /*
  * Reason: CPUs still need special care by board code: wiring up
  * IRQs, adding reset handlers, halting non-first CPUs, ...
-- 
2.25.1

[PATCH v4 29/48] tcg: Move helper_*_mmu decls to tcg/tcg-ldst.h

2021-10-12 Thread Richard Henderson

These functions have been replaced by cpu_*_mmu as the
most proper interface to use from target code.

Hide these declarations from code that should not use them.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 include/tcg/tcg-ldst.h | 74 ++
 include/tcg/tcg.h  | 71 
 accel/tcg/cputlb.c |  1 +
 tcg/tcg.c  |  1 +
 tcg/tci.c  |  1 +
 5 files changed, 77 insertions(+), 71 deletions(-)
 create mode 100644 include/tcg/tcg-ldst.h

diff --git a/include/tcg/tcg-ldst.h b/include/tcg/tcg-ldst.h
new file mode 100644
index 00..8c86365611
--- /dev/null
+++ b/include/tcg/tcg-ldst.h
@@ -0,0 +1,74 @@
+/*
+ * Memory helpers that will be used by TCG generated code.
+ *
+ * Copyright (c) 2008 Fabrice Bellard
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#ifndef TCG_LDST_H
+#define TCG_LDST_H 1
+
+#ifdef CONFIG_SOFTMMU
+
+/* Value zero-extended to tcg register size.  */
+tcg_target_ulong helper_ret_ldub_mmu(CPUArchState *env, target_ulong addr,
+ MemOpIdx oi, uintptr_t retaddr);
+tcg_target_ulong helper_le_lduw_mmu(CPUArchState *env, target_ulong addr,
+MemOpIdx oi, uintptr_t retaddr);
+tcg_target_ulong helper_le_ldul_mmu(CPUArchState *env, target_ulong addr,
+MemOpIdx oi, uintptr_t retaddr);
+uint64_t helper_le_ldq_mmu(CPUArchState *env, target_ulong addr,
+   MemOpIdx oi, uintptr_t retaddr);
+tcg_target_ulong helper_be_lduw_mmu(CPUArchState *env, target_ulong addr,
+MemOpIdx oi, uintptr_t retaddr);
+tcg_target_ulong helper_be_ldul_mmu(CPUArchState *env, target_ulong addr,
+MemOpIdx oi, uintptr_t retaddr);
+uint64_t helper_be_ldq_mmu(CPUArchState *env, target_ulong addr,
+   MemOpIdx oi, uintptr_t retaddr);
+
+/* Value sign-extended to tcg register size.  */
+tcg_target_ulong helper_ret_ldsb_mmu(CPUArchState *env, target_ulong addr,
+ MemOpIdx oi, uintptr_t retaddr);
+tcg_target_ulong helper_le_ldsw_mmu(CPUArchState *env, target_ulong addr,
+MemOpIdx oi, uintptr_t retaddr);
+tcg_target_ulong helper_le_ldsl_mmu(CPUArchState *env, target_ulong addr,
+MemOpIdx oi, uintptr_t retaddr);
+tcg_target_ulong helper_be_ldsw_mmu(CPUArchState *env, target_ulong addr,
+MemOpIdx oi, uintptr_t retaddr);
+tcg_target_ulong helper_be_ldsl_mmu(CPUArchState *env, target_ulong addr,
+MemOpIdx oi, uintptr_t retaddr);
+
+void helper_ret_stb_mmu(CPUArchState *env, target_ulong addr, uint8_t val,
+MemOpIdx oi, uintptr_t retaddr);
+void helper_le_stw_mmu(CPUArchState *env, target_ulong addr, uint16_t val,
+   MemOpIdx oi, uintptr_t retaddr);
+void helper_le_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val,
+   MemOpIdx oi, uintptr_t retaddr);
+void helper_le_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
+   MemOpIdx oi, uintptr_t retaddr);
+void helper_be_stw_mmu(CPUArchState *env, target_ulong addr, uint16_t val,
+   MemOpIdx oi, uintptr_t retaddr);
+void helper_be_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val,
+   MemOpIdx oi, uintptr_t retaddr);
+void helper_be_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
+   MemOpIdx oi, uintptr_t retaddr);
+
+#endif /* CONFIG_SOFTMMU */
+#endif /* TCG_LDST_H */
diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 83e38487cf..7069a401f1 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -1240,77 +1240,6 @@ uint64_t

[PATCH v4 25/48] target/mips: Use 8-byte memory ops for msa load/store

2021-10-12 Thread Richard Henderson

Rather than use 4-16 separate operations, use 2 operations
plus some byte reordering as necessary.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 target/mips/tcg/msa_helper.c | 201 +--
 1 file changed, 71 insertions(+), 130 deletions(-)

diff --git a/target/mips/tcg/msa_helper.c b/target/mips/tcg/msa_helper.c
index a8880ce81c..e40c1b7057 100644
--- a/target/mips/tcg/msa_helper.c
+++ b/target/mips/tcg/msa_helper.c
@@ -8218,47 +8218,31 @@ void helper_msa_ffint_u_df(CPUMIPSState *env, uint32_t 
df, uint32_t wd,
 #define MEMOP_IDX(DF)
 #endif
 
+#ifdef TARGET_WORDS_BIGENDIAN
+static inline uint64_t bswap16x4(uint64_t x)
+{
+uint64_t m = 0x00ff00ff00ff00ffull;
+return ((x & m) << 8) | ((x >> 8) & m);
+}
+
+static inline uint64_t bswap32x2(uint64_t x)
+{
+return ror64(bswap64(x), 32);
+}
+#endif
+
 void helper_msa_ld_b(CPUMIPSState *env, uint32_t wd,
  target_ulong addr)
 {
 wr_t *pwd = &(env->active_fpu.fpr[wd].wr);
 uintptr_t ra = GETPC();
+uint64_t d0, d1;
 
-#if !defined(HOST_WORDS_BIGENDIAN)
-pwd->b[0]  = cpu_ldub_data_ra(env, addr + (0  << DF_BYTE), ra);
-pwd->b[1]  = cpu_ldub_data_ra(env, addr + (1  << DF_BYTE), ra);
-pwd->b[2]  = cpu_ldub_data_ra(env, addr + (2  << DF_BYTE), ra);
-pwd->b[3]  = cpu_ldub_data_ra(env, addr + (3  << DF_BYTE), ra);
-pwd->b[4]  = cpu_ldub_data_ra(env, addr + (4  << DF_BYTE), ra);
-pwd->b[5]  = cpu_ldub_data_ra(env, addr + (5  << DF_BYTE), ra);
-pwd->b[6]  = cpu_ldub_data_ra(env, addr + (6  << DF_BYTE), ra);
-pwd->b[7]  = cpu_ldub_data_ra(env, addr + (7  << DF_BYTE), ra);
-pwd->b[8]  = cpu_ldub_data_ra(env, addr + (8  << DF_BYTE), ra);
-pwd->b[9]  = cpu_ldub_data_ra(env, addr + (9  << DF_BYTE), ra);
-pwd->b[10] = cpu_ldub_data_ra(env, addr + (10 << DF_BYTE), ra);
-pwd->b[11] = cpu_ldub_data_ra(env, addr + (11 << DF_BYTE), ra);
-pwd->b[12] = cpu_ldub_data_ra(env, addr + (12 << DF_BYTE), ra);
-pwd->b[13] = cpu_ldub_data_ra(env, addr + (13 << DF_BYTE), ra);
-pwd->b[14] = cpu_ldub_data_ra(env, addr + (14 << DF_BYTE), ra);
-pwd->b[15] = cpu_ldub_data_ra(env, addr + (15 << DF_BYTE), ra);
-#else
-pwd->b[0]  = cpu_ldub_data_ra(env, addr + (7  << DF_BYTE), ra);
-pwd->b[1]  = cpu_ldub_data_ra(env, addr + (6  << DF_BYTE), ra);
-pwd->b[2]  = cpu_ldub_data_ra(env, addr + (5  << DF_BYTE), ra);
-pwd->b[3]  = cpu_ldub_data_ra(env, addr + (4  << DF_BYTE), ra);
-pwd->b[4]  = cpu_ldub_data_ra(env, addr + (3  << DF_BYTE), ra);
-pwd->b[5]  = cpu_ldub_data_ra(env, addr + (2  << DF_BYTE), ra);
-pwd->b[6]  = cpu_ldub_data_ra(env, addr + (1  << DF_BYTE), ra);
-pwd->b[7]  = cpu_ldub_data_ra(env, addr + (0  << DF_BYTE), ra);
-pwd->b[8]  = cpu_ldub_data_ra(env, addr + (15 << DF_BYTE), ra);
-pwd->b[9]  = cpu_ldub_data_ra(env, addr + (14 << DF_BYTE), ra);
-pwd->b[10] = cpu_ldub_data_ra(env, addr + (13 << DF_BYTE), ra);
-pwd->b[11] = cpu_ldub_data_ra(env, addr + (12 << DF_BYTE), ra);
-pwd->b[12] = cpu_ldub_data_ra(env, addr + (11 << DF_BYTE), ra);
-pwd->b[13] = cpu_ldub_data_ra(env, addr + (10 << DF_BYTE), ra);
-pwd->b[14] = cpu_ldub_data_ra(env, addr + (9 << DF_BYTE), ra);
-pwd->b[15] = cpu_ldub_data_ra(env, addr + (8 << DF_BYTE), ra);
-#endif
+/* Load 8 bytes at a time.  Vector element ordering makes this LE.  */
+d0 = cpu_ldq_le_data_ra(env, addr + 0, ra);
+d1 = cpu_ldq_le_data_ra(env, addr + 8, ra);
+pwd->d[0] = d0;
+pwd->d[1] = d1;
 }
 
 void helper_msa_ld_h(CPUMIPSState *env, uint32_t wd,
@@ -8266,26 +8250,20 @@ void helper_msa_ld_h(CPUMIPSState *env, uint32_t wd,
 {
 wr_t *pwd = &(env->active_fpu.fpr[wd].wr);
 uintptr_t ra = GETPC();
+uint64_t d0, d1;
 
-#if !defined(HOST_WORDS_BIGENDIAN)
-pwd->h[0] = cpu_lduw_data_ra(env, addr + (0 << DF_HALF), ra);
-pwd->h[1] = cpu_lduw_data_ra(env, addr + (1 << DF_HALF), ra);
-pwd->h[2] = cpu_lduw_data_ra(env, addr + (2 << DF_HALF), ra);
-pwd->h[3] = cpu_lduw_data_ra(env, addr + (3 << DF_HALF), ra);
-pwd->h[4] = cpu_lduw_data_ra(env, addr + (4 << DF_HALF), ra);
-pwd->h[5] = cpu_lduw_data_ra(env, addr + (5 << DF_HALF), ra);
-pwd->h[6] = cpu_lduw_data_ra(env, addr + (6 << DF_HALF), ra);
-pwd->h[7] = cpu_lduw_data_ra(env, addr + (7 << DF_HALF), ra);
-#else
-pwd->h[0] = cpu_lduw_data_ra(env, addr + (3 << DF_HALF), ra);
-pwd->h[1] = cpu_lduw_data_ra(env, addr + (2 << DF_HALF), ra);
-pwd->h[2] = cpu_lduw_data_ra(env, addr + (1 << DF_HALF), ra);
-pwd->h[3] = cpu_lduw_data_ra(env, addr + (0 << DF_HALF), ra);
-pwd->h[4] = cpu_lduw_data_ra(env, addr + (7 << DF_HALF), ra);
-pwd->h[5] = cpu_lduw_data_ra(env, addr + (6 << DF_HALF), ra);
-pwd->h[6] = cpu_lduw_data_ra(env, addr + (5 << DF_HALF), ra);
-pwd->h[7] = cpu_lduw_data_ra(env, addr + (4 << DF_HALF), ra);
+/*
+ * Load 8 bytes at a time.  Use little-endian load, then for
+ * big-endian

[PATCH v4 27/48] target/sparc: Use cpu__mmu instead of helper__mmu

2021-10-12 Thread Richard Henderson

The helper_*_mmu functions were the only thing available
when this code was written.  This could have been adjusted
when we added cpu_*_mmuidx_ra, but now we can most easily
use the newest set of interfaces.

Reviewed-by: Mark Cave-Ayland 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 target/sparc/ldst_helper.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/target/sparc/ldst_helper.c b/target/sparc/ldst_helper.c
index 299fc386ea..a3e1cf9b6e 100644
--- a/target/sparc/ldst_helper.c
+++ b/target/sparc/ldst_helper.c
@@ -1328,27 +1328,27 @@ uint64_t helper_ld_asi(CPUSPARCState *env, target_ulong 
addr,
 oi = make_memop_idx(memop, idx);
 switch (size) {
 case 1:
-ret = helper_ret_ldub_mmu(env, addr, oi, GETPC());
+ret = cpu_ldb_mmu(env, addr, oi, GETPC());
 break;
 case 2:
 if (asi & 8) {
-ret = helper_le_lduw_mmu(env, addr, oi, GETPC());
+ret = cpu_ldw_le_mmu(env, addr, oi, GETPC());
 } else {
-ret = helper_be_lduw_mmu(env, addr, oi, GETPC());
+ret = cpu_ldw_be_mmu(env, addr, oi, GETPC());
 }
 break;
 case 4:
 if (asi & 8) {
-ret = helper_le_ldul_mmu(env, addr, oi, GETPC());
+ret = cpu_ldl_le_mmu(env, addr, oi, GETPC());
 } else {
-ret = helper_be_ldul_mmu(env, addr, oi, GETPC());
+ret = cpu_ldl_be_mmu(env, addr, oi, GETPC());
 }
 break;
 case 8:
 if (asi & 8) {
-ret = helper_le_ldq_mmu(env, addr, oi, GETPC());
+ret = cpu_ldq_le_mmu(env, addr, oi, GETPC());
 } else {
-ret = helper_be_ldq_mmu(env, addr, oi, GETPC());
+ret = cpu_ldq_be_mmu(env, addr, oi, GETPC());
 }
 break;
 default:
-- 
2.25.1

[PATCH v4 34/48] linux-user: Add code for PR_GET/SET_UNALIGN

2021-10-12 Thread Richard Henderson

This requires extra work for each target, but adds the
common syscall code, and the necessary flag in CPUState.

Signed-off-by: Richard Henderson 
---
 include/hw/core/cpu.h |  3 +++
 linux-user/generic/target_prctl_unalign.h | 27 +++
 cpu.c | 20 -
 linux-user/syscall.c  | 13 +--
 4 files changed, 56 insertions(+), 7 deletions(-)
 create mode 100644 linux-user/generic/target_prctl_unalign.h

diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index 1a10497af3..6202bbf9c3 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -412,6 +412,9 @@ struct CPUState {
 
 bool ignore_memory_transaction_failures;
 
+/* Used for user-only emulation of prctl(PR_SET_UNALIGN). */
+bool prctl_unalign_sigbus;
+
 struct hax_vcpu_state *hax_vcpu;
 
 struct hvf_vcpu_state *hvf;
diff --git a/linux-user/generic/target_prctl_unalign.h 
b/linux-user/generic/target_prctl_unalign.h
new file mode 100644
index 00..bc3b83af2a
--- /dev/null
+++ b/linux-user/generic/target_prctl_unalign.h
@@ -0,0 +1,27 @@
+/*
+ * Generic prctl unalign functions for linux-user
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+#ifndef GENERIC_TARGET_PRCTL_UNALIGN_H
+#define GENERIC_TARGET_PRCTL_UNALIGN_H
+
+static abi_long do_prctl_get_unalign(CPUArchState *env, target_long arg2)
+{
+CPUState *cs = env_cpu(env);
+uint32_t res = PR_UNALIGN_NOPRINT;
+if (cs->prctl_unalign_sigbus) {
+res |= PR_UNALIGN_SIGBUS;
+}
+return put_user_u32(res, arg2);
+}
+#define do_prctl_get_unalign do_prctl_get_unalign
+
+static abi_long do_prctl_set_unalign(CPUArchState *env, target_long arg2)
+{
+env_cpu(env)->prctl_unalign_sigbus = arg2 & PR_UNALIGN_SIGBUS;
+return 0;
+}
+#define do_prctl_set_unalign do_prctl_set_unalign
+
+#endif /* GENERIC_TARGET_PRCTL_UNALIGN_H */
diff --git a/cpu.c b/cpu.c
index 9bce67ef55..9e388d9cd3 100644
--- a/cpu.c
+++ b/cpu.c
@@ -179,13 +179,23 @@ void cpu_exec_unrealizefn(CPUState *cpu)
 cpu_list_remove(cpu);
 }
 
+/*
+ * This can't go in hw/core/cpu.c because that file is compiled only
+ * once for both user-mode and system builds.
+ */
 static Property cpu_common_props[] = {
-#ifndef CONFIG_USER_ONLY
+#ifdef CONFIG_USER_ONLY
 /*
- * Create a memory property for softmmu CPU object,
- * so users can wire up its memory. (This can't go in hw/core/cpu.c
- * because that file is compiled only once for both user-mode
- * and system builds.) The default if no link is set up is to use
+ * Create a property for the user-only object, so users can
+ * adjust prctl(PR_SET_UNALIGN) from the command-line.
+ * Has no effect if the target does not support the feature.
+ */
+DEFINE_PROP_BOOL("prctl-unalign-sigbus", CPUState,
+ prctl_unalign_sigbus, false),
+#else
+/*
+ * Create a memory property for softmmu CPU object, so users can
+ * wire up its memory.  The default if no link is set up is to use
  * the system address space.
  */
 DEFINE_PROP_LINK("memory", CPUState, memory, TYPE_MEMORY_REGION,
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 7635c2397a..ac3bc8a330 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -6375,6 +6375,12 @@ static abi_long do_prctl_inval1(CPUArchState *env, 
abi_long arg2)
 #ifndef do_prctl_get_tagged_addr_ctrl
 #define do_prctl_get_tagged_addr_ctrl do_prctl_inval0
 #endif
+#ifndef do_prctl_get_unalign
+#define do_prctl_get_unalign do_prctl_inval1
+#endif
+#ifndef do_prctl_set_unalign
+#define do_prctl_set_unalign do_prctl_inval1
+#endif
 
 static abi_long do_prctl(CPUArchState *env, abi_long option, abi_long arg2,
  abi_long arg3, abi_long arg4, abi_long arg5)
@@ -6438,6 +6444,11 @@ static abi_long do_prctl(CPUArchState *env, abi_long 
option, abi_long arg2,
 }
 return do_prctl_get_tagged_addr_ctrl(env);
 
+case PR_GET_UNALIGN:
+return do_prctl_get_unalign(env, arg2);
+case PR_SET_UNALIGN:
+return do_prctl_set_unalign(env, arg2);
+
 case PR_GET_DUMPABLE:
 case PR_SET_DUMPABLE:
 case PR_GET_KEEPCAPS:
@@ -6480,8 +6491,6 @@ static abi_long do_prctl(CPUArchState *env, abi_long 
option, abi_long arg2,
 case PR_SET_THP_DISABLE:
 case PR_GET_TSC:
 case PR_SET_TSC:
-case PR_GET_UNALIGN:
-case PR_SET_UNALIGN:
 default:
 /* Disable to prevent the target disabling stuff we need. */
 return -TARGET_EINVAL;
-- 
2.25.1

[PATCH v4 24/48] target/mips: Use cpu_*_data_ra for msa load/store

2021-10-12 Thread Richard Henderson

We should not have been using the helper_ret_* set of
functions, as they are supposed to be private to tcg.
Nor should we have been using the plain cpu_*_data set
of functions, as they do not handle unwinding properly.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 target/mips/tcg/msa_helper.c | 420 +++
 1 file changed, 135 insertions(+), 285 deletions(-)

diff --git a/target/mips/tcg/msa_helper.c b/target/mips/tcg/msa_helper.c
index 167d9a591c..a8880ce81c 100644
--- a/target/mips/tcg/msa_helper.c
+++ b/target/mips/tcg/msa_helper.c
@@ -8222,79 +8222,42 @@ void helper_msa_ld_b(CPUMIPSState *env, uint32_t wd,
  target_ulong addr)
 {
 wr_t *pwd = &(env->active_fpu.fpr[wd].wr);
-MEMOP_IDX(DF_BYTE)
-#if !defined(CONFIG_USER_ONLY)
+uintptr_t ra = GETPC();
+
 #if !defined(HOST_WORDS_BIGENDIAN)
-pwd->b[0]  = helper_ret_ldub_mmu(env, addr + (0  << DF_BYTE), oi, GETPC());
-pwd->b[1]  = helper_ret_ldub_mmu(env, addr + (1  << DF_BYTE), oi, GETPC());
-pwd->b[2]  = helper_ret_ldub_mmu(env, addr + (2  << DF_BYTE), oi, GETPC());
-pwd->b[3]  = helper_ret_ldub_mmu(env, addr + (3  << DF_BYTE), oi, GETPC());
-pwd->b[4]  = helper_ret_ldub_mmu(env, addr + (4  << DF_BYTE), oi, GETPC());
-pwd->b[5]  = helper_ret_ldub_mmu(env, addr + (5  << DF_BYTE), oi, GETPC());
-pwd->b[6]  = helper_ret_ldub_mmu(env, addr + (6  << DF_BYTE), oi, GETPC());
-pwd->b[7]  = helper_ret_ldub_mmu(env, addr + (7  << DF_BYTE), oi, GETPC());
-pwd->b[8]  = helper_ret_ldub_mmu(env, addr + (8  << DF_BYTE), oi, GETPC());
-pwd->b[9]  = helper_ret_ldub_mmu(env, addr + (9  << DF_BYTE), oi, GETPC());
-pwd->b[10] = helper_ret_ldub_mmu(env, addr + (10 << DF_BYTE), oi, GETPC());
-pwd->b[11] = helper_ret_ldub_mmu(env, addr + (11 << DF_BYTE), oi, GETPC());
-pwd->b[12] = helper_ret_ldub_mmu(env, addr + (12 << DF_BYTE), oi, GETPC());
-pwd->b[13] = helper_ret_ldub_mmu(env, addr + (13 << DF_BYTE), oi, GETPC());
-pwd->b[14] = helper_ret_ldub_mmu(env, addr + (14 << DF_BYTE), oi, GETPC());
-pwd->b[15] = helper_ret_ldub_mmu(env, addr + (15 << DF_BYTE), oi, GETPC());
+pwd->b[0]  = cpu_ldub_data_ra(env, addr + (0  << DF_BYTE), ra);
+pwd->b[1]  = cpu_ldub_data_ra(env, addr + (1  << DF_BYTE), ra);
+pwd->b[2]  = cpu_ldub_data_ra(env, addr + (2  << DF_BYTE), ra);
+pwd->b[3]  = cpu_ldub_data_ra(env, addr + (3  << DF_BYTE), ra);
+pwd->b[4]  = cpu_ldub_data_ra(env, addr + (4  << DF_BYTE), ra);
+pwd->b[5]  = cpu_ldub_data_ra(env, addr + (5  << DF_BYTE), ra);
+pwd->b[6]  = cpu_ldub_data_ra(env, addr + (6  << DF_BYTE), ra);
+pwd->b[7]  = cpu_ldub_data_ra(env, addr + (7  << DF_BYTE), ra);
+pwd->b[8]  = cpu_ldub_data_ra(env, addr + (8  << DF_BYTE), ra);
+pwd->b[9]  = cpu_ldub_data_ra(env, addr + (9  << DF_BYTE), ra);
+pwd->b[10] = cpu_ldub_data_ra(env, addr + (10 << DF_BYTE), ra);
+pwd->b[11] = cpu_ldub_data_ra(env, addr + (11 << DF_BYTE), ra);
+pwd->b[12] = cpu_ldub_data_ra(env, addr + (12 << DF_BYTE), ra);
+pwd->b[13] = cpu_ldub_data_ra(env, addr + (13 << DF_BYTE), ra);
+pwd->b[14] = cpu_ldub_data_ra(env, addr + (14 << DF_BYTE), ra);
+pwd->b[15] = cpu_ldub_data_ra(env, addr + (15 << DF_BYTE), ra);
 #else
-pwd->b[0]  = helper_ret_ldub_mmu(env, addr + (7  << DF_BYTE), oi, GETPC());
-pwd->b[1]  = helper_ret_ldub_mmu(env, addr + (6  << DF_BYTE), oi, GETPC());
-pwd->b[2]  = helper_ret_ldub_mmu(env, addr + (5  << DF_BYTE), oi, GETPC());
-pwd->b[3]  = helper_ret_ldub_mmu(env, addr + (4  << DF_BYTE), oi, GETPC());
-pwd->b[4]  = helper_ret_ldub_mmu(env, addr + (3  << DF_BYTE), oi, GETPC());
-pwd->b[5]  = helper_ret_ldub_mmu(env, addr + (2  << DF_BYTE), oi, GETPC());
-pwd->b[6]  = helper_ret_ldub_mmu(env, addr + (1  << DF_BYTE), oi, GETPC());
-pwd->b[7]  = helper_ret_ldub_mmu(env, addr + (0  << DF_BYTE), oi, GETPC());
-pwd->b[8]  = helper_ret_ldub_mmu(env, addr + (15 << DF_BYTE), oi, GETPC());
-pwd->b[9]  = helper_ret_ldub_mmu(env, addr + (14 << DF_BYTE), oi, GETPC());
-pwd->b[10] = helper_ret_ldub_mmu(env, addr + (13 << DF_BYTE), oi, GETPC());
-pwd->b[11] = helper_ret_ldub_mmu(env, addr + (12 << DF_BYTE), oi, GETPC());
-pwd->b[12] = helper_ret_ldub_mmu(env, addr + (11 << DF_BYTE), oi, GETPC());
-pwd->b[13] = helper_ret_ldub_mmu(env, addr + (10 << DF_BYTE), oi, GETPC());
-pwd->b[14] = helper_ret_ldub_mmu(env, addr + (9  << DF_BYTE), oi, GETPC());
-pwd->b[15] = helper_ret_ldub_mmu(env, addr + (8  << DF_BYTE), oi, GETPC());
-#endif
-#else
-#if !defined(HOST_WORDS_BIGENDIAN)
-pwd->b[0]  = cpu_ldub_data(env, addr + (0  << DF_BYTE));
-pwd->b[1]  = cpu_ldub_data(env, addr + (1  << DF_BYTE));
-pwd->b[2]  = cpu_ldub_data(env, addr + (2  << DF_BYTE));
-pwd->b[3]  = cpu_ldub_data(env, addr + (3  << DF_BYTE));
-pwd->b[4]  = cpu_ldub_data(env, addr + (4  << DF_BYTE));
-pwd->b[5]  = cpu_ldub_data(env, addr

[PATCH v4 32/48] linux-user: Disable more prctl subcodes

2021-10-12 Thread Richard Henderson

Create a list of subcodes that we want to pass on, a list of
subcodes that should not be passed on because they would affect
the running qemu itself, and a list that probably could be
implemented but require extra work. Do not pass on unknown subcodes.

Signed-off-by: Richard Henderson 
---
 linux-user/syscall.c | 56 
 1 file changed, 52 insertions(+), 4 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index a417396981..7635c2397a 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -6334,6 +6334,13 @@ abi_long do_arch_prctl(CPUX86State *env, int code, 
abi_ulong addr)
 # define PR_MTE_TAG_SHIFT   3
 # define PR_MTE_TAG_MASK(0xUL << PR_MTE_TAG_SHIFT)
 #endif
+#ifndef PR_SET_IO_FLUSHER
+# define PR_SET_IO_FLUSHER 57
+# define PR_GET_IO_FLUSHER 58
+#endif
+#ifndef PR_SET_SYSCALL_USER_DISPATCH
+# define PR_SET_SYSCALL_USER_DISPATCH 59
+#endif
 
 #include "target_prctl.h"
 
@@ -6430,13 +6437,54 @@ static abi_long do_prctl(CPUArchState *env, abi_long 
option, abi_long arg2,
 return -TARGET_EINVAL;
 }
 return do_prctl_get_tagged_addr_ctrl(env);
+
+case PR_GET_DUMPABLE:
+case PR_SET_DUMPABLE:
+case PR_GET_KEEPCAPS:
+case PR_SET_KEEPCAPS:
+case PR_GET_TIMING:
+case PR_SET_TIMING:
+case PR_GET_TIMERSLACK:
+case PR_SET_TIMERSLACK:
+case PR_MCE_KILL:
+case PR_MCE_KILL_GET:
+case PR_GET_NO_NEW_PRIVS:
+case PR_SET_NO_NEW_PRIVS:
+case PR_GET_IO_FLUSHER:
+case PR_SET_IO_FLUSHER:
+/* Some prctl options have no pointer arguments and we can pass on. */
+return get_errno(prctl(option, arg2, arg3, arg4, arg5));
+
+case PR_GET_CHILD_SUBREAPER:
+case PR_SET_CHILD_SUBREAPER:
+case PR_GET_SPECULATION_CTRL:
+case PR_SET_SPECULATION_CTRL:
+case PR_GET_TID_ADDRESS:
+/* TODO */
+return -TARGET_EINVAL;
+
+case PR_GET_FPEXC:
+case PR_SET_FPEXC:
+/* Was used for SPE on PowerPC. */
+return -TARGET_EINVAL;
+
+case PR_GET_ENDIAN:
+case PR_SET_ENDIAN:
+case PR_GET_FPEMU:
+case PR_SET_FPEMU:
+case PR_SET_MM:
 case PR_GET_SECCOMP:
 case PR_SET_SECCOMP:
-/* Disable seccomp to prevent the target disabling syscalls we need. */
-return -TARGET_EINVAL;
+case PR_SET_SYSCALL_USER_DISPATCH:
+case PR_GET_THP_DISABLE:
+case PR_SET_THP_DISABLE:
+case PR_GET_TSC:
+case PR_SET_TSC:
+case PR_GET_UNALIGN:
+case PR_SET_UNALIGN:
 default:
-/* Most prctl options have no pointer arguments */
-return get_errno(prctl(option, arg2, arg3, arg4, arg5));
+/* Disable to prevent the target disabling stuff we need. */
+return -TARGET_EINVAL;
 }
 }
 
-- 
2.25.1

[PATCH v4 26/48] target/s390x: Use cpu__mmu instead of helper__mmu

2021-10-12 Thread Richard Henderson

The helper_*_mmu functions were the only thing available
when this code was written.  This could have been adjusted
when we added cpu_*_mmuidx_ra, but now we can most easily
use the newest set of interfaces.

Reviewed-by: David Hildenbrand 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 target/s390x/tcg/mem_helper.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/target/s390x/tcg/mem_helper.c b/target/s390x/tcg/mem_helper.c
index f50c3f88a2..362a30d99e 100644
--- a/target/s390x/tcg/mem_helper.c
+++ b/target/s390x/tcg/mem_helper.c
@@ -241,13 +241,13 @@ static void do_access_memset(CPUS390XState *env, vaddr 
vaddr, char *haddr,
  * page. This is especially relevant to speed up TLB_NOTDIRTY.
  */
 g_assert(size > 0);
-helper_ret_stb_mmu(env, vaddr, byte, oi, ra);
+cpu_stb_mmu(env, vaddr, byte, oi, ra);
 haddr = tlb_vaddr_to_host(env, vaddr, MMU_DATA_STORE, mmu_idx);
 if (likely(haddr)) {
 memset(haddr + 1, byte, size - 1);
 } else {
 for (i = 1; i < size; i++) {
-helper_ret_stb_mmu(env, vaddr + i, byte, oi, ra);
+cpu_stb_mmu(env, vaddr + i, byte, oi, ra);
 }
 }
 }
@@ -283,7 +283,7 @@ static uint8_t do_access_get_byte(CPUS390XState *env, vaddr 
vaddr, char **haddr,
  * Do a single access and test if we can then get access to the
  * page. This is especially relevant to speed up TLB_NOTDIRTY.
  */
-byte = helper_ret_ldub_mmu(env, vaddr + offset, oi, ra);
+byte = cpu_ldb_mmu(env, vaddr + offset, oi, ra);
 *haddr = tlb_vaddr_to_host(env, vaddr, MMU_DATA_LOAD, mmu_idx);
 return byte;
 #endif
@@ -317,7 +317,7 @@ static void do_access_set_byte(CPUS390XState *env, vaddr 
vaddr, char **haddr,
  * Do a single access and test if we can then get access to the
  * page. This is especially relevant to speed up TLB_NOTDIRTY.
  */
-helper_ret_stb_mmu(env, vaddr + offset, byte, oi, ra);
+cpu_stb_mmu(env, vaddr + offset, byte, oi, ra);
 *haddr = tlb_vaddr_to_host(env, vaddr, MMU_DATA_STORE, mmu_idx);
 #endif
 }
-- 
2.25.1

[PATCH v4 28/48] target/arm: Use cpu__mmu instead of helper__mmu

2021-10-12 Thread Richard Henderson

The helper_*_mmu functions were the only thing available
when this code was written.  This could have been adjusted
when we added cpu_*_mmuidx_ra, but now we can most easily
use the newest set of interfaces.

Cc: qemu-...@nongnu.org
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 target/arm/helper-a64.c | 52 +++--
 target/arm/m_helper.c   |  6 ++---
 2 files changed, 11 insertions(+), 47 deletions(-)

diff --git a/target/arm/helper-a64.c b/target/arm/helper-a64.c
index b110c57956..5ae2ecb0f3 100644
--- a/target/arm/helper-a64.c
+++ b/target/arm/helper-a64.c
@@ -512,37 +512,19 @@ uint64_t HELPER(paired_cmpxchg64_le)(CPUARMState *env, 
uint64_t addr,
 uintptr_t ra = GETPC();
 uint64_t o0, o1;
 bool success;
-
-#ifdef CONFIG_USER_ONLY
-/* ??? Enforce alignment.  */
-uint64_t *haddr = g2h(env_cpu(env), addr);
-
-set_helper_retaddr(ra);
-o0 = ldq_le_p(haddr + 0);
-o1 = ldq_le_p(haddr + 1);
-oldv = int128_make128(o0, o1);
-
-success = int128_eq(oldv, cmpv);
-if (success) {
-stq_le_p(haddr + 0, int128_getlo(newv));
-stq_le_p(haddr + 1, int128_gethi(newv));
-}
-clear_helper_retaddr();
-#else
 int mem_idx = cpu_mmu_index(env, false);
 MemOpIdx oi0 = make_memop_idx(MO_LEQ | MO_ALIGN_16, mem_idx);
 MemOpIdx oi1 = make_memop_idx(MO_LEQ, mem_idx);
 
-o0 = helper_le_ldq_mmu(env, addr + 0, oi0, ra);
-o1 = helper_le_ldq_mmu(env, addr + 8, oi1, ra);
+o0 = cpu_ldq_le_mmu(env, addr + 0, oi0, ra);
+o1 = cpu_ldq_le_mmu(env, addr + 8, oi1, ra);
 oldv = int128_make128(o0, o1);
 
 success = int128_eq(oldv, cmpv);
 if (success) {
-helper_le_stq_mmu(env, addr + 0, int128_getlo(newv), oi1, ra);
-helper_le_stq_mmu(env, addr + 8, int128_gethi(newv), oi1, ra);
+cpu_stq_le_mmu(env, addr + 0, int128_getlo(newv), oi1, ra);
+cpu_stq_le_mmu(env, addr + 8, int128_gethi(newv), oi1, ra);
 }
-#endif
 
 return !success;
 }
@@ -582,37 +564,19 @@ uint64_t HELPER(paired_cmpxchg64_be)(CPUARMState *env, 
uint64_t addr,
 uintptr_t ra = GETPC();
 uint64_t o0, o1;
 bool success;
-
-#ifdef CONFIG_USER_ONLY
-/* ??? Enforce alignment.  */
-uint64_t *haddr = g2h(env_cpu(env), addr);
-
-set_helper_retaddr(ra);
-o1 = ldq_be_p(haddr + 0);
-o0 = ldq_be_p(haddr + 1);
-oldv = int128_make128(o0, o1);
-
-success = int128_eq(oldv, cmpv);
-if (success) {
-stq_be_p(haddr + 0, int128_gethi(newv));
-stq_be_p(haddr + 1, int128_getlo(newv));
-}
-clear_helper_retaddr();
-#else
 int mem_idx = cpu_mmu_index(env, false);
 MemOpIdx oi0 = make_memop_idx(MO_BEQ | MO_ALIGN_16, mem_idx);
 MemOpIdx oi1 = make_memop_idx(MO_BEQ, mem_idx);
 
-o1 = helper_be_ldq_mmu(env, addr + 0, oi0, ra);
-o0 = helper_be_ldq_mmu(env, addr + 8, oi1, ra);
+o1 = cpu_ldq_be_mmu(env, addr + 0, oi0, ra);
+o0 = cpu_ldq_be_mmu(env, addr + 8, oi1, ra);
 oldv = int128_make128(o0, o1);
 
 success = int128_eq(oldv, cmpv);
 if (success) {
-helper_be_stq_mmu(env, addr + 0, int128_gethi(newv), oi1, ra);
-helper_be_stq_mmu(env, addr + 8, int128_getlo(newv), oi1, ra);
+cpu_stq_be_mmu(env, addr + 0, int128_gethi(newv), oi1, ra);
+cpu_stq_be_mmu(env, addr + 8, int128_getlo(newv), oi1, ra);
 }
-#endif
 
 return !success;
 }
diff --git a/target/arm/m_helper.c b/target/arm/m_helper.c
index 62aa12c9d8..2c9922dc29 100644
--- a/target/arm/m_helper.c
+++ b/target/arm/m_helper.c
@@ -1947,9 +1947,9 @@ static bool do_v7m_function_return(ARMCPU *cpu)
  * do them as secure, so work out what MMU index that is.
  */
 mmu_idx = arm_v7m_mmu_idx_for_secstate(env, true);
-oi = make_memop_idx(MO_LE, arm_to_core_mmu_idx(mmu_idx));
-newpc = helper_le_ldul_mmu(env, frameptr, oi, 0);
-newpsr = helper_le_ldul_mmu(env, frameptr + 4, oi, 0);
+oi = make_memop_idx(MO_LEUL, arm_to_core_mmu_idx(mmu_idx));
+newpc = cpu_ldl_le_mmu(env, frameptr, oi, 0);
+newpsr = cpu_ldl_le_mmu(env, frameptr + 4, oi, 0);
 
 /* Consistency checks on new IPSR */
 newpsr_exc = newpsr & XPSR_EXCP;
-- 
2.25.1

[PATCH v4 22/48] accel/tcg: Add cpu_{ld,st}*_mmu interfaces

2021-10-12 Thread Richard Henderson

These functions are much closer to the softmmu helper
functions, in that they take the complete MemOpIdx,
and from that they may enforce required alignment.

The previous cpu_ldst.h functions did not have alignment info,
and so did not enforce it.  Retain this by adding MO_UNALN to
the MemOp that we create in calling the new functions.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 docs/devel/loads-stores.rst |  52 -
 include/exec/cpu_ldst.h | 245 --
 accel/tcg/cputlb.c  | 392 
 accel/tcg/user-exec.c   | 390 +++
 accel/tcg/ldst_common.c.inc | 307 
 5 files changed, 722 insertions(+), 664 deletions(-)
 create mode 100644 accel/tcg/ldst_common.c.inc

diff --git a/docs/devel/loads-stores.rst b/docs/devel/loads-stores.rst
index 568274baec..8f0035c821 100644
--- a/docs/devel/loads-stores.rst
+++ b/docs/devel/loads-stores.rst
@@ -68,15 +68,19 @@ Regexes for git grep
  - ``\``
  - ``\``
 
-``cpu_{ld,st}*_mmuidx_ra``
-~~
+``cpu_{ld,st}*_mmu``
+
 
-These functions operate on a guest virtual address plus a context,
-known as a "mmu index" or ``mmuidx``, which controls how that virtual
-address is translated.  The meaning of the indexes are target specific,
-but specifying a particular index might be necessary if, for instance,
-the helper requires an "always as non-privileged" access rather that
-the default access for the current state of the guest CPU.
+These functions operate on a guest virtual address, plus a context
+known as a "mmu index" which controls how that virtual address is
+translated, plus a ``MemOp`` which contains alignment requirements
+among other things.  The ``MemOp`` and mmu index are combined into
+a single argument of type ``MemOpIdx``.
+
+The meaning of the indexes are target specific, but specifying a
+particular index might be necessary if, for instance, the helper
+requires a "always as non-privileged" access rather than the
+default access for the current state of the guest CPU.
 
 These functions may cause a guest CPU exception to be taken
 (e.g. for an alignment fault or MMU fault) which will result in
@@ -99,6 +103,35 @@ function, which is a return address into the generated code 
[#gpc]_.
 
 Function names follow the pattern:
 
+load: ``cpu_ld{size}{end}_mmu(env, ptr, oi, retaddr)``
+
+store: ``cpu_st{size}{end}_mmu(env, ptr, val, oi, retaddr)``
+
+``size``
+ - ``b`` : 8 bits
+ - ``w`` : 16 bits
+ - ``l`` : 32 bits
+ - ``q`` : 64 bits
+
+``end``
+ - (empty) : for target endian, or 8 bit sizes
+ - ``_be`` : big endian
+ - ``_le`` : little endian
+
+Regexes for git grep:
+ - ``\``
+ - ``\``
+
+
+``cpu_{ld,st}*_mmuidx_ra``
+~~
+
+These functions work like the ``cpu_{ld,st}_mmu`` functions except
+that the ``mmuidx`` parameter is not combined with a ``MemOp``,
+and therefore there is no required alignment supplied or enforced.
+
+Function names follow the pattern:
+
 load: ``cpu_ld{sign}{size}{end}_mmuidx_ra(env, ptr, mmuidx, retaddr)``
 
 store: ``cpu_st{size}{end}_mmuidx_ra(env, ptr, val, mmuidx, retaddr)``
@@ -132,7 +165,8 @@ of the guest CPU, as determined by ``cpu_mmu_index(env, 
false)``.
 
 These are generally the preferred way to do accesses by guest
 virtual address from helper functions, unless the access should
-be performed with a context other than the default.
+be performed with a context other than the default, or alignment
+should be enforced for the access.
 
 Function names follow the pattern:
 
diff --git a/include/exec/cpu_ldst.h b/include/exec/cpu_ldst.h
index ce6ce82618..a4dad0772f 100644
--- a/include/exec/cpu_ldst.h
+++ b/include/exec/cpu_ldst.h
@@ -28,10 +28,12 @@
  * load:  cpu_ld{sign}{size}{end}_{mmusuffix}(env, ptr)
  *cpu_ld{sign}{size}{end}_{mmusuffix}_ra(env, ptr, retaddr)
  *cpu_ld{sign}{size}{end}_mmuidx_ra(env, ptr, mmu_idx, retaddr)
+ *cpu_ld{sign}{size}{end}_mmu(env, ptr, oi, retaddr)
  *
  * store: cpu_st{size}{end}_{mmusuffix}(env, ptr, val)
  *cpu_st{size}{end}_{mmusuffix}_ra(env, ptr, val, retaddr)
  *cpu_st{size}{end}_mmuidx_ra(env, ptr, val, mmu_idx, retaddr)
+ *cpu_st{size}{end}_mmu(env, ptr, val, oi, retaddr)
  *
  * sign is:
  * (empty): for 32 and 64 bit sizes
@@ -53,10 +55,15 @@
  * The "mmuidx" suffix carries an extra mmu_idx argument that specifies
  * the index to use; the "data" and "code" suffixes take the index from
  * cpu_mmu_index().
+ *
+ * The "mmu" suffix carries the full MemOpIdx, with both mmu_idx and the
+ * MemOp including alignment requirements.  The alignment will be enforced.
  */
 #ifndef CPU_LDST_H
 #define CPU_LDST_H
 
+#include "exec/memopidx.h"
+
 #if defined(CONFIG_USER_ONLY)
 /* sparc32plus has 64bit long but 32bit space address
  * this can make bad result with g2h() and h2g()
@@ -118,12 +125,10 @@ typedef target_ulong abi_ptr;
 
 uint32_t

[PATCH v4 23/48] accel/tcg: Move cpu_atomic decls to exec/cpu_ldst.h

2021-10-12 Thread Richard Henderson

The previous placement in tcg/tcg.h was not logical.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 include/exec/cpu_ldst.h   | 87 +++
 include/tcg/tcg.h | 87 ---
 target/arm/helper-a64.c   |  1 -
 target/m68k/op_helper.c   |  1 -
 target/ppc/mem_helper.c   |  1 -
 target/s390x/tcg/mem_helper.c |  1 -
 6 files changed, 87 insertions(+), 91 deletions(-)

diff --git a/include/exec/cpu_ldst.h b/include/exec/cpu_ldst.h
index a4dad0772f..a878fd0105 100644
--- a/include/exec/cpu_ldst.h
+++ b/include/exec/cpu_ldst.h
@@ -63,6 +63,7 @@
 #define CPU_LDST_H
 
 #include "exec/memopidx.h"
+#include "qemu/int128.h"
 
 #if defined(CONFIG_USER_ONLY)
 /* sparc32plus has 64bit long but 32bit space address
@@ -233,6 +234,92 @@ void cpu_stl_le_mmu(CPUArchState *env, abi_ptr ptr, 
uint32_t val,
 void cpu_stq_le_mmu(CPUArchState *env, abi_ptr ptr, uint64_t val,
 MemOpIdx oi, uintptr_t ra);
 
+uint32_t cpu_atomic_cmpxchgb_mmu(CPUArchState *env, target_ulong addr,
+ uint32_t cmpv, uint32_t newv,
+ MemOpIdx oi, uintptr_t retaddr);
+uint32_t cpu_atomic_cmpxchgw_le_mmu(CPUArchState *env, target_ulong addr,
+uint32_t cmpv, uint32_t newv,
+MemOpIdx oi, uintptr_t retaddr);
+uint32_t cpu_atomic_cmpxchgl_le_mmu(CPUArchState *env, target_ulong addr,
+uint32_t cmpv, uint32_t newv,
+MemOpIdx oi, uintptr_t retaddr);
+uint64_t cpu_atomic_cmpxchgq_le_mmu(CPUArchState *env, target_ulong addr,
+uint64_t cmpv, uint64_t newv,
+MemOpIdx oi, uintptr_t retaddr);
+uint32_t cpu_atomic_cmpxchgw_be_mmu(CPUArchState *env, target_ulong addr,
+uint32_t cmpv, uint32_t newv,
+MemOpIdx oi, uintptr_t retaddr);
+uint32_t cpu_atomic_cmpxchgl_be_mmu(CPUArchState *env, target_ulong addr,
+uint32_t cmpv, uint32_t newv,
+MemOpIdx oi, uintptr_t retaddr);
+uint64_t cpu_atomic_cmpxchgq_be_mmu(CPUArchState *env, target_ulong addr,
+uint64_t cmpv, uint64_t newv,
+MemOpIdx oi, uintptr_t retaddr);
+
+#define GEN_ATOMIC_HELPER(NAME, TYPE, SUFFIX) \
+TYPE cpu_atomic_ ## NAME ## SUFFIX ## _mmu\
+(CPUArchState *env, target_ulong addr, TYPE val,  \
+ MemOpIdx oi, uintptr_t retaddr);
+
+#ifdef CONFIG_ATOMIC64
+#define GEN_ATOMIC_HELPER_ALL(NAME)  \
+GEN_ATOMIC_HELPER(NAME, uint32_t, b) \
+GEN_ATOMIC_HELPER(NAME, uint32_t, w_le)  \
+GEN_ATOMIC_HELPER(NAME, uint32_t, w_be)  \
+GEN_ATOMIC_HELPER(NAME, uint32_t, l_le)  \
+GEN_ATOMIC_HELPER(NAME, uint32_t, l_be)  \
+GEN_ATOMIC_HELPER(NAME, uint64_t, q_le)  \
+GEN_ATOMIC_HELPER(NAME, uint64_t, q_be)
+#else
+#define GEN_ATOMIC_HELPER_ALL(NAME)  \
+GEN_ATOMIC_HELPER(NAME, uint32_t, b) \
+GEN_ATOMIC_HELPER(NAME, uint32_t, w_le)  \
+GEN_ATOMIC_HELPER(NAME, uint32_t, w_be)  \
+GEN_ATOMIC_HELPER(NAME, uint32_t, l_le)  \
+GEN_ATOMIC_HELPER(NAME, uint32_t, l_be)
+#endif
+
+GEN_ATOMIC_HELPER_ALL(fetch_add)
+GEN_ATOMIC_HELPER_ALL(fetch_sub)
+GEN_ATOMIC_HELPER_ALL(fetch_and)
+GEN_ATOMIC_HELPER_ALL(fetch_or)
+GEN_ATOMIC_HELPER_ALL(fetch_xor)
+GEN_ATOMIC_HELPER_ALL(fetch_smin)
+GEN_ATOMIC_HELPER_ALL(fetch_umin)
+GEN_ATOMIC_HELPER_ALL(fetch_smax)
+GEN_ATOMIC_HELPER_ALL(fetch_umax)
+
+GEN_ATOMIC_HELPER_ALL(add_fetch)
+GEN_ATOMIC_HELPER_ALL(sub_fetch)
+GEN_ATOMIC_HELPER_ALL(and_fetch)
+GEN_ATOMIC_HELPER_ALL(or_fetch)
+GEN_ATOMIC_HELPER_ALL(xor_fetch)
+GEN_ATOMIC_HELPER_ALL(smin_fetch)
+GEN_ATOMIC_HELPER_ALL(umin_fetch)
+GEN_ATOMIC_HELPER_ALL(smax_fetch)
+GEN_ATOMIC_HELPER_ALL(umax_fetch)
+
+GEN_ATOMIC_HELPER_ALL(xchg)
+
+#undef GEN_ATOMIC_HELPER_ALL
+#undef GEN_ATOMIC_HELPER
+
+Int128 cpu_atomic_cmpxchgo_le_mmu(CPUArchState *env, target_ulong addr,
+  Int128 cmpv, Int128 newv,
+  MemOpIdx oi, uintptr_t retaddr);
+Int128 cpu_atomic_cmpxchgo_be_mmu(CPUArchState *env, target_ulong addr,
+  Int128 cmpv, Int128 newv,
+  MemOpIdx oi, uintptr_t retaddr);
+
+Int128 cpu_atomic_ldo_le_mmu(CPUArchState *env, target_ulong addr,
+ MemOpIdx oi, uintptr_t retaddr);
+Int128 cpu_atomic_ldo_be_mmu(CPUArchState *env, target_ulong addr,
+ MemOpIdx oi, uintptr_t retaddr);
+void cpu_atomic_sto_le_mmu(CPUArchState *env, target_ulong addr, Int128 val,
+   MemOpIdx oi, uintptr_t retaddr);
+void cpu_atomic_sto_be_mmu(CPUArchState *env, target_ulong

[PATCH v4 19/48] target/ppc: Use MO_128 for 16 byte atomics

2021-10-12 Thread Richard Henderson

Cc: qemu-...@nongnu.org
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 target/ppc/translate.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index b985e9e55b..9ca78ee156 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -3462,10 +3462,12 @@ static void gen_std(DisasContext *ctx)
 if (HAVE_ATOMIC128) {
 TCGv_i32 oi = tcg_temp_new_i32();
 if (ctx->le_mode) {
-tcg_gen_movi_i32(oi, make_memop_idx(MO_LEQ, ctx->mem_idx));
+tcg_gen_movi_i32(oi, make_memop_idx(MO_LE | MO_128,
+ctx->mem_idx));
 gen_helper_stq_le_parallel(cpu_env, EA, lo, hi, oi);
 } else {
-tcg_gen_movi_i32(oi, make_memop_idx(MO_BEQ, ctx->mem_idx));
+tcg_gen_movi_i32(oi, make_memop_idx(MO_BE | MO_128,
+ctx->mem_idx));
 gen_helper_stq_be_parallel(cpu_env, EA, lo, hi, oi);
 }
 tcg_temp_free_i32(oi);
@@ -4067,11 +4069,11 @@ static void gen_lqarx(DisasContext *ctx)
 if (HAVE_ATOMIC128) {
 TCGv_i32 oi = tcg_temp_new_i32();
 if (ctx->le_mode) {
-tcg_gen_movi_i32(oi, make_memop_idx(MO_LEQ | MO_ALIGN_16,
+tcg_gen_movi_i32(oi, make_memop_idx(MO_LE | MO_128 | MO_ALIGN,
 ctx->mem_idx));
 gen_helper_lq_le_parallel(lo, cpu_env, EA, oi);
 } else {
-tcg_gen_movi_i32(oi, make_memop_idx(MO_BEQ | MO_ALIGN_16,
+tcg_gen_movi_i32(oi, make_memop_idx(MO_BE | MO_128 | MO_ALIGN,
 ctx->mem_idx));
 gen_helper_lq_be_parallel(lo, cpu_env, EA, oi);
 }
@@ -4122,7 +4124,7 @@ static void gen_stqcx_(DisasContext *ctx)
 
 if (tb_cflags(ctx->base.tb) & CF_PARALLEL) {
 if (HAVE_CMPXCHG128) {
-TCGv_i32 oi = tcg_const_i32(DEF_MEMOP(MO_Q) | MO_ALIGN_16);
+TCGv_i32 oi = tcg_const_i32(DEF_MEMOP(MO_128) | MO_ALIGN);
 if (ctx->le_mode) {
 gen_helper_stqcx_le_parallel(cpu_crf[0], cpu_env,
  EA, lo, hi, oi);
-- 
2.25.1

[PATCH v4 11/48] linux-user/hppa: Remove POWERPC_EXCP_ALIGN handling

2021-10-12 Thread Richard Henderson

We will raise SIGBUS directly from cpu_loop_exit_sigbus.

Signed-off-by: Richard Henderson 
---
 linux-user/ppc/cpu_loop.c | 8 
 1 file changed, 8 deletions(-)

diff --git a/linux-user/ppc/cpu_loop.c b/linux-user/ppc/cpu_loop.c
index 840b23736b..483e669300 100644
--- a/linux-user/ppc/cpu_loop.c
+++ b/linux-user/ppc/cpu_loop.c
@@ -162,14 +162,6 @@ void cpu_loop(CPUPPCState *env)
 cpu_abort(cs, "External interrupt while in user mode. "
   "Aborting\n");
 break;
-case POWERPC_EXCP_ALIGN:/* Alignment exception   */
-/* XXX: check this */
-info.si_signo = TARGET_SIGBUS;
-info.si_errno = 0;
-info.si_code = TARGET_BUS_ADRALN;
-info._sifields._sigfault._addr = env->nip;
-queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
-break;
 case POWERPC_EXCP_PROGRAM:  /* Program exception */
 case POWERPC_EXCP_HV_EMU:   /* HV emulation  */
 /* XXX: check this */
-- 
2.25.1

[PATCH v4 30/48] tcg: Add helper_unaligned_{ld, st} for user-only sigbus

2021-10-12 Thread Richard Henderson

To be called from tcg generated code on hosts that support
unaligned accesses natively, in response to an access that
is supposed to be aligned.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 include/tcg/tcg-ldst.h |  5 +
 accel/tcg/user-exec.c  | 11 +++
 2 files changed, 16 insertions(+)

diff --git a/include/tcg/tcg-ldst.h b/include/tcg/tcg-ldst.h
index 8c86365611..bf40942de4 100644
--- a/include/tcg/tcg-ldst.h
+++ b/include/tcg/tcg-ldst.h
@@ -70,5 +70,10 @@ void helper_be_stl_mmu(CPUArchState *env, target_ulong addr, 
uint32_t val,
 void helper_be_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
MemOpIdx oi, uintptr_t retaddr);
 
+#else
+
+void QEMU_NORETURN helper_unaligned_ld(CPUArchState *env, target_ulong addr);
+void QEMU_NORETURN helper_unaligned_st(CPUArchState *env, target_ulong addr);
+
 #endif /* CONFIG_SOFTMMU */
 #endif /* TCG_LDST_H */
diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c
index 7d50dd54f6..0473ead5ab 100644
--- a/accel/tcg/user-exec.c
+++ b/accel/tcg/user-exec.c
@@ -27,6 +27,7 @@
 #include "exec/helper-proto.h"
 #include "qemu/atomic128.h"
 #include "trace/trace-root.h"
+#include "tcg/tcg-ldst.h"
 #include "internal.h"
 
 __thread uintptr_t helper_retaddr;
@@ -217,6 +218,16 @@ static void validate_memop(MemOpIdx oi, MemOp expected)
 #endif
 }
 
+void helper_unaligned_ld(CPUArchState *env, target_ulong addr)
+{
+cpu_loop_exit_sigbus(env_cpu(env), addr, MMU_DATA_LOAD, GETPC());
+}
+
+void helper_unaligned_st(CPUArchState *env, target_ulong addr)
+{
+cpu_loop_exit_sigbus(env_cpu(env), addr, MMU_DATA_STORE, GETPC());
+}
+
 static void *cpu_mmu_lookup(CPUArchState *env, target_ulong addr,
 MemOpIdx oi, uintptr_t ra, MMUAccessType type)
 {
-- 
2.25.1

[PATCH v4 18/48] target/i386: Use MO_128 for 16 byte atomics

2021-10-12 Thread Richard Henderson

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 target/i386/tcg/mem_helper.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/tcg/mem_helper.c b/target/i386/tcg/mem_helper.c
index 0fd696f9c1..a207e624cb 100644
--- a/target/i386/tcg/mem_helper.c
+++ b/target/i386/tcg/mem_helper.c
@@ -136,7 +136,7 @@ void helper_cmpxchg16b(CPUX86State *env, target_ulong a0)
 Int128 newv = int128_make128(env->regs[R_EBX], env->regs[R_ECX]);
 
 int mem_idx = cpu_mmu_index(env, false);
-MemOpIdx oi = make_memop_idx(MO_TEQ | MO_ALIGN_16, mem_idx);
+MemOpIdx oi = make_memop_idx(MO_TE | MO_128 | MO_ALIGN, mem_idx);
 Int128 oldv = cpu_atomic_cmpxchgo_le_mmu(env, a0, cmpv, newv, oi, ra);
 
 if (int128_eq(oldv, cmpv)) {
-- 
2.25.1

[PATCH v4 21/48] target/hexagon: Implement cpu_mmu_index

2021-10-12 Thread Richard Henderson

The function is trivial for user-only, but still must be present.

Reviewed-by: Taylor Simpson 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 target/hexagon/cpu.h | 9 +
 1 file changed, 9 insertions(+)

diff --git a/target/hexagon/cpu.h b/target/hexagon/cpu.h
index f7d043865b..f90c187888 100644
--- a/target/hexagon/cpu.h
+++ b/target/hexagon/cpu.h
@@ -141,6 +141,15 @@ static inline void cpu_get_tb_cpu_state(CPUHexagonState 
*env, target_ulong *pc,
 #endif
 }
 
+static inline int cpu_mmu_index(CPUHexagonState *env, bool ifetch)
+{
+#ifdef CONFIG_USER_ONLY
+return MMU_USER_IDX;
+#else
+#error System mode not supported on Hexagon yet
+#endif
+}
+
 typedef struct CPUHexagonState CPUArchState;
 typedef HexagonCPU ArchCPU;
 
-- 
2.25.1

[PATCH v4 20/48] target/s390x: Use MO_128 for 16 byte atomics

2021-10-12 Thread Richard Henderson

Reviewed-by: David Hildenbrand 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 target/s390x/tcg/mem_helper.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/s390x/tcg/mem_helper.c b/target/s390x/tcg/mem_helper.c
index 4accffe68f..8624385fe1 100644
--- a/target/s390x/tcg/mem_helper.c
+++ b/target/s390x/tcg/mem_helper.c
@@ -1803,7 +1803,7 @@ void HELPER(cdsg_parallel)(CPUS390XState *env, uint64_t 
addr,
 assert(HAVE_CMPXCHG128);
 
 mem_idx = cpu_mmu_index(env, false);
-oi = make_memop_idx(MO_TEQ | MO_ALIGN_16, mem_idx);
+oi = make_memop_idx(MO_TE | MO_128 | MO_ALIGN, mem_idx);
 oldv = cpu_atomic_cmpxchgo_be_mmu(env, addr, cmpv, newv, oi, ra);
 fail = !int128_eq(oldv, cmpv);
 
@@ -1932,7 +1932,7 @@ static uint32_t do_csst(CPUS390XState *env, uint32_t r3, 
uint64_t a1,
 cpu_stq_data_ra(env, a1 + 0, int128_gethi(nv), ra);
 cpu_stq_data_ra(env, a1 + 8, int128_getlo(nv), ra);
 } else if (HAVE_CMPXCHG128) {
-MemOpIdx oi = make_memop_idx(MO_TEQ | MO_ALIGN_16, mem_idx);
+MemOpIdx oi = make_memop_idx(MO_TE | MO_128 | MO_ALIGN, 
mem_idx);
 ov = cpu_atomic_cmpxchgo_be_mmu(env, a1, cv, nv, oi, ra);
 cc = !int128_eq(ov, cv);
 } else {
-- 
2.25.1

[PATCH v4 10/48] target/s390x: Implement s390x_cpu_record_sigbus

2021-10-12 Thread Richard Henderson

For s390x, the only unaligned accesses that are signaled are atomic,
and we don't actually want to raise SIGBUS for those, but instead
raise a SPECIFICATION error, which the kernel will report as SIGILL.

Split out a do_unaligned_access function to share between the user-only
s390x_cpu_record_sigbus and the sysemu s390x_do_unaligned_access.

Signed-off-by: Richard Henderson 
---
 target/s390x/s390x-internal.h  |  8 +---
 target/s390x/cpu.c |  1 +
 target/s390x/tcg/excp_helper.c | 27 ---
 3 files changed, 26 insertions(+), 10 deletions(-)

diff --git a/target/s390x/s390x-internal.h b/target/s390x/s390x-internal.h
index 163aa4f94a..1a178aed41 100644
--- a/target/s390x/s390x-internal.h
+++ b/target/s390x/s390x-internal.h
@@ -270,18 +270,20 @@ ObjectClass *s390_cpu_class_by_name(const char *name);
 void s390x_cpu_debug_excp_handler(CPUState *cs);
 void s390_cpu_do_interrupt(CPUState *cpu);
 bool s390_cpu_exec_interrupt(CPUState *cpu, int int_req);
-void s390x_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
-   MMUAccessType access_type, int mmu_idx,
-   uintptr_t retaddr) QEMU_NORETURN;
 
 #ifdef CONFIG_USER_ONLY
 void s390_cpu_record_sigsegv(CPUState *cs, vaddr address,
  MMUAccessType access_type,
  bool maperr, uintptr_t retaddr);
+void s390_cpu_record_sigbus(CPUState *cs, vaddr address,
+MMUAccessType access_type, uintptr_t retaddr);
 #else
 bool s390_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
MMUAccessType access_type, int mmu_idx,
bool probe, uintptr_t retaddr);
+void s390x_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
+   MMUAccessType access_type, int mmu_idx,
+   uintptr_t retaddr) QEMU_NORETURN;
 #endif
 
 
diff --git a/target/s390x/cpu.c b/target/s390x/cpu.c
index 593dda75c4..ccdbaf84d5 100644
--- a/target/s390x/cpu.c
+++ b/target/s390x/cpu.c
@@ -269,6 +269,7 @@ static const struct TCGCPUOps s390_tcg_ops = {
 
 #ifdef CONFIG_USER_ONLY
 .record_sigsegv = s390_cpu_record_sigsegv,
+.record_sigbus = s390_cpu_record_sigbus,
 #else
 .tlb_fill = s390_cpu_tlb_fill,
 .cpu_exec_interrupt = s390_cpu_exec_interrupt,
diff --git a/target/s390x/tcg/excp_helper.c b/target/s390x/tcg/excp_helper.c
index b923d080fc..4e7648f301 100644
--- a/target/s390x/tcg/excp_helper.c
+++ b/target/s390x/tcg/excp_helper.c
@@ -82,6 +82,19 @@ void HELPER(data_exception)(CPUS390XState *env, uint32_t dxc)
 tcg_s390_data_exception(env, dxc, GETPC());
 }
 
+/*
+ * Unaligned accesses are only diagnosed with MO_ALIGN.  At the moment,
+ * this is only for the atomic operations, for which we want to raise a
+ * specification exception.
+ */
+static void QEMU_NORETURN do_unaligned_access(CPUState *cs, uintptr_t retaddr)
+{
+S390CPU *cpu = S390_CPU(cs);
+CPUS390XState *env = >env;
+
+tcg_s390_program_interrupt(env, PGM_SPECIFICATION, retaddr);
+}
+
 #if defined(CONFIG_USER_ONLY)
 
 void s390_cpu_do_interrupt(CPUState *cs)
@@ -106,6 +119,12 @@ void s390_cpu_record_sigsegv(CPUState *cs, vaddr address,
 cpu_loop_exit_restore(cs, retaddr);
 }
 
+void s390_cpu_record_sigbus(CPUState *cs, vaddr address,
+MMUAccessType access_type, uintptr_t retaddr)
+{
+do_unaligned_access(cs, retaddr);
+}
+
 #else /* !CONFIG_USER_ONLY */
 
 static inline uint64_t cpu_mmu_idx_to_asc(int mmu_idx)
@@ -593,17 +612,11 @@ void s390x_cpu_debug_excp_handler(CPUState *cs)
 }
 }
 
-/* Unaligned accesses are only diagnosed with MO_ALIGN.  At the moment,
-   this is only for the atomic operations, for which we want to raise a
-   specification exception.  */
 void s390x_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
MMUAccessType access_type,
int mmu_idx, uintptr_t retaddr)
 {
-S390CPU *cpu = S390_CPU(cs);
-CPUS390XState *env = >env;
-
-tcg_s390_program_interrupt(env, PGM_SPECIFICATION, retaddr);
+do_unaligned_access(cs, retaddr);
 }
 
 static void QEMU_NORETURN monitor_event(CPUS390XState *env,
-- 
2.25.1

[PATCH v4 17/48] target/arm: Use MO_128 for 16 byte atomics

2021-10-12 Thread Richard Henderson

Cc: qemu-...@nongnu.org
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 target/arm/helper-a64.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/target/arm/helper-a64.c b/target/arm/helper-a64.c
index c5af779006..4cafd3c11a 100644
--- a/target/arm/helper-a64.c
+++ b/target/arm/helper-a64.c
@@ -560,7 +560,7 @@ uint64_t HELPER(paired_cmpxchg64_le_parallel)(CPUARMState 
*env, uint64_t addr,
 assert(HAVE_CMPXCHG128);
 
 mem_idx = cpu_mmu_index(env, false);
-oi = make_memop_idx(MO_LEQ | MO_ALIGN_16, mem_idx);
+oi = make_memop_idx(MO_LE | MO_128 | MO_ALIGN, mem_idx);
 
 cmpv = int128_make128(env->exclusive_val, env->exclusive_high);
 newv = int128_make128(new_lo, new_hi);
@@ -630,7 +630,7 @@ uint64_t HELPER(paired_cmpxchg64_be_parallel)(CPUARMState 
*env, uint64_t addr,
 assert(HAVE_CMPXCHG128);
 
 mem_idx = cpu_mmu_index(env, false);
-oi = make_memop_idx(MO_BEQ | MO_ALIGN_16, mem_idx);
+oi = make_memop_idx(MO_BE | MO_128 | MO_ALIGN, mem_idx);
 
 /*
  * High and low need to be switched here because this is not actually a
@@ -656,7 +656,7 @@ void HELPER(casp_le_parallel)(CPUARMState *env, uint32_t 
rs, uint64_t addr,
 assert(HAVE_CMPXCHG128);
 
 mem_idx = cpu_mmu_index(env, false);
-oi = make_memop_idx(MO_LEQ | MO_ALIGN_16, mem_idx);
+oi = make_memop_idx(MO_LE | MO_128 | MO_ALIGN, mem_idx);
 
 cmpv = int128_make128(env->xregs[rs], env->xregs[rs + 1]);
 newv = int128_make128(new_lo, new_hi);
@@ -677,7 +677,7 @@ void HELPER(casp_be_parallel)(CPUARMState *env, uint32_t 
rs, uint64_t addr,
 assert(HAVE_CMPXCHG128);
 
 mem_idx = cpu_mmu_index(env, false);
-oi = make_memop_idx(MO_LEQ | MO_ALIGN_16, mem_idx);
+oi = make_memop_idx(MO_LE | MO_128 | MO_ALIGN, mem_idx);
 
 cmpv = int128_make128(env->xregs[rs + 1], env->xregs[rs]);
 newv = int128_make128(new_lo, new_hi);
-- 
2.25.1

[PATCH v4 14/48] target/sparc: Split out build_sfsr

2021-10-12 Thread Richard Henderson

Reviewed-by: Mark Cave-Ayland 
Signed-off-by: Richard Henderson 
---
 target/sparc/mmu_helper.c | 72 +--
 1 file changed, 46 insertions(+), 26 deletions(-)

diff --git a/target/sparc/mmu_helper.c b/target/sparc/mmu_helper.c
index 2ad47391d0..014601e701 100644
--- a/target/sparc/mmu_helper.c
+++ b/target/sparc/mmu_helper.c
@@ -502,16 +502,60 @@ static inline int ultrasparc_tag_match(SparcTLBEntry *tlb,
 return 0;
 }
 
+static uint64_t build_sfsr(CPUSPARCState *env, int mmu_idx, int rw)
+{
+uint64_t sfsr = SFSR_VALID_BIT;
+
+switch (mmu_idx) {
+case MMU_PHYS_IDX:
+sfsr |= SFSR_CT_NOTRANS;
+break;
+case MMU_USER_IDX:
+case MMU_KERNEL_IDX:
+sfsr |= SFSR_CT_PRIMARY;
+break;
+case MMU_USER_SECONDARY_IDX:
+case MMU_KERNEL_SECONDARY_IDX:
+sfsr |= SFSR_CT_SECONDARY;
+break;
+case MMU_NUCLEUS_IDX:
+sfsr |= SFSR_CT_NUCLEUS;
+break;
+default:
+g_assert_not_reached();
+}
+
+if (rw == 1) {
+sfsr |= SFSR_WRITE_BIT;
+} else if (rw == 4) {
+sfsr |= SFSR_NF_BIT;
+}
+
+if (env->pstate & PS_PRIV) {
+sfsr |= SFSR_PR_BIT;
+}
+
+if (env->dmmu.sfsr & SFSR_VALID_BIT) { /* Fault status register */
+sfsr |= SFSR_OW_BIT; /* overflow (not read before another fault) */
+}
+
+/* FIXME: ASI field in SFSR must be set */
+
+return sfsr;
+}
+
 static int get_physical_address_data(CPUSPARCState *env, hwaddr *physical,
  int *prot, MemTxAttrs *attrs,
  target_ulong address, int rw, int mmu_idx)
 {
 CPUState *cs = env_cpu(env);
 unsigned int i;
+uint64_t sfsr;
 uint64_t context;
-uint64_t sfsr = 0;
 bool is_user = false;
 
+sfsr = build_sfsr(env, mmu_idx, rw);
+
 switch (mmu_idx) {
 case MMU_PHYS_IDX:
 g_assert_not_reached();
@@ -520,29 +564,18 @@ static int get_physical_address_data(CPUSPARCState *env, 
hwaddr *physical,
 /* fallthru */
 case MMU_KERNEL_IDX:
 context = env->dmmu.mmu_primary_context & 0x1fff;
-sfsr |= SFSR_CT_PRIMARY;
 break;
 case MMU_USER_SECONDARY_IDX:
 is_user = true;
 /* fallthru */
 case MMU_KERNEL_SECONDARY_IDX:
 context = env->dmmu.mmu_secondary_context & 0x1fff;
-sfsr |= SFSR_CT_SECONDARY;
 break;
-case MMU_NUCLEUS_IDX:
-sfsr |= SFSR_CT_NUCLEUS;
-/* FALLTHRU */
 default:
 context = 0;
 break;
 }
 
-if (rw == 1) {
-sfsr |= SFSR_WRITE_BIT;
-} else if (rw == 4) {
-sfsr |= SFSR_NF_BIT;
-}
-
 for (i = 0; i < 64; i++) {
 /* ctx match, vaddr match, valid? */
 if (ultrasparc_tag_match(>dtlb[i], address, context, physical)) {
@@ -592,22 +625,9 @@ static int get_physical_address_data(CPUSPARCState *env, 
hwaddr *physical,
 return 0;
 }
 
-if (env->dmmu.sfsr & SFSR_VALID_BIT) { /* Fault status register */
-sfsr |= SFSR_OW_BIT; /* overflow (not read before
-another fault) */
-}
-
-if (env->pstate & PS_PRIV) {
-sfsr |= SFSR_PR_BIT;
-}
-
-/* FIXME: ASI field in SFSR must be set */
-env->dmmu.sfsr = sfsr | SFSR_VALID_BIT;
-
+env->dmmu.sfsr = sfsr;
 env->dmmu.sfar = address; /* Fault address register */
-
 env->dmmu.tag_access = (address & ~0x1fffULL) | context;
-
 return 1;
 }
 }
-- 
2.25.1

[PATCH v4 07/48] target/ppc: Move SPR_DSISR setting to powerpc_excp

2021-10-12 Thread Richard Henderson

By doing this while sending the exception, we will have already
done the unwinding, which makes the ppc_cpu_do_unaligned_access
code a bit cleaner.

Update the comment about the expected instruction format.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/ppc/excp_helper.c | 21 +
 1 file changed, 9 insertions(+), 12 deletions(-)

diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
index b7d1767920..88a8de4b80 100644
--- a/target/ppc/excp_helper.c
+++ b/target/ppc/excp_helper.c
@@ -454,13 +454,15 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int 
excp_model, int excp)
 break;
 }
 case POWERPC_EXCP_ALIGN: /* Alignment exception  */
-/* Get rS/rD and rA from faulting opcode */
 /*
- * Note: the opcode fields will not be set properly for a
- * direct store load/store, but nobody cares as nobody
- * actually uses direct store segments.
+ * Get rS/rD and rA from faulting opcode.
+ * Note: We will only invoke ALIGN for atomic operations,
+ * so all instructions are X-form.
  */
-env->spr[SPR_DSISR] |= (env->error_code & 0x03FF) >> 16;
+{
+uint32_t insn = cpu_ldl_code(env, env->nip);
+env->spr[SPR_DSISR] |= (insn & 0x03FF) >> 16;
+}
 break;
 case POWERPC_EXCP_PROGRAM:   /* Program exception*/
 switch (env->error_code & ~0xF) {
@@ -1462,14 +1464,9 @@ void ppc_cpu_do_unaligned_access(CPUState *cs, vaddr 
vaddr,
  int mmu_idx, uintptr_t retaddr)
 {
 CPUPPCState *env = cs->env_ptr;
-uint32_t insn;
-
-/* Restore state and reload the insn we executed, for filling in DSISR.  */
-cpu_restore_state(cs, retaddr, true);
-insn = cpu_ldl_code(env, env->nip);
 
 cs->exception_index = POWERPC_EXCP_ALIGN;
-env->error_code = insn & 0x03FF;
-cpu_loop_exit(cs);
+env->error_code = 0;
+cpu_loop_exit_restore(cs, retaddr);
 }
 #endif
-- 
2.25.1

[PATCH v4 16/48] accel/tcg: Report unaligned atomics for user-only

2021-10-12 Thread Richard Henderson

Use the new cpu_loop_exit_sigbus for atomic_mmu_lookup, which
has access to complete alignment info from the TCGMemOpIdx arg.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 accel/tcg/user-exec.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c
index 83ed76cef9..5dcd58c6d5 100644
--- a/accel/tcg/user-exec.c
+++ b/accel/tcg/user-exec.c
@@ -547,11 +547,22 @@ static void *atomic_mmu_lookup(CPUArchState *env, 
target_ulong addr,
MemOpIdx oi, int size, int prot,
uintptr_t retaddr)
 {
+MemOp mop = get_memop(oi);
+int a_bits = get_alignment_bits(mop);
+void *ret;
+
+/* Enforce guest required alignment.  */
+if (unlikely(addr & ((1 << a_bits) - 1))) {
+MMUAccessType t = prot == PAGE_READ ? MMU_DATA_LOAD : MMU_DATA_STORE;
+cpu_loop_exit_sigbus(env_cpu(env), addr, t, retaddr);
+}
+
 /* Enforce qemu required alignment.  */
 if (unlikely(addr & (size - 1))) {
 cpu_loop_exit_atomic(env_cpu(env), retaddr);
 }
-void *ret = g2h(env_cpu(env), addr);
+
+ret = g2h(env_cpu(env), addr);
 set_helper_retaddr(retaddr);
 return ret;
 }
-- 
2.25.1

[PATCH v4 09/48] target/ppc: Restrict ppc_cpu_do_unaligned_access to sysemu

2021-10-12 Thread Richard Henderson

This is not used by, nor required by, user-only.

Signed-off-by: Richard Henderson 
---
 target/ppc/internal.h| 8 +++-
 target/ppc/excp_helper.c | 8 +++-
 2 files changed, 6 insertions(+), 10 deletions(-)

diff --git a/target/ppc/internal.h b/target/ppc/internal.h
index 339974b7d8..6aa9484f34 100644
--- a/target/ppc/internal.h
+++ b/target/ppc/internal.h
@@ -211,11 +211,6 @@ void helper_compute_fprf_float16(CPUPPCState *env, float16 
arg);
 void helper_compute_fprf_float32(CPUPPCState *env, float32 arg);
 void helper_compute_fprf_float128(CPUPPCState *env, float128 arg);
 
-/* Raise a data fault alignment exception for the specified virtual address */
-void ppc_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
- MMUAccessType access_type, int mmu_idx,
- uintptr_t retaddr) QEMU_NORETURN;
-
 /* translate.c */
 
 int ppc_fixup_cpu(PowerPCCPU *cpu);
@@ -291,6 +286,9 @@ void ppc_cpu_record_sigsegv(CPUState *cs, vaddr addr,
 bool ppc_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
   MMUAccessType access_type, int mmu_idx,
   bool probe, uintptr_t retaddr);
+void ppc_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
+ MMUAccessType access_type, int mmu_idx,
+ uintptr_t retaddr) QEMU_NORETURN;
 #endif
 
 #endif /* PPC_INTERNAL_H */
diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
index e568a54536..17607adbe4 100644
--- a/target/ppc/excp_helper.c
+++ b/target/ppc/excp_helper.c
@@ -1454,11 +1454,8 @@ void helper_book3s_msgsndp(CPUPPCState *env, 
target_ulong rb)
 
 book3s_msgsnd_common(pir, PPC_INTERRUPT_DOORBELL);
 }
-#endif
-#endif /* CONFIG_TCG */
-#endif
+#endif /* TARGET_PPC64 */
 
-#ifdef CONFIG_TCG
 void ppc_cpu_do_unaligned_access(CPUState *cs, vaddr vaddr,
  MMUAccessType access_type,
  int mmu_idx, uintptr_t retaddr)
@@ -1483,4 +1480,5 @@ void ppc_cpu_do_unaligned_access(CPUState *cs, vaddr 
vaddr,
 env->error_code = 0;
 cpu_loop_exit_restore(cs, retaddr);
 }
-#endif
+#endif /* CONFIG_TCG */
+#endif /* !CONFIG_USER_ONLY */
-- 
2.25.1

[PATCH v4 15/48] target/sparc: Set fault address in sparc_cpu_do_unaligned_access

2021-10-12 Thread Richard Henderson

We ought to have been recording the virtual address for reporting
to the guest trap handler.  Move the function to mmu_helper.c, so
that we can re-use code shared with get_physical_address_data.

Reviewed-by: Mark Cave-Ayland 
Signed-off-by: Richard Henderson 
---
 target/sparc/ldst_helper.c | 13 -
 target/sparc/mmu_helper.c  | 20 
 2 files changed, 20 insertions(+), 13 deletions(-)

diff --git a/target/sparc/ldst_helper.c b/target/sparc/ldst_helper.c
index 2d0d180ea6..299fc386ea 100644
--- a/target/sparc/ldst_helper.c
+++ b/target/sparc/ldst_helper.c
@@ -1953,16 +1953,3 @@ void sparc_cpu_do_transaction_failed(CPUState *cs, 
hwaddr physaddr,
   is_asi, size, retaddr);
 }
 #endif
-
-#if !defined(CONFIG_USER_ONLY)
-void QEMU_NORETURN sparc_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
- MMUAccessType access_type,
- int mmu_idx,
- uintptr_t retaddr)
-{
-SPARCCPU *cpu = SPARC_CPU(cs);
-CPUSPARCState *env = >env;
-
-cpu_raise_exception_ra(env, TT_UNALIGNED, retaddr);
-}
-#endif
diff --git a/target/sparc/mmu_helper.c b/target/sparc/mmu_helper.c
index 014601e701..f2668389b0 100644
--- a/target/sparc/mmu_helper.c
+++ b/target/sparc/mmu_helper.c
@@ -922,3 +922,23 @@ hwaddr sparc_cpu_get_phys_page_debug(CPUState *cs, vaddr 
addr)
 }
 return phys_addr;
 }
+
+#ifndef CONFIG_USER_ONLY
+void QEMU_NORETURN sparc_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
+ MMUAccessType access_type,
+ int mmu_idx,
+ uintptr_t retaddr)
+{
+SPARCCPU *cpu = SPARC_CPU(cs);
+CPUSPARCState *env = >env;
+
+#ifdef TARGET_SPARC64
+env->dmmu.sfsr = build_sfsr(env, mmu_idx, access_type);
+env->dmmu.sfar = addr;
+#else
+env->mmuregs[4] = addr;
+#endif
+
+cpu_raise_exception_ra(env, TT_UNALIGNED, retaddr);
+}
+#endif /* !CONFIG_USER_ONLY */
-- 
2.25.1

[PATCH v4 02/48] linux-user: Add cpu_loop_exit_sigbus

2021-10-12 Thread Richard Henderson

This is a new interface to be provided by the os emulator for
raising SIGBUS on fault.  Use the new record_sigbus target hook.

Signed-off-by: Richard Henderson 
---
 include/exec/exec-all.h | 14 ++
 linux-user/signal.c | 14 ++
 2 files changed, 28 insertions(+)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index f74578500c..6bb2a0f7ec 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -700,6 +700,20 @@ void QEMU_NORETURN cpu_loop_exit_sigsegv(CPUState *cpu, 
target_ulong addr,
  MMUAccessType access_type,
  bool maperr, uintptr_t ra);
 
+/**
+ * cpu_loop_exit_sigbus:
+ * @cpu: the cpu context
+ * @addr: the guest address of the alignment fault
+ * @access_type: access was read/write/execute
+ * @ra: host pc for unwinding
+ *
+ * Use the TCGCPUOps hook to record cpu state, do guest operating system
+ * specific things to raise SIGBUS, and jump to the main cpu loop.
+ */
+void QEMU_NORETURN cpu_loop_exit_sigbus(CPUState *cpu, target_ulong addr,
+MMUAccessType access_type,
+uintptr_t ra);
+
 #else
 static inline void mmap_lock(void) {}
 static inline void mmap_unlock(void) {}
diff --git a/linux-user/signal.c b/linux-user/signal.c
index 9d60abc038..df2c8678d0 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -706,6 +706,20 @@ void cpu_loop_exit_sigsegv(CPUState *cpu, target_ulong 
addr,
 cpu_loop_exit_restore(cpu, ra);
 }
 
+void cpu_loop_exit_sigbus(CPUState *cpu, target_ulong addr,
+  MMUAccessType access_type, uintptr_t ra)
+{
+const struct TCGCPUOps *tcg_ops = CPU_GET_CLASS(cpu)->tcg_ops;
+
+if (tcg_ops->record_sigbus) {
+tcg_ops->record_sigbus(cpu, addr, access_type, ra);
+}
+
+force_sig_fault(TARGET_SIGBUS, TARGET_BUS_ADRALN, addr);
+cpu->exception_index = EXCP_INTERRUPT;
+cpu_loop_exit_restore(cpu, ra);
+}
+
 /* abort execution with signal */
 static void QEMU_NORETURN dump_core_and_abort(int target_sig)
 {
-- 
2.25.1

[PATCH v4 12/48] target/sh4: Set fault address in superh_cpu_do_unaligned_access

2021-10-12 Thread Richard Henderson

We ought to have been recording the virtual address for reporting
to the guest trap handler.

Cc: Yoshinori Sato 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 target/sh4/op_helper.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/target/sh4/op_helper.c b/target/sh4/op_helper.c
index c0cbb95382..d6d70c339f 100644
--- a/target/sh4/op_helper.c
+++ b/target/sh4/op_helper.c
@@ -29,6 +29,9 @@ void superh_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
 MMUAccessType access_type,
 int mmu_idx, uintptr_t retaddr)
 {
+CPUSH4State *env = cs->env_ptr;
+
+env->tea = addr;
 switch (access_type) {
 case MMU_INST_FETCH:
 case MMU_DATA_LOAD:
@@ -37,6 +40,8 @@ void superh_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
 case MMU_DATA_STORE:
 cs->exception_index = 0x100;
 break;
+default:
+g_assert_not_reached();
 }
 cpu_loop_exit_restore(cs, retaddr);
 }
-- 
2.25.1

[PATCH v4 13/48] target/sparc: Remove DEBUG_UNALIGNED

2021-10-12 Thread Richard Henderson

The printf should have been qemu_log_mask, the parameters
themselves no longer compile, and because this is placed
before unwinding the PC is actively wrong.

We get better (and correct) logging on the other side of
raising the exception, in sparc_cpu_do_interrupt.

Reviewed-by: Mark Cave-Ayland 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 target/sparc/ldst_helper.c | 9 -
 1 file changed, 9 deletions(-)

diff --git a/target/sparc/ldst_helper.c b/target/sparc/ldst_helper.c
index abe2889d27..2d0d180ea6 100644
--- a/target/sparc/ldst_helper.c
+++ b/target/sparc/ldst_helper.c
@@ -27,7 +27,6 @@
 
 //#define DEBUG_MMU
 //#define DEBUG_MXCC
-//#define DEBUG_UNALIGNED
 //#define DEBUG_UNASSIGNED
 //#define DEBUG_ASI
 //#define DEBUG_CACHE_CONTROL
@@ -364,10 +363,6 @@ static void do_check_align(CPUSPARCState *env, 
target_ulong addr,
uint32_t align, uintptr_t ra)
 {
 if (addr & align) {
-#ifdef DEBUG_UNALIGNED
-printf("Unaligned access to 0x" TARGET_FMT_lx " from 0x" TARGET_FMT_lx
-   "\n", addr, env->pc);
-#endif
 cpu_raise_exception_ra(env, TT_UNALIGNED, ra);
 }
 }
@@ -1968,10 +1963,6 @@ void QEMU_NORETURN 
sparc_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
 SPARCCPU *cpu = SPARC_CPU(cs);
 CPUSPARCState *env = >env;
 
-#ifdef DEBUG_UNALIGNED
-printf("Unaligned access to 0x" TARGET_FMT_lx " from 0x" TARGET_FMT_lx
-   "\n", addr, env->pc);
-#endif
 cpu_raise_exception_ra(env, TT_UNALIGNED, retaddr);
 }
 #endif
-- 
2.25.1

[PATCH v4 05/48] linux-user/hppa: Remove EXCP_UNALIGN handling

2021-10-12 Thread Richard Henderson

We will raise SIGBUS directly from cpu_loop_exit_sigbus.

Signed-off-by: Richard Henderson 
---
 linux-user/hppa/cpu_loop.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/linux-user/hppa/cpu_loop.c b/linux-user/hppa/cpu_loop.c
index e0a62deeb9..375576c8f0 100644
--- a/linux-user/hppa/cpu_loop.c
+++ b/linux-user/hppa/cpu_loop.c
@@ -144,13 +144,6 @@ void cpu_loop(CPUHPPAState *env)
 env->iaoq_f = env->gr[31];
 env->iaoq_b = env->gr[31] + 4;
 break;
-case EXCP_UNALIGN:
-info.si_signo = TARGET_SIGBUS;
-info.si_errno = 0;
-info.si_code = 0;
-info._sifields._sigfault._addr = env->cr[CR_IOR];
-queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
-break;
 case EXCP_ILL:
 case EXCP_PRIV_OPR:
 case EXCP_PRIV_REG:
-- 
2.25.1

[PATCH v4 08/48] target/ppc: Set fault address in ppc_cpu_do_unaligned_access

2021-10-12 Thread Richard Henderson

We ought to have been recording the virtual address for reporting
to the guest trap handler.

Cc: qemu-...@nongnu.org
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/ppc/excp_helper.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
index 88a8de4b80..e568a54536 100644
--- a/target/ppc/excp_helper.c
+++ b/target/ppc/excp_helper.c
@@ -1465,6 +1465,20 @@ void ppc_cpu_do_unaligned_access(CPUState *cs, vaddr 
vaddr,
 {
 CPUPPCState *env = cs->env_ptr;
 
+switch (env->mmu_model) {
+case POWERPC_MMU_SOFT_4xx:
+case POWERPC_MMU_SOFT_4xx_Z:
+env->spr[SPR_40x_DEAR] = vaddr;
+break;
+case POWERPC_MMU_BOOKE:
+case POWERPC_MMU_BOOKE206:
+env->spr[SPR_BOOKE_DEAR] = vaddr;
+break;
+default:
+env->spr[SPR_DAR] = vaddr;
+break;
+}
+
 cs->exception_index = POWERPC_EXCP_ALIGN;
 env->error_code = 0;
 cpu_loop_exit_restore(cs, retaddr);
-- 
2.25.1

[PATCH v4 03/48] linux-user/alpha: Remove EXCP_UNALIGN handling

2021-10-12 Thread Richard Henderson

We will raise SIGBUS directly from cpu_loop_exit_sigbus.

Signed-off-by: Richard Henderson 
---
 linux-user/alpha/cpu_loop.c | 15 ---
 1 file changed, 15 deletions(-)

diff --git a/linux-user/alpha/cpu_loop.c b/linux-user/alpha/cpu_loop.c
index 1b00a81385..4029849d5c 100644
--- a/linux-user/alpha/cpu_loop.c
+++ b/linux-user/alpha/cpu_loop.c
@@ -54,21 +54,6 @@ void cpu_loop(CPUAlphaState *env)
 fprintf(stderr, "External interrupt. Exit\n");
 exit(EXIT_FAILURE);
 break;
-case EXCP_MMFAULT:
-info.si_signo = TARGET_SIGSEGV;
-info.si_errno = 0;
-info.si_code = (page_get_flags(env->trap_arg0) & PAGE_VALID
-? TARGET_SEGV_ACCERR : TARGET_SEGV_MAPERR);
-info._sifields._sigfault._addr = env->trap_arg0;
-queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
-break;
-case EXCP_UNALIGN:
-info.si_signo = TARGET_SIGBUS;
-info.si_errno = 0;
-info.si_code = TARGET_BUS_ADRALN;
-info._sifields._sigfault._addr = env->trap_arg0;
-queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
-break;
 case EXCP_OPCDEC:
 do_sigill:
 info.si_signo = TARGET_SIGILL;
-- 
2.25.1

[PATCH v4 00/48]

2021-10-12 Thread Richard Henderson

Based-on: 20211006172307.780893-1-richard.hender...@linaro.org
("[PATCH v4 00/41] linux-user: Streamline handling of SIGSEGV")

This began with Peter wanting a cpu_ldst.h interface that can handle
alignment info for Arm M-profile system mode, which will also compile
for user-only without ifdefs.

Once I had that interface, I thought I might as well enforce the
requested alignment in user-only.  There are plenty of cases where
we ought to have been doing that for quite a while.  This took rather
more work than I imagined to start.

Changes for v4:
  * Rebase, with some patches now upstream.
  * Rename the core function to cpu_loop_exit_sigbus.

Changes for v3:
  * Updated tcg/{aarch64,ppc,s390,riscv,tci}.

Changes for v2:
  * Cleanup prctl(2), add support for prctl(PR_GET/SET_UNALIGN).
  * Adjustments for ppc and sparc reporting address during alignment fault.


r~


Richard Henderson (48):
  hw/core: Add TCGCPUOps.record_sigbus
  linux-user: Add cpu_loop_exit_sigbus
  linux-user/alpha: Remove EXCP_UNALIGN handling
  target/arm: Implement arm_cpu_record_sigbus
  linux-user/hppa: Remove EXCP_UNALIGN handling
  target/microblaze: Do not set MO_ALIGN for user-only
  target/ppc: Move SPR_DSISR setting to powerpc_excp
  target/ppc: Set fault address in ppc_cpu_do_unaligned_access
  target/ppc: Restrict ppc_cpu_do_unaligned_access to sysemu
  target/s390x: Implement s390x_cpu_record_sigbus
  linux-user/hppa: Remove POWERPC_EXCP_ALIGN handling
  target/sh4: Set fault address in superh_cpu_do_unaligned_access
  target/sparc: Remove DEBUG_UNALIGNED
  target/sparc: Split out build_sfsr
  target/sparc: Set fault address in sparc_cpu_do_unaligned_access
  accel/tcg: Report unaligned atomics for user-only
  target/arm: Use MO_128 for 16 byte atomics
  target/i386: Use MO_128 for 16 byte atomics
  target/ppc: Use MO_128 for 16 byte atomics
  target/s390x: Use MO_128 for 16 byte atomics
  target/hexagon: Implement cpu_mmu_index
  accel/tcg: Add cpu_{ld,st}*_mmu interfaces
  accel/tcg: Move cpu_atomic decls to exec/cpu_ldst.h
  target/mips: Use cpu_*_data_ra for msa load/store
  target/mips: Use 8-byte memory ops for msa load/store
  target/s390x: Use cpu_*_mmu instead of helper_*_mmu
  target/sparc: Use cpu_*_mmu instead of helper_*_mmu
  target/arm: Use cpu_*_mmu instead of helper_*_mmu
  tcg: Move helper_*_mmu decls to tcg/tcg-ldst.h
  tcg: Add helper_unaligned_{ld,st} for user-only sigbus
  linux-user: Split out do_prctl and subroutines
  linux-user: Disable more prctl subcodes
  Revert "cpu: Move cpu_common_props to hw/core/cpu.c"
  linux-user: Add code for PR_GET/SET_UNALIGN
  target/alpha: Reorg fp memory operations
  target/alpha: Reorg integer memory operations
  target/alpha: Implement prctl_unalign_sigbus
  target/hppa: Implement prctl_unalign_sigbus
  target/sh4: Implement prctl_unalign_sigbus
  linux-user/signal: Handle BUS_ADRALN in host_signal_handler
  tcg: Canonicalize alignment flags in MemOp
  tcg/i386: Support raising sigbus for user-only
  tcg/aarch64: Support raising sigbus for user-only
  tcg/ppc: Support raising sigbus for user-only
  tcg/s390: Support raising sigbus for user-only
  tcg/tci: Support raising sigbus for user-only
  tcg/riscv: Support raising sigbus for user-only
  tests/tcg/multiarch: Add sigbus.c

 docs/devel/loads-stores.rst   |  52 ++-
 include/exec/cpu_ldst.h   | 332 ---
 include/exec/exec-all.h   |  14 +
 include/hw/core/cpu.h |   4 +
 include/hw/core/tcg-cpu-ops.h |  23 +
 include/tcg/tcg-ldst.h|  79 
 include/tcg/tcg.h | 158 ---
 linux-user/aarch64/target_prctl.h | 160 +++
 linux-user/aarch64/target_syscall.h   |  23 -
 linux-user/alpha/target_prctl.h   |   1 +
 linux-user/arm/target_prctl.h |   1 +
 linux-user/cris/target_prctl.h|   1 +
 linux-user/generic/target_prctl_unalign.h |  27 ++
 linux-user/hexagon/target_prctl.h |   1 +
 linux-user/hppa/target_prctl.h|   1 +
 linux-user/i386/target_prctl.h|   1 +
 linux-user/m68k/target_prctl.h|   1 +
 linux-user/microblaze/target_prctl.h  |   1 +
 linux-user/mips/target_prctl.h|  88 
 linux-user/mips/target_syscall.h  |   6 -
 linux-user/mips64/target_prctl.h  |   1 +
 linux-user/mips64/target_syscall.h|   6 -
 linux-user/nios2/target_prctl.h   |   1 +
 linux-user/openrisc/target_prctl.h|   1 +
 linux-user/ppc/target_prctl.h |   1 +
 linux-user/riscv/target_prctl.h   |   1 +
 linux-user/s390x/target_prctl.h   |   1 +
 linux-user/sh4/target_prctl.h |   1 +
 linux-user/sparc/target_prctl.h   |   1 +
 linux-user/x86_64/target_prctl.h  |   1 +
 linux-user/xtensa/target_prctl.h  |   1 +
 target/alpha/cpu.h|   5 +
 target/arm/internals.h|   2 +

[PATCH v4 06/48] target/microblaze: Do not set MO_ALIGN for user-only

2021-10-12 Thread Richard Henderson

The kernel will fix up unaligned accesses, so emulate that
by allowing unaligned accesses to succeed.

Reviewed-by: Edgar E. Iglesias 
Signed-off-by: Richard Henderson 
---
 target/microblaze/translate.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/target/microblaze/translate.c b/target/microblaze/translate.c
index a14ffed784..ef44bca2fd 100644
--- a/target/microblaze/translate.c
+++ b/target/microblaze/translate.c
@@ -727,6 +727,7 @@ static TCGv compute_ldst_addr_ea(DisasContext *dc, int ra, 
int rb)
 }
 #endif
 
+#ifndef CONFIG_USER_ONLY
 static void record_unaligned_ess(DisasContext *dc, int rd,
  MemOp size, bool store)
 {
@@ -739,6 +740,7 @@ static void record_unaligned_ess(DisasContext *dc, int rd,
 
 tcg_set_insn_start_param(dc->insn_start, 1, iflags);
 }
+#endif
 
 static bool do_load(DisasContext *dc, int rd, TCGv addr, MemOp mop,
 int mem_index, bool rev)
@@ -760,12 +762,19 @@ static bool do_load(DisasContext *dc, int rd, TCGv addr, 
MemOp mop,
 }
 }
 
+/*
+ * For system mode, enforce alignment if the cpu configuration
+ * requires it.  For user-mode, the Linux kernel will have fixed up
+ * any unaligned access, so emulate that by *not* setting MO_ALIGN.
+ */
+#ifndef CONFIG_USER_ONLY
 if (size > MO_8 &&
 (dc->tb_flags & MSR_EE) &&
 dc->cfg->unaligned_exceptions) {
 record_unaligned_ess(dc, rd, size, false);
 mop |= MO_ALIGN;
 }
+#endif
 
 tcg_gen_qemu_ld_i32(reg_for_write(dc, rd), addr, mem_index, mop);
 
@@ -906,12 +915,19 @@ static bool do_store(DisasContext *dc, int rd, TCGv addr, 
MemOp mop,
 }
 }
 
+/*
+ * For system mode, enforce alignment if the cpu configuration
+ * requires it.  For user-mode, the Linux kernel will have fixed up
+ * any unaligned access, so emulate that by *not* setting MO_ALIGN.
+ */
+#ifndef CONFIG_USER_ONLY
 if (size > MO_8 &&
 (dc->tb_flags & MSR_EE) &&
 dc->cfg->unaligned_exceptions) {
 record_unaligned_ess(dc, rd, size, true);
 mop |= MO_ALIGN;
 }
+#endif
 
 tcg_gen_qemu_st_i32(reg_for_read(dc, rd), addr, mem_index, mop);
 
-- 
2.25.1

[PATCH v4 01/48] hw/core: Add TCGCPUOps.record_sigbus

2021-10-12 Thread Richard Henderson

Add a new user-only interface for updating cpu state before
raising a signal.  This will take the place of do_unaligned_access
for user-only and should result in less boilerplate for each guest.

Signed-off-by: Richard Henderson 
---
 include/hw/core/tcg-cpu-ops.h | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/include/hw/core/tcg-cpu-ops.h b/include/hw/core/tcg-cpu-ops.h
index 8eadd404c8..e13898553a 100644
--- a/include/hw/core/tcg-cpu-ops.h
+++ b/include/hw/core/tcg-cpu-ops.h
@@ -135,6 +135,29 @@ struct TCGCPUOps {
 void (*record_sigsegv)(CPUState *cpu, vaddr addr,
MMUAccessType access_type,
bool maperr, uintptr_t ra);
+/**
+ * record_sigbus:
+ * @cpu: cpu context
+ * @addr: misaligned guest address
+ * @access_type: access was read/write/execute
+ * @ra: host pc for unwinding
+ *
+ * We are about to raise SIGBUS with si_code BUS_ADRALN,
+ * and si_addr set for @addr.  Record anything further needed
+ * for the signal ucontext_t.
+ *
+ * If the emulated kernel does not provide the signal handler with
+ * anything besides the user context registers, and the siginfo_t,
+ * then this hook need do nothing and may be omitted.
+ * Otherwise, record the data and return; the caller will raise
+ * the signal, unwind the cpu state, and return to the main loop.
+ *
+ * If it is simpler to re-use the sysemu do_unaligned_access code,
+ * @ra is provided so that a "normal" cpu exception can be raised.
+ * In this case, the signal must be raised by the architecture cpu_loop.
+ */
+void (*record_sigbus)(CPUState *cpu, vaddr addr,
+  MMUAccessType access_type, uintptr_t ra);
 #endif /* CONFIG_SOFTMMU */
 #endif /* NEED_CPU_H */
 
-- 
2.25.1

[PATCH v4 04/48] target/arm: Implement arm_cpu_record_sigbus

2021-10-12 Thread Richard Henderson

Because of the complexity of setting ESR, re-use the existing
arm_cpu_do_unaligned_access function.  This means we have to
handle the exception ourselves in cpu_loop, transforming it
to the appropriate signal.

Signed-off-by: Richard Henderson 
---
 target/arm/internals.h|  2 ++
 linux-user/aarch64/cpu_loop.c | 12 +---
 linux-user/arm/cpu_loop.c | 30 ++
 target/arm/cpu.c  |  1 +
 target/arm/cpu_tcg.c  |  1 +
 target/arm/tlb_helper.c   |  6 ++
 6 files changed, 45 insertions(+), 7 deletions(-)

diff --git a/target/arm/internals.h b/target/arm/internals.h
index 5a7aaf0f51..89f7610ebc 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -548,6 +548,8 @@ static inline bool arm_extabort_type(MemTxResult result)
 void arm_cpu_record_sigsegv(CPUState *cpu, vaddr addr,
 MMUAccessType access_type,
 bool maperr, uintptr_t ra);
+void arm_cpu_record_sigbus(CPUState *cpu, vaddr addr,
+   MMUAccessType access_type, uintptr_t ra);
 #else
 bool arm_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
   MMUAccessType access_type, int mmu_idx,
diff --git a/linux-user/aarch64/cpu_loop.c b/linux-user/aarch64/cpu_loop.c
index 034b737435..97e0728b67 100644
--- a/linux-user/aarch64/cpu_loop.c
+++ b/linux-user/aarch64/cpu_loop.c
@@ -79,7 +79,7 @@
 void cpu_loop(CPUARMState *env)
 {
 CPUState *cs = env_cpu(env);
-int trapnr, ec, fsc, si_code;
+int trapnr, ec, fsc, si_code, si_signo;
 abi_long ret;
 
 for (;;) {
@@ -121,20 +121,26 @@ void cpu_loop(CPUARMState *env)
 fsc = extract32(env->exception.syndrome, 0, 6);
 switch (fsc) {
 case 0x04 ... 0x07: /* Translation fault, level {0-3} */
+si_signo = TARGET_SIGSEGV;
 si_code = TARGET_SEGV_MAPERR;
 break;
 case 0x09 ... 0x0b: /* Access flag fault, level {1-3} */
 case 0x0d ... 0x0f: /* Permission fault, level {1-3} */
+si_signo = TARGET_SIGSEGV;
 si_code = TARGET_SEGV_ACCERR;
 break;
 case 0x11: /* Synchronous Tag Check Fault */
+si_signo = TARGET_SIGSEGV;
 si_code = TARGET_SEGV_MTESERR;
 break;
+case 0x21: /* Alignment fault */
+si_signo = TARGET_SIGBUS;
+si_code = TARGET_BUS_ADRALN;
+break;
 default:
 g_assert_not_reached();
 }
-
-force_sig_fault(TARGET_SIGSEGV, si_code, env->exception.vaddress);
+force_sig_fault(si_signo, si_code, env->exception.vaddress);
 break;
 case EXCP_DEBUG:
 case EXCP_BKPT:
diff --git a/linux-user/arm/cpu_loop.c b/linux-user/arm/cpu_loop.c
index ae09adcb95..01cb6eb534 100644
--- a/linux-user/arm/cpu_loop.c
+++ b/linux-user/arm/cpu_loop.c
@@ -25,6 +25,7 @@
 #include "cpu_loop-common.h"
 #include "signal-common.h"
 #include "semihosting/common-semi.h"
+#include "target/arm/syndrome.h"
 
 #define get_user_code_u32(x, gaddr, env)\
 ({ abi_long __r = get_user_u32((x), (gaddr));   \
@@ -280,7 +281,7 @@ static bool emulate_arm_fpa11(CPUARMState *env, uint32_t 
opcode)
 void cpu_loop(CPUARMState *env)
 {
 CPUState *cs = env_cpu(env);
-int trapnr;
+int trapnr, si_signo, si_code;
 unsigned int n, insn;
 abi_ulong ret;
 
@@ -423,9 +424,30 @@ void cpu_loop(CPUARMState *env)
 break;
 case EXCP_PREFETCH_ABORT:
 case EXCP_DATA_ABORT:
-/* XXX: check env->error_code */
-force_sig_fault(TARGET_SIGSEGV, TARGET_SEGV_MAPERR,
-env->exception.vaddress);
+/* For user-only we don't set TTBCR_EAE, so look at the FSR. */
+switch (env->exception.fsr & 0x1f) {
+case 0x1: /* Alignment */
+si_signo = TARGET_SIGBUS;
+si_code = TARGET_BUS_ADRALN;
+break;
+case 0x3: /* Access flag fault, level 1 */
+case 0x6: /* Access flag fault, level 2 */
+case 0x9: /* Domain fault, level 1 */
+case 0xb: /* Domain fault, level 2 */
+case 0xd: /* Permision fault, level 1 */
+case 0xf: /* Permision fault, level 2 */
+si_signo = TARGET_SIGSEGV;
+si_code = TARGET_SEGV_ACCERR;
+break;
+case 0x5: /* Translation fault, level 1 */
+case 0x7: /* Translation fault, level 2 */
+si_signo = TARGET_SIGSEGV;
+si_code = TARGET_SEGV_MAPERR;
+break;
+default:
+g_assert_not_reached();
+}
+force_sig_fault(si_signo, si_code, env->exception.vaddress);
 break;
 case EXCP_DEBUG:
 case

Re: [PATCH 1/2] hw/core/machine: Split out smp_parse as an inline API

2021-10-12 Thread wangyanan (Y)




On 2021/10/12 22:36, Markus Armbruster wrote:

"wangyanan (Y)"  writes:


Hi Markus,

On 2021/10/11 13:26, Markus Armbruster wrote:

Yanan Wang  writes:


Functionally smp_parse() is only called once and in one place
i.e. machine_set_smp, the possible second place where it'll be
called should be some unit tests if any.

Actually we are going to introduce an unit test for the parser.
For necessary isolation of the tested code, split smp_parse out
into a separate header as an inline API.

Why inline?

The motivation of the splitting is to isolate the tested smp_parse
from the other unrelated code in machine.c, so that we can solve
the build dependency problem for the unit test.

I once tried to split smp_parse out into a source file in [1] for the
test, but it looks more concise and convenient to make it as an
inline function in a header compared to [1]. Given that we only call
it in one place, it may not be harmful to keep it an inline.

Anyway, I not sure the method in this patch is most appropriate
and compliant. If it's just wrong I can change back to [1]. :)

[1]
https://lore.kernel.org/qemu-devel/20210910073025.16480-16-wangyana...@huawei.com/#t

I'd prefer to keep it in .c, but I'm not the maintainer.


Ok, I will move it into a separate .c file in v2, which seems to meet
the standards more.

Thanks,
Yanan

Re: [PATCH 00/16] fdt: Make OF_BOARD a boolean option

2021-10-12 Thread Tom Rini

On Wed, Oct 13, 2021 at 09:29:14AM +0800, Bin Meng wrote:
> Hi Simon,
> 
> On Wed, Oct 13, 2021 at 9:01 AM Simon Glass  wrote:
> >
> > With Ilias' efforts we have dropped OF_PRIOR_STAGE and OF_HOSTFILE so
> > there are only three ways to obtain a devicetree:
> >
> >- OF_SEPARATE - the normal way, where the devicetree is built and
> >   appended to U-Boot
> >- OF_EMBED - for development purposes, the devicetree is embedded in
> >   the ELF file (also used for EFI)
> >- OF_BOARD - the board figures it out on its own
> >
> > The last one is currently set up so that no devicetree is needed at all
> > in the U-Boot tree. Most boards do provide one, but some don't. Some
> > don't even provide instructions on how to boot on the board.
> >
> > The problems with this approach are documented at [1].
> >
> > In practice, OF_BOARD is not really distinct from OF_SEPARATE. Any board
> > can obtain its devicetree at runtime, even it is has a devicetree built
> > in U-Boot. This is because U-Boot may be a second-stage bootloader and its
> > caller may have a better idea about the hardware available in the machine.
> > This is the case with a few QEMU boards, for example.
> >
> > So it makes no sense to have OF_BOARD as a 'choice'. It should be an
> > option, available with either OF_SEPARATE or OF_EMBED.
> >
> > This series makes this change, adding various missing devicetree files
> > (and placeholders) to make the build work.
> 
> Adding device trees that are never used sounds like a hack to me.
> 
> For QEMU, device tree is dynamically generated on the fly based on
> command line parameters, and the device tree you put in this series
> has various hardcoded  values which normally do not show up
> in hand-written dts files.
> 
> I am not sure I understand the whole point of this.

I am also confused and do not like the idea of adding device trees for
platforms that are capable of and can / do have a device tree to give us
at run time.

-- 
Tom


signature.asc
Description: PGP signature

Re: [PATCH 00/16] fdt: Make OF_BOARD a boolean option

2021-10-12 Thread Bin Meng

Hi Simon,

On Wed, Oct 13, 2021 at 9:01 AM Simon Glass  wrote:
>
> With Ilias' efforts we have dropped OF_PRIOR_STAGE and OF_HOSTFILE so
> there are only three ways to obtain a devicetree:
>
>- OF_SEPARATE - the normal way, where the devicetree is built and
>   appended to U-Boot
>- OF_EMBED - for development purposes, the devicetree is embedded in
>   the ELF file (also used for EFI)
>- OF_BOARD - the board figures it out on its own
>
> The last one is currently set up so that no devicetree is needed at all
> in the U-Boot tree. Most boards do provide one, but some don't. Some
> don't even provide instructions on how to boot on the board.
>
> The problems with this approach are documented at [1].
>
> In practice, OF_BOARD is not really distinct from OF_SEPARATE. Any board
> can obtain its devicetree at runtime, even it is has a devicetree built
> in U-Boot. This is because U-Boot may be a second-stage bootloader and its
> caller may have a better idea about the hardware available in the machine.
> This is the case with a few QEMU boards, for example.
>
> So it makes no sense to have OF_BOARD as a 'choice'. It should be an
> option, available with either OF_SEPARATE or OF_EMBED.
>
> This series makes this change, adding various missing devicetree files
> (and placeholders) to make the build work.

Adding device trees that are never used sounds like a hack to me.

For QEMU, device tree is dynamically generated on the fly based on
command line parameters, and the device tree you put in this series
has various hardcoded  values which normally do not show up
in hand-written dts files.

I am not sure I understand the whole point of this.

>
> It also provides a few qemu clean-ups discovered along the way.
>
> This series is based on Ilias' two series for OF_HOSTFILE and
> OF_PRIOR_STAGE removal.
>
> It is available at u-boot-dm/ofb-working
>
> [1] 
> https://patchwork.ozlabs.org/project/uboot/patch/20210919215111.3830278-3-...@chromium.org/
>

Regards,
Bin

Re: [PATCH 2/2] tests/unit: Add an unit test for smp parsing

2021-10-12 Thread wangyanan (Y)




On 2021/10/12 21:51, Andrew Jones wrote:

On Sun, Oct 10, 2021 at 06:39:54PM +0800, Yanan Wang wrote:

Now that we have a generic parser smp_parse(), let's add an unit
test for the code. All possible valid/invalid SMP configurations
that the user can specify are covered.

Signed-off-by: Yanan Wang 
---
  MAINTAINERS |   1 +
  tests/unit/meson.build  |   1 +
  tests/unit/test-smp-parse.c | 613 
  3 files changed, 615 insertions(+)
  create mode 100644 tests/unit/test-smp-parse.c

diff --git a/MAINTAINERS b/MAINTAINERS
index dc9091c1d7..b5a5b1469b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1632,6 +1632,7 @@ F: include/hw/core/cpu.h
  F: include/hw/core/smp.h
  F: include/hw/cpu/cluster.h
  F: include/sysemu/numa.h
+F: tests/unit/test-smp-parse.c
  T: git https://gitlab.com/ehabkost/qemu.git machine-next
  
  Xtensa Machines

diff --git a/tests/unit/meson.build b/tests/unit/meson.build
index 7c297d7e5c..0382669fcf 100644
--- a/tests/unit/meson.build
+++ b/tests/unit/meson.build
@@ -45,6 +45,7 @@ tests = {
'test-uuid': [],
'ptimer-test': ['ptimer-test-stubs.c', meson.project_source_root() / 
'hw/core/ptimer.c'],
'test-qapi-util': [],
+  'test-smp-parse': [qom],
  }
  
  if have_system or have_tools

diff --git a/tests/unit/test-smp-parse.c b/tests/unit/test-smp-parse.c
new file mode 100644
index 00..7be258171e
--- /dev/null
+++ b/tests/unit/test-smp-parse.c
@@ -0,0 +1,613 @@
+/*
+ * SMP parsing unit-tests
+ *
+ * Copyright (c) 2021 Huawei Technologies Co., Ltd
+ *
+ * Authors:
+ *  Yanan Wang 
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.1 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qom/object.h"
+#include "qemu/module.h"
+#include "qapi/error.h"
+
+#include "hw/boards.h"
+#include "hw/core/smp.h"
+
+#define T true
+#define F false
+
+#define MIN_CPUS 1   /* set the min CPUs supported by the machine as 1 */
+#define MAX_CPUS 512 /* set the max CPUs supported by the machine as 512 */
+
+/*
+ * Used to define the generic 3-level CPU topology hierarchy
+ *  -sockets/cores/threads
+ */
+#define SMP_CONFIG_GENERIC(ha, a, hb, b, hc, c, hd, d, he, e) \
+{ \
+.has_cpus= ha, .cpus= a,  \
+.has_sockets = hb, .sockets = b,  \
+.has_cores   = hc, .cores   = c,  \
+.has_threads = hd, .threads = d,  \
+.has_maxcpus = he, .maxcpus = e,  \
+}
+
+#define CPU_TOPOLOGY_GENERIC(a, b, c, d, e)   \
+{ \
+.cpus = a,\
+.sockets  = b,\
+.cores= c,\
+.threads  = d,\
+.max_cpus = e,\
+}
+
+/*
+ * Currently a 4-level topology hierarchy is supported on PC machines
+ *  -sockets/dies/cores/threads
+ */
+#define SMP_CONFIG_WITH_DIES(ha, a, hb, b, hc, c, hd, d, he, e, hf, f) \
+{ \
+.has_cpus= ha, .cpus= a,  \
+.has_sockets = hb, .sockets = b,  \
+.has_dies= hc, .dies= c,  \
+.has_cores   = hd, .cores   = d,  \
+.has_threads = he, .threads = e,  \
+.has_maxcpus = hf, .maxcpus = f,  \
+}
+
+#define CPU_TOPOLOGY_WITH_DIES(a, b, c, d, e, f)  \
+{ \
+.cpus = a,\
+.sockets  = b,\
+.dies = c,\
+.cores= d,\
+.threads  = e,\
+.max_cpus = f,\
+}
+
+/**
+ * @config - the given SMP configuration
+ * @expect_prefer_sockets - the expected parsing result for the
+ * valid configuration, when sockets are preferred over cores
+ * @expect_prefer_cores - the expected parsing result for the
+ * valid configuration, when cores are preferred over sockets
+ * @expect_error - the expected error report when the given
+ * configuration is invalid
+ */
+typedef struct SMPTestData {
+SMPConfiguration config;
+CpuTopology expect_prefer_sockets;
+CpuTopology expect_prefer_cores;
+const char *expect_error;
+} SMPTestData;
+
+/* Type info of the tested machine */
+static const TypeInfo smp_machine_info = {
+.name =

Re: [PATCH 02/16] arm: qemu: Explain how to extract the generate devicetree

2021-10-12 Thread François Ozog

Le mer. 13 oct. 2021 à 03:02, Simon Glass  a écrit :

> QEMU currently generates a devicetree for use with U-Boot. Explain how to
> obtain it.
>
> Signed-off-by: Simon Glass 
> ---
>
>  doc/board/emulation/qemu-arm.rst | 12 
>  1 file changed, 12 insertions(+)
>
> diff --git a/doc/board/emulation/qemu-arm.rst
> b/doc/board/emulation/qemu-arm.rst
> index 97b6ec64905..b458a398c69 100644
> --- a/doc/board/emulation/qemu-arm.rst
> +++ b/doc/board/emulation/qemu-arm.rst
> @@ -91,3 +91,15 @@ The debug UART on the ARM virt board uses these
> settings::
>  CONFIG_DEBUG_UART_PL010=y
>  CONFIG_DEBUG_UART_BASE=0x900
>  CONFIG_DEBUG_UART_CLOCK=0
> +
> +Obtaining the QEMU devicetree
> +-
> +
> +QEMU generates its own devicetree to pass to U-Boot and does this by
> default.
> +You can use `-dtb u-boot.dtb` to force QEMU to use U-Boot's in-tree
> version.

this is for either Qemu experts or u-boot for Qemu maintainers. Not for the
kernel développer as it is recipe for problems: could you add this warning ?

>
> +
> +To obtain the devicetree that qemu generates, add `-machine
> dumpdtb=dtb.dtb`,
> +e.g.::
> +
> +qemu-system-aarch64 -machine virt -nographic -cpu cortex-a57 \
> +   -bios u-boot.bin -machine dumpdtb=dtb.dtb
> --
> 2.33.0.882.g93a45727a2-goog
>
> --
François-Frédéric Ozog | *Director Business Development*
T: +33.67221.6485
francois.o...@linaro.org | Skype: ffozog

Re: [PATCH 05/16] arm: qemu: Add a devicetree file for qemu_arm64

2021-10-12 Thread François Ozog

Hi Simon

The only place I could agree with this file presence is in the
documentation directory, not in dts. It creates a mental picture  for the
reader an entirely bad mind scheme around Qemu and DT.

And even in a documentation directory I would place a bug warning: don’t
use this with any kernel , Qemu generates a DT dynamically based on cpu,
memory and devices specified at the command line.

I would also document how to get the DT that Qemu generates (and lkvm btw)
outside any firmware/os provided.

Cheers

FF

Le mer. 13 oct. 2021 à 03:03, Simon Glass  a écrit :

> Add this file, generated from qemu, so there is a reference devicetree
> in the U-Boot tree.
>
> Signed-off-by: Simon Glass 
> ---
>
>  arch/arm/dts/Makefile|   2 +-
>  arch/arm/dts/qemu-arm64.dts  | 381 +++
>  configs/qemu_arm64_defconfig |   1 +
>  3 files changed, 383 insertions(+), 1 deletion(-)
>  create mode 100644 arch/arm/dts/qemu-arm64.dts
>
> diff --git a/arch/arm/dts/Makefile b/arch/arm/dts/Makefile
> index e2fc0cb65fc..52c586f3974 100644
> --- a/arch/arm/dts/Makefile
> +++ b/arch/arm/dts/Makefile
> @@ -1145,7 +1145,7 @@ dtb-$(CONFIG_TARGET_IMX8MM_CL_IOT_GATE) +=
> imx8mm-cl-iot-gate.dtb
>
>  dtb-$(CONFIG_TARGET_EA_LPC3250DEVKITV2) += lpc3250-ea3250.dtb
>
> -dtb-$(CONFIG_ARCH_QEMU) += qemu-arm.dtb
> +dtb-$(CONFIG_ARCH_QEMU) += qemu-arm.dtb qemu-arm64.dtb
>
>  targets += $(dtb-y)
>
> diff --git a/arch/arm/dts/qemu-arm64.dts b/arch/arm/dts/qemu-arm64.dts
> new file mode 100644
> index 000..7590e49cc84
> --- /dev/null
> +++ b/arch/arm/dts/qemu-arm64.dts
> @@ -0,0 +1,381 @@
> +// SPDX-License-Identifier: GPL-2.0+ OR MIT
> +/*
> + * Sample device tree for qemu_arm64
> +
> + * Copyright 2021 Google LLC
> + */
> +
> +/dts-v1/;
> +
> +/ {
> +   interrupt-parent = <0x8001>;
> +   #size-cells = <0x02>;
> +   #address-cells = <0x02>;
> +   compatible = "linux,dummy-virt";
> +
> +   psci {
> +   migrate = <0xc405>;
> +   cpu_on = <0xc403>;
> +   cpu_off = <0x8402>;
> +   cpu_suspend = <0xc401>;
> +   method = "hvc";
> +   compatible = "arm,psci-0.2\0arm,psci";
> +   };
> +
> +   memory@4000 {
> +   reg = <0x00 0x4000 0x00 0x800>;
> +   device_type = "memory";
> +   };
> +
> +   platform@c00 {
> +   interrupt-parent = <0x8001>;
> +   ranges = <0x00 0x00 0xc00 0x200>;
> +   #address-cells = <0x01>;
> +   #size-cells = <0x01>;
> +   compatible = "qemu,platform\0simple-bus";
> +   };
> +
> +   fw-cfg@902 {
> +   dma-coherent;
> +   reg = <0x00 0x902 0x00 0x18>;
> +   compatible = "qemu,fw-cfg-mmio";
> +   };
> +
> +   virtio_mmio@a00 {
> +   dma-coherent;
> +   interrupts = <0x00 0x10 0x01>;
> +   reg = <0x00 0xa00 0x00 0x200>;
> +   compatible = "virtio,mmio";
> +   };
> +
> +   virtio_mmio@a000200 {
> +   dma-coherent;
> +   interrupts = <0x00 0x11 0x01>;
> +   reg = <0x00 0xa000200 0x00 0x200>;
> +   compatible = "virtio,mmio";
> +   };
> +
> +   virtio_mmio@a000400 {
> +   dma-coherent;
> +   interrupts = <0x00 0x12 0x01>;
> +   reg = <0x00 0xa000400 0x00 0x200>;
> +   compatible = "virtio,mmio";
> +   };
> +
> +   virtio_mmio@a000600 {
> +   dma-coherent;
> +   interrupts = <0x00 0x13 0x01>;
> +   reg = <0x00 0xa000600 0x00 0x200>;
> +   compatible = "virtio,mmio";
> +   };
> +
> +   virtio_mmio@a000800 {
> +   dma-coherent;
> +   interrupts = <0x00 0x14 0x01>;
> +   reg = <0x00 0xa000800 0x00 0x200>;
> +   compatible = "virtio,mmio";
> +   };
> +
> +   virtio_mmio@a000a00 {
> +   dma-coherent;
> +   interrupts = <0x00 0x15 0x01>;
> +   reg = <0x00 0xa000a00 0x00 0x200>;
> +   compatible = "virtio,mmio";
> +   };
> +
> +   virtio_mmio@a000c00 {
> +   dma-coherent;
> +   interrupts = <0x00 0x16 0x01>;
> +   reg = <0x00 0xa000c00 0x00 0x200>;
> +   compatible = "virtio,mmio";
> +   };
> +
> +   virtio_mmio@a000e00 {
> +   dma-coherent;
> +   interrupts = <0x00 0x17 0x01>;
> +   reg = <0x00 0xa000e00 0x00 0x200>;
> +   compatible = "virtio,mmio";
> +   };
> +
> +   virtio_mmio@a001000 {
> +   dma-coherent;
> +   interrupts = <0x00 0x18 0x01>;
> +   reg = <0x00 0xa001000 0x00 0x200>;
> +   compatible = "virtio,mmio";
> +   };
> +
> +   virtio_mmio@a001200 {
> +   dma-coherent;
> +

Re: [PULL 00/10] Python patches

2021-10-12 Thread Richard Henderson


On 10/12/21 2:41 PM, John Snow wrote:

The following changes since commit bfd9a76f9c143d450ab5545dedfa74364b39fc56:

   Merge remote-tracking branch 'remotes/stsquad/tags/pull-for-6.2-121021-2' 
into staging (2021-10-12 06:16:25 -0700)

are available in the Git repository at:

   https://gitlab.com/jsnow/qemu.git tags/python-pull-request

for you to fetch changes up to c163c723ef92d0f629d015902396f2c67328b2e5:

   python, iotests: remove socket_scm_helper (2021-10-12 12:22:11 -0400)


Pull request



John Snow (10):
   python/aqmp: add greeting property to QMPClient
   python/aqmp: add .empty() method to EventListener
   python/aqmp: Return cleared events from EventListener.clear()
   python/aqmp: add send_fd_scm
   python/aqmp: Add dict conversion method to Greeting object
   python/aqmp: Reduce severity of EOFError-caused loop terminations
   python/aqmp: Disable logging messages by default
   python/qmp: clear events on get_events() call
   python/qmp: add send_fd_scm directly to QEMUMonitorProtocol
   python, iotests: remove socket_scm_helper

  tests/qemu-iotests/socket_scm_helper.c | 136 -
  python/qemu/aqmp/__init__.py   |   4 +
  python/qemu/aqmp/events.py |  15 ++-
  python/qemu/aqmp/models.py |  13 +++
  python/qemu/aqmp/protocol.py   |   7 +-
  python/qemu/aqmp/qmp_client.py |  27 +
  python/qemu/machine/machine.py |  48 ++---
  python/qemu/machine/qtest.py   |   2 -
  python/qemu/qmp/__init__.py|  27 +++--
  python/qemu/qmp/qmp_shell.py   |   1 -
  tests/Makefile.include |   1 -
  tests/meson.build  |   4 -
  tests/qemu-iotests/iotests.py  |   3 -
  tests/qemu-iotests/meson.build |   5 -
  tests/qemu-iotests/testenv.py  |   8 +-
  15 files changed, 85 insertions(+), 216 deletions(-)
  delete mode 100644 tests/qemu-iotests/socket_scm_helper.c
  delete mode 100644 tests/qemu-iotests/meson.build


Applied, thanks.

r~

[PATCH 02/16] arm: qemu: Explain how to extract the generate devicetree

2021-10-12 Thread Simon Glass

QEMU currently generates a devicetree for use with U-Boot. Explain how to
obtain it.

Signed-off-by: Simon Glass 
---

 doc/board/emulation/qemu-arm.rst | 12 
 1 file changed, 12 insertions(+)

diff --git a/doc/board/emulation/qemu-arm.rst b/doc/board/emulation/qemu-arm.rst
index 97b6ec64905..b458a398c69 100644
--- a/doc/board/emulation/qemu-arm.rst
+++ b/doc/board/emulation/qemu-arm.rst
@@ -91,3 +91,15 @@ The debug UART on the ARM virt board uses these settings::
 CONFIG_DEBUG_UART_PL010=y
 CONFIG_DEBUG_UART_BASE=0x900
 CONFIG_DEBUG_UART_CLOCK=0
+
+Obtaining the QEMU devicetree
+-
+
+QEMU generates its own devicetree to pass to U-Boot and does this by default.
+You can use `-dtb u-boot.dtb` to force QEMU to use U-Boot's in-tree version.
+
+To obtain the devicetree that qemu generates, add `-machine dumpdtb=dtb.dtb`,
+e.g.::
+
+qemu-system-aarch64 -machine virt -nographic -cpu cortex-a57 \
+   -bios u-boot.bin -machine dumpdtb=dtb.dtb
-- 
2.33.0.882.g93a45727a2-goog

[PATCH 04/16] arm: qemu: Add a devicetree file for qemu_arm

2021-10-12 Thread Simon Glass

Add this file, generated from qemu, so there is a reference devicetree
in the U-Boot tree.

Signed-off-by: Simon Glass 
---

 arch/arm/dts/Makefile  |   2 +
 arch/arm/dts/qemu-arm.dts  | 402 +
 configs/qemu_arm_defconfig |   1 +
 3 files changed, 405 insertions(+)
 create mode 100644 arch/arm/dts/qemu-arm.dts

diff --git a/arch/arm/dts/Makefile b/arch/arm/dts/Makefile
index b8a382d1539..e2fc0cb65fc 100644
--- a/arch/arm/dts/Makefile
+++ b/arch/arm/dts/Makefile
@@ -1145,6 +1145,8 @@ dtb-$(CONFIG_TARGET_IMX8MM_CL_IOT_GATE) += 
imx8mm-cl-iot-gate.dtb
 
 dtb-$(CONFIG_TARGET_EA_LPC3250DEVKITV2) += lpc3250-ea3250.dtb
 
+dtb-$(CONFIG_ARCH_QEMU) += qemu-arm.dtb
+
 targets += $(dtb-y)
 
 # Add any required device tree compiler flags here
diff --git a/arch/arm/dts/qemu-arm.dts b/arch/arm/dts/qemu-arm.dts
new file mode 100644
index 000..790571a9d9e
--- /dev/null
+++ b/arch/arm/dts/qemu-arm.dts
@@ -0,0 +1,402 @@
+// SPDX-License-Identifier: GPL-2.0+ OR MIT
+/*
+ * Sample device tree for qemu_arm
+
+ * Copyright 2021 Google LLC
+ */
+
+/dts-v1/;
+
+/ {
+   interrupt-parent = <0x8001>;
+   #size-cells = <0x02>;
+   #address-cells = <0x02>;
+   compatible = "linux,dummy-virt";
+
+   psci {
+   migrate = <0x8405>;
+   cpu_on = <0x8403>;
+   cpu_off = <0x8402>;
+   cpu_suspend = <0x8401>;
+   method = "hvc";
+   compatible = "arm,psci-0.2\0arm,psci";
+   };
+
+   memory@4000 {
+   reg = <0x00 0x4000 0x00 0x800>;
+   device_type = "memory";
+   };
+
+   platform@c00 {
+   interrupt-parent = <0x8001>;
+   ranges = <0x00 0x00 0xc00 0x200>;
+   #address-cells = <0x01>;
+   #size-cells = <0x01>;
+   compatible = "qemu,platform\0simple-bus";
+   };
+
+   fw-cfg@902 {
+   dma-coherent;
+   reg = <0x00 0x902 0x00 0x18>;
+   compatible = "qemu,fw-cfg-mmio";
+   };
+
+   virtio_mmio@a00 {
+   dma-coherent;
+   interrupts = <0x00 0x10 0x01>;
+   reg = <0x00 0xa00 0x00 0x200>;
+   compatible = "virtio,mmio";
+   };
+
+   virtio_mmio@a000200 {
+   dma-coherent;
+   interrupts = <0x00 0x11 0x01>;
+   reg = <0x00 0xa000200 0x00 0x200>;
+   compatible = "virtio,mmio";
+   };
+
+   virtio_mmio@a000400 {
+   dma-coherent;
+   interrupts = <0x00 0x12 0x01>;
+   reg = <0x00 0xa000400 0x00 0x200>;
+   compatible = "virtio,mmio";
+   };
+
+   virtio_mmio@a000600 {
+   dma-coherent;
+   interrupts = <0x00 0x13 0x01>;
+   reg = <0x00 0xa000600 0x00 0x200>;
+   compatible = "virtio,mmio";
+   };
+
+   virtio_mmio@a000800 {
+   dma-coherent;
+   interrupts = <0x00 0x14 0x01>;
+   reg = <0x00 0xa000800 0x00 0x200>;
+   compatible = "virtio,mmio";
+   };
+
+   virtio_mmio@a000a00 {
+   dma-coherent;
+   interrupts = <0x00 0x15 0x01>;
+   reg = <0x00 0xa000a00 0x00 0x200>;
+   compatible = "virtio,mmio";
+   };
+
+   virtio_mmio@a000c00 {
+   dma-coherent;
+   interrupts = <0x00 0x16 0x01>;
+   reg = <0x00 0xa000c00 0x00 0x200>;
+   compatible = "virtio,mmio";
+   };
+
+   virtio_mmio@a000e00 {
+   dma-coherent;
+   interrupts = <0x00 0x17 0x01>;
+   reg = <0x00 0xa000e00 0x00 0x200>;
+   compatible = "virtio,mmio";
+   };
+
+   virtio_mmio@a001000 {
+   dma-coherent;
+   interrupts = <0x00 0x18 0x01>;
+   reg = <0x00 0xa001000 0x00 0x200>;
+   compatible = "virtio,mmio";
+   };
+
+   virtio_mmio@a001200 {
+   dma-coherent;
+   interrupts = <0x00 0x19 0x01>;
+   reg = <0x00 0xa001200 0x00 0x200>;
+   compatible = "virtio,mmio";
+   };
+
+   virtio_mmio@a001400 {
+   dma-coherent;
+   interrupts = <0x00 0x1a 0x01>;
+   reg = <0x00 0xa001400 0x00 0x200>;
+   compatible = "virtio,mmio";
+   };
+
+   virtio_mmio@a001600 {
+   dma-coherent;
+   interrupts = <0x00 0x1b 0x01>;
+   reg = <0x00 0xa001600 0x00 0x200>;
+   compatible = "virtio,mmio";
+   };
+
+   virtio_mmio@a001800 {
+   dma-coherent;
+   interrupts = <0x00 0x1c 0x01>;
+   reg = <0x00 0xa001800 0x00 0x200>;
+   compatible = "virtio,mmio";
+   };
+
+   virtio_mmio@a001a00 {
+   dma-coherent;
+   interrupts = <0x00

[PATCH 03/16] riscv: qemu: Explain how to extract the generate devicetree

2021-10-12 Thread Simon Glass

QEMU currently generates a devicetree for use with U-Boot. Explain how to
obtain it.

Signed-off-by: Simon Glass 
---

 doc/board/emulation/qemu-riscv.rst | 12 
 1 file changed, 12 insertions(+)

diff --git a/doc/board/emulation/qemu-riscv.rst 
b/doc/board/emulation/qemu-riscv.rst
index 4b8e104a215..b3cf7085847 100644
--- a/doc/board/emulation/qemu-riscv.rst
+++ b/doc/board/emulation/qemu-riscv.rst
@@ -113,3 +113,15 @@ An attached disk can be emulated by adding::
 -device ide-hd,drive=mydisk,bus=ahci.0
 
 You will have to run 'scsi scan' to use it.
+
+Obtaining the QEMU devicetree
+-
+
+QEMU generates its own devicetree to pass to U-Boot and does this by default.
+You can use `-dtb u-boot.dtb` to force QEMU to use U-Boot's in-tree version.
+
+To obtain the devicetree that qemu generates, add `-machine dumpdtb=dtb.dtb`,
+e.g.::
+
+qemu-system-riscv64 -nographic -machine virt -bios u-boot \
+   -machine dumpdtb=dtb.dtb
-- 
2.33.0.882.g93a45727a2-goog

[PATCH 01/16] arm: qemu: Mention -nographic in the docs

2021-10-12 Thread Simon Glass

Without this option QEMU appears to hang. Add it to avoid confusion.

Signed-off-by: Simon Glass 
---

 doc/board/emulation/qemu-arm.rst | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/doc/board/emulation/qemu-arm.rst b/doc/board/emulation/qemu-arm.rst
index 8d7fda10f15..97b6ec64905 100644
--- a/doc/board/emulation/qemu-arm.rst
+++ b/doc/board/emulation/qemu-arm.rst
@@ -41,14 +41,15 @@ The minimal QEMU command line to get U-Boot up and running 
is:
 
 - For ARM::
 
-qemu-system-arm -machine virt -bios u-boot.bin
+qemu-system-arm -machine virt -nographic -bios u-boot.bin
 
 - For AArch64::
 
-qemu-system-aarch64 -machine virt -cpu cortex-a57 -bios u-boot.bin
+qemu-system-aarch64 -machine virt -nographic -cpu cortex-a57 -bios 
u-boot.bin
 
 Note that for some odd reason qemu-system-aarch64 needs to be explicitly
-told to use a 64-bit CPU or it will boot in 32-bit mode.
+told to use a 64-bit CPU or it will boot in 32-bit mode. The -nographic 
argument
+ensures that output appears on the terminal. Use Ctrl-A X to quit.
 
 Additional persistent U-boot environment support can be added as follows:
 
-- 
2.33.0.882.g93a45727a2-goog

[PATCH 06/16] riscv: qemu: Add devicetree files for qemu_riscv32/64

2021-10-12 Thread Simon Glass

Add these files, generated from qemu, so there is a reference devicetree
in the U-Boot tree.

Split the existing qemu-virt into two, since we need a different
devicetree for 32- and 64-bit machines.

Signed-off-by: Simon Glass 
---

 arch/riscv/dts/Makefile  |   2 +-
 arch/riscv/dts/qemu-virt.dts |   8 -
 arch/riscv/dts/qemu-virt32.dts   | 217 +++
 arch/riscv/dts/qemu-virt64.dts   | 217 +++
 configs/qemu-riscv32_defconfig   |   1 +
 configs/qemu-riscv32_smode_defconfig |   1 +
 configs/qemu-riscv32_spl_defconfig   |   2 +-
 configs/qemu-riscv64_defconfig   |   1 +
 configs/qemu-riscv64_smode_defconfig |   1 +
 configs/qemu-riscv64_spl_defconfig   |   2 +-
 10 files changed, 441 insertions(+), 11 deletions(-)
 delete mode 100644 arch/riscv/dts/qemu-virt.dts
 create mode 100644 arch/riscv/dts/qemu-virt32.dts
 create mode 100644 arch/riscv/dts/qemu-virt64.dts

diff --git a/arch/riscv/dts/Makefile b/arch/riscv/dts/Makefile
index b6e9166767b..90d3f35e6e3 100644
--- a/arch/riscv/dts/Makefile
+++ b/arch/riscv/dts/Makefile
@@ -2,7 +2,7 @@
 
 dtb-$(CONFIG_TARGET_AX25_AE350) += ae350_32.dtb ae350_64.dtb
 dtb-$(CONFIG_TARGET_MICROCHIP_ICICLE) += microchip-mpfs-icicle-kit.dtb
-dtb-$(CONFIG_TARGET_QEMU_VIRT) += qemu-virt.dtb
+dtb-$(CONFIG_TARGET_QEMU_VIRT) += qemu-virt32.dtb qemu-virt64.dtb
 dtb-$(CONFIG_TARGET_OPENPITON_RISCV64) += openpiton-riscv64.dtb
 dtb-$(CONFIG_TARGET_SIFIVE_UNLEASHED) += hifive-unleashed-a00.dtb
 dtb-$(CONFIG_TARGET_SIFIVE_UNMATCHED) += hifive-unmatched-a00.dtb
diff --git a/arch/riscv/dts/qemu-virt.dts b/arch/riscv/dts/qemu-virt.dts
deleted file mode 100644
index fecff542b91..000
--- a/arch/riscv/dts/qemu-virt.dts
+++ /dev/null
@@ -1,8 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0+
-/*
- * Copyright (C) 2021, Bin Meng 
- */
-
-/dts-v1/;
-
-#include "binman.dtsi"
diff --git a/arch/riscv/dts/qemu-virt32.dts b/arch/riscv/dts/qemu-virt32.dts
new file mode 100644
index 000..3c449413523
--- /dev/null
+++ b/arch/riscv/dts/qemu-virt32.dts
@@ -0,0 +1,217 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * Copyright (C) 2021, Bin Meng 
+ */
+
+/dts-v1/;
+
+#include "binman.dtsi"
+
+/ {
+   #address-cells = <0x02>;
+   #size-cells = <0x02>;
+   compatible = "riscv-virtio";
+   model = "riscv-virtio,qemu";
+
+   fw-cfg@1010 {
+   dma-coherent;
+   reg = <0x00 0x1010 0x00 0x18>;
+   compatible = "qemu,fw-cfg-mmio";
+   };
+
+   flash@2000 {
+   bank-width = <0x04>;
+   reg = <0x00 0x2000 0x00 0x200
+   0x00 0x2200 0x00 0x200>;
+   compatible = "cfi-flash";
+   };
+
+   chosen {
+   bootargs = [00];
+   stdout-path = "/soc/uart@1000";
+   };
+
+   memory@8000 {
+   device_type = "memory";
+   reg = <0x00 0x8000 0x00 0x800>;
+   };
+
+   cpus {
+   #address-cells = <0x01>;
+   #size-cells = <0x00>;
+   timebase-frequency = <0x989680>;
+
+   cpu@0 {
+   phandle = <0x01>;
+   device_type = "cpu";
+   reg = <0x00>;
+   status = "okay";
+   compatible = "riscv";
+   riscv,isa = "rv32imafdcsu";
+   mmu-type = "riscv,sv32";
+
+   interrupt-controller {
+   #interrupt-cells = <0x01>;
+   interrupt-controller;
+   compatible = "riscv,cpu-intc";
+   phandle = <0x02>;
+   };
+   };
+
+   cpu-map {
+
+   cluster0 {
+
+   core0 {
+   cpu = <0x01>;
+   };
+   };
+   };
+   };
+
+   soc {
+   #address-cells = <0x02>;
+   #size-cells = <0x02>;
+   compatible = "simple-bus";
+   ranges;
+
+   rtc@101000 {
+   interrupts = <0x0b>;
+   interrupt-parent = <0x03>;
+   reg = <0x00 0x101000 0x00 0x1000>;
+   compatible = "google,goldfish-rtc";
+   };
+
+   uart@1000 {
+   interrupts = <0x0a>;
+   interrupt-parent = <0x03>;
+   clock-frequency = <0x384000>;
+   reg = <0x00 0x1000 0x00 0x100>;
+   compatible = "ns16550a";
+   };
+
+   poweroff {
+   value = <0x>;
+   offset = <0x00>;
+   regmap = <0x04>;
+   compatible =

[PATCH 05/16] arm: qemu: Add a devicetree file for qemu_arm64

2021-10-12 Thread Simon Glass

Add this file, generated from qemu, so there is a reference devicetree
in the U-Boot tree.

Signed-off-by: Simon Glass 
---

 arch/arm/dts/Makefile|   2 +-
 arch/arm/dts/qemu-arm64.dts  | 381 +++
 configs/qemu_arm64_defconfig |   1 +
 3 files changed, 383 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm/dts/qemu-arm64.dts

diff --git a/arch/arm/dts/Makefile b/arch/arm/dts/Makefile
index e2fc0cb65fc..52c586f3974 100644
--- a/arch/arm/dts/Makefile
+++ b/arch/arm/dts/Makefile
@@ -1145,7 +1145,7 @@ dtb-$(CONFIG_TARGET_IMX8MM_CL_IOT_GATE) += 
imx8mm-cl-iot-gate.dtb
 
 dtb-$(CONFIG_TARGET_EA_LPC3250DEVKITV2) += lpc3250-ea3250.dtb
 
-dtb-$(CONFIG_ARCH_QEMU) += qemu-arm.dtb
+dtb-$(CONFIG_ARCH_QEMU) += qemu-arm.dtb qemu-arm64.dtb
 
 targets += $(dtb-y)
 
diff --git a/arch/arm/dts/qemu-arm64.dts b/arch/arm/dts/qemu-arm64.dts
new file mode 100644
index 000..7590e49cc84
--- /dev/null
+++ b/arch/arm/dts/qemu-arm64.dts
@@ -0,0 +1,381 @@
+// SPDX-License-Identifier: GPL-2.0+ OR MIT
+/*
+ * Sample device tree for qemu_arm64
+
+ * Copyright 2021 Google LLC
+ */
+
+/dts-v1/;
+
+/ {
+   interrupt-parent = <0x8001>;
+   #size-cells = <0x02>;
+   #address-cells = <0x02>;
+   compatible = "linux,dummy-virt";
+
+   psci {
+   migrate = <0xc405>;
+   cpu_on = <0xc403>;
+   cpu_off = <0x8402>;
+   cpu_suspend = <0xc401>;
+   method = "hvc";
+   compatible = "arm,psci-0.2\0arm,psci";
+   };
+
+   memory@4000 {
+   reg = <0x00 0x4000 0x00 0x800>;
+   device_type = "memory";
+   };
+
+   platform@c00 {
+   interrupt-parent = <0x8001>;
+   ranges = <0x00 0x00 0xc00 0x200>;
+   #address-cells = <0x01>;
+   #size-cells = <0x01>;
+   compatible = "qemu,platform\0simple-bus";
+   };
+
+   fw-cfg@902 {
+   dma-coherent;
+   reg = <0x00 0x902 0x00 0x18>;
+   compatible = "qemu,fw-cfg-mmio";
+   };
+
+   virtio_mmio@a00 {
+   dma-coherent;
+   interrupts = <0x00 0x10 0x01>;
+   reg = <0x00 0xa00 0x00 0x200>;
+   compatible = "virtio,mmio";
+   };
+
+   virtio_mmio@a000200 {
+   dma-coherent;
+   interrupts = <0x00 0x11 0x01>;
+   reg = <0x00 0xa000200 0x00 0x200>;
+   compatible = "virtio,mmio";
+   };
+
+   virtio_mmio@a000400 {
+   dma-coherent;
+   interrupts = <0x00 0x12 0x01>;
+   reg = <0x00 0xa000400 0x00 0x200>;
+   compatible = "virtio,mmio";
+   };
+
+   virtio_mmio@a000600 {
+   dma-coherent;
+   interrupts = <0x00 0x13 0x01>;
+   reg = <0x00 0xa000600 0x00 0x200>;
+   compatible = "virtio,mmio";
+   };
+
+   virtio_mmio@a000800 {
+   dma-coherent;
+   interrupts = <0x00 0x14 0x01>;
+   reg = <0x00 0xa000800 0x00 0x200>;
+   compatible = "virtio,mmio";
+   };
+
+   virtio_mmio@a000a00 {
+   dma-coherent;
+   interrupts = <0x00 0x15 0x01>;
+   reg = <0x00 0xa000a00 0x00 0x200>;
+   compatible = "virtio,mmio";
+   };
+
+   virtio_mmio@a000c00 {
+   dma-coherent;
+   interrupts = <0x00 0x16 0x01>;
+   reg = <0x00 0xa000c00 0x00 0x200>;
+   compatible = "virtio,mmio";
+   };
+
+   virtio_mmio@a000e00 {
+   dma-coherent;
+   interrupts = <0x00 0x17 0x01>;
+   reg = <0x00 0xa000e00 0x00 0x200>;
+   compatible = "virtio,mmio";
+   };
+
+   virtio_mmio@a001000 {
+   dma-coherent;
+   interrupts = <0x00 0x18 0x01>;
+   reg = <0x00 0xa001000 0x00 0x200>;
+   compatible = "virtio,mmio";
+   };
+
+   virtio_mmio@a001200 {
+   dma-coherent;
+   interrupts = <0x00 0x19 0x01>;
+   reg = <0x00 0xa001200 0x00 0x200>;
+   compatible = "virtio,mmio";
+   };
+
+   virtio_mmio@a001400 {
+   dma-coherent;
+   interrupts = <0x00 0x1a 0x01>;
+   reg = <0x00 0xa001400 0x00 0x200>;
+   compatible = "virtio,mmio";
+   };
+
+   virtio_mmio@a001600 {
+   dma-coherent;
+   interrupts = <0x00 0x1b 0x01>;
+   reg = <0x00 0xa001600 0x00 0x200>;
+   compatible = "virtio,mmio";
+   };
+
+   virtio_mmio@a001800 {
+   dma-coherent;
+   interrupts = <0x00 0x1c 0x01>;
+   reg = <0x00 0xa001800 0x00 0x200>;
+   compatible = "virtio,mmio";
+   };
+
+   virtio_mmio@a001a00 {
+   dma-coherent;
+

[PATCH 00/16] fdt: Make OF_BOARD a boolean option

2021-10-12 Thread Simon Glass

With Ilias' efforts we have dropped OF_PRIOR_STAGE and OF_HOSTFILE so
there are only three ways to obtain a devicetree:

   - OF_SEPARATE - the normal way, where the devicetree is built and
  appended to U-Boot
   - OF_EMBED - for development purposes, the devicetree is embedded in
  the ELF file (also used for EFI)
   - OF_BOARD - the board figures it out on its own

The last one is currently set up so that no devicetree is needed at all
in the U-Boot tree. Most boards do provide one, but some don't. Some
don't even provide instructions on how to boot on the board.

The problems with this approach are documented at [1].

In practice, OF_BOARD is not really distinct from OF_SEPARATE. Any board
can obtain its devicetree at runtime, even it is has a devicetree built
in U-Boot. This is because U-Boot may be a second-stage bootloader and its
caller may have a better idea about the hardware available in the machine.
This is the case with a few QEMU boards, for example.

So it makes no sense to have OF_BOARD as a 'choice'. It should be an
option, available with either OF_SEPARATE or OF_EMBED.

This series makes this change, adding various missing devicetree files
(and placeholders) to make the build work.

It also provides a few qemu clean-ups discovered along the way.

This series is based on Ilias' two series for OF_HOSTFILE and
OF_PRIOR_STAGE removal.

It is available at u-boot-dm/ofb-working

[1] 
https://patchwork.ozlabs.org/project/uboot/patch/20210919215111.3830278-3-...@chromium.org/


Simon Glass (16):
  arm: qemu: Mention -nographic in the docs
  arm: qemu: Explain how to extract the generate devicetree
  riscv: qemu: Explain how to extract the generate devicetree
  arm: qemu: Add a devicetree file for qemu_arm
  arm: qemu: Add a devicetree file for qemu_arm64
  riscv: qemu: Add devicetree files for qemu_riscv32/64
  arm: rpi: Add a devicetree file for rpi_4
  arm: vexpress: Add a devicetree file for juno
  arm: xenguest_arm64: Add a fake devicetree file
  arm: octeontx: Add a fake devicetree file
  arm: xilinx_versal_virt: Add a devicetree file
  arm: bcm7xxx: Add a devicetree file
  arm: qemu-ppce500: Add a devicetree file
  arm: highbank: Add a fake devicetree file
  fdt: Make OF_BOARD a bool option
  Drop CONFIG_BINMAN_STANDALONE_FDT

 Makefile   |3 +-
 arch/arm/dts/Makefile  |   20 +-
 arch/arm/dts/bcm2711-rpi-4-b.dts   | 1958 
 arch/arm/dts/bcm7xxx.dts   |   15 +
 arch/arm/dts/highbank.dts  |   14 +
 arch/arm/dts/juno-r2.dts   | 1038 +
 arch/arm/dts/octeontx.dts  |   14 +
 arch/arm/dts/qemu-arm.dts  |  402 +
 arch/arm/dts/qemu-arm64.dts|  381 +
 arch/arm/dts/xenguest-arm64.dts|   15 +
 arch/arm/dts/xilinx-versal-virt.dts|  307 
 arch/powerpc/dts/Makefile  |1 +
 arch/powerpc/dts/qemu-ppce500.dts  |  264 
 arch/riscv/dts/Makefile|2 +-
 arch/riscv/dts/qemu-virt.dts   |8 -
 arch/riscv/dts/qemu-virt32.dts |  217 +++
 arch/riscv/dts/qemu-virt64.dts |  217 +++
 configs/bcm7260_defconfig  |1 +
 configs/bcm7445_defconfig  |1 +
 configs/highbank_defconfig |2 +-
 configs/octeontx2_95xx_defconfig   |1 +
 configs/octeontx2_96xx_defconfig   |1 +
 configs/octeontx_81xx_defconfig|1 +
 configs/octeontx_83xx_defconfig|1 +
 configs/qemu-ppce500_defconfig |2 +
 configs/qemu-riscv32_defconfig |1 +
 configs/qemu-riscv32_smode_defconfig   |1 +
 configs/qemu-riscv32_spl_defconfig |4 +-
 configs/qemu-riscv64_defconfig |1 +
 configs/qemu-riscv64_smode_defconfig   |1 +
 configs/qemu-riscv64_spl_defconfig |3 +-
 configs/qemu_arm64_defconfig   |1 +
 configs/qemu_arm_defconfig |1 +
 configs/rpi_4_32b_defconfig|1 +
 configs/rpi_4_defconfig|1 +
 configs/rpi_arm64_defconfig|1 +
 configs/vexpress_aemv8a_juno_defconfig |1 +
 configs/xenguest_arm64_defconfig   |1 +
 configs/xilinx_versal_virt_defconfig   |1 +
 doc/board/emulation/qemu-arm.rst   |   19 +-
 doc/board/emulation/qemu-riscv.rst |   12 +
 dts/Kconfig|   27 +-
 tools/binman/binman.rst|   20 -
 43 files changed, 4922 insertions(+), 61 deletions(-)
 create mode 100644 arch/arm/dts/bcm2711-rpi-4-b.dts
 create mode 100644 arch/arm/dts/bcm7xxx.dts
 create mode 100644 arch/arm/dts/highbank.dts
 create mode 100644 arch/arm/dts/juno-r2.dts
 create mode 100644 arch/arm/dts/octeontx.dts
 create mode 100644 arch/arm/dts/qemu-arm.dts
 create mode 100644 arch/arm/dts/qemu-arm64.dts
 create mode 100644 arch/arm/dts/xenguest-arm64.dts
 create mode 100644 arch/arm/dts/xilinx-versal-virt.dts
 create mode 100644

Re: [PATCH v4 00/11] virtio-iommu: Add ACPI support

2021-10-12 Thread Haiwei Li

On Tue, Oct 12, 2021 at 1:34 AM Jean-Philippe Brucker
 wrote:
>
> Hi Haiwei,
>
> On Mon, Oct 11, 2021 at 06:10:07PM +0800, Haiwei Li wrote:
> [...]
> > Gave up waiting for root file system device.  Common problems:
> >  - Boot args (cat /proc/cmdline)
> >- Check rootdelay= (did the system wait long enough?)
> >  - Missing modules (cat /proc/modules; ls /dev)
> > ALERT!  UUID=3caf26b5-4d08-43e0-8634-7573269c4f70 does not exist.
> > Dropping to a shell!
> >
> > Any suggestions? Thanks.
>
> It's possible that the rootfs is on a disk behind the IOMMU, and the IOMMU
> driver doesn't get loaded. That could happen, for example, if the
> virtio-iommu module is not present in the initramfs. Since IOMMU drivers
> are typically built into the kernel rather than modules, distro tools that
> build the initramfs might not pick up IOMMU modules. I'm guessing this
> could be the issue here because of the hints and "Dropping to a shell"
> line.
>
> The clean solution will be to patch the initramfs tools to learn about
> IOMMU drivers (I'm somewhat working on that). In the meantime, if this is
> indeed the problem, you could try explicitly adding the virtio-iommu
> module to the initramfs, or building the kernel with CONFIG_VIRTIO_IOMMU=y
> rather than =m, though that requires VIRTIO and VIRTIO_PCI to be built-in
> as well.

Thanks, Jean. It works.

--
Haiwei

Re: [PATCH 1/3] vfio-ccw: step down as maintainer

2021-10-12 Thread Matthew Rosato


On 10/12/21 10:40 AM, Cornelia Huck wrote:

I currently don't have time to act as vfio-ccw maintainer anymore,
so remove myself there.

Signed-off-by: Cornelia Huck 


Once again, thanks for all of your work on vfio-ccw.

Acked-by: Matthew Rosato 


---
  MAINTAINERS | 2 --
  1 file changed, 2 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 50435b8d2f50..14d131294156 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1862,7 +1862,6 @@ F: docs/igd-assign.txt
  F: docs/devel/vfio-migration.rst
  
  vfio-ccw

-M: Cornelia Huck 
  M: Eric Farman 
  M: Matthew Rosato 
  S: Supported
@@ -1870,7 +1869,6 @@ F: hw/vfio/ccw.c
  F: hw/s390x/s390-ccw.c
  F: include/hw/s390x/s390-ccw.h
  F: include/hw/s390x/vfio-ccw.h
-T: git https://gitlab.com/cohuck/qemu.git s390-next
  L: qemu-s3...@nongnu.org
  
  vfio-ap

Re: [PATCH 1/2] numa: Set default distance map if needed

2021-10-12 Thread Gavin Shan

dded Rob to CC.



As explained above.




or use ACPI tables which can
 describe memory-less NUMA nodes if fixing how DT is
 parsed unfeasible.


We use ACPI already for our guests, but we also generate a DT (which
edk2 consumes). We can't generate a valid DT when empty numa nodes

does edk2 actually uses numa info from QEMU?


are put on the command line unless we follow a DT spec saying how
to do that. The current spec says we should have a distance-map
that contains those nodes.


can you point out to the spec and place within it, pls?



https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20211012=58ae0b51506802713aa0e9956d1853ba4c722c98
("Documentation, dt, numa: Add note to empty NUMA node")

Thanks,
Gavin

Re: [RFC PATCH v2 09/16] hw/core/machine: Remove the dynamic sysbus devices type check

2021-10-12 Thread Alistair Francis

On Thu, Sep 23, 2021 at 2:23 AM Damien Hedde  wrote:
>
> Now that we check sysbus device types during device creation, we
> can remove the check done in the machine init done notifier.
> This was the only thing done by this notifier, so we remove the
> whole sysbus_notifier structure of the MachineState.
>
> Note: This notifier was checking all /peripheral and /peripheral-anon
> sysbus devices. Now we only check those added by -device cli option or
> device_add qmp command when handling the command/option. So if there
> are some devices added in one of these containers manually (eg in
> machine C code), these will not be checked anymore.
> This use case does not seem to appear apart from
> hw/xen/xen-legacy-backend.c (it uses qdev_set_id() and in this case,
> not for a sysbus device, so it's ok).
>
> Signed-off-by: Damien Hedde 

Acked-by: Alistair Francis 

Alistair

> ---
>  include/hw/boards.h |  1 -
>  hw/core/machine.c   | 27 ---
>  2 files changed, 28 deletions(-)
>
> diff --git a/include/hw/boards.h b/include/hw/boards.h
> index 934443c1cd..ccbc40355a 100644
> --- a/include/hw/boards.h
> +++ b/include/hw/boards.h
> @@ -311,7 +311,6 @@ typedef struct CpuTopology {
>  struct MachineState {
>  /*< private >*/
>  Object parent_obj;
> -Notifier sysbus_notifier;
>
>  /*< public >*/
>
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index 1a18912dc8..521438e90a 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -571,18 +571,6 @@ bool 
> machine_class_is_dynamic_sysbus_dev_allowed(MachineClass *mc,
>  return allowed;
>  }
>
> -static void validate_sysbus_device(SysBusDevice *sbdev, void *opaque)
> -{
> -MachineState *machine = opaque;
> -MachineClass *mc = MACHINE_GET_CLASS(machine);
> -
> -if (!device_is_dynamic_sysbus(mc, DEVICE(sbdev))) {
> -error_report("Option '-device %s' cannot be handled by this machine",
> - object_class_get_name(object_get_class(OBJECT(sbdev;
> -exit(1);
> -}
> -}
> -
>  static char *machine_get_memdev(Object *obj, Error **errp)
>  {
>  MachineState *ms = MACHINE(obj);
> @@ -598,17 +586,6 @@ static void machine_set_memdev(Object *obj, const char 
> *value, Error **errp)
>  ms->ram_memdev_id = g_strdup(value);
>  }
>
> -static void machine_init_notify(Notifier *notifier, void *data)
> -{
> -MachineState *machine = MACHINE(qdev_get_machine());
> -
> -/*
> - * Loop through all dynamically created sysbus devices and check if they 
> are
> - * all allowed.  If a device is not allowed, error out.
> - */
> -foreach_dynamic_sysbus_device(validate_sysbus_device, machine);
> -}
> -
>  HotpluggableCPUList *machine_query_hotpluggable_cpus(MachineState *machine)
>  {
>  int i;
> @@ -1030,10 +1007,6 @@ static void machine_initfn(Object *obj)
>  "Table (HMAT)");
>  }
>
> -/* Register notifier when init is done for sysbus sanity checks */
> -ms->sysbus_notifier.notify = machine_init_notify;
> -qemu_add_machine_init_done_notifier(>sysbus_notifier);
> -
>  /* default to mc->default_cpus */
>  ms->smp.cpus = mc->default_cpus;
>  ms->smp.max_cpus = mc->default_cpus;
> --
> 2.33.0
>
>

Re: [PATCH 1/2] numa: Set default distance map if needed

2021-10-12 Thread Gavin Shan


Hi Drew and Igor,

On 10/13/21 12:05 AM, Andrew Jones wrote:

On Tue, Oct 12, 2021 at 02:34:30PM +0200, Igor Mammedov wrote:

On Tue, 12 Oct 2021 13:48:02 +0200

On Tue, Oct 12, 2021 at 09:31:55PM +1100, Gavin Shan wrote:

On 10/12/21 8:40 PM, Igor Mammedov wrote:

On Wed,  6 Oct 2021 18:22:08 +0800
Gavin Shan  wrote:
   

The following option is used to specify the distance map. It's
possible the option isn't provided by user. In this case, the
distance map isn't populated and exposed to platform. On the
other hand, the empty NUMA node, where no memory resides, is
allowed on ARM64 virt platform. For these empty NUMA nodes,
their corresponding device-tree nodes aren't populated, but
their NUMA IDs should be included in the "/distance-map"
device-tree node, so that kernel can probe them properly if
device-tree is used.

-numa,dist,src=,dst=,val=

So when user doesn't specify distance map, we need to generate
the default distance map, where the local and remote distances
are 10 and 20 separately. This adds an extra parameter to the
exiting complete_init_numa_distance() to generate the default
distance map for this case.

Signed-off-by: Gavin Shan 



how about error-ing out if distance map is required but
not provided by user explicitly and asking user to fix
command line?

Reasoning behind this that defaults are hard to maintain
and will require compat hacks and being raod blocks down
the road.
Approach I was taking with generic NUMA code, is deprecating
defaults and replacing them with sanity checks, which bail
out on incorrect configuration and ask user to correct command line.
Hence I dislike approach taken in this patch.

If you really wish to provide default, push it out of
generic code into ARM specific one
(then I won't oppose it that much (I think PPC does
some magic like this))
Also behavior seems to be ARM specific so generic
NUMA code isn't a place for it anyways
   


Thanks for your comments.

Yep, Lets move the logic into hw/arm/virt in v3 because I think simply
error-ing out will block the existing configuration where the distance
map isn't provided by user. After moving the logic to hw/arm/virt,
this patch is consistent with PATCH[02/02] and the specific platform
is affected only.


Please don't move anything NUMA DT generic to hw/arm/virt. If the spec
isn't arch-specific, then the modeling shouldn't be either.




If you want to error-out for all configs missing the distance map, then
you'll need compat code.



If you only want to error-out for configs that
have empty NUMA nodes and are missing a distance map, then you don't
need compat code, because those configs never worked before anyway.


I think memory-less configs without distance map worked for x86 just fine.


Ah, yes, we should make the condition for erroring-out be

  have-memoryless-nodes && !have-distance-map && generate-DT

ACPI only architectures, x86, don't need to care about this.



Sure, I will change the code accordingly in v3. Thanks for discussing
it through with Igor :)



After looking at this thread all over again it seems to me that using
distance map as a source of numa ids is a mistake.


You'll have to discuss that with Rob Herring, as that was his proposal.
He'll expect a counterproposal though, which we don't have...



However, Getting the NUMA node IDs from PCI host bridge and CPUs aren't
working out. I will explain in another thread.

Thanks,
Gavin

[PATCH v3 6/7] python/aqmp: Create sync QMP wrapper for iotests

2021-10-12 Thread John Snow

This is a wrapper around the async QMPClient that mimics the old,
synchronous QEMUMonitorProtocol class. It is designed to be
interchangeable with the old implementation.

It does not, however, attempt to mimic Exception compatibility.

Signed-off-by: John Snow 
---
 python/qemu/aqmp/legacy.py | 138 +
 1 file changed, 138 insertions(+)
 create mode 100644 python/qemu/aqmp/legacy.py

diff --git a/python/qemu/aqmp/legacy.py b/python/qemu/aqmp/legacy.py
new file mode 100644
index 000..9e7b9fb80b9
--- /dev/null
+++ b/python/qemu/aqmp/legacy.py
@@ -0,0 +1,138 @@
+"""
+Sync QMP Wrapper
+
+This class pretends to be qemu.qmp.QEMUMonitorProtocol.
+"""
+
+import asyncio
+from typing import (
+Awaitable,
+List,
+Optional,
+TypeVar,
+Union,
+)
+
+import qemu.qmp
+from qemu.qmp import QMPMessage, QMPReturnValue, SocketAddrT
+
+from .qmp_client import QMPClient
+
+
+# pylint: disable=missing-docstring
+
+
+class QEMUMonitorProtocol(qemu.qmp.QEMUMonitorProtocol):
+def __init__(self, address: SocketAddrT,
+ server: bool = False,
+ nickname: Optional[str] = None):
+
+# pylint: disable=super-init-not-called
+self._aqmp = QMPClient(nickname)
+self._aloop = asyncio.get_event_loop()
+self._address = address
+self._timeout: Optional[float] = None
+
+_T = TypeVar('_T')
+
+def _sync(
+self, future: Awaitable[_T], timeout: Optional[float] = None
+) -> _T:
+return self._aloop.run_until_complete(
+asyncio.wait_for(future, timeout=timeout)
+)
+
+def _get_greeting(self) -> Optional[QMPMessage]:
+if self._aqmp.greeting is not None:
+# pylint: disable=protected-access
+return self._aqmp.greeting._asdict()
+return None
+
+# __enter__ and __exit__ need no changes
+# parse_address needs no changes
+
+def connect(self, negotiate: bool = True) -> Optional[QMPMessage]:
+self._aqmp.await_greeting = negotiate
+self._aqmp.negotiate = negotiate
+
+self._sync(
+self._aqmp.connect(self._address)
+)
+return self._get_greeting()
+
+def accept(self, timeout: Optional[float] = 15.0) -> QMPMessage:
+self._aqmp.await_greeting = True
+self._aqmp.negotiate = True
+
+self._sync(
+self._aqmp.accept(self._address),
+timeout
+)
+
+ret = self._get_greeting()
+assert ret is not None
+return ret
+
+def cmd_obj(self, qmp_cmd: QMPMessage) -> QMPMessage:
+return dict(
+self._sync(
+# pylint: disable=protected-access
+
+# _raw() isn't a public API, because turning off
+# automatic ID assignment is discouraged. For
+# compatibility with iotests *only*, do it anyway.
+self._aqmp._raw(qmp_cmd, assign_id=False),
+self._timeout
+)
+)
+
+# Default impl of cmd() delegates to cmd_obj
+
+def command(self, cmd: str, **kwds: object) -> QMPReturnValue:
+return self._sync(
+self._aqmp.execute(cmd, kwds),
+self._timeout
+)
+
+def pull_event(self,
+   wait: Union[bool, float] = False) -> Optional[QMPMessage]:
+if not wait:
+# wait is False/0: "do not wait, do not except."
+if self._aqmp.events.empty():
+return None
+
+# If wait is 'True', wait forever. If wait is False/0, the events
+# queue must not be empty; but it still needs some real amount
+# of time to complete.
+timeout = None
+if wait and isinstance(wait, float):
+timeout = wait
+
+return dict(
+self._sync(
+self._aqmp.events.get(),
+timeout
+)
+)
+
+def get_events(self, wait: Union[bool, float] = False) -> List[QMPMessage]:
+events = [dict(x) for x in self._aqmp.events.clear()]
+if events:
+return events
+
+event = self.pull_event(wait)
+return [event] if event is not None else []
+
+def clear_events(self) -> None:
+self._aqmp.events.clear()
+
+def close(self) -> None:
+self._sync(
+self._aqmp.disconnect()
+)
+
+def settimeout(self, timeout: Optional[float]) -> None:
+self._timeout = timeout
+
+def send_fd_scm(self, fd: int) -> None:
+self._aqmp.send_fd_scm(fd)
-- 
2.31.1

Re: [PATCH 2/3] s390x/kvm: step down as maintainer

2021-10-12 Thread Halil Pasic

On Tue, 12 Oct 2021 16:40:39 +0200
Cornelia Huck  wrote:

> I'm no longer involved with KVM/s390 on the kernel side, and I don't
> have enough resources to work on the s390 KVM cpus support, so I'll
> step down.
> 
> Signed-off-by: Cornelia Huck 

Acked-by: Halil Pasic 

Thank you for your invaluable work!

Re: [RFC PATCH v2 08/16] qdev-monitor: Check sysbus device type before creating it

2021-10-12 Thread Alistair Francis

On Thu, Sep 23, 2021 at 2:53 AM Damien Hedde  wrote:
>
> Add an early check to test if the requested sysbus device type
> is allowed by the current machine before creating the device. This
> impacts both -device cli option and device_add qmp command.
>
> Before this patch, the check was done well after the device has
> been created (in a machine init done notifier). We can now report
> the error right away.
>
> Signed-off-by: Damien Hedde 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  softmmu/qdev-monitor.c | 11 +++
>  1 file changed, 11 insertions(+)
>
> diff --git a/softmmu/qdev-monitor.c b/softmmu/qdev-monitor.c
> index 47ccd90be8..f1c9242855 100644
> --- a/softmmu/qdev-monitor.c
> +++ b/softmmu/qdev-monitor.c
> @@ -40,6 +40,7 @@
>  #include "qemu/cutils.h"
>  #include "hw/qdev-properties.h"
>  #include "hw/clock.h"
> +#include "hw/boards.h"
>
>  /*
>   * Aliases were a bad idea from the start.  Let's keep them
> @@ -268,6 +269,16 @@ static DeviceClass *qdev_get_device_class(const char 
> **driver, Error **errp)
>  return NULL;
>  }
>
> +if (object_class_dynamic_cast(oc, TYPE_SYS_BUS_DEVICE)) {
> +/* sysbus devices need to be allowed by the machine */
> +MachineClass *mc = 
> MACHINE_CLASS(object_get_class(qdev_get_machine()));
> +if (!machine_class_is_dynamic_sysbus_dev_allowed(mc, *driver)) {
> +error_setg(errp, "'%s' is not an allowed pluggable sysbus device 
> "
> + " type for the machine", *driver);
> +return NULL;
> +}
> +}
> +
>  return dc;
>  }
>
> --
> 2.33.0
>
>

[PATCH v3 3/7] python/aqmp: Remove scary message

2021-10-12 Thread John Snow

The scary message interferes with the iotests output. Coincidentally, if
iotests works by removing this, then it's good evidence that we don't
really need to scare people away from using it.

Signed-off-by: John Snow 
---
 python/qemu/aqmp/__init__.py | 12 
 1 file changed, 12 deletions(-)

diff --git a/python/qemu/aqmp/__init__.py b/python/qemu/aqmp/__init__.py
index d1b0e4dc3d3..880d5b6fa7f 100644
--- a/python/qemu/aqmp/__init__.py
+++ b/python/qemu/aqmp/__init__.py
@@ -22,7 +22,6 @@
 # the COPYING file in the top-level directory.
 
 import logging
-import warnings
 
 from .error import AQMPError
 from .events import EventListener
@@ -31,17 +30,6 @@
 from .qmp_client import ExecInterruptedError, ExecuteError, QMPClient
 
 
-_WMSG = """
-
-The Asynchronous QMP library is currently in development and its API
-should be considered highly fluid and subject to change. It should
-not be used by any other scripts checked into the QEMU tree.
-
-Proceed with caution!
-"""
-
-warnings.warn(_WMSG, FutureWarning)
-
 # Suppress logging unless an application engages it.
 logging.getLogger('qemu.aqmp').addHandler(logging.NullHandler())
 
-- 
2.31.1

Re: [PATCH 3/3] s390x virtio-ccw machine: step down as maintainer

2021-10-12 Thread Halil Pasic

On Tue, 12 Oct 2021 16:40:40 +0200
Cornelia Huck  wrote:

> I currently don't have time to work on the s390x virtio-ccw machine
> anymore, so let's step down. (I will, however, continue as a
> maintainer for the virtio-ccw *transport*.)
> 
> Signed-off-by: Cornelia Huck 

Acked-by: Halil Pasic 

Thank you for your invaluable work!

[PATCH v3 7/7] python, iotests: replace qmp with aqmp

2021-10-12 Thread John Snow

Swap out the synchronous QEMUMonitorProtocol from qemu.qmp with the sync
wrapper from qemu.aqmp instead.

Add an escape hatch in the form of the environment variable
QEMU_PYTHON_LEGACY_QMP which allows you to cajole QEMUMachine into using
the old implementatin, proving that both implementations work
concurrently.

Signed-off-by: John Snow 
Reviewed-by: Hanna Reitz 
Tested-by: Hanna Reitz 
---
 python/qemu/machine/machine.py | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
index a0cf69786b4..a487c397459 100644
--- a/python/qemu/machine/machine.py
+++ b/python/qemu/machine/machine.py
@@ -41,7 +41,6 @@
 )
 
 from qemu.qmp import (  # pylint: disable=import-error
-QEMUMonitorProtocol,
 QMPMessage,
 QMPReturnValue,
 SocketAddrT,
@@ -50,6 +49,12 @@
 from . import console_socket
 
 
+if os.environ.get('QEMU_PYTHON_LEGACY_QMP'):
+from qemu.qmp import QEMUMonitorProtocol
+else:
+from qemu.aqmp.legacy import QEMUMonitorProtocol
+
+
 LOG = logging.getLogger(__name__)
 
 
-- 
2.31.1

[PATCH v3 2/7] python/machine: Handle QMP errors on close more meticulously

2021-10-12 Thread John Snow

To use the AQMP backend, Machine just needs to be a little more diligent
about what happens when closing a QMP connection. The operation is no
longer a freebie in the async world; it may return errors encountered in
the async bottom half on incoming message receipt, etc.

(AQMP's disconnect, ultimately, serves as the quiescence point where all
async contexts are gathered together, and any final errors reported at
that point.)

Because async QMP continues to check for messages asynchronously, it's
almost certainly likely that the loop will have exited due to EOF after
issuing the last 'quit' command. That error will ultimately be bubbled
up when attempting to close the QMP connection. The manager class here
then is free to discard it -- if it was expected.

Signed-off-by: John Snow 
---
 python/qemu/machine/machine.py | 48 +-
 1 file changed, 42 insertions(+), 6 deletions(-)

diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
index 0bd40bc2f76..a0cf69786b4 100644
--- a/python/qemu/machine/machine.py
+++ b/python/qemu/machine/machine.py
@@ -342,9 +342,15 @@ def _post_shutdown(self) -> None:
 # Comprehensive reset for the failed launch case:
 self._early_cleanup()
 
-if self._qmp_connection:
-self._qmp.close()
-self._qmp_connection = None
+try:
+self._close_qmp_connection()
+except Exception as err:  # pylint: disable=broad-except
+LOG.warning(
+"Exception closing QMP connection: %s",
+str(err) if str(err) else type(err).__name__
+)
+finally:
+assert self._qmp_connection is None
 
 self._close_qemu_log_file()
 
@@ -420,6 +426,31 @@ def _launch(self) -> None:
close_fds=False)
 self._post_launch()
 
+def _close_qmp_connection(self) -> None:
+"""
+Close the underlying QMP connection, if any.
+
+Dutifully report errors that occurred while closing, but assume
+that any error encountered indicates an abnormal termination
+process and not a failure to close.
+"""
+if self._qmp_connection is None:
+return
+
+try:
+self._qmp.close()
+except EOFError:
+# EOF can occur as an Exception here when using the Async
+# QMP backend. It indicates that the server closed the
+# stream. If we successfully issued 'quit' at any point,
+# then this was expected. If the remote went away without
+# our permission, it's worth reporting that as an abnormal
+# shutdown case.
+if not (self._user_killed or self._quit_issued):
+raise
+finally:
+self._qmp_connection = None
+
 def _early_cleanup(self) -> None:
 """
 Perform any cleanup that needs to happen before the VM exits.
@@ -460,9 +491,14 @@ def _soft_shutdown(self, timeout: Optional[int]) -> None:
 self._early_cleanup()
 
 if self._qmp_connection:
-if not self._quit_issued:
-# Might raise ConnectionReset
-self.qmp('quit')
+try:
+if not self._quit_issued:
+# May raise ExecInterruptedError or StateError if the
+# connection dies or has *already* died.
+self.qmp('quit')
+finally:
+# Regardless, we want to quiesce the connection.
+self._close_qmp_connection()
 
 # May raise subprocess.TimeoutExpired
 self._subp.wait(timeout=timeout)
-- 
2.31.1

[PATCH v3 5/7] iotests: Conditionally silence certain AQMP errors

2021-10-12 Thread John Snow

AQMP likes to be very chatty about errors it encounters. In general,
this is good because it allows us to get good diagnostic information for
otherwise complex async failures.

For example, during a failed QMP connection attempt, we might see:

+ERROR:qemu.aqmp.qmp_client.qemub-2536319:Negotiation failed: EOFError
+ERROR:qemu.aqmp.qmp_client.qemub-2536319:Failed to establish session: EOFError

This might be nice in iotests output, because failure scenarios
involving the new QMP library will be spelled out plainly in the output
diffs.

For tests that are intentionally causing this scenario though, filtering
that log output could be a hassle. For now, add a context manager that
simply lets us toggle this output off during a critical region.

(Additionally, a forthcoming patch allows the use of either legacy or
async QMP to be toggled with an environment variable. In this
circumstance, we can't amend the iotest output to just always expect the
error message, either. Just suppress it for now. More rigorous log
filtering can be investigated later if/when it is deemed safe to
permanently replace the legacy QMP library.)

Signed-off-by: John Snow 
---
 tests/qemu-iotests/iotests.py | 20 +++-
 tests/qemu-iotests/tests/mirror-top-perms | 12 
 2 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index e5fff6ddcfc..e2f9d873ada 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -30,7 +30,7 @@
 import subprocess
 import sys
 import time
-from typing import (Any, Callable, Dict, Iterable,
+from typing import (Any, Callable, Dict, Iterable, Iterator,
 List, Optional, Sequence, TextIO, Tuple, Type, TypeVar)
 import unittest
 
@@ -114,6 +114,24 @@
 sample_img_dir = os.environ['SAMPLE_IMG_DIR']
 
 
+@contextmanager
+def change_log_level(
+logger_name: str, level: int = logging.CRITICAL) -> Iterator[None]:
+"""
+Utility function for temporarily changing the log level of a logger.
+
+This can be used to silence errors that are expected or uninteresting.
+"""
+_logger = logging.getLogger(logger_name)
+current_level = _logger.level
+_logger.setLevel(level)
+
+try:
+yield
+finally:
+_logger.setLevel(current_level)
+
+
 def unarchive_sample_image(sample, fname):
 sample_fname = os.path.join(sample_img_dir, sample + '.bz2')
 with bz2.open(sample_fname) as f_in, open(fname, 'wb') as f_out:
diff --git a/tests/qemu-iotests/tests/mirror-top-perms 
b/tests/qemu-iotests/tests/mirror-top-perms
index a2d5c269d7a..0a51a613f39 100755
--- a/tests/qemu-iotests/tests/mirror-top-perms
+++ b/tests/qemu-iotests/tests/mirror-top-perms
@@ -26,7 +26,7 @@ from qemu.machine import machine
 from qemu.qmp import QMPConnectError
 
 import iotests
-from iotests import qemu_img
+from iotests import change_log_level, qemu_img
 
 
 image_size = 1 * 1024 * 1024
@@ -100,9 +100,13 @@ class TestMirrorTopPerms(iotests.QMPTestCase):
 self.vm_b.add_blockdev(f'file,node-name=drive0,filename={source}')
 self.vm_b.add_device('virtio-blk,drive=drive0,share-rw=on')
 try:
-self.vm_b.launch()
-print('ERROR: VM B launched successfully, this should not have '
-  'happened')
+# Silence AQMP errors temporarily.
+# TODO: Remove this and just allow the errors to be logged when
+# AQMP fully replaces QMP.
+with change_log_level('qemu.aqmp'):
+self.vm_b.launch()
+print('ERROR: VM B launched successfully, '
+  'this should not have happened')
 except (QMPConnectError, ConnectError):
 assert 'Is another process using the image' in self.vm_b.get_log()
 
-- 
2.31.1

[PATCH v3 4/7] iotests: Accommodate async QMP Exception classes

2021-10-12 Thread John Snow

(But continue to support the old ones for now, too.)

There are very few cases of any user of QEMUMachine or a subclass
thereof relying on a QMP Exception type. If you'd like to check for
yourself, you want to grep for all of the derivatives of QMPError,
excluding 'AQMPError' and its derivatives. That'd be these:

- QMPError
- QMPConnectError
- QMPCapabilitiesError
- QMPTimeoutError
- QMPProtocolError
- QMPResponseError
- QMPBadPortError


Signed-off-by: John Snow 
---
 scripts/simplebench/bench_block_job.py| 3 ++-
 tests/qemu-iotests/tests/mirror-top-perms | 5 +++--
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/scripts/simplebench/bench_block_job.py 
b/scripts/simplebench/bench_block_job.py
index 4f03c121697..a403c35b08f 100755
--- a/scripts/simplebench/bench_block_job.py
+++ b/scripts/simplebench/bench_block_job.py
@@ -28,6 +28,7 @@
 sys.path.append(os.path.join(os.path.dirname(__file__), '..', '..', 'python'))
 from qemu.machine import QEMUMachine
 from qemu.qmp import QMPConnectError
+from qemu.aqmp import ConnectError
 
 
 def bench_block_job(cmd, cmd_args, qemu_args):
@@ -49,7 +50,7 @@ def bench_block_job(cmd, cmd_args, qemu_args):
 vm.launch()
 except OSError as e:
 return {'error': 'popen failed: ' + str(e)}
-except (QMPConnectError, socket.timeout):
+except (QMPConnectError, ConnectError, socket.timeout):
 return {'error': 'qemu failed: ' + str(vm.get_log())}
 
 try:
diff --git a/tests/qemu-iotests/tests/mirror-top-perms 
b/tests/qemu-iotests/tests/mirror-top-perms
index 3d475aa3a54..a2d5c269d7a 100755
--- a/tests/qemu-iotests/tests/mirror-top-perms
+++ b/tests/qemu-iotests/tests/mirror-top-perms
@@ -21,8 +21,9 @@
 
 import os
 
-from qemu import qmp
+from qemu.aqmp import ConnectError
 from qemu.machine import machine
+from qemu.qmp import QMPConnectError
 
 import iotests
 from iotests import qemu_img
@@ -102,7 +103,7 @@ class TestMirrorTopPerms(iotests.QMPTestCase):
 self.vm_b.launch()
 print('ERROR: VM B launched successfully, this should not have '
   'happened')
-except qmp.QMPConnectError:
+except (QMPConnectError, ConnectError):
 assert 'Is another process using the image' in self.vm_b.get_log()
 
 result = self.vm.qmp('block-job-cancel',
-- 
2.31.1

[PATCH v3 1/7] python/machine: remove has_quit argument

2021-10-12 Thread John Snow

If we spy on the QMP commands instead, we don't need callers to remember
to pass it. Seems like a fair trade-off.

The one slightly weird bit is overloading this instance variable for
wait(), where we use it to mean "don't issue the qmp 'quit'
command". This means that wait() will "fail" if the QEMU process does
not terminate of its own accord.

In most cases, we probably did already actually issue quit -- some
iotests do this -- but in some others, we may be waiting for QEMU to
terminate for some other reason, such as a test wherein we tell the
guest (directly) to shut down.

Signed-off-by: John Snow 
Reviewed-by: Hanna Reitz 
---
 python/qemu/machine/machine.py | 34 +++---
 tests/qemu-iotests/040 |  7 +--
 tests/qemu-iotests/218 |  2 +-
 tests/qemu-iotests/255 |  2 +-
 4 files changed, 22 insertions(+), 23 deletions(-)

diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
index 056d340e355..0bd40bc2f76 100644
--- a/python/qemu/machine/machine.py
+++ b/python/qemu/machine/machine.py
@@ -170,6 +170,7 @@ def __init__(self,
 self._console_socket: Optional[socket.socket] = None
 self._remove_files: List[str] = []
 self._user_killed = False
+self._quit_issued = False
 
 def __enter__(self: _T) -> _T:
 return self
@@ -368,6 +369,7 @@ def _post_shutdown(self) -> None:
 command = ''
 LOG.warning(msg, -int(exitcode), command)
 
+self._quit_issued = False
 self._user_killed = False
 self._launched = False
 
@@ -443,15 +445,13 @@ def _hard_shutdown(self) -> None:
 self._subp.kill()
 self._subp.wait(timeout=60)
 
-def _soft_shutdown(self, timeout: Optional[int],
-   has_quit: bool = False) -> None:
+def _soft_shutdown(self, timeout: Optional[int]) -> None:
 """
 Perform early cleanup, attempt to gracefully shut down the VM, and wait
 for it to terminate.
 
 :param timeout: Timeout in seconds for graceful shutdown.
 A value of None is an infinite wait.
-:param has_quit: When True, don't attempt to issue 'quit' QMP command
 
 :raise ConnectionReset: On QMP communication errors
 :raise subprocess.TimeoutExpired: When timeout is exceeded waiting for
@@ -460,21 +460,19 @@ def _soft_shutdown(self, timeout: Optional[int],
 self._early_cleanup()
 
 if self._qmp_connection:
-if not has_quit:
+if not self._quit_issued:
 # Might raise ConnectionReset
-self._qmp.cmd('quit')
+self.qmp('quit')
 
 # May raise subprocess.TimeoutExpired
 self._subp.wait(timeout=timeout)
 
-def _do_shutdown(self, timeout: Optional[int],
- has_quit: bool = False) -> None:
+def _do_shutdown(self, timeout: Optional[int]) -> None:
 """
 Attempt to shutdown the VM gracefully; fallback to a hard shutdown.
 
 :param timeout: Timeout in seconds for graceful shutdown.
 A value of None is an infinite wait.
-:param has_quit: When True, don't attempt to issue 'quit' QMP command
 
 :raise AbnormalShutdown: When the VM could not be shut down gracefully.
 The inner exception will likely be ConnectionReset or
@@ -482,13 +480,13 @@ def _do_shutdown(self, timeout: Optional[int],
 may result in its own exceptions, likely subprocess.TimeoutExpired.
 """
 try:
-self._soft_shutdown(timeout, has_quit)
+self._soft_shutdown(timeout)
 except Exception as exc:
 self._hard_shutdown()
 raise AbnormalShutdown("Could not perform graceful shutdown") \
 from exc
 
-def shutdown(self, has_quit: bool = False,
+def shutdown(self,
  hard: bool = False,
  timeout: Optional[int] = 30) -> None:
 """
@@ -498,7 +496,6 @@ def shutdown(self, has_quit: bool = False,
 If the VM has not yet been launched, or shutdown(), wait(), or kill()
 have already been called, this method does nothing.
 
-:param has_quit: When true, do not attempt to issue 'quit' QMP command.
 :param hard: When true, do not attempt graceful shutdown, and
  suppress the SIGKILL warning log message.
 :param timeout: Optional timeout in seconds for graceful shutdown.
@@ -512,7 +509,7 @@ def shutdown(self, has_quit: bool = False,
 self._user_killed = True
 self._hard_shutdown()
 else:
-self._do_shutdown(timeout, has_quit)
+self._do_shutdown(timeout)
 finally:
 self._post_shutdown()
 
@@ -529,7 +526,8 @@ def wait(self, timeout: Optional[int] = 30) -> None:
 :param timeout: Optional timeout in seconds. Default 30 seconds.

[PATCH v3 0/7] Switch iotests to using Async QMP

2021-10-12 Thread John Snow

Based-on: <20211012214152.802483-1-js...@redhat.com>
  [PULL 00/10] Python patches
GitLab: https://gitlab.com/jsnow/qemu/-/commits/python-aqmp-iotest-wrapper
CI: https://gitlab.com/jsnow/qemu/-/pipelines/387210591

Hiya,

This series continues where the last two AQMP series left off and adds a
synchronous 'legacy' wrapper around the new AQMP interface, then drops
it straight into iotests to prove that AQMP is functional and totally
cool and fine. The disruption and churn to iotests is pretty minimal.

In the event that a regression happens and I am not physically proximate
to inflict damage upon, one may set the QEMU_PYTHON_LEGACY_QMP variable
to any non-empty string as it pleases you to engage the QMP machinery
you are used to.

I'd like to try and get this committed early in the 6.2 development
cycle to give ample time to smooth over any possible regressions. I've
tested it locally and via gitlab CI, across Python versions 3.6 through
3.10, and "worksforme". If something bad happens, we can revert the
actual switch-flip very trivially.

V3:

001/7:[] [--] 'python/machine: remove has_quit argument'
002/7:[0002] [FC] 'python/machine: Handle QMP errors on close more meticulously'
003/7:[] [--] 'python/aqmp: Remove scary message'
004/7:[0006] [FC] 'iotests: Accommodate async QMP Exception classes'
005/7:[0003] [FC] 'iotests: Conditionally silence certain AQMP errors'
006/7:[0009] [FC] 'python/aqmp: Create sync QMP wrapper for iotests'
007/7:[] [--] 'python, iotests: replace qmp with aqmp'

002: Account for force-kill cases, too.
003: Shuffled earlier into the series to prevent a mid-series regression.
004: Rewrite the imports to be less "heterogeneous" ;)
005: Add in a TODO for me to trip over in the future.
006: Fix a bug surfaced by a new iotest where waiting with pull_event for a
 timeout of 0.0 will cause a timeout exception to be raised even if there
 was an event ready to be read.

V2:

001/17:[] [--] 'python/aqmp: add greeting property to QMPClient'
002/17:[] [--] 'python/aqmp: add .empty() method to EventListener'
003/17:[] [--] 'python/aqmp: Return cleared events from 
EventListener.clear()'
004/17:[0007] [FC] 'python/aqmp: add send_fd_scm'
005/17:[down] 'python/aqmp: Add dict conversion method to Greeting object'
006/17:[down] 'python/aqmp: Reduce severity of EOFError-caused loop 
terminations'
007/17:[down] 'python/aqmp: Disable logging messages by default'

008/17:[0002] [FC] 'python/qmp: clear events on get_events() call'
009/17:[] [--] 'python/qmp: add send_fd_scm directly to QEMUMonitorProtocol'
010/17:[] [--] 'python, iotests: remove socket_scm_helper'
011/17:[0013] [FC] 'python/machine: remove has_quit argument'
012/17:[down] 'python/machine: Handle QMP errors on close more meticulously'

013/17:[0009] [FC] 'iotests: Accommodate async QMP Exception classes'
014/17:[down] 'iotests: Conditionally silence certain AQMP errors'

015/17:[0016] [FC] 'python/aqmp: Create sync QMP wrapper for iotests'
016/17:[0002] [FC] 'python/aqmp: Remove scary message'
017/17:[] [--] 'python, iotests: replace qmp with aqmp'

- Rebased on jsnow/python, which was recently rebased on origin/master.
- Make aqmp's send_fd_scm method bark if the socket isn't AF_UNIX (Hanna)
- Uh... modify send_fd_scm so it doesn't break when Python 3.11 comes out ...
  See the commit message for more detail.
- Drop the "python/aqmp: Create MessageModel and StandaloneModel class"
  patch and replace with a far simpler method that just adds an
  _asdict() method.
- Add patches 06 and 07 to change how the AQMP library handles logging.
- Adjust docstring in patch 08 (Hanna)
- Rename "_has_quit" attribute to "_quid_issued" (Hanna)
- Renamed patch 12, simplified the logic in _soft_shutdown a tiny bit.
- Fixed bad exception handling logic in 13 (Hanna)
- Introduce a helper in patch 14 to silence log output when it's unwanted.
- Small addition of _get_greeting() helper in patch 15, coinciding with the
  new patch 05 here.
- Contextual changes in 16.

John Snow (7):
  python/machine: remove has_quit argument
  python/machine: Handle QMP errors on close more meticulously
  python/aqmp: Remove scary message
  iotests: Accommodate async QMP Exception classes
  iotests: Conditionally silence certain AQMP errors
  python/aqmp: Create sync QMP wrapper for iotests
  python, iotests: replace qmp with aqmp

 python/qemu/aqmp/__init__.py  |  12 --
 python/qemu/aqmp/legacy.py| 138 ++
 python/qemu/machine/machine.py|  85 +
 scripts/simplebench/bench_block_job.py|   3 +-
 tests/qemu-iotests/040|   7 +-
 tests/qemu-iotests/218|   2 +-
 tests/qemu-iotests/255|   2 +-
 tests/qemu-iotests/iotests.py |  20 +++-
 tests/qemu-iotests/tests/mirror-top-perms |  17 ++-
 9 files changed, 238 insertions(+), 48 deletions(-)
 create mode 100644 python/qemu/aqmp/legacy.py

--

Re: [PATCH v2 15/23] target/openrisc: Drop checks for singlestep_enabled

2021-10-12 Thread Philippe Mathieu-Daudé

On 10/12/21 18:21, Richard Henderson wrote:
> GDB single-stepping is now handled generically.
> 
> Signed-off-by: Richard Henderson 
> ---
>  target/openrisc/translate.c | 18 +++---
>  1 file changed, 3 insertions(+), 15 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé

Re: [PATCH v2 02/23] target/alpha: Drop checks for singlestep_enabled

2021-10-12 Thread Philippe Mathieu-Daudé

On 10/12/21 18:21, Richard Henderson wrote:
> GDB single-stepping is now handled generically.
> 
> Signed-off-by: Richard Henderson 
> ---
>  target/alpha/translate.c | 13 +++--
>  1 file changed, 3 insertions(+), 10 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé

Re: [PATCH v2 05/23] target/hexagon: Drop checks for singlestep_enabled

2021-10-12 Thread Philippe Mathieu-Daudé

On 10/12/21 18:21, Richard Henderson wrote:
> GDB single-stepping is now handled generically.
> 
> Signed-off-by: Richard Henderson 
> ---
>  target/hexagon/translate.c | 12 ++--
>  1 file changed, 2 insertions(+), 10 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé

1 2 3 4 >

1 - 100 of 391 matches

Mail list logo