date:20220307

On Tue, Mar 08, 2022 at 03:14:35PM +0800, Jason Wang wrote:
> On Tue, Mar 8, 2022 at 3:11 PM Michael S. Tsirkin  wrote:
> >
> > On Tue, Mar 08, 2022 at 02:03:32PM +0800, Jason Wang wrote:
> > >
> > > 在 2022/3/7 下午11:33, Eugenio Pérez 写道:
> > > > This series enable shadow virtqueue (SVQ) for vhost-vdpa devices. This
> > > > is intended as a new method of tracking the memory the devices touch
> > > > during a migration process: Instead of relay on vhost device's dirty
> > > > logging capability, SVQ intercepts the VQ dataplane forwarding the
> > > > descriptors between VM and device. This way qemu is the effective
> > > > writer of guests memory, like in qemu's virtio device operation.
> > > >
> > > > When SVQ is enabled qemu offers a new virtual address space to the
> > > > device to read and write into, and it maps new vrings and the guest
> > > > memory in it. SVQ also intercepts kicks and calls between the device
> > > > and the guest. Used buffers relay would cause dirty memory being
> > > > tracked.
> > > >
> > > > This effectively means that vDPA device passthrough is intercepted by
> > > > qemu. While SVQ should only be enabled at migration time, the switching
> > > > from regular mode to SVQ mode is left for a future series.
> > > >
> > > > It is based on the ideas of DPDK SW assisted LM, in the series of
> > > > DPDK's https://patchwork.dpdk.org/cover/48370/ . However, these does
> > > > not map the shadow vq in guest's VA, but in qemu's.
> > > >
> > > > For qemu to use shadow virtqueues the guest virtio driver must not use
> > > > features like event_idx.
> > > >
> > > > SVQ needs to be enabled with cmdline:
> > > >
> > > > -netdev type=vhost-vdpa,vhostdev=vhost-vdpa-0,id=vhost-vdpa0,svq=on
> >
> > A stable API for an incomplete feature is a problem imho.
> 
> It should be "x-svq".


Well look at patch 15.

> >
> >
> > > >
> > > > The first three patches enables notifications forwarding with
> > > > assistance of qemu. It's easy to enable only this if the relevant
> > > > cmdline part of the last patch is applied on top of these.
> > > >
> > > > Next four patches implement the actual buffer forwarding. However,
> > > > address are not translated from HVA so they will need a host device with
> > > > an iommu allowing them to access all of the HVA range.
> > > >
> > > > The last part of the series uses properly the host iommu, so qemu
> > > > creates a new iova address space in the device's range and translates
> > > > the buffers in it. Finally, it adds the cmdline parameter.
> > > >
> > > > Some simple performance tests with netperf were done. They used a nested
> > > > guest with vp_vdpa, vhost-kernel at L0 host. Starting with no svq and a
> > > > baseline average of ~9009.96Mbps:
> > > > Recv   SendSend
> > > > Socket Socket  Message  Elapsed
> > > > Size   SizeSize Time Throughput
> > > > bytes  bytes   bytessecs.10^6bits/sec
> > > > 131072  16384  1638430.019061.03
> > > > 131072  16384  1638430.018962.94
> > > > 131072  16384  1638430.019005.92
> > > >
> > > > To enable SVQ buffers forwarding reduce throughput to about
> > > > Recv   SendSend
> > > > Socket Socket  Message  Elapsed
> > > > Size   SizeSize Time Throughput
> > > > bytes  bytes   bytessecs.10^6bits/sec
> > > > 131072  16384  1638430.017689.72
> > > > 131072  16384  1638430.007752.07
> > > > 131072  16384  1638430.017750.30
> > > >
> > > > However, many performance improvements were left out of this series for
> > > > simplicity, so difference should shrink in the future.
> > > >
> > > > Comments are welcome.
> > >
> > >
> > > Hi Michael:
> > >
> > > What do you think of this series? It looks good to me as a start. The
> > > feature could only be enabled as a dedicated parameter. If you're ok, I'd
> > > try to make it for 7.0.
> > >
> > > Thanks
> >
> > Well that's cutting it awfully close, and it's not really useful
> > at the current stage, is it?
> 
> This allows vDPA to be migrated when using "x-svq=on".
> But anyhow it's
> experimental.

it's less experimental than incomplete. It seems pretty clearly not
the way it will work down the road, we don't want svq involved
at all times.

> >
> > The IOVA trick does not feel complete either.
> 
> I don't get here. We don't use any IOVA trick as DPDK (it reserve IOVA
> for shadow vq) did. So we won't suffer from the issues of DPDK.
> 
> Thanks

Maybe I misundrstand how this all works.
I refer to all the iova_tree_alloc_map things.

> >
> > >
> > > >
> > > > TODO on future series:
> > > > * Event, indirect, packed, and others features of virtio.
> > > > * To support different set of features between the device<->SVQ and the
> > > >SVQ<->guest communication.
> > > > * Support of device host notifier memory regions.
> > > > * To sepparate buffers forwarding in its own AIO context, so we can
> > > >throw more threads to that task and we don't need to stop the main
> > > >event

Re: [PATCH v5 15/15] vdpa: Add x-svq to NetdevVhostVDPAOptions

2022-03-07 Thread Eugenio Perez Martin

On Tue, Mar 8, 2022 at 8:11 AM Michael S. Tsirkin  wrote:
>
> On Mon, Mar 07, 2022 at 04:33:34PM +0100, Eugenio Pérez wrote:
> > Finally offering the possibility to enable SVQ from the command line.
> >
> > Signed-off-by: Eugenio Pérez 
> > ---
> >  qapi/net.json|  8 +++-
> >  net/vhost-vdpa.c | 48 
> >  2 files changed, 47 insertions(+), 9 deletions(-)
> >
> > diff --git a/qapi/net.json b/qapi/net.json
> > index 7fab2e7cd8..d626fa441c 100644
> > --- a/qapi/net.json
> > +++ b/qapi/net.json
> > @@ -445,12 +445,18 @@
> >  # @queues: number of queues to be created for multiqueue vhost-vdpa
> >  #  (default: 1)
> >  #
> > +# @svq: Start device with (experimental) shadow virtqueue. (Since 7.0)
> > +#
> > +# Features:
> > +# @unstable: Member @svq is experimental.
> > +#
> >  # Since: 5.1
> >  ##
> >  { 'struct': 'NetdevVhostVDPAOptions',
> >'data': {
> >  '*vhostdev': 'str',
> > -'*queues':   'int' } }
> > +'*queues':   'int',
> > +'*svq':  {'type': 'bool', 'features' : [ 'unstable'] } } }
> >
> >  ##
> >  # @NetClientDriver:
>
> I think this should be x-svq same as other unstable features.
>

I'm fine with both, but I was pointed to the other direction at [1] and [2].

Thanks!

[1] 
https://patchwork.kernel.org/project/qemu-devel/patch/20220302203012.3476835-15-epere...@redhat.com/
[2] 
https://lore.kernel.org/qemu-devel/20220303185147.3605350-15-epere...@redhat.com/

> > diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> > index 1e9fe47c03..c827921654 100644
> > --- a/net/vhost-vdpa.c
> > +++ b/net/vhost-vdpa.c
> > @@ -127,7 +127,11 @@ err_init:
> >  static void vhost_vdpa_cleanup(NetClientState *nc)
> >  {
> >  VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
> > +struct vhost_dev *dev = s->vhost_vdpa.dev;
> >
> > +if (dev && dev->vq_index + dev->nvqs == dev->vq_index_end) {
> > +g_clear_pointer(>vhost_vdpa.iova_tree, vhost_iova_tree_delete);
> > +}
> >  if (s->vhost_net) {
> >  vhost_net_cleanup(s->vhost_net);
> >  g_free(s->vhost_net);
> > @@ -187,13 +191,23 @@ static NetClientInfo net_vhost_vdpa_info = {
> >  .check_peer_type = vhost_vdpa_check_peer_type,
> >  };
> >
> > +static int vhost_vdpa_get_iova_range(int fd,
> > + struct vhost_vdpa_iova_range 
> > *iova_range)
> > +{
> > +int ret = ioctl(fd, VHOST_VDPA_GET_IOVA_RANGE, iova_range);
> > +
> > +return ret < 0 ? -errno : 0;
> > +}
> > +
> >  static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
> > -   const char *device,
> > -   const char *name,
> > -   int vdpa_device_fd,
> > -   int queue_pair_index,
> > -   int nvqs,
> > -   bool is_datapath)
> > +   const char *device,
> > +   const char *name,
> > +   int vdpa_device_fd,
> > +   int queue_pair_index,
> > +   int nvqs,
> > +   bool is_datapath,
> > +   bool svq,
> > +   VhostIOVATree *iova_tree)
> >  {
> >  NetClientState *nc = NULL;
> >  VhostVDPAState *s;
> > @@ -211,6 +225,8 @@ static NetClientState 
> > *net_vhost_vdpa_init(NetClientState *peer,
> >
> >  s->vhost_vdpa.device_fd = vdpa_device_fd;
> >  s->vhost_vdpa.index = queue_pair_index;
> > +s->vhost_vdpa.shadow_vqs_enabled = svq;
> > +s->vhost_vdpa.iova_tree = iova_tree;
> >  ret = vhost_vdpa_add(nc, (void *)>vhost_vdpa, queue_pair_index, 
> > nvqs);
> >  if (ret) {
> >  qemu_del_net_client(nc);
> > @@ -266,6 +282,7 @@ int net_init_vhost_vdpa(const Netdev *netdev, const 
> > char *name,
> >  g_autofree NetClientState **ncs = NULL;
> >  NetClientState *nc;
> >  int queue_pairs, i, has_cvq = 0;
> > +g_autoptr(VhostIOVATree) iova_tree = NULL;
> >
> >  assert(netdev->type == NET_CLIENT_DRIVER_VHOST_VDPA);
> >  opts = >u.vhost_vdpa;
> > @@ -285,29 +302,44 @@ int net_init_vhost_vdpa(const Netdev *netdev, const 
> > char *name,
> >  qemu_close(vdpa_device_fd);
> >  return queue_pairs;
> >  }
> > +if (opts->svq) {
> > +struct vhost_vdpa_iova_range iova_range;
> > +
> > +if (has_cvq) {
> > +error_setg(errp, "vdpa svq does not work with cvq");
> > +goto err_svq;
> > +}
> > +vhost_vdpa_get_iova_range(vdpa_device_fd, _range);
> > +iova_tree = vhost_iova_tree_new(iova_range.first, iova_range.last);
> > +}
> >
> >  ncs = g_malloc0(sizeof(*ncs) * queue_pairs);
> >

[PATCH v4 28/33] target/nios2: Clean up nios2_cpu_do_interrupt

Sink the bulk of the interrupt processing to the end
of the file.  All of the internal interrupt and
non-interrupt exception code shares EH processing.

Signed-off-by: Richard Henderson 
---
 target/nios2/helper.c | 100 +++---
 1 file changed, 25 insertions(+), 75 deletions(-)

diff --git a/target/nios2/helper.c b/target/nios2/helper.c
index a338d02f6b..ccf2634c9b 100644
--- a/target/nios2/helper.c
+++ b/target/nios2/helper.c
@@ -53,48 +53,25 @@ void nios2_cpu_do_interrupt(CPUState *cs)
 {
 Nios2CPU *cpu = NIOS2_CPU(cs);
 CPUNios2State *env = >env;
+uint32_t exception_addr = cpu->exception_addr;
+unsigned r_ea = R_EA;
+unsigned cr_estatus = CR_ESTATUS;
 
 switch (cs->exception_index) {
 case EXCP_IRQ:
-assert(env->status & CR_STATUS_PIE);
-
 qemu_log_mask(CPU_LOG_INT, "interrupt at pc=%x\n", env->pc);
-
-env->estatus = env->status;
-env->status |= CR_STATUS_IH;
-env->status &= ~(CR_STATUS_PIE | CR_STATUS_U);
-
-nios2_crs(env)[R_EA] = env->pc + 4;
-env->pc = cpu->exception_addr;
 break;
 
 case EXCP_TLBD:
-if ((env->status & CR_STATUS_EH) == 0) {
+if (env->status & CR_STATUS_EH) {
+qemu_log_mask(CPU_LOG_INT, "TLB MISS (double) at pc=%x\n", 
env->pc);
+/* Double TLB miss */
+env->tlbmisc |= CR_TLBMISC_DBL;
+} else {
 qemu_log_mask(CPU_LOG_INT, "TLB MISS (fast) at pc=%x\n", env->pc);
-
-/* Fast TLB miss */
-/* Variation from the spec. Table 3-35 of the cpu reference shows
- * estatus not being changed for TLB miss but this appears to
- * be incorrect. */
-env->estatus = env->status;
-env->status |= CR_STATUS_EH;
-env->status &= ~(CR_STATUS_PIE | CR_STATUS_U);
-
 env->tlbmisc &= ~CR_TLBMISC_DBL;
 env->tlbmisc |= CR_TLBMISC_WR;
-
-nios2_crs(env)[R_EA] = env->pc + 4;
-env->pc = cpu->fast_tlb_miss_addr;
-} else {
-qemu_log_mask(CPU_LOG_INT, "TLB MISS (double) at pc=%x\n", 
env->pc);
-
-/* Double TLB miss */
-env->status |= CR_STATUS_EH;
-env->status &= ~(CR_STATUS_PIE | CR_STATUS_U);
-
-env->tlbmisc |= CR_TLBMISC_DBL;
-
-env->pc = cpu->exception_addr;
+exception_addr = cpu->fast_tlb_miss_addr;
 }
 break;
 
@@ -102,48 +79,18 @@ void nios2_cpu_do_interrupt(CPUState *cs)
 case EXCP_TLBW:
 case EXCP_TLBX:
 qemu_log_mask(CPU_LOG_INT, "TLB PERM at pc=%x\n", env->pc);
-
-env->estatus = env->status;
-env->status |= CR_STATUS_EH;
-env->status &= ~(CR_STATUS_PIE | CR_STATUS_U);
-
-if ((env->status & CR_STATUS_EH) == 0) {
-env->tlbmisc |= CR_TLBMISC_WR;
-}
-
-nios2_crs(env)[R_EA] = env->pc + 4;
-env->pc = cpu->exception_addr;
+env->tlbmisc |= CR_TLBMISC_WR;
 break;
 
 case EXCP_SUPERA:
 case EXCP_SUPERI:
 case EXCP_SUPERD:
 qemu_log_mask(CPU_LOG_INT, "SUPERVISOR exception at pc=%x\n", env->pc);
-
-if ((env->status & CR_STATUS_EH) == 0) {
-env->estatus = env->status;
-nios2_crs(env)[R_EA] = env->pc + 4;
-}
-
-env->status |= CR_STATUS_EH;
-env->status &= ~(CR_STATUS_PIE | CR_STATUS_U);
-
-env->pc = cpu->exception_addr;
 break;
 
 case EXCP_ILLEGAL:
 case EXCP_TRAP:
 qemu_log_mask(CPU_LOG_INT, "TRAP exception at pc=%x\n", env->pc);
-
-if ((env->status & CR_STATUS_EH) == 0) {
-env->estatus = env->status;
-nios2_crs(env)[R_EA] = env->pc + 4;
-}
-
-env->status |= CR_STATUS_EH;
-env->status &= ~(CR_STATUS_PIE | CR_STATUS_U);
-
-env->pc = cpu->exception_addr;
 break;
 
 case EXCP_SEMIHOST:
@@ -154,23 +101,26 @@ void nios2_cpu_do_interrupt(CPUState *cs)
 
 case EXCP_BREAK:
 qemu_log_mask(CPU_LOG_INT, "BREAK exception at pc=%x\n", env->pc);
-if ((env->status & CR_STATUS_EH) == 0) {
-env->bstatus = env->status;
-nios2_crs(env)[R_BA] = env->pc + 4;
-}
-
-env->status |= CR_STATUS_EH;
-env->status &= ~(CR_STATUS_PIE | CR_STATUS_U);
-
-env->pc = cpu->exception_addr;
+r_ea = R_BA;
+cr_estatus = CR_BSTATUS;
 break;
 
 default:
-cpu_abort(cs, "unhandled exception type=%d\n",
-  cs->exception_index);
-break;
+cpu_abort(cs, "unhandled exception type=%d\n", cs->exception_index);
 }
 
+/*
+ * Finish Internal Interrupt or Noninterrupt Exception.
+ */
+
+if (!(env->status & CR_STATUS_EH)) {
+env->ctrl[cr_estatus] = env->status;
+env->crs[r_ea] = env->pc + 4;
+env->status |= CR_STATUS_EH;
+}
+env->status &= ~(CR_STATUS_PIE | CR_STATUS_U);
+

Re: [PATCH V7 10/29] machine: memfd-alloc option

2022-03-07 Thread Igor Mammedov

On Tue, 8 Mar 2022 01:50:11 -0500
"Michael S. Tsirkin"  wrote:

> On Mon, Mar 07, 2022 at 09:41:44AM -0500, Steven Sistare wrote:
> > On 3/4/2022 5:41 AM, Igor Mammedov wrote:  
> > > On Thu, 3 Mar 2022 12:21:15 -0500
> > > "Michael S. Tsirkin"  wrote:
> > >   
> > >> On Wed, Dec 22, 2021 at 11:05:15AM -0800, Steve Sistare wrote:  
> > >>> Allocate anonymous memory using memfd_create if the memfd-alloc machine
> > >>> option is set.
> > >>>
> > >>> Signed-off-by: Steve Sistare 
> > >>> ---
> > >>>  hw/core/machine.c   | 19 +++
> > >>>  include/hw/boards.h |  1 +
> > >>>  qemu-options.hx |  6 ++
> > >>>  softmmu/physmem.c   | 47 
> > >>> ++-
> > >>>  softmmu/vl.c|  1 +
> > >>>  trace-events|  1 +
> > >>>  util/qemu-config.c  |  4 
> > >>>  7 files changed, 70 insertions(+), 9 deletions(-)
> > >>>
> > >>> diff --git a/hw/core/machine.c b/hw/core/machine.c
> > >>> index 53a99ab..7739d88 100644
> > >>> --- a/hw/core/machine.c
> > >>> +++ b/hw/core/machine.c
> > >>> @@ -392,6 +392,20 @@ static void machine_set_mem_merge(Object *obj, 
> > >>> bool value, Error **errp)
> > >>>  ms->mem_merge = value;
> > >>>  }
> > >>>  
> > >>> +static bool machine_get_memfd_alloc(Object *obj, Error **errp)
> > >>> +{
> > >>> +MachineState *ms = MACHINE(obj);
> > >>> +
> > >>> +return ms->memfd_alloc;
> > >>> +}
> > >>> +
> > >>> +static void machine_set_memfd_alloc(Object *obj, bool value, Error 
> > >>> **errp)
> > >>> +{
> > >>> +MachineState *ms = MACHINE(obj);
> > >>> +
> > >>> +ms->memfd_alloc = value;
> > >>> +}
> > >>> +
> > >>>  static bool machine_get_usb(Object *obj, Error **errp)
> > >>>  {
> > >>>  MachineState *ms = MACHINE(obj);
> > >>> @@ -829,6 +843,11 @@ static void machine_class_init(ObjectClass *oc, 
> > >>> void *data)
> > >>>  object_class_property_set_description(oc, "mem-merge",
> > >>>  "Enable/disable memory merge support");
> > >>>  
> > >>> +object_class_property_add_bool(oc, "memfd-alloc",
> > >>> +machine_get_memfd_alloc, machine_set_memfd_alloc);
> > >>> +object_class_property_set_description(oc, "memfd-alloc",
> > >>> +"Enable/disable allocating anonymous memory using 
> > >>> memfd_create");
> > >>> +
> > >>>  object_class_property_add_bool(oc, "usb",
> > >>>  machine_get_usb, machine_set_usb);
> > >>>  object_class_property_set_description(oc, "usb",
> > >>> diff --git a/include/hw/boards.h b/include/hw/boards.h
> > >>> index 9c1c190..a57d7a0 100644
> > >>> --- a/include/hw/boards.h
> > >>> +++ b/include/hw/boards.h
> > >>> @@ -327,6 +327,7 @@ struct MachineState {
> > >>>  char *dt_compatible;
> > >>>  bool dump_guest_core;
> > >>>  bool mem_merge;
> > >>> +bool memfd_alloc;
> > >>>  bool usb;
> > >>>  bool usb_disabled;
> > >>>  char *firmware;
> > >>> diff --git a/qemu-options.hx b/qemu-options.hx
> > >>> index 7d47510..33c8173 100644
> > >>> --- a/qemu-options.hx
> > >>> +++ b/qemu-options.hx
> > >>> @@ -30,6 +30,7 @@ DEF("machine", HAS_ARG, QEMU_OPTION_machine, \
> > >>>  "vmport=on|off|auto controls emulation of vmport 
> > >>> (default: auto)\n"
> > >>>  "dump-guest-core=on|off include guest memory in a 
> > >>> core dump (default=on)\n"
> > >>>  "mem-merge=on|off controls memory merge support 
> > >>> (default: on)\n"
> > >>> +"memfd-alloc=on|off controls allocating anonymous 
> > >>> guest RAM using memfd_create (default: off)\n"
> > >>
> > >> Question: are there any disadvantages associated with using
> > >> memfd_create? I guess we are using up an fd, but that seems minor.  Any
> > >> reason not to set to on by default? maybe with a fallback option to
> > >> disable that?  
> > 
> > Old Linux host kernels, circa 4.1, do not support huge pages for shared 
> > memory.
> > Also, the tunable to enable huge pages for share memory is different than 
> > for
> > anon memory, so there could be performance loss if it is not set correctly.
> > /sys/kernel/mm/transparent_hugepage/enabled
> > vs
> > /sys/kernel/mm/transparent_hugepage/shmem_enabled  
> 
> I guess we can test this when launching the VM, and select
> a good default.
> 
> > It might make sense to use memfd_create by default for the secondary 
> > segments.  
> 
> Well there's also KSM now you mention it.

then another quest, is there downside to always using memfd_create
without any knobs being involved?

> 
> > >> I am concerned that it's actually a kind of memory backend, this flag
> > >> seems to instead be closer to the deprecated mem-prealloc. E.g.
> > >> it does not work with a mem path, does it?  
> > 
> > One can still define a memory backend with mempath to create the main ram 
> > segment,
> > though it must be some form of shared to work with live update.  Indeed, I 
> > would 
> > expect most users to specify an explicit memory backend

Re: [PATCH v5 00/15] vDPA shadow virtqueue

On Tue, Mar 8, 2022 at 3:28 PM Michael S. Tsirkin  wrote:
>
> On Tue, Mar 08, 2022 at 03:14:35PM +0800, Jason Wang wrote:
> > On Tue, Mar 8, 2022 at 3:11 PM Michael S. Tsirkin  wrote:
> > >
> > > On Tue, Mar 08, 2022 at 02:03:32PM +0800, Jason Wang wrote:
> > > >
> > > > 在 2022/3/7 下午11:33, Eugenio Pérez 写道:
> > > > > This series enable shadow virtqueue (SVQ) for vhost-vdpa devices. This
> > > > > is intended as a new method of tracking the memory the devices touch
> > > > > during a migration process: Instead of relay on vhost device's dirty
> > > > > logging capability, SVQ intercepts the VQ dataplane forwarding the
> > > > > descriptors between VM and device. This way qemu is the effective
> > > > > writer of guests memory, like in qemu's virtio device operation.
> > > > >
> > > > > When SVQ is enabled qemu offers a new virtual address space to the
> > > > > device to read and write into, and it maps new vrings and the guest
> > > > > memory in it. SVQ also intercepts kicks and calls between the device
> > > > > and the guest. Used buffers relay would cause dirty memory being
> > > > > tracked.
> > > > >
> > > > > This effectively means that vDPA device passthrough is intercepted by
> > > > > qemu. While SVQ should only be enabled at migration time, the 
> > > > > switching
> > > > > from regular mode to SVQ mode is left for a future series.
> > > > >
> > > > > It is based on the ideas of DPDK SW assisted LM, in the series of
> > > > > DPDK's https://patchwork.dpdk.org/cover/48370/ . However, these does
> > > > > not map the shadow vq in guest's VA, but in qemu's.
> > > > >
> > > > > For qemu to use shadow virtqueues the guest virtio driver must not use
> > > > > features like event_idx.
> > > > >
> > > > > SVQ needs to be enabled with cmdline:
> > > > >
> > > > > -netdev type=vhost-vdpa,vhostdev=vhost-vdpa-0,id=vhost-vdpa0,svq=on
> > >
> > > A stable API for an incomplete feature is a problem imho.
> >
> > It should be "x-svq".
>
>
> Well look at patch 15.

It's a bug that needs to be fixed.

>
> > >
> > >
> > > > >
> > > > > The first three patches enables notifications forwarding with
> > > > > assistance of qemu. It's easy to enable only this if the relevant
> > > > > cmdline part of the last patch is applied on top of these.
> > > > >
> > > > > Next four patches implement the actual buffer forwarding. However,
> > > > > address are not translated from HVA so they will need a host device 
> > > > > with
> > > > > an iommu allowing them to access all of the HVA range.
> > > > >
> > > > > The last part of the series uses properly the host iommu, so qemu
> > > > > creates a new iova address space in the device's range and translates
> > > > > the buffers in it. Finally, it adds the cmdline parameter.
> > > > >
> > > > > Some simple performance tests with netperf were done. They used a 
> > > > > nested
> > > > > guest with vp_vdpa, vhost-kernel at L0 host. Starting with no svq and 
> > > > > a
> > > > > baseline average of ~9009.96Mbps:
> > > > > Recv   SendSend
> > > > > Socket Socket  Message  Elapsed
> > > > > Size   SizeSize Time Throughput
> > > > > bytes  bytes   bytessecs.10^6bits/sec
> > > > > 131072  16384  1638430.019061.03
> > > > > 131072  16384  1638430.018962.94
> > > > > 131072  16384  1638430.019005.92
> > > > >
> > > > > To enable SVQ buffers forwarding reduce throughput to about
> > > > > Recv   SendSend
> > > > > Socket Socket  Message  Elapsed
> > > > > Size   SizeSize Time Throughput
> > > > > bytes  bytes   bytessecs.10^6bits/sec
> > > > > 131072  16384  1638430.017689.72
> > > > > 131072  16384  1638430.007752.07
> > > > > 131072  16384  1638430.017750.30
> > > > >
> > > > > However, many performance improvements were left out of this series 
> > > > > for
> > > > > simplicity, so difference should shrink in the future.
> > > > >
> > > > > Comments are welcome.
> > > >
> > > >
> > > > Hi Michael:
> > > >
> > > > What do you think of this series? It looks good to me as a start. The
> > > > feature could only be enabled as a dedicated parameter. If you're ok, 
> > > > I'd
> > > > try to make it for 7.0.
> > > >
> > > > Thanks
> > >
> > > Well that's cutting it awfully close, and it's not really useful
> > > at the current stage, is it?
> >
> > This allows vDPA to be migrated when using "x-svq=on".
> > But anyhow it's
> > experimental.
>
> it's less experimental than incomplete. It seems pretty clearly not
> the way it will work down the road, we don't want svq involved
> at all times.

Right, but SVQ could be used for other places e.g providing migration
compatibility when the destination lacks some features.

>
> > >
> > > The IOVA trick does not feel complete either.
> >
> > I don't get here. We don't use any IOVA trick as DPDK (it reserve IOVA
> > for shadow vq) did. So we won't suffer from the issues of DPDK.
> >
> > Thanks
>
> Maybe I misundrstand how this all works.
> I refer to

[PATCH v4 25/33] target/nios2: Implement rdprs, wrprs

Implement these out of line, so that tcg global temps
(aka the architectural registers) are synced back to
storage as required.  This makes sure that we get the
proper results when status.PRS == status.CRS.

Signed-off-by: Richard Henderson 
---
 target/nios2/cpu.h   |  2 ++
 target/nios2/helper.h|  2 ++
 target/nios2/op_helper.c | 12 ++
 target/nios2/translate.c | 47 ++--
 4 files changed, 61 insertions(+), 2 deletions(-)

diff --git a/target/nios2/cpu.h b/target/nios2/cpu.h
index f05536e04d..efaac274aa 100644
--- a/target/nios2/cpu.h
+++ b/target/nios2/cpu.h
@@ -237,6 +237,8 @@ struct Nios2CPU {
 CPUNios2State env;
 
 bool mmu_present;
+bool eic_present;
+
 uint32_t pid_num_bits;
 uint32_t tlb_num_ways;
 uint32_t tlb_num_entries;
diff --git a/target/nios2/helper.h b/target/nios2/helper.h
index 02797c384d..a8edca5194 100644
--- a/target/nios2/helper.h
+++ b/target/nios2/helper.h
@@ -22,6 +22,8 @@ DEF_HELPER_FLAGS_2(raise_exception, TCG_CALL_NO_WG, noreturn, 
env, i32)
 
 #if !defined(CONFIG_USER_ONLY)
 DEF_HELPER_2(eret, noreturn, env, i32)
+DEF_HELPER_FLAGS_2(rdprs, TCG_CALL_NO_WG, i32, env, i32)
+DEF_HELPER_3(wrprs, void, env, i32, i32)
 DEF_HELPER_2(mmu_write_tlbacc, void, env, i32)
 DEF_HELPER_2(mmu_write_tlbmisc, void, env, i32)
 DEF_HELPER_2(mmu_write_pteaddr, void, env, i32)
diff --git a/target/nios2/op_helper.c b/target/nios2/op_helper.c
index a1554ce349..e656986e3c 100644
--- a/target/nios2/op_helper.c
+++ b/target/nios2/op_helper.c
@@ -38,4 +38,16 @@ void helper_eret(CPUNios2State *env, uint32_t new_pc)
 env->pc = new_pc;
 cpu_loop_exit(env_cpu(env));
 }
+
+uint32_t helper_rdprs(CPUNios2State *env, uint32_t regno)
+{
+unsigned prs = FIELD_EX32(env->status, CR_STATUS, PRS);
+return env->shadow_regs[prs][regno];
+}
+
+void helper_wrprs(CPUNios2State *env, uint32_t regno, uint32_t val)
+{
+unsigned prs = FIELD_EX32(env->status, CR_STATUS, PRS);
+env->shadow_regs[prs][regno] = val;
+}
 #endif /* !CONFIG_USER_ONLY */
diff --git a/target/nios2/translate.c b/target/nios2/translate.c
index 57913da3c9..7730735639 100644
--- a/target/nios2/translate.c
+++ b/target/nios2/translate.c
@@ -103,6 +103,7 @@ typedef struct DisasContext {
 bool  crs0;
 TCGv  sink;
 const ControlRegState *cr_state;
+bool  eic_present;
 } DisasContext;
 
 static TCGv cpu_R[NUM_GP_REGS];
@@ -305,6 +306,27 @@ gen_i_math_logic(andhi, andi, 0, instr.imm16.u << 16)
 gen_i_math_logic(orhi , ori,  1, instr.imm16.u << 16)
 gen_i_math_logic(xorhi, xori, 1, instr.imm16.u << 16)
 
+/* rB <- prs.rA + sigma(IMM16) */
+static void rdprs(DisasContext *dc, uint32_t code, uint32_t flags)
+{
+if (!dc->eic_present) {
+t_gen_helper_raise_exception(dc, EXCP_ILLEGAL);
+return;
+}
+if (!gen_check_supervisor(dc)) {
+return;
+}
+
+#ifdef CONFIG_USER_ONLY
+g_assert_not_reached();
+#else
+I_TYPE(instr, code);
+TCGv dest = dest_gpr(dc, instr.b);
+gen_helper_rdprs(dest, cpu_env, tcg_constant_i32(instr.a));
+tcg_gen_addi_tl(dest, dest, instr.imm16.s);
+#endif
+}
+
 /* Prototype only, defined below */
 static void handle_r_type_instr(DisasContext *dc, uint32_t code,
 uint32_t flags);
@@ -366,7 +388,7 @@ static const Nios2Instruction i_type_instructions[] = {
 INSTRUCTION_FLG(gen_stx, MO_SL),  /* stwio */
 INSTRUCTION_FLG(gen_bxx, TCG_COND_LTU),   /* bltu */
 INSTRUCTION_FLG(gen_ldx, MO_UL),  /* ldwio */
-INSTRUCTION_UNIMPLEMENTED(),  /* rdprs */
+INSTRUCTION(rdprs),   /* rdprs */
 INSTRUCTION_ILLEGAL(),
 INSTRUCTION_FLG(handle_r_type_instr, 0),  /* R-Type */
 INSTRUCTION_NOP(),/* flushd */
@@ -552,6 +574,26 @@ static void wrctl(DisasContext *dc, uint32_t code, 
uint32_t flags)
 #endif
 }
 
+/* prs.rC <- rA */
+static void wrprs(DisasContext *dc, uint32_t code, uint32_t flags)
+{
+if (!dc->eic_present) {
+t_gen_helper_raise_exception(dc, EXCP_ILLEGAL);
+return;
+}
+if (!gen_check_supervisor(dc)) {
+return;
+}
+
+#ifdef CONFIG_USER_ONLY
+g_assert_not_reached();
+#else
+R_TYPE(instr, code);
+gen_helper_wrprs(cpu_env, tcg_constant_i32(instr.c),
+ load_gpr(dc, instr.a));
+#endif
+}
+
 /* Comparison instructions */
 static void gen_cmpxx(DisasContext *dc, uint32_t code, uint32_t flags)
 {
@@ -690,7 +732,7 @@ static const Nios2Instruction r_type_instructions[] = {
 INSTRUCTION_ILLEGAL(),
 INSTRUCTION(slli),/* slli */
 INSTRUCTION(sll), /* sll */
-INSTRUCTION_UNIMPLEMENTED(),  /* wrprs */
+INSTRUCTION(wrprs),   /* wrprs */
 INSTRUCTION_ILLEGAL(),
 INSTRUCTION(or),

[PATCH v4 27/33] target/nios2: Create EXCP_SEMIHOST for semi-hosting

Decode 'break 1' during translation, rather than doing
it again during exception processing.

Signed-off-by: Richard Henderson 
---
 target/nios2/cpu.h   |  1 +
 target/nios2/helper.c| 15 ++-
 target/nios2/translate.c | 17 -
 3 files changed, 23 insertions(+), 10 deletions(-)

diff --git a/target/nios2/cpu.h b/target/nios2/cpu.h
index c48daa5640..13e1d49f38 100644
--- a/target/nios2/cpu.h
+++ b/target/nios2/cpu.h
@@ -162,6 +162,7 @@ FIELD(CR_TLBMISC, EE, 24, 1)
 
 /* Exceptions */
 #define EXCP_BREAK0x1000
+#define EXCP_SEMIHOST 0x1001
 #define EXCP_RESET0
 #define EXCP_PRESET   1
 #define EXCP_IRQ  2
diff --git a/target/nios2/helper.c b/target/nios2/helper.c
index 007496b957..a338d02f6b 100644
--- a/target/nios2/helper.c
+++ b/target/nios2/helper.c
@@ -146,17 +146,14 @@ void nios2_cpu_do_interrupt(CPUState *cs)
 env->pc = cpu->exception_addr;
 break;
 
+case EXCP_SEMIHOST:
+qemu_log_mask(CPU_LOG_INT, "BREAK semihosting at pc=%x\n", env->pc);
+env->pc += 4;
+do_nios2_semihosting(env);
+return;
+
 case EXCP_BREAK:
 qemu_log_mask(CPU_LOG_INT, "BREAK exception at pc=%x\n", env->pc);
-/* The semihosting instruction is "break 1".  */
-if (semihosting_enabled() &&
-cpu_ldl_code(env, env->pc) == 0x003da07a)  {
-qemu_log_mask(CPU_LOG_INT, "Entering semihosting\n");
-env->pc += 4;
-do_nios2_semihosting(env);
-break;
-}
-
 if ((env->status & CR_STATUS_EH) == 0) {
 env->bstatus = env->status;
 nios2_crs(env)[R_BA] = env->pc + 4;
diff --git a/target/nios2/translate.c b/target/nios2/translate.c
index 7730735639..f9b84e31d7 100644
--- a/target/nios2/translate.c
+++ b/target/nios2/translate.c
@@ -33,6 +33,7 @@
 #include "exec/translator.h"
 #include "qemu/qemu-print.h"
 #include "exec/gen-icount.h"
+#include "semihosting/semihost.h"
 
 /* is_jmp field values */
 #define DISAS_JUMPDISAS_TARGET_0 /* only pc was modified dynamically */
@@ -711,6 +712,20 @@ static void trap(DisasContext *dc, uint32_t code, uint32_t 
flags)
 t_gen_helper_raise_exception(dc, EXCP_TRAP);
 }
 
+static void gen_break(DisasContext *dc, uint32_t code, uint32_t flags)
+{
+#ifndef CONFIG_USER_ONLY
+/* The semihosting instruction is "break 1".  */
+R_TYPE(instr, code);
+if (semihosting_enabled() && instr.imm5 == 1) {
+t_gen_helper_raise_exception(dc, EXCP_SEMIHOST);
+return;
+}
+#endif
+
+t_gen_helper_raise_exception(dc, EXCP_BREAK);
+}
+
 static const Nios2Instruction r_type_instructions[] = {
 INSTRUCTION_ILLEGAL(),
 INSTRUCTION(eret),/* eret */
@@ -764,7 +779,7 @@ static const Nios2Instruction r_type_instructions[] = {
 INSTRUCTION(add), /* add */
 INSTRUCTION_ILLEGAL(),
 INSTRUCTION_ILLEGAL(),
-INSTRUCTION_FLG(gen_excp, EXCP_BREAK),/* break */
+INSTRUCTION(gen_break),   /* break */
 INSTRUCTION_ILLEGAL(),
 INSTRUCTION(nop), /* nop */
 INSTRUCTION_ILLEGAL(),
-- 
2.25.1

[PATCH v4 32/33] hw/nios2: Move memory regions into Nios2Machine

Convert to contiguous allocation, as much as possible so far.
The two timer objects are not exposed for subobject allocation.

Signed-off-by: Richard Henderson 
---
 hw/nios2/10m50_devboard.c | 29 +++--
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/hw/nios2/10m50_devboard.c b/hw/nios2/10m50_devboard.c
index f245e0baa8..f4931b8a67 100644
--- a/hw/nios2/10m50_devboard.c
+++ b/hw/nios2/10m50_devboard.c
@@ -38,6 +38,11 @@
 
 struct Nios2MachineState {
 MachineState parent_obj;
+
+MemoryRegion phys_tcm;
+MemoryRegion phys_tcm_alias;
+MemoryRegion phys_ram;
+MemoryRegion phys_ram_alias;
 };
 
 #define TYPE_NIOS2_MACHINE  MACHINE_TYPE_NAME("10m50-ghrd")
@@ -51,10 +56,6 @@ static void nios2_10m50_ghrd_init(MachineState *machine)
 Nios2CPU *cpu;
 DeviceState *dev;
 MemoryRegion *address_space_mem = get_system_memory();
-MemoryRegion *phys_tcm = g_new(MemoryRegion, 1);
-MemoryRegion *phys_tcm_alias = g_new(MemoryRegion, 1);
-MemoryRegion *phys_ram = g_new(MemoryRegion, 1);
-MemoryRegion *phys_ram_alias = g_new(MemoryRegion, 1);
 ram_addr_t tcm_base = 0x0;
 ram_addr_t tcm_size = 0x1000;/* 1 kiB, but QEMU limit is 4 kiB */
 ram_addr_t ram_base = 0x0800;
@@ -63,22 +64,22 @@ static void nios2_10m50_ghrd_init(MachineState *machine)
 int i;
 
 /* Physical TCM (tb_ram_1k) with alias at 0xc000 */
-memory_region_init_ram(phys_tcm, NULL, "nios2.tcm", tcm_size,
+memory_region_init_ram(>phys_tcm, NULL, "nios2.tcm", tcm_size,
_abort);
-memory_region_init_alias(phys_tcm_alias, NULL, "nios2.tcm.alias",
- phys_tcm, 0, tcm_size);
-memory_region_add_subregion(address_space_mem, tcm_base, phys_tcm);
+memory_region_init_alias(>phys_tcm_alias, NULL, "nios2.tcm.alias",
+ >phys_tcm, 0, tcm_size);
+memory_region_add_subregion(address_space_mem, tcm_base, >phys_tcm);
 memory_region_add_subregion(address_space_mem, 0xc000 + tcm_base,
-phys_tcm_alias);
+>phys_tcm_alias);
 
 /* Physical DRAM with alias at 0xc000 */
-memory_region_init_ram(phys_ram, NULL, "nios2.ram", ram_size,
+memory_region_init_ram(>phys_ram, NULL, "nios2.ram", ram_size,
_abort);
-memory_region_init_alias(phys_ram_alias, NULL, "nios2.ram.alias",
- phys_ram, 0, ram_size);
-memory_region_add_subregion(address_space_mem, ram_base, phys_ram);
+memory_region_init_alias(>phys_ram_alias, NULL, "nios2.ram.alias",
+ >phys_ram, 0, ram_size);
+memory_region_add_subregion(address_space_mem, ram_base, >phys_ram);
 memory_region_add_subregion(address_space_mem, 0xc000 + ram_base,
-phys_ram_alias);
+>phys_ram_alias);
 
 /* Create CPU -- FIXME */
 cpu = NIOS2_CPU(cpu_create(TYPE_NIOS2_CPU));
-- 
2.25.1

[PATCH v4 33/33] hw/nios2: Machine with a Vectored Interrupt Controller

From: Amir Gonnen 

Demonstrate how to use nios2 VIC on a machine.
Introduce a new machine property to attach a VIC.

When VIC is present, let the CPU know that it should use the
External Interrupt Interface instead of the Internal Interrupt Interface.
The devices on the machine are attached to the VIC and not directly to cpu.
To allow VIC update EIC fields, we set the "cpu" property of the VIC
with a reference to the nios2 cpu.

Signed-off-by: Amir Gonnen 
Message-Id: <20220303153906.2024748-6-amir.gon...@neuroblade.ai>
[rth: Put a property on the 10m50-ghrd machine, rather than
  create a new machine class.]
Signed-off-by: Richard Henderson 
---
 hw/nios2/10m50_devboard.c | 61 +--
 hw/nios2/Kconfig  |  1 +
 2 files changed, 53 insertions(+), 9 deletions(-)

diff --git a/hw/nios2/10m50_devboard.c b/hw/nios2/10m50_devboard.c
index f4931b8a67..bdbc6539c9 100644
--- a/hw/nios2/10m50_devboard.c
+++ b/hw/nios2/10m50_devboard.c
@@ -43,6 +43,8 @@ struct Nios2MachineState {
 MemoryRegion phys_tcm_alias;
 MemoryRegion phys_ram;
 MemoryRegion phys_ram_alias;
+
+bool vic;
 };
 
 #define TYPE_NIOS2_MACHINE  MACHINE_TYPE_NAME("10m50-ghrd")
@@ -81,10 +83,40 @@ static void nios2_10m50_ghrd_init(MachineState *machine)
 memory_region_add_subregion(address_space_mem, 0xc000 + ram_base,
 >phys_ram_alias);
 
-/* Create CPU -- FIXME */
-cpu = NIOS2_CPU(cpu_create(TYPE_NIOS2_CPU));
-for (i = 0; i < 32; i++) {
-irq[i] = qdev_get_gpio_in_named(DEVICE(cpu), "IRQ", i);
+/* Create CPU.  We need to set eic_present between init and realize. */
+cpu = NIOS2_CPU(object_new(TYPE_NIOS2_CPU));
+
+/* Enable the External Interrupt Controller within the CPU. */
+cpu->eic_present = nms->vic;
+
+/* Configure new exception vectors. */
+cpu->reset_addr = 0xd400;
+cpu->exception_addr = 0xc8000120;
+cpu->fast_tlb_miss_addr = 0xc100;
+
+qdev_realize(DEVICE(cpu), NULL, _fatal);
+object_unref(CPU(cpu));
+
+if (nms->vic) {
+DeviceState *dev = qdev_new("nios2-vic");
+MemoryRegion *dev_mr;
+qemu_irq cpu_irq;
+
+object_property_set_link(OBJECT(dev), "cpu", OBJECT(cpu), 
_fatal);
+sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), _fatal);
+
+cpu_irq = qdev_get_gpio_in_named(DEVICE(cpu), "EIC", 0);
+sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0, cpu_irq);
+for (int i = 0; i < 32; i++) {
+irq[i] = qdev_get_gpio_in(dev, i);
+}
+
+dev_mr = sysbus_mmio_get_region(SYS_BUS_DEVICE(dev), 0);
+memory_region_add_subregion(address_space_mem, 0x18002000, dev_mr);
+} else {
+for (i = 0; i < 32; i++) {
+irq[i] = qdev_get_gpio_in_named(DEVICE(cpu), "IRQ", i);
+}
 }
 
 /* Register: Altera 16550 UART */
@@ -105,15 +137,22 @@ static void nios2_10m50_ghrd_init(MachineState *machine)
 sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, 0xe880);
 sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0, irq[5]);
 
-/* Configure new exception vectors and reset CPU for it to take effect. */
-cpu->reset_addr = 0xd400;
-cpu->exception_addr = 0xc8000120;
-cpu->fast_tlb_miss_addr = 0xc100;
-
 nios2_load_kernel(cpu, ram_base, ram_size, nms->parent_obj.initrd_filename,
   BINARY_DEVICE_TREE_FILE, NULL);
 }
 
+static bool get_vic(Object *obj, Error **errp)
+{
+Nios2MachineState *nms = NIOS2_MACHINE(obj);
+return nms->vic;
+}
+
+static void set_vic(Object *obj, bool value, Error **errp)
+{
+Nios2MachineState *nms = NIOS2_MACHINE(obj);
+nms->vic = value;
+}
+
 static void nios2_10m50_ghrd_class_init(ObjectClass *oc, void *data)
 {
 MachineClass *mc = MACHINE_CLASS(oc);
@@ -121,6 +160,10 @@ static void nios2_10m50_ghrd_class_init(ObjectClass *oc, 
void *data)
 mc->desc = "Altera 10M50 GHRD Nios II design";
 mc->init = nios2_10m50_ghrd_init;
 mc->is_default = true;
+
+object_class_property_add_bool(oc, "vic", get_vic, set_vic);
+object_class_property_set_description(oc, "vic",
+"Set on/off to enable/disable the Vectored Interrupt Controller");
 }
 
 static const TypeInfo nios2_10m50_ghrd_type_info = {
diff --git a/hw/nios2/Kconfig b/hw/nios2/Kconfig
index b10ea640da..4748ae27b6 100644
--- a/hw/nios2/Kconfig
+++ b/hw/nios2/Kconfig
@@ -3,6 +3,7 @@ config NIOS2_10M50
 select NIOS2
 select SERIAL
 select ALTERA_TIMER
+select NIOS2_VIC
 
 config NIOS2_GENERIC_NOMMU
 bool
-- 
2.25.1

[PATCH v4 22/33] target/nios2: Introduce dest_gpr

Constrain all references to cpu_R[] to load_gpr and dest_gpr.
This will be required for supporting shadow register sets.

Signed-off-by: Richard Henderson 
---
 target/nios2/translate.c | 144 +++
 1 file changed, 55 insertions(+), 89 deletions(-)

diff --git a/target/nios2/translate.c b/target/nios2/translate.c
index 6ff9c18502..7c2ad02685 100644
--- a/target/nios2/translate.c
+++ b/target/nios2/translate.c
@@ -100,6 +100,7 @@ typedef struct DisasContext {
 DisasContextBase  base;
 target_ulong  pc;
 int   mem_idx;
+TCGv  sink;
 const ControlRegState *cr_state;
 } DisasContext;
 
@@ -132,6 +133,18 @@ static TCGv load_gpr(DisasContext *dc, unsigned reg)
 return cpu_R[reg];
 }
 
+static TCGv dest_gpr(DisasContext *dc, unsigned reg)
+{
+assert(reg < NUM_GP_REGS);
+if (unlikely(reg == R_ZERO)) {
+if (dc->sink == NULL) {
+dc->sink = tcg_temp_new();
+}
+return dc->sink;
+}
+return cpu_R[reg];
+}
+
 static void t_gen_helper_raise_exception(DisasContext *dc,
  uint32_t index)
 {
@@ -190,7 +203,7 @@ static void jmpi(DisasContext *dc, uint32_t code, uint32_t 
flags)
 
 static void call(DisasContext *dc, uint32_t code, uint32_t flags)
 {
-tcg_gen_movi_tl(cpu_R[R_RA], dc->base.pc_next);
+tcg_gen_movi_tl(dest_gpr(dc, R_RA), dc->base.pc_next);
 jmpi(dc, code, flags);
 }
 
@@ -203,27 +216,10 @@ static void gen_ldx(DisasContext *dc, uint32_t code, 
uint32_t flags)
 I_TYPE(instr, code);
 
 TCGv addr = tcg_temp_new();
-TCGv data;
-
-/*
- * WARNING: Loads into R_ZERO are ignored, but we must generate the
- *  memory access itself to emulate the CPU precisely. Load
- *  from a protected page to R_ZERO will cause SIGSEGV on
- *  the Nios2 CPU.
- */
-if (likely(instr.b != R_ZERO)) {
-data = cpu_R[instr.b];
-} else {
-data = tcg_temp_new();
-}
+TCGv data = dest_gpr(dc, instr.b);
 
 tcg_gen_addi_tl(addr, load_gpr(dc, instr.a), instr.imm16.s);
 tcg_gen_qemu_ld_tl(data, addr, dc->mem_idx, flags);
-
-if (unlikely(instr.b == R_ZERO)) {
-tcg_temp_free(data);
-}
-
 tcg_temp_free(addr);
 }
 
@@ -253,7 +249,7 @@ static void gen_bxx(DisasContext *dc, uint32_t code, 
uint32_t flags)
 I_TYPE(instr, code);
 
 TCGLabel *l1 = gen_new_label();
-tcg_gen_brcond_tl(flags, cpu_R[instr.a], cpu_R[instr.b], l1);
+tcg_gen_brcond_tl(flags, load_gpr(dc, instr.a), load_gpr(dc, instr.b), l1);
 gen_goto_tb(dc, 0, dc->base.pc_next);
 gen_set_label(l1);
 gen_goto_tb(dc, 1, dc->base.pc_next + (instr.imm16.s & -4));
@@ -261,11 +257,12 @@ static void gen_bxx(DisasContext *dc, uint32_t code, 
uint32_t flags)
 }
 
 /* Comparison instructions */
-#define gen_i_cmpxx(fname, op3)  \
-static void (fname)(DisasContext *dc, uint32_t code, uint32_t flags) \
-{\
-I_TYPE(instr, (code));   \
-tcg_gen_setcondi_tl(flags, cpu_R[instr.b], cpu_R[instr.a], (op3));   \
+#define gen_i_cmpxx(fname, op3) \
+static void (fname)(DisasContext *dc, uint32_t code, uint32_t flags)\
+{   \
+I_TYPE(instr, (code));  \
+tcg_gen_setcondi_tl(flags, dest_gpr(dc, instr.b),   \
+load_gpr(dc, instr.a), (op3));  \
 }
 
 gen_i_cmpxx(gen_cmpxxsi, instr.imm16.s)
@@ -276,13 +273,7 @@ gen_i_cmpxx(gen_cmpxxui, instr.imm16.u)
 static void (fname)(DisasContext *dc, uint32_t code, uint32_t flags)\
 {   \
 I_TYPE(instr, (code));  \
-if (unlikely(instr.b == R_ZERO)) { /* Store to R_ZERO is ignored */ \
-return; \
-} else if (instr.a == R_ZERO) { /* MOVxI optimizations */   \
-tcg_gen_movi_tl(cpu_R[instr.b], (resimm) ? (op3) : 0);  \
-} else {\
-tcg_gen_##insn##_tl(cpu_R[instr.b], cpu_R[instr.a], (op3)); \
-}   \
+tcg_gen_##insn##_tl(dest_gpr(dc, instr.b), load_gpr(dc, instr.a), (op3)); \
 }
 
 gen_i_math_logic(addi,  addi, 1, instr.imm16.s)
@@ -383,7 +374,7 @@ static void eret(DisasContext *dc, uint32_t code, uint32_t 
flags)
 #ifdef CONFIG_USER_ONLY
 g_assert_not_reached();
 #else
-gen_helper_eret(cpu_env, cpu_R[R_EA]);
+gen_helper_eret(cpu_env, load_gpr(dc, R_EA));

[PATCH v4 31/33] hw/nios2: Introduce Nios2MachineState

We want to move data from the heap into Nios2MachineState,
which is not possible with DEFINE_MACHINE.

Signed-off-by: Richard Henderson 
---
 hw/nios2/10m50_devboard.c | 28 +---
 1 file changed, 25 insertions(+), 3 deletions(-)

diff --git a/hw/nios2/10m50_devboard.c b/hw/nios2/10m50_devboard.c
index 3d1205b8bd..f245e0baa8 100644
--- a/hw/nios2/10m50_devboard.c
+++ b/hw/nios2/10m50_devboard.c
@@ -36,10 +36,18 @@
 
 #include "boot.h"
 
+struct Nios2MachineState {
+MachineState parent_obj;
+};
+
+#define TYPE_NIOS2_MACHINE  MACHINE_TYPE_NAME("10m50-ghrd")
+OBJECT_DECLARE_TYPE(Nios2MachineState, MachineClass, NIOS2_MACHINE)
+
 #define BINARY_DEVICE_TREE_FILE"10m50-devboard.dtb"
 
 static void nios2_10m50_ghrd_init(MachineState *machine)
 {
+Nios2MachineState *nms = NIOS2_MACHINE(machine);
 Nios2CPU *cpu;
 DeviceState *dev;
 MemoryRegion *address_space_mem = get_system_memory();
@@ -101,15 +109,29 @@ static void nios2_10m50_ghrd_init(MachineState *machine)
 cpu->exception_addr = 0xc8000120;
 cpu->fast_tlb_miss_addr = 0xc100;
 
-nios2_load_kernel(cpu, ram_base, ram_size, machine->initrd_filename,
+nios2_load_kernel(cpu, ram_base, ram_size, nms->parent_obj.initrd_filename,
   BINARY_DEVICE_TREE_FILE, NULL);
 }
 
-static void nios2_10m50_ghrd_machine_init(struct MachineClass *mc)
+static void nios2_10m50_ghrd_class_init(ObjectClass *oc, void *data)
 {
+MachineClass *mc = MACHINE_CLASS(oc);
+
 mc->desc = "Altera 10M50 GHRD Nios II design";
 mc->init = nios2_10m50_ghrd_init;
 mc->is_default = true;
 }
 
-DEFINE_MACHINE("10m50-ghrd", nios2_10m50_ghrd_machine_init);
+static const TypeInfo nios2_10m50_ghrd_type_info = {
+.name  = TYPE_NIOS2_MACHINE,
+.parent= TYPE_MACHINE,
+.instance_size = sizeof(Nios2MachineState),
+.class_size= sizeof(MachineClass),
+.class_init= nios2_10m50_ghrd_class_init,
+};
+
+static void nios2_10m50_ghrd_type_init(void)
+{
+type_register_static(_10m50_ghrd_type_info);
+}
+type_init(nios2_10m50_ghrd_type_init);
-- 
2.25.1

[PATCH v4 19/33] target/nios2: Implement CR_STATUS.RSIE

Without EIC, this bit is RES1.  So set the bit at reset,
and add it to the readonly fields of CR_STATUS.

Signed-off-by: Richard Henderson 
---
 target/nios2/cpu.h | 1 +
 target/nios2/cpu.c | 5 +++--
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/target/nios2/cpu.h b/target/nios2/cpu.h
index 7faec97d77..b418deec4c 100644
--- a/target/nios2/cpu.h
+++ b/target/nios2/cpu.h
@@ -108,6 +108,7 @@ FIELD(CR_STATUS, IL, 4, 6)
 FIELD(CR_STATUS, CRS, 10, 6)
 FIELD(CR_STATUS, PRS, 16, 6)
 FIELD(CR_STATUS, NMI, 22, 1)
+FIELD(CR_STATUS, RSIE, 23, 1)
 
 #define CR_STATUS_PIE  (1u << R_CR_STATUS_PIE_SHIFT)
 #define CR_STATUS_U(1u << R_CR_STATUS_U_SHIFT)
diff --git a/target/nios2/cpu.c b/target/nios2/cpu.c
index fbcb4da737..ed7b9f9459 100644
--- a/target/nios2/cpu.c
+++ b/target/nios2/cpu.c
@@ -59,9 +59,9 @@ static void nios2_cpu_reset(DeviceState *dev)
 
 #if defined(CONFIG_USER_ONLY)
 /* Start in user mode with interrupts enabled. */
-env->status = CR_STATUS_U | CR_STATUS_PIE;
+env->status = CR_STATUS_RSIE | CR_STATUS_U | CR_STATUS_PIE;
 #else
-env->status = 0;
+env->status = CR_STATUS_RSIE;
 #endif
 }
 
@@ -109,6 +109,7 @@ static void nios2_cpu_initfn(Object *obj)
 WR_REG(CR_BADADDR);
 
 /* TODO: These control registers are not present with the EIC. */
+RO_FIELD(CR_STATUS, RSIE);
 WR_REG(CR_IENABLE);
 RO_REG(CR_IPENDING);
 
-- 
2.25.1

[PATCH v4 21/33] target/nios2: Use tcg_constant_tl

Replace current uses of tcg_const_tl, and remove the frees.

Signed-off-by: Richard Henderson 
---
 target/nios2/translate.c | 36 
 1 file changed, 8 insertions(+), 28 deletions(-)

diff --git a/target/nios2/translate.c b/target/nios2/translate.c
index 38e16df459..6ff9c18502 100644
--- a/target/nios2/translate.c
+++ b/target/nios2/translate.c
@@ -98,7 +98,6 @@
 
 typedef struct DisasContext {
 DisasContextBase  base;
-TCGv_i32  zero;
 target_ulong  pc;
 int   mem_idx;
 const ControlRegState *cr_state;
@@ -124,31 +123,20 @@ static uint8_t get_opxcode(uint32_t code)
 return instr.opx;
 }
 
-static TCGv load_zero(DisasContext *dc)
+static TCGv load_gpr(DisasContext *dc, unsigned reg)
 {
-if (!dc->zero) {
-dc->zero = tcg_const_i32(0);
-}
-return dc->zero;
-}
-
-static TCGv load_gpr(DisasContext *dc, uint8_t reg)
-{
-if (likely(reg != R_ZERO)) {
-return cpu_R[reg];
-} else {
-return load_zero(dc);
+assert(reg < NUM_GP_REGS);
+if (unlikely(reg == R_ZERO)) {
+return tcg_constant_tl(0);
 }
+return cpu_R[reg];
 }
 
 static void t_gen_helper_raise_exception(DisasContext *dc,
  uint32_t index)
 {
-TCGv_i32 tmp = tcg_const_i32(index);
-
 tcg_gen_movi_tl(cpu_pc, dc->pc);
-gen_helper_raise_exception(cpu_env, tmp);
-tcg_temp_free_i32(tmp);
+gen_helper_raise_exception(cpu_env, tcg_constant_i32(index));
 dc->base.is_jmp = DISAS_NORETURN;
 }
 
@@ -675,8 +663,8 @@ static void divu(DisasContext *dc, uint32_t code, uint32_t 
flags)
 
 TCGv t0 = tcg_temp_new();
 TCGv t1 = tcg_temp_new();
-TCGv t2 = tcg_const_tl(0);
-TCGv t3 = tcg_const_tl(1);
+TCGv t2 = tcg_constant_tl(0);
+TCGv t3 = tcg_constant_tl(1);
 
 tcg_gen_ext32u_tl(t0, load_gpr(dc, instr.a));
 tcg_gen_ext32u_tl(t1, load_gpr(dc, instr.b));
@@ -684,8 +672,6 @@ static void divu(DisasContext *dc, uint32_t code, uint32_t 
flags)
 tcg_gen_divu_tl(cpu_R[instr.c], t0, t1);
 tcg_gen_ext32s_tl(cpu_R[instr.c], cpu_R[instr.c]);
 
-tcg_temp_free(t3);
-tcg_temp_free(t2);
 tcg_temp_free(t1);
 tcg_temp_free(t0);
 }
@@ -863,14 +849,8 @@ static void nios2_tr_translate_insn(DisasContextBase 
*dcbase, CPUState *cs)
 return;
 }
 
-dc->zero = NULL;
-
 instr = _type_instructions[op];
 instr->handler(dc, code, instr->flags);
-
-if (dc->zero) {
-tcg_temp_free(dc->zero);
-}
 }
 
 static void nios2_tr_tb_stop(DisasContextBase *dcbase, CPUState *cs)
-- 
2.25.1

[PATCH v4 30/33] hw/intc: Vectored Interrupt Controller (VIC)

From: Amir Gonnen 

Implement nios2 Vectored Interrupt Controller (VIC).
VIC is connected to EIC. It needs to update rha, ril, rrs and rnmi
fields on Nios2CPU before raising an IRQ.
For that purpose, VIC has a "cpu" property which should refer to the
nios2 cpu and set by the board that connects VIC.

Signed-off-by: Amir Gonnen 
Message-Id: <20220303153906.2024748-5-amir.gon...@neuroblade.ai>
Signed-off-by: Richard Henderson 
---
 hw/intc/nios2_vic.c | 341 
 hw/intc/Kconfig |   3 +
 hw/intc/meson.build |   1 +
 3 files changed, 345 insertions(+)
 create mode 100644 hw/intc/nios2_vic.c

diff --git a/hw/intc/nios2_vic.c b/hw/intc/nios2_vic.c
new file mode 100644
index 00..b59d3f6f4c
--- /dev/null
+++ b/hw/intc/nios2_vic.c
@@ -0,0 +1,341 @@
+/*
+ * Vectored Interrupt Controller for nios2 processor
+ *
+ * Copyright (c) 2022 Neuroblade
+ *
+ * Interface:
+ * QOM property "cpu": link to the Nios2 CPU (must be set)
+ * Unnamed GPIO inputs 0..NIOS2_VIC_MAX_IRQ-1: input IRQ lines
+ * IRQ should be connected to nios2 IRQ0.
+ *
+ * Reference: "Embedded Peripherals IP User Guide
+ * for Intel® Quartus® Prime Design Suite: 21.4"
+ * Chapter 38 "Vectored Interrupt Controller Core"
+ * See: 
https://www.intel.com/content/www/us/en/docs/programmable/683130/21-4/vectored-interrupt-controller-core.html
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "qemu/osdep.h"
+
+#include "hw/irq.h"
+#include "hw/qdev-properties.h"
+#include "hw/sysbus.h"
+#include "migration/vmstate.h"
+#include "qapi/error.h"
+#include "qemu/bitops.h"
+#include "qemu/log.h"
+#include "qom/object.h"
+#include "cpu.h"
+
+#define TYPE_NIOS2_VIC "nios2-vic"
+
+OBJECT_DECLARE_SIMPLE_TYPE(Nios2Vic, NIOS2_VIC)
+
+#define NIOS2_VIC_MAX_IRQ 32
+
+enum {
+INT_CONFIG0 = 0,
+INT_CONFIG31 = 31,
+INT_ENABLE = 32,
+INT_ENABLE_SET = 33,
+INT_ENABLE_CLR = 34,
+INT_PENDING = 35,
+INT_RAW_STATUS = 36,
+SW_INTERRUPT = 37,
+SW_INTERRUPT_SET = 38,
+SW_INTERRUPT_CLR = 39,
+VIC_CONFIG = 40,
+VIC_STATUS = 41,
+VEC_TBL_BASE = 42,
+VEC_TBL_ADDR = 43,
+CSR_COUNT /* Last! */
+};
+
+struct Nios2Vic {
+/*< private >*/
+SysBusDevice parent_obj;
+
+/*< public >*/
+qemu_irq output_int;
+
+/* properties */
+CPUState *cpu;
+MemoryRegion csr;
+
+uint32_t int_config[32];
+uint32_t vic_config;
+uint32_t int_raw_status;
+uint32_t int_enable;
+uint32_t sw_int;
+uint32_t vic_status;
+uint32_t vec_tbl_base;
+uint32_t vec_tbl_addr;
+};
+
+/* Requested interrupt level (INT_CONFIG[0:5]) */
+static inline uint32_t vic_int_config_ril(const Nios2Vic *vic, int irq_num)
+{
+return extract32(vic->int_config[irq_num], 0, 6);
+}
+
+/* Requested NMI (INT_CONFIG[6]) */
+static inline uint32_t vic_int_config_rnmi(const Nios2Vic *vic, int irq_num)
+{
+return extract32(vic->int_config[irq_num], 6, 1);
+}
+
+/* Requested register set (INT_CONFIG[7:12]) */
+static inline uint32_t vic_int_config_rrs(const Nios2Vic *vic, int irq_num)
+{
+return extract32(vic->int_config[irq_num], 7, 6);
+}
+
+static inline uint32_t vic_config_vec_size(const Nios2Vic *vic)
+{
+return 1 << (2 + extract32(vic->vic_config, 0, 3));
+}
+
+static inline uint32_t vic_int_pending(const Nios2Vic *vic)
+{
+return (vic->int_raw_status | vic->sw_int) & vic->int_enable;
+}
+
+static void vic_update_irq(Nios2Vic *vic)
+{
+Nios2CPU *cpu = NIOS2_CPU(vic->cpu);
+uint32_t pending = vic_int_pending(vic);
+int irq = -1;
+int max_ril = 0;
+/* Note that if RIL is 0 for an interrupt it is effectively disabled */
+
+vic->vec_tbl_addr = 0;
+vic->vic_status = 0;
+
+if (pending == 0) {
+qemu_irq_lower(vic->output_int);
+return;
+}
+
+for (int i = 0; i < NIOS2_VIC_MAX_IRQ; i++) {
+if (pending & BIT(i)) {
+int ril =

[PATCH v4 13/33] target/nios2: Use hw/registerfields.h for CR_TLBADDR fields

Signed-off-by: Richard Henderson 
---
 target/nios2/cpu.h   |  8 
 target/nios2/helper.c|  4 ++--
 target/nios2/mmu.c   | 16 
 target/nios2/translate.c |  2 +-
 4 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/target/nios2/cpu.h b/target/nios2/cpu.h
index 35b4d88859..84138000fa 100644
--- a/target/nios2/cpu.h
+++ b/target/nios2/cpu.h
@@ -110,10 +110,10 @@ FIELD(CR_EXCEPTION, CAUSE, 2, 5)
 FIELD(CR_EXCEPTION, ECCFTL, 31, 1)
 
 #define CR_PTEADDR   8
-#define   CR_PTEADDR_PTBASE_SHIFT 22
-#define   CR_PTEADDR_PTBASE_MASK  (0x3FF << CR_PTEADDR_PTBASE_SHIFT)
-#define   CR_PTEADDR_VPN_SHIFT2
-#define   CR_PTEADDR_VPN_MASK (0xF << CR_PTEADDR_VPN_SHIFT)
+
+FIELD(CR_PTEADDR, VPN, 2, 20)
+FIELD(CR_PTEADDR, PTBASE, 22, 10)
+
 #define CR_TLBACC9
 #define   CR_TLBACC_IGN_SHIFT 25
 #define   CR_TLBACC_IGN_MASK  (0x7F << CR_TLBACC_IGN_SHIFT)
diff --git a/target/nios2/helper.c b/target/nios2/helper.c
index eb354f78e2..37fb53dadb 100644
--- a/target/nios2/helper.c
+++ b/target/nios2/helper.c
@@ -281,8 +281,8 @@ bool nios2_cpu_tlb_fill(CPUState *cs, vaddr address, int 
size,
 } else {
 env->tlbmisc |= CR_TLBMISC_D;
 }
-env->pteaddr &= CR_PTEADDR_PTBASE_MASK;
-env->pteaddr |= (address >> 10) & CR_PTEADDR_VPN_MASK;
+env->pteaddr = FIELD_DP32(env->pteaddr, CR_PTEADDR, VPN,
+  address >> TARGET_PAGE_BITS);
 env->mmu.pteaddr_wr = env->pteaddr;
 
 cs->exception_index = excp;
diff --git a/target/nios2/mmu.c b/target/nios2/mmu.c
index 382b190ae7..8017f2af93 100644
--- a/target/nios2/mmu.c
+++ b/target/nios2/mmu.c
@@ -97,7 +97,7 @@ void helper_mmu_write_tlbacc(CPUNios2State *env, uint32_t v)
 /* if tlbmisc.WE == 1 then trigger a TLB write on writes to TLBACC */
 if (env->tlbmisc & CR_TLBMISC_WR) {
 int way = (env->tlbmisc >> CR_TLBMISC_WAY_SHIFT);
-int vpn = (env->mmu.pteaddr_wr & CR_PTEADDR_VPN_MASK) >> 2;
+int vpn = FIELD_EX32(env->mmu.pteaddr_wr, CR_PTEADDR, VPN);
 int pid = (env->mmu.tlbmisc_wr & CR_TLBMISC_PID_MASK) >> 4;
 int g = (v & CR_TLBACC_G) ? 1 : 0;
 int valid = ((vpn & CR_TLBACC_PFN_MASK) < 0xC) ? 1 : 0;
@@ -148,7 +148,7 @@ void helper_mmu_write_tlbmisc(CPUNios2State *env, uint32_t 
v)
 /* if tlbmisc.RD == 1 then trigger a TLB read on writes to TLBMISC */
 if (v & CR_TLBMISC_RD) {
 int way = (v >> CR_TLBMISC_WAY_SHIFT);
-int vpn = (env->mmu.pteaddr_wr & CR_PTEADDR_VPN_MASK) >> 2;
+int vpn = FIELD_EX32(env->mmu.pteaddr_wr, CR_PTEADDR, VPN);
 Nios2TLBEntry *entry =
 >mmu.tlb[(way * cpu->tlb_num_ways) +
   (vpn & env->mmu.tlb_entry_mask)];
@@ -160,8 +160,8 @@ void helper_mmu_write_tlbmisc(CPUNios2State *env, uint32_t 
v)
 (v & ~CR_TLBMISC_PID_MASK) |
 ((entry->tag & ((1 << cpu->pid_num_bits) - 1)) <<
  CR_TLBMISC_PID_SHIFT);
-env->pteaddr &= ~CR_PTEADDR_VPN_MASK;
-env->pteaddr |= (entry->tag >> 12) << CR_PTEADDR_VPN_SHIFT;
+env->pteaddr = FIELD_DP32(env->pteaddr, CR_PTEADDR, VPN,
+  entry->tag >> TARGET_PAGE_BITS);
 } else {
 env->tlbmisc = v;
 }
@@ -171,12 +171,12 @@ void helper_mmu_write_tlbmisc(CPUNios2State *env, 
uint32_t v)
 
 void helper_mmu_write_pteaddr(CPUNios2State *env, uint32_t v)
 {
-trace_nios2_mmu_write_pteaddr(v >> CR_PTEADDR_PTBASE_SHIFT,
-  (v & CR_PTEADDR_VPN_MASK) >> 
CR_PTEADDR_VPN_SHIFT);
+trace_nios2_mmu_write_pteaddr(FIELD_EX32(v, CR_PTEADDR, PTBASE),
+  FIELD_EX32(v, CR_PTEADDR, VPN));
 
 /* Writes to PTEADDR don't change the read-back VPN value */
-env->pteaddr = (v & ~CR_PTEADDR_VPN_MASK) |
-(env->pteaddr & CR_PTEADDR_VPN_MASK);
+env->pteaddr = (v & ~R_CR_PTEADDR_VPN_MASK) |
+   (env->pteaddr & R_CR_PTEADDR_VPN_MASK);
 env->mmu.pteaddr_wr = v;
 }
 
diff --git a/target/nios2/translate.c b/target/nios2/translate.c
index 7a32e6626d..3cdef16519 100644
--- a/target/nios2/translate.c
+++ b/target/nios2/translate.c
@@ -909,7 +909,7 @@ void nios2_cpu_dump_state(CPUState *cs, FILE *f, int flags)
 }
 }
 qemu_fprintf(f, " mmu write: VPN=%05X PID %02X TLBACC %08X\n",
- env->mmu.pteaddr_wr & CR_PTEADDR_VPN_MASK,
+ env->mmu.pteaddr_wr & R_CR_PTEADDR_VPN_MASK,
  (env->mmu.tlbmisc_wr & CR_TLBMISC_PID_MASK) >> 4,
  env->mmu.tlbacc_wr);
 #endif
-- 
2.25.1

[PATCH v4 29/33] target/nios2: Implement EIC interrupt processing

This is the cpu side of the operation.  Register one irq line,
called EIC.  Split out the rather different processing to a
separate function.

Delay initialization of gpio irqs until realize.  We need to
provide a window after init in which the board can set eic_present.

Signed-off-by: Richard Henderson 
---
 target/nios2/cpu.h|  8 +
 target/nios2/cpu.c| 75 +--
 target/nios2/helper.c | 37 +
 3 files changed, 103 insertions(+), 17 deletions(-)

diff --git a/target/nios2/cpu.h b/target/nios2/cpu.h
index 13e1d49f38..89c575c26d 100644
--- a/target/nios2/cpu.h
+++ b/target/nios2/cpu.h
@@ -115,6 +115,7 @@ FIELD(CR_STATUS, CRS, 10, 6)
 FIELD(CR_STATUS, PRS, 16, 6)
 FIELD(CR_STATUS, NMI, 22, 1)
 FIELD(CR_STATUS, RSIE, 23, 1)
+FIELD(CR_STATUS, SRS, 31, 1)
 
 #define CR_STATUS_PIE  (1u << R_CR_STATUS_PIE_SHIFT)
 #define CR_STATUS_U(1u << R_CR_STATUS_U_SHIFT)
@@ -122,6 +123,7 @@ FIELD(CR_STATUS, RSIE, 23, 1)
 #define CR_STATUS_IH   (1u << R_CR_STATUS_IH_SHIFT)
 #define CR_STATUS_NMI  (1u << R_CR_STATUS_NMI_SHIFT)
 #define CR_STATUS_RSIE (1u << R_CR_STATUS_RSIE_SHIFT)
+#define CR_STATUS_SRS  (1u << R_CR_STATUS_SRS_SHIFT)
 
 FIELD(CR_EXCEPTION, CAUSE, 2, 5)
 FIELD(CR_EXCEPTION, ECCFTL, 31, 1)
@@ -252,6 +254,12 @@ struct Nios2CPU {
 
 /* Bits within each control register which are reserved or readonly. */
 ControlRegState cr_state[NUM_CR_REGS];
+
+/* External Interrupt Controller Interface */
+uint32_t rha; /* Requested handler address */
+uint32_t ril; /* Requested interrupt level */
+uint32_t rrs; /* Requested register set */
+bool rnmi;/* Requested nonmaskable interrupt */
 };
 
 
diff --git a/target/nios2/cpu.c b/target/nios2/cpu.c
index 6ece92a2b8..65a900a7fb 100644
--- a/target/nios2/cpu.c
+++ b/target/nios2/cpu.c
@@ -67,7 +67,19 @@ static void nios2_cpu_reset(DeviceState *dev)
 }
 
 #ifndef CONFIG_USER_ONLY
-static void nios2_cpu_set_irq(void *opaque, int irq, int level)
+static void eic_set_irq(void *opaque, int irq, int level)
+{
+Nios2CPU *cpu = opaque;
+CPUState *cs = CPU(cpu);
+
+if (level) {
+cpu_interrupt(cs, CPU_INTERRUPT_HARD);
+} else {
+cpu_reset_interrupt(cs, CPU_INTERRUPT_HARD);
+}
+}
+
+static void iic_set_irq(void *opaque, int irq, int level)
 {
 Nios2CPU *cpu = opaque;
 CPUNios2State *env = >env;
@@ -149,15 +161,6 @@ static void nios2_cpu_initfn(Object *obj)
 
 #if !defined(CONFIG_USER_ONLY)
 mmu_init(>env);
-
-/*
- * These interrupt lines model the IIC (internal interrupt
- * controller). QEMU does not currently support the EIC
- * (external interrupt controller) -- if we did it would be
- * a separate device in hw/intc with a custom interface to
- * the CPU, and boards using it would not wire up these IRQ lines.
- */
-qdev_init_gpio_in_named(DEVICE(cpu), nios2_cpu_set_irq, "IRQ", 32);
 #endif
 }
 
@@ -173,6 +176,14 @@ static void nios2_cpu_realizefn(DeviceState *dev, Error 
**errp)
 Nios2CPUClass *ncc = NIOS2_CPU_GET_CLASS(dev);
 Error *local_err = NULL;
 
+#ifndef CONFIG_USER_ONLY
+if (cpu->eic_present) {
+qdev_init_gpio_in_named(DEVICE(cpu), eic_set_irq, "EIC", 1);
+} else {
+qdev_init_gpio_in_named(DEVICE(cpu), iic_set_irq, "IRQ", 32);
+}
+#endif
+
 cpu_exec_realizefn(cs, _err);
 if (local_err != NULL) {
 error_propagate(errp, local_err);
@@ -189,17 +200,47 @@ static void nios2_cpu_realizefn(DeviceState *dev, Error 
**errp)
 }
 
 #ifndef CONFIG_USER_ONLY
+static bool eic_take_interrupt(Nios2CPU *cpu)
+{
+CPUNios2State *env = >env;
+
+if (cpu->rnmi) {
+return !(env->status & CR_STATUS_NMI);
+}
+if (!(env->status & CR_STATUS_PIE)) {
+return false;
+}
+if (cpu->ril <= FIELD_EX32(env->status, CR_STATUS, IL)) {
+return false;
+}
+if (cpu->rrs != FIELD_EX32(env->status, CR_STATUS, CRS)) {
+return true;
+}
+return env->status & CR_STATUS_RSIE;
+}
+
+static bool iic_take_interrupt(Nios2CPU *cpu)
+{
+CPUNios2State *env = >env;
+
+if (!(env->status & CR_STATUS_PIE)) {
+return false;
+}
+return env->ipending & env->ienable;
+}
+
 static bool nios2_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
 {
 Nios2CPU *cpu = NIOS2_CPU(cs);
-CPUNios2State *env = >env;
 
-if ((interrupt_request & CPU_INTERRUPT_HARD) &&
-(env->status & CR_STATUS_PIE) &&
-(env->ipending & env->ienable)) {
-cs->exception_index = EXCP_IRQ;
-nios2_cpu_do_interrupt(cs);
-return true;
+if (interrupt_request & CPU_INTERRUPT_HARD) {
+if (cpu->eic_present
+? eic_take_interrupt(cpu)
+: iic_take_interrupt(cpu)) {
+cs->exception_index = EXCP_IRQ;
+nios2_cpu_do_interrupt(cs);
+return true;
+}
 }
 return false;
 }
diff --git a/target/nios2/helper.c

[PATCH v4 24/33] target/nios2: Introduce shadow register sets

Do not actually enable them so far, but add all of the
plumbing to address them.  Do not enable them for user-only.

Add an env->crs pointer that handles the indirection to
the current register set.  Add a nios2_crs() function to
wrap this for normal uses, which hides the difference
between user-only and system modes.

>From the notes on wrprs, which states that r0 must be initialized
before use in shadow register sets, infer that R_ZERO is *not*
hardwired to zero in shadow register sets.  Adjust load_gpr and
dest_gpr to reflect this.  At the same time we might as well
special case crs == 0 to avoid the indirection through env->crs
during translation as well.  Given that this is intended to be
the most common case for non-interrupt handlers.

Drop the zeroing of env->regs at reset, as those are undefined.
Do init env->crs at reset.

Signed-off-by: Richard Henderson 
---
 target/nios2/cpu.h| 32 
 hw/nios2/boot.c   |  8 ++---
 target/nios2/cpu.c|  7 +++--
 target/nios2/helper.c | 12 
 target/nios2/nios2-semi.c | 13 
 target/nios2/translate.c  | 62 ++-
 6 files changed, 95 insertions(+), 39 deletions(-)

diff --git a/target/nios2/cpu.h b/target/nios2/cpu.h
index 2a5e070960..f05536e04d 100644
--- a/target/nios2/cpu.h
+++ b/target/nios2/cpu.h
@@ -61,6 +61,11 @@ struct Nios2CPUClass {
 #define NUM_GP_REGS 32
 #define NUM_CR_REGS 32
 
+#ifndef CONFIG_USER_ONLY
+/* 63 shadow register sets; index 0 is the primary register set. */
+#define NUM_REG_SETS 64
+#endif
+
 /* General purpose register aliases */
 enum {
 R_ZERO   = 0,
@@ -176,7 +181,13 @@ FIELD(CR_TLBMISC, EE, 24, 1)
 #define EXCP_MPUD 17
 
 struct CPUNios2State {
+#ifdef CONFIG_USER_ONLY
 uint32_t regs[NUM_GP_REGS];
+#else
+uint32_t shadow_regs[NUM_REG_SETS][NUM_GP_REGS];
+uint32_t *crs;
+#endif
+
 union {
 uint32_t ctrl[NUM_CR_REGS];
 struct {
@@ -245,6 +256,23 @@ static inline bool nios2_cr_reserved(const ControlRegState 
*s)
 return (s->writable | s->readonly) == 0;
 }
 
+static inline void nios2_update_crs(CPUNios2State *env)
+{
+#ifndef CONFIG_USER_ONLY
+unsigned crs = FIELD_EX32(env->status, CR_STATUS, CRS);
+env->crs = env->shadow_regs[crs];
+#endif
+}
+
+static inline uint32_t *nios2_crs(CPUNios2State *env)
+{
+#ifdef CONFIG_USER_ONLY
+return env->regs;
+#else
+return env->crs;
+#endif
+}
+
 void nios2_tcg_init(void);
 void nios2_cpu_do_interrupt(CPUState *cs);
 void dump_mmu(CPUNios2State *env);
@@ -286,12 +314,16 @@ typedef Nios2CPU ArchCPU;
 
 #include "exec/cpu-all.h"
 
+FIELD(TBFLAGS, CRS0, 0, 1)
+FIELD(TBFLAGS, U, 1, 1) /* Overlaps CR_STATUS_U */
+
 static inline void cpu_get_tb_cpu_state(CPUNios2State *env, target_ulong *pc,
 target_ulong *cs_base, uint32_t *flags)
 {
 *pc = env->pc;
 *cs_base = 0;
 *flags = env->status & CR_STATUS_U;
+*flags |= env->status & R_CR_STATUS_CRS_MASK ? 0 : R_TBFLAGS_CRS0_MASK;
 }
 
 #endif /* NIOS2_CPU_H */
diff --git a/hw/nios2/boot.c b/hw/nios2/boot.c
index 5b3e4efed5..96896f2ec5 100644
--- a/hw/nios2/boot.c
+++ b/hw/nios2/boot.c
@@ -62,10 +62,10 @@ static void main_cpu_reset(void *opaque)
 
 cpu_reset(CPU(cpu));
 
-env->regs[R_ARG0] = NIOS2_MAGIC;
-env->regs[R_ARG1] = boot_info.initrd_start;
-env->regs[R_ARG2] = boot_info.fdt;
-env->regs[R_ARG3] = boot_info.cmdline;
+nios2_crs(env)[R_ARG0] = NIOS2_MAGIC;
+nios2_crs(env)[R_ARG1] = boot_info.initrd_start;
+nios2_crs(env)[R_ARG2] = boot_info.fdt;
+nios2_crs(env)[R_ARG3] = boot_info.cmdline;
 
 cpu_set_pc(cs, boot_info.bootstrap_pc);
 if (boot_info.machine_cpu_reset) {
diff --git a/target/nios2/cpu.c b/target/nios2/cpu.c
index 2779650128..05f4a7a93a 100644
--- a/target/nios2/cpu.c
+++ b/target/nios2/cpu.c
@@ -53,7 +53,6 @@ static void nios2_cpu_reset(DeviceState *dev)
 
 ncc->parent_reset(dev);
 
-memset(env->regs, 0, sizeof(env->regs));
 memset(env->ctrl, 0, sizeof(env->ctrl));
 env->pc = cpu->reset_addr;
 
@@ -63,6 +62,8 @@ static void nios2_cpu_reset(DeviceState *dev)
 #else
 env->status = CR_STATUS_RSIE;
 #endif
+
+nios2_update_crs(env);
 }
 
 #ifndef CONFIG_USER_ONLY
@@ -210,7 +211,7 @@ static int nios2_cpu_gdb_read_register(CPUState *cs, 
GByteArray *mem_buf, int n)
 uint32_t val;
 
 if (n < 32) {  /* GP regs */
-val = env->regs[n];
+val = nios2_crs(env)[n];
 } else if (n == 32) {/* PC */
 val = env->pc;
 } else if (n < 49) { /* Status regs */
@@ -241,7 +242,7 @@ static int nios2_cpu_gdb_write_register(CPUState *cs, 
uint8_t *mem_buf, int n)
 val = ldl_p(mem_buf);
 
 if (n < 32) {/* GP regs */
-env->regs[n] = val;
+nios2_crs(env)[n] = val;
 } else if (n == 32) {/* PC */
 env->pc = val;
 } else if (n < 49) { /* Status regs */
diff --git a/target/nios2/helper.c

[PATCH v4 16/33] target/nios2: Move R_FOO and CR_BAR into enumerations

These symbols become available to the debugger.

Signed-off-by: Richard Henderson 
---
 target/nios2/cpu.h | 72 ++
 1 file changed, 35 insertions(+), 37 deletions(-)

diff --git a/target/nios2/cpu.h b/target/nios2/cpu.h
index 3857848f7c..927c4aaa80 100644
--- a/target/nios2/cpu.h
+++ b/target/nios2/cpu.h
@@ -62,25 +62,43 @@ struct Nios2CPUClass {
 #define NUM_CR_REGS 32
 
 /* General purpose register aliases */
-#define R_ZERO   0
-#define R_AT 1
-#define R_RET0   2
-#define R_RET1   3
-#define R_ARG0   4
-#define R_ARG1   5
-#define R_ARG2   6
-#define R_ARG3   7
-#define R_ET 24
-#define R_BT 25
-#define R_GP 26
-#define R_SP 27
-#define R_FP 28
-#define R_EA 29
-#define R_BA 30
-#define R_RA 31
+enum {
+R_ZERO   = 0,
+R_AT = 1,
+R_RET0   = 2,
+R_RET1   = 3,
+R_ARG0   = 4,
+R_ARG1   = 5,
+R_ARG2   = 6,
+R_ARG3   = 7,
+R_ET = 24,
+R_BT = 25,
+R_GP = 26,
+R_SP = 27,
+R_FP = 28,
+R_EA = 29,
+R_BA = 30,
+R_RA = 31,
+};
 
 /* Control register aliases */
-#define CR_STATUS0
+enum {
+CR_STATUS= 0,
+CR_ESTATUS   = 1,
+CR_BSTATUS   = 2,
+CR_IENABLE   = 3,
+CR_IPENDING  = 4,
+CR_CPUID = 5,
+CR_EXCEPTION = 7,
+CR_PTEADDR   = 8,
+CR_TLBACC= 9,
+CR_TLBMISC   = 10,
+CR_ENCINJ= 11,
+CR_BADADDR   = 12,
+CR_CONFIG= 13,
+CR_MPUBASE   = 14,
+CR_MPUACC= 15,
+};
 
 FIELD(CR_STATUS, PIE, 0, 1)
 FIELD(CR_STATUS, U, 1, 1)
@@ -98,24 +116,12 @@ FIELD(CR_STATUS, NMI, 22, 1)
 #define CR_STATUS_NMI  (1u << R_CR_STATUS_NMI_SHIFT)
 #define CR_STATUS_RSIE (1u << R_CR_STATUS_RSIE_SHIFT)
 
-#define CR_ESTATUS   1
-#define CR_BSTATUS   2
-#define CR_IENABLE   3
-#define CR_IPENDING  4
-#define CR_CPUID 5
-#define CR_CTL6  6
-#define CR_EXCEPTION 7
-
 FIELD(CR_EXCEPTION, CAUSE, 2, 5)
 FIELD(CR_EXCEPTION, ECCFTL, 31, 1)
 
-#define CR_PTEADDR   8
-
 FIELD(CR_PTEADDR, VPN, 2, 20)
 FIELD(CR_PTEADDR, PTBASE, 22, 10)
 
-#define CR_TLBACC9
-
 FIELD(CR_TLBACC, PFN, 0, 20)
 FIELD(CR_TLBACC, G, 20, 1)
 FIELD(CR_TLBACC, X, 21, 1)
@@ -130,8 +136,6 @@ FIELD(CR_TLBACC, IG, 25, 7)
 #define CR_TLBACC_X  (1u << R_CR_TLBACC_X_SHIFT)
 #define CR_TLBACC_G  (1u << R_CR_TLBACC_G_SHIFT)
 
-#define CR_TLBMISC   10
-
 FIELD(CR_TLBMISC, D, 0, 1)
 FIELD(CR_TLBMISC, PERM, 1, 1)
 FIELD(CR_TLBMISC, BAD, 2, 1)
@@ -149,12 +153,6 @@ FIELD(CR_TLBMISC, EE, 24, 1)
 #define CR_TLBMISC_PERM  (1u << R_CR_TLBMISC_PERM_SHIFT)
 #define CR_TLBMISC_D (1u << R_CR_TLBMISC_D_SHIFT)
 
-#define CR_ENCINJ11
-#define CR_BADADDR   12
-#define CR_CONFIG13
-#define CR_MPUBASE   14
-#define CR_MPUACC15
-
 /* Exceptions */
 #define EXCP_BREAK0x1000
 #define EXCP_RESET0
-- 
2.25.1

[PATCH v4 12/33] target/nios2: Use hw/registerfields.h for CR_EXCEPTION fields

Sink the set of env->exception to the end of nios2_cpu_do_interrupt.

Signed-off-by: Richard Henderson 
---
 target/nios2/cpu.h|  4 
 target/nios2/helper.c | 24 +++-
 2 files changed, 7 insertions(+), 21 deletions(-)

diff --git a/target/nios2/cpu.h b/target/nios2/cpu.h
index 26618baa70..35b4d88859 100644
--- a/target/nios2/cpu.h
+++ b/target/nios2/cpu.h
@@ -105,6 +105,10 @@ FIELD(CR_STATUS, NMI, 22, 1)
 #define CR_CPUID 5
 #define CR_CTL6  6
 #define CR_EXCEPTION 7
+
+FIELD(CR_EXCEPTION, CAUSE, 2, 5)
+FIELD(CR_EXCEPTION, ECCFTL, 31, 1)
+
 #define CR_PTEADDR   8
 #define   CR_PTEADDR_PTBASE_SHIFT 22
 #define   CR_PTEADDR_PTBASE_MASK  (0x3FF << CR_PTEADDR_PTBASE_SHIFT)
diff --git a/target/nios2/helper.c b/target/nios2/helper.c
index 3c49b0cfbf..eb354f78e2 100644
--- a/target/nios2/helper.c
+++ b/target/nios2/helper.c
@@ -64,9 +64,6 @@ void nios2_cpu_do_interrupt(CPUState *cs)
 env->status |= CR_STATUS_IH;
 env->status &= ~(CR_STATUS_PIE | CR_STATUS_U);
 
-env->exception &= ~(0x1F << 2);
-env->exception |= (cs->exception_index & 0x1F) << 2;
-
 env->regs[R_EA] = env->pc + 4;
 env->pc = cpu->exception_addr;
 break;
@@ -83,9 +80,6 @@ void nios2_cpu_do_interrupt(CPUState *cs)
 env->status |= CR_STATUS_EH;
 env->status &= ~(CR_STATUS_PIE | CR_STATUS_U);
 
-env->exception &= ~(0x1F << 2);
-env->exception |= (cs->exception_index & 0x1F) << 2;
-
 env->tlbmisc &= ~CR_TLBMISC_DBL;
 env->tlbmisc |= CR_TLBMISC_WR;
 
@@ -98,9 +92,6 @@ void nios2_cpu_do_interrupt(CPUState *cs)
 env->status |= CR_STATUS_EH;
 env->status &= ~(CR_STATUS_PIE | CR_STATUS_U);
 
-env->exception &= ~(0x1F << 2);
-env->exception |= (cs->exception_index & 0x1F) << 2;
-
 env->tlbmisc |= CR_TLBMISC_DBL;
 
 env->pc = cpu->exception_addr;
@@ -116,9 +107,6 @@ void nios2_cpu_do_interrupt(CPUState *cs)
 env->status |= CR_STATUS_EH;
 env->status &= ~(CR_STATUS_PIE | CR_STATUS_U);
 
-env->exception &= ~(0x1F << 2);
-env->exception |= (cs->exception_index & 0x1F) << 2;
-
 if ((env->status & CR_STATUS_EH) == 0) {
 env->tlbmisc |= CR_TLBMISC_WR;
 }
@@ -140,9 +128,6 @@ void nios2_cpu_do_interrupt(CPUState *cs)
 env->status |= CR_STATUS_EH;
 env->status &= ~(CR_STATUS_PIE | CR_STATUS_U);
 
-env->exception &= ~(0x1F << 2);
-env->exception |= (cs->exception_index & 0x1F) << 2;
-
 env->pc = cpu->exception_addr;
 break;
 
@@ -158,9 +143,6 @@ void nios2_cpu_do_interrupt(CPUState *cs)
 env->status |= CR_STATUS_EH;
 env->status &= ~(CR_STATUS_PIE | CR_STATUS_U);
 
-env->exception &= ~(0x1F << 2);
-env->exception |= (cs->exception_index & 0x1F) << 2;
-
 env->pc = cpu->exception_addr;
 break;
 
@@ -183,9 +165,6 @@ void nios2_cpu_do_interrupt(CPUState *cs)
 env->status |= CR_STATUS_EH;
 env->status &= ~(CR_STATUS_PIE | CR_STATUS_U);
 
-env->exception &= ~(0x1F << 2);
-env->exception |= (cs->exception_index & 0x1F) << 2;
-
 env->pc = cpu->exception_addr;
 break;
 
@@ -194,6 +173,9 @@ void nios2_cpu_do_interrupt(CPUState *cs)
   cs->exception_index);
 break;
 }
+
+env->exception = FIELD_DP32(env->exception, CR_EXCEPTION, CAUSE,
+cs->exception_index);
 }
 
 hwaddr nios2_cpu_get_phys_page_debug(CPUState *cs, vaddr addr)
-- 
2.25.1

[PATCH v4 26/33] target/nios2: Update helper_eret for shadow registers

When CRS = 0, we restore from estatus; otherwise from sstatus.
Do not allow reserved status bits to be set via this restore.
Add the fields defined for EIC to status.

Signed-off-by: Richard Henderson 
---
 target/nios2/cpu.h   |  1 +
 target/nios2/cpu.c   | 16 
 target/nios2/op_helper.c | 20 +++-
 3 files changed, 32 insertions(+), 5 deletions(-)

diff --git a/target/nios2/cpu.h b/target/nios2/cpu.h
index efaac274aa..c48daa5640 100644
--- a/target/nios2/cpu.h
+++ b/target/nios2/cpu.h
@@ -83,6 +83,7 @@ enum {
 R_FP = 28,
 R_EA = 29,
 R_BA = 30,
+R_SSTATUS = 30,
 R_RA = 31,
 };
 
diff --git a/target/nios2/cpu.c b/target/nios2/cpu.c
index 05f4a7a93a..6ece92a2b8 100644
--- a/target/nios2/cpu.c
+++ b/target/nios2/cpu.c
@@ -109,10 +109,18 @@ static void nios2_cpu_initfn(Object *obj)
 WR_FIELD(CR_EXCEPTION, CAUSE);
 WR_REG(CR_BADADDR);
 
-/* TODO: These control registers are not present with the EIC. */
-RO_FIELD(CR_STATUS, RSIE);
-WR_REG(CR_IENABLE);
-RO_REG(CR_IPENDING);
+if (cpu->eic_present) {
+WR_FIELD(CR_STATUS, RSIE);
+RO_FIELD(CR_STATUS, NMI);
+WR_FIELD(CR_STATUS, PRS);
+RO_FIELD(CR_STATUS, CRS);
+WR_FIELD(CR_STATUS, IL);
+WR_FIELD(CR_STATUS, IH);
+} else {
+RO_FIELD(CR_STATUS, RSIE);
+WR_REG(CR_IENABLE);
+RO_REG(CR_IPENDING);
+}
 
 if (cpu->mmu_present) {
 WR_FIELD(CR_STATUS, U);
diff --git a/target/nios2/op_helper.c b/target/nios2/op_helper.c
index e656986e3c..42342f007f 100644
--- a/target/nios2/op_helper.c
+++ b/target/nios2/op_helper.c
@@ -34,7 +34,25 @@ void helper_raise_exception(CPUNios2State *env, uint32_t 
index)
 #ifndef CONFIG_USER_ONLY
 void helper_eret(CPUNios2State *env, uint32_t new_pc)
 {
-env->status = env->estatus;
+Nios2CPU *cpu = env_archcpu(env);
+unsigned crs = FIELD_EX32(env->status, CR_STATUS, CRS);
+uint32_t val;
+
+if (crs == 0) {
+val = env->estatus;
+} else {
+val = env->shadow_regs[crs][R_SSTATUS];
+}
+
+/*
+ * Both estatus and sstatus have no constraints on write;
+ * do not allow reserved fields in status to be set.
+ */
+val &= (cpu->cr_state[CR_STATUS].writable |
+cpu->cr_state[CR_STATUS].readonly);
+env->status = val;
+nios2_update_crs(env);
+
 env->pc = new_pc;
 cpu_loop_exit(env_cpu(env));
 }
-- 
2.25.1

[PATCH v4 14/33] target/nios2: Use hw/registerfields.h for CR_TLBACC fields

Signed-off-by: Richard Henderson 
---
 target/nios2/cpu.h | 23 +++
 target/nios2/mmu.c | 16 
 2 files changed, 23 insertions(+), 16 deletions(-)

diff --git a/target/nios2/cpu.h b/target/nios2/cpu.h
index 84138000fa..024ef3ccc0 100644
--- a/target/nios2/cpu.h
+++ b/target/nios2/cpu.h
@@ -115,14 +115,21 @@ FIELD(CR_PTEADDR, VPN, 2, 20)
 FIELD(CR_PTEADDR, PTBASE, 22, 10)
 
 #define CR_TLBACC9
-#define   CR_TLBACC_IGN_SHIFT 25
-#define   CR_TLBACC_IGN_MASK  (0x7F << CR_TLBACC_IGN_SHIFT)
-#define   CR_TLBACC_C (1 << 24)
-#define   CR_TLBACC_R (1 << 23)
-#define   CR_TLBACC_W (1 << 22)
-#define   CR_TLBACC_X (1 << 21)
-#define   CR_TLBACC_G (1 << 20)
-#define   CR_TLBACC_PFN_MASK  0x000F
+
+FIELD(CR_TLBACC, PFN, 0, 20)
+FIELD(CR_TLBACC, G, 20, 1)
+FIELD(CR_TLBACC, X, 21, 1)
+FIELD(CR_TLBACC, W, 22, 1)
+FIELD(CR_TLBACC, R, 23, 1)
+FIELD(CR_TLBACC, C, 24, 1)
+FIELD(CR_TLBACC, IG, 25, 7)
+
+#define CR_TLBACC_C  (1u << R_CR_TLBACC_C_SHIFT)
+#define CR_TLBACC_R  (1u << R_CR_TLBACC_R_SHIFT)
+#define CR_TLBACC_W  (1u << R_CR_TLBACC_W_SHIFT)
+#define CR_TLBACC_X  (1u << R_CR_TLBACC_X_SHIFT)
+#define CR_TLBACC_G  (1u << R_CR_TLBACC_G_SHIFT)
+
 #define CR_TLBMISC   10
 #define   CR_TLBMISC_WAY_SHIFT 20
 #define   CR_TLBMISC_WAY_MASK  (0xF << CR_TLBMISC_WAY_SHIFT)
diff --git a/target/nios2/mmu.c b/target/nios2/mmu.c
index 8017f2af93..d6221936f7 100644
--- a/target/nios2/mmu.c
+++ b/target/nios2/mmu.c
@@ -49,7 +49,7 @@ unsigned int mmu_translate(CPUNios2State *env,
 }
 
 lu->vaddr = vaddr & TARGET_PAGE_MASK;
-lu->paddr = (entry->data & CR_TLBACC_PFN_MASK) << TARGET_PAGE_BITS;
+lu->paddr = FIELD_EX32(entry->data, CR_TLBACC, PFN) << 
TARGET_PAGE_BITS;
 lu->prot = ((entry->data & CR_TLBACC_R) ? PAGE_READ : 0) |
((entry->data & CR_TLBACC_W) ? PAGE_WRITE : 0) |
((entry->data & CR_TLBACC_X) ? PAGE_EXEC : 0);
@@ -86,27 +86,27 @@ void helper_mmu_write_tlbacc(CPUNios2State *env, uint32_t v)
 CPUState *cs = env_cpu(env);
 Nios2CPU *cpu = env_archcpu(env);
 
-trace_nios2_mmu_write_tlbacc(v >> CR_TLBACC_IGN_SHIFT,
+trace_nios2_mmu_write_tlbacc(FIELD_EX32(v, CR_TLBACC, IG),
  (v & CR_TLBACC_C) ? 'C' : '.',
  (v & CR_TLBACC_R) ? 'R' : '.',
  (v & CR_TLBACC_W) ? 'W' : '.',
  (v & CR_TLBACC_X) ? 'X' : '.',
  (v & CR_TLBACC_G) ? 'G' : '.',
- v & CR_TLBACC_PFN_MASK);
+ FIELD_EX32(v, CR_TLBACC, PFN));
 
 /* if tlbmisc.WE == 1 then trigger a TLB write on writes to TLBACC */
 if (env->tlbmisc & CR_TLBMISC_WR) {
 int way = (env->tlbmisc >> CR_TLBMISC_WAY_SHIFT);
 int vpn = FIELD_EX32(env->mmu.pteaddr_wr, CR_PTEADDR, VPN);
 int pid = (env->mmu.tlbmisc_wr & CR_TLBMISC_PID_MASK) >> 4;
-int g = (v & CR_TLBACC_G) ? 1 : 0;
-int valid = ((vpn & CR_TLBACC_PFN_MASK) < 0xC) ? 1 : 0;
+int g = FIELD_EX32(v, CR_TLBACC, G);
+int valid = FIELD_EX32(vpn, CR_TLBACC, PFN) < 0xC;
 Nios2TLBEntry *entry =
 >mmu.tlb[(way * cpu->tlb_num_ways) +
   (vpn & env->mmu.tlb_entry_mask)];
 uint32_t newTag = (vpn << 12) | (g << 11) | (valid << 10) | pid;
 uint32_t newData = v & (CR_TLBACC_C | CR_TLBACC_R | CR_TLBACC_W |
-CR_TLBACC_X | CR_TLBACC_PFN_MASK);
+CR_TLBACC_X | R_CR_TLBACC_PFN_MASK);
 
 if ((entry->tag != newTag) || (entry->data != newData)) {
 if (entry->tag & (1 << 10)) {
@@ -153,7 +153,7 @@ void helper_mmu_write_tlbmisc(CPUNios2State *env, uint32_t 
v)
 >mmu.tlb[(way * cpu->tlb_num_ways) +
   (vpn & env->mmu.tlb_entry_mask)];
 
-env->tlbacc &= CR_TLBACC_IGN_MASK;
+env->tlbacc &= R_CR_TLBACC_IG_MASK;
 env->tlbacc |= entry->data;
 env->tlbacc |= (entry->tag & (1 << 11)) ? CR_TLBACC_G : 0;
 env->tlbmisc =
@@ -207,7 +207,7 @@ void dump_mmu(CPUNios2State *env)
 entry->tag >> 12,
 entry->tag & ((1 << cpu->pid_num_bits) - 1),
 (entry->tag & (1 << 11)) ? 'G' : '-',
-entry->data & CR_TLBACC_PFN_MASK,
+FIELD_EX32(entry->data, CR_TLBACC, PFN),
 (entry->data & CR_TLBACC_C) ? 'C' : '-',
 (entry->data & CR_TLBACC_R) ? 'R' : '-',
 (entry->data & CR_TLBACC_W) ? 'W' : '-',
-- 
2.25.1

[PATCH v4 17/33] target/nios2: Prevent writes to read-only or reserved control fields

Create an array of masks which detail the writable and readonly
bits for each control register.  Apply them when writing to
control registers.

Signed-off-by: Richard Henderson 
---
 target/nios2/cpu.h   | 13 ++
 target/nios2/cpu.c   | 90 +---
 target/nios2/translate.c | 80 ---
 3 files changed, 152 insertions(+), 31 deletions(-)

diff --git a/target/nios2/cpu.h b/target/nios2/cpu.h
index 927c4aaa80..7faec97d77 100644
--- a/target/nios2/cpu.h
+++ b/target/nios2/cpu.h
@@ -207,6 +207,11 @@ struct CPUNios2State {
 int error_code;
 };
 
+typedef struct {
+uint32_t writable;
+uint32_t readonly;
+} ControlRegState;
+
 /**
  * Nios2CPU:
  * @env: #CPUNios2State
@@ -230,9 +235,17 @@ struct Nios2CPU {
 uint32_t reset_addr;
 uint32_t exception_addr;
 uint32_t fast_tlb_miss_addr;
+
+/* Bits within each control register which are reserved or readonly. */
+ControlRegState cr_state[NUM_CR_REGS];
 };
 
 
+static inline bool nios2_cr_reserved(const ControlRegState *s)
+{
+return (s->writable | s->readonly) == 0;
+}
+
 void nios2_tcg_init(void);
 void nios2_cpu_do_interrupt(CPUState *cs);
 void dump_mmu(CPUNios2State *env);
diff --git a/target/nios2/cpu.c b/target/nios2/cpu.c
index f2813d3b47..189adf111c 100644
--- a/target/nios2/cpu.c
+++ b/target/nios2/cpu.c
@@ -88,6 +88,55 @@ static void nios2_cpu_initfn(Object *obj)
 
 cpu_set_cpustate_pointers(cpu);
 
+/* Begin with all fields of all registers are reserved. */
+memset(cpu->cr_state, 0, sizeof(cpu->cr_state));
+
+/*
+ * The combination of writable and readonly is the set of all
+ * non-reserved fields.  We apply writable as a mask to bits,
+ * and merge in existing readonly bits, before storing.
+ */
+#define WR_REG(C)   cpu->cr_state[C].writable = -1
+#define RO_REG(C)   cpu->cr_state[C].readonly = -1
+#define WR_FIELD(C, F)  cpu->cr_state[C].writable |= R_##C##_##F##_MASK
+#define RO_FIELD(C, F)  cpu->cr_state[C].readonly |= R_##C##_##F##_MASK
+
+WR_FIELD(CR_STATUS, PIE);
+WR_REG(CR_ESTATUS);
+WR_REG(CR_BSTATUS);
+RO_REG(CR_CPUID);
+WR_FIELD(CR_EXCEPTION, CAUSE);
+WR_REG(CR_BADADDR);
+
+/* TODO: These control registers are not present with the EIC. */
+WR_REG(CR_IENABLE);
+RO_REG(CR_IPENDING);
+
+if (cpu->mmu_present) {
+WR_FIELD(CR_STATUS, U);
+WR_FIELD(CR_STATUS, EH);
+
+WR_FIELD(CR_PTEADDR, VPN);
+WR_FIELD(CR_PTEADDR, PTBASE);
+
+RO_FIELD(CR_TLBMISC, D);
+RO_FIELD(CR_TLBMISC, PERM);
+RO_FIELD(CR_TLBMISC, BAD);
+RO_FIELD(CR_TLBMISC, DBL);
+WR_FIELD(CR_TLBMISC, WR);
+WR_FIELD(CR_TLBMISC, RD);
+WR_FIELD(CR_TLBMISC, WAY);
+
+WR_REG(CR_TLBACC);
+}
+
+/* TODO: ECC and MPU not implemented. */
+
+#undef WR_REG
+#undef RO_REG
+#undef WR_FIELD
+#undef RO_FIELD
+
 #if !defined(CONFIG_USER_ONLY)
 mmu_init(>env);
 
@@ -152,23 +201,26 @@ static void nios2_cpu_disas_set_info(CPUState *cpu, 
disassemble_info *info)
 static int nios2_cpu_gdb_read_register(CPUState *cs, GByteArray *mem_buf, int 
n)
 {
 Nios2CPU *cpu = NIOS2_CPU(cs);
-CPUClass *cc = CPU_GET_CLASS(cs);
 CPUNios2State *env = >env;
+uint32_t val;
 
-if (n > cc->gdb_num_core_regs) {
+if (n < 32) {  /* GP regs */
+val = env->regs[n];
+} else if (n == 32) {/* PC */
+val = env->pc;
+} else if (n < 49) { /* Status regs */
+unsigned cr = n - 33;
+if (nios2_cr_reserved(>cr_state[cr])) {
+val = 0;
+} else {
+val = env->ctrl[n - 33];
+}
+} else {
+/* Invalid regs */
 return 0;
 }
 
-if (n < 32) {  /* GP regs */
-return gdb_get_reg32(mem_buf, env->regs[n]);
-} else if (n == 32) {/* PC */
-return gdb_get_reg32(mem_buf, env->pc);
-} else if (n < 49) { /* Status regs */
-return gdb_get_reg32(mem_buf, env->ctrl[n - 33]);
-}
-
-/* Invalid regs */
-return 0;
+return gdb_get_reg32(mem_buf, val);
 }
 
 static int nios2_cpu_gdb_write_register(CPUState *cs, uint8_t *mem_buf, int n)
@@ -176,17 +228,25 @@ static int nios2_cpu_gdb_write_register(CPUState *cs, 
uint8_t *mem_buf, int n)
 Nios2CPU *cpu = NIOS2_CPU(cs);
 CPUClass *cc = CPU_GET_CLASS(cs);
 CPUNios2State *env = >env;
+uint32_t val;
 
 if (n > cc->gdb_num_core_regs) {
 return 0;
 }
+val = ldl_p(mem_buf);
 
 if (n < 32) {/* GP regs */
-env->regs[n] = ldl_p(mem_buf);
+env->regs[n] = val;
 } else if (n == 32) {/* PC */
-env->pc = ldl_p(mem_buf);
+env->pc = val;
 } else if (n < 49) { /* Status regs */
-env->ctrl[n - 33] = ldl_p(mem_buf);
+unsigned cr = n - 33;
+/* ??? Maybe allow the debugger to write to readonly fields. */
+val &=

[PATCH v4 11/33] target/nios2: Use hw/registerfields.h for CR_STATUS fields

Add all fields; retain the helper macros for single bit fields.
So far there are no uses of the multi-bit status fields.

Signed-off-by: Richard Henderson 
---
 target/nios2/cpu.h | 27 ++-
 1 file changed, 18 insertions(+), 9 deletions(-)

diff --git a/target/nios2/cpu.h b/target/nios2/cpu.h
index 5bc0e353b4..26618baa70 100644
--- a/target/nios2/cpu.h
+++ b/target/nios2/cpu.h
@@ -23,6 +23,7 @@
 
 #include "exec/cpu-defs.h"
 #include "hw/core/cpu.h"
+#include "hw/registerfields.h"
 #include "qom/object.h"
 
 typedef struct CPUNios2State CPUNios2State;
@@ -80,15 +81,23 @@ struct Nios2CPUClass {
 
 /* Control register aliases */
 #define CR_STATUS0
-#define   CR_STATUS_PIE  (1 << 0)
-#define   CR_STATUS_U(1 << 1)
-#define   CR_STATUS_EH   (1 << 2)
-#define   CR_STATUS_IH   (1 << 3)
-#define   CR_STATUS_IL   (63 << 4)
-#define   CR_STATUS_CRS  (63 << 10)
-#define   CR_STATUS_PRS  (63 << 16)
-#define   CR_STATUS_NMI  (1 << 22)
-#define   CR_STATUS_RSIE (1 << 23)
+
+FIELD(CR_STATUS, PIE, 0, 1)
+FIELD(CR_STATUS, U, 1, 1)
+FIELD(CR_STATUS, EH, 2, 1)
+FIELD(CR_STATUS, IH, 3, 1)
+FIELD(CR_STATUS, IL, 4, 6)
+FIELD(CR_STATUS, CRS, 10, 6)
+FIELD(CR_STATUS, PRS, 16, 6)
+FIELD(CR_STATUS, NMI, 22, 1)
+
+#define CR_STATUS_PIE  (1u << R_CR_STATUS_PIE_SHIFT)
+#define CR_STATUS_U(1u << R_CR_STATUS_U_SHIFT)
+#define CR_STATUS_EH   (1u << R_CR_STATUS_EH_SHIFT)
+#define CR_STATUS_IH   (1u << R_CR_STATUS_IH_SHIFT)
+#define CR_STATUS_NMI  (1u << R_CR_STATUS_NMI_SHIFT)
+#define CR_STATUS_RSIE (1u << R_CR_STATUS_RSIE_SHIFT)
+
 #define CR_ESTATUS   1
 #define CR_BSTATUS   2
 #define CR_IENABLE   3
-- 
2.25.1

[PATCH v4 07/33] linux-user/nios2: Trim target_pc_regs to sp and pc

The only thing this struct is used for is passing startup values
from elfload.c to the cpu.  We do not need all registers to be
represented, we do not need the kernel internal stack slots.

The userland argc, argv, and envp values are passed on
the stack, so only SP and PC need updating.

Signed-off-by: Richard Henderson 
---
 linux-user/nios2/target_syscall.h | 25 ++---
 linux-user/elfload.c  |  3 +--
 linux-user/nios2/cpu_loop.c   | 24 +---
 3 files changed, 4 insertions(+), 48 deletions(-)

diff --git a/linux-user/nios2/target_syscall.h 
b/linux-user/nios2/target_syscall.h
index 561b28d281..0999ce25fd 100644
--- a/linux-user/nios2/target_syscall.h
+++ b/linux-user/nios2/target_syscall.h
@@ -5,29 +5,8 @@
 #define UNAME_MINIMUM_RELEASE "3.19.0"
 
 struct target_pt_regs {
-unsigned long  r8;/* r8-r15 Caller-saved GP registers */
-unsigned long  r9;
-unsigned long  r10;
-unsigned long  r11;
-unsigned long  r12;
-unsigned long  r13;
-unsigned long  r14;
-unsigned long  r15;
-unsigned long  r1;/* Assembler temporary */
-unsigned long  r2;/* Retval LS 32bits */
-unsigned long  r3;/* Retval MS 32bits */
-unsigned long  r4;/* r4-r7 Register arguments */
-unsigned long  r5;
-unsigned long  r6;
-unsigned long  r7;
-unsigned long  orig_r2;/* Copy of r2 ?? */
-unsigned long  ra;/* Return address */
-unsigned long  fp;/* Frame pointer */
-unsigned long  sp;/* Stack pointer */
-unsigned long  gp;/* Global pointer */
-unsigned long  estatus;
-unsigned long  ea;/* Exception return address (pc) */
-unsigned long  orig_r7;
+target_ulong sp;
+target_ulong pc;
 };
 
 #define TARGET_MCL_CURRENT 1
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 23ff9659a5..cb14c5f786 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -1094,9 +1094,8 @@ static void elf_core_copy_regs(target_elf_gregset_t 
*regs, const CPUMBState *env
 
 static void init_thread(struct target_pt_regs *regs, struct image_info *infop)
 {
-regs->ea = infop->entry;
+regs->pc = infop->entry;
 regs->sp = infop->start_stack;
-regs->estatus = 0x3;
 }
 
 #define LO_COMMPAGE  TARGET_PAGE_SIZE
diff --git a/linux-user/nios2/cpu_loop.c b/linux-user/nios2/cpu_loop.c
index 7b20c024db..37e1dfecfd 100644
--- a/linux-user/nios2/cpu_loop.c
+++ b/linux-user/nios2/cpu_loop.c
@@ -132,28 +132,6 @@ void cpu_loop(CPUNios2State *env)
 
 void target_cpu_copy_regs(CPUArchState *env, struct target_pt_regs *regs)
 {
-env->regs[0] = 0;
-env->regs[1] = regs->r1;
-env->regs[2] = regs->r2;
-env->regs[3] = regs->r3;
-env->regs[4] = regs->r4;
-env->regs[5] = regs->r5;
-env->regs[6] = regs->r6;
-env->regs[7] = regs->r7;
-env->regs[8] = regs->r8;
-env->regs[9] = regs->r9;
-env->regs[10] = regs->r10;
-env->regs[11] = regs->r11;
-env->regs[12] = regs->r12;
-env->regs[13] = regs->r13;
-env->regs[14] = regs->r14;
-env->regs[15] = regs->r15;
-/* TODO: unsigned long  orig_r2; */
-env->regs[R_RA] = regs->ra;
-env->regs[R_FP] = regs->fp;
 env->regs[R_SP] = regs->sp;
-env->regs[R_GP] = regs->gp;
-env->regs[CR_ESTATUS] = regs->estatus;
-env->pc = regs->ea;
-/* TODO: unsigned long  orig_r7; */
+env->pc = regs->pc;
 }
-- 
2.25.1

[PATCH v4 18/33] target/nios2: Implement cpuid

Copy the existing cpu_index into the space reserved for CR_CPUID.

Signed-off-by: Richard Henderson 
---
 target/nios2/cpu.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/target/nios2/cpu.c b/target/nios2/cpu.c
index 189adf111c..fbcb4da737 100644
--- a/target/nios2/cpu.c
+++ b/target/nios2/cpu.c
@@ -159,6 +159,7 @@ static ObjectClass *nios2_cpu_class_by_name(const char 
*cpu_model)
 static void nios2_cpu_realizefn(DeviceState *dev, Error **errp)
 {
 CPUState *cs = CPU(dev);
+Nios2CPU *cpu = NIOS2_CPU(cs);
 Nios2CPUClass *ncc = NIOS2_CPU_GET_CLASS(dev);
 Error *local_err = NULL;
 
@@ -171,6 +172,9 @@ static void nios2_cpu_realizefn(DeviceState *dev, Error 
**errp)
 qemu_init_vcpu(cs);
 cpu_reset(cs);
 
+/* We have reserved storage for ctrl[CR_CPUID]; might as well use it. */
+cpu->env.cpuid = cs->cpu_index;
+
 ncc->parent_realize(dev, errp);
 }
 
-- 
2.25.1

[PATCH v4 15/33] target/nios2: Use hw/registerfields.h for CR_TLBMISC fields

Signed-off-by: Richard Henderson 
---
 target/nios2/cpu.h   | 28 ++--
 target/nios2/helper.c|  7 ++-
 target/nios2/mmu.c   | 33 +++--
 target/nios2/translate.c |  2 +-
 4 files changed, 36 insertions(+), 34 deletions(-)

diff --git a/target/nios2/cpu.h b/target/nios2/cpu.h
index 024ef3ccc0..3857848f7c 100644
--- a/target/nios2/cpu.h
+++ b/target/nios2/cpu.h
@@ -131,16 +131,24 @@ FIELD(CR_TLBACC, IG, 25, 7)
 #define CR_TLBACC_G  (1u << R_CR_TLBACC_G_SHIFT)
 
 #define CR_TLBMISC   10
-#define   CR_TLBMISC_WAY_SHIFT 20
-#define   CR_TLBMISC_WAY_MASK  (0xF << CR_TLBMISC_WAY_SHIFT)
-#define   CR_TLBMISC_RD(1 << 19)
-#define   CR_TLBMISC_WR(1 << 18)
-#define   CR_TLBMISC_PID_SHIFT 4
-#define   CR_TLBMISC_PID_MASK  (0x3FFF << CR_TLBMISC_PID_SHIFT)
-#define   CR_TLBMISC_DBL   (1 << 3)
-#define   CR_TLBMISC_BAD   (1 << 2)
-#define   CR_TLBMISC_PERM  (1 << 1)
-#define   CR_TLBMISC_D (1 << 0)
+
+FIELD(CR_TLBMISC, D, 0, 1)
+FIELD(CR_TLBMISC, PERM, 1, 1)
+FIELD(CR_TLBMISC, BAD, 2, 1)
+FIELD(CR_TLBMISC, DBL, 3, 1)
+FIELD(CR_TLBMISC, PID, 4, 14)
+FIELD(CR_TLBMISC, WR, 18, 1)
+FIELD(CR_TLBMISC, RD, 19, 1)
+FIELD(CR_TLBMISC, WAY, 20, 4)
+FIELD(CR_TLBMISC, EE, 24, 1)
+
+#define CR_TLBMISC_RD(1u << R_CR_TLBMISC_RD_SHIFT)
+#define CR_TLBMISC_WR(1u << R_CR_TLBMISC_WR_SHIFT)
+#define CR_TLBMISC_DBL   (1u << R_CR_TLBMISC_DBL_SHIFT)
+#define CR_TLBMISC_BAD   (1u << R_CR_TLBMISC_BAD_SHIFT)
+#define CR_TLBMISC_PERM  (1u << R_CR_TLBMISC_PERM_SHIFT)
+#define CR_TLBMISC_D (1u << R_CR_TLBMISC_D_SHIFT)
+
 #define CR_ENCINJ11
 #define CR_BADADDR   12
 #define CR_CONFIG13
diff --git a/target/nios2/helper.c b/target/nios2/helper.c
index 37fb53dadb..93338e86f0 100644
--- a/target/nios2/helper.c
+++ b/target/nios2/helper.c
@@ -276,11 +276,8 @@ bool nios2_cpu_tlb_fill(CPUState *cs, vaddr address, int 
size,
 return false;
 }
 
-if (access_type == MMU_INST_FETCH) {
-env->tlbmisc &= ~CR_TLBMISC_D;
-} else {
-env->tlbmisc |= CR_TLBMISC_D;
-}
+env->tlbmisc = FIELD_DP32(env->tlbmisc, CR_TLBMISC, D,
+  access_type == MMU_INST_FETCH);
 env->pteaddr = FIELD_DP32(env->pteaddr, CR_PTEADDR, VPN,
   address >> TARGET_PAGE_BITS);
 env->mmu.pteaddr_wr = env->pteaddr;
diff --git a/target/nios2/mmu.c b/target/nios2/mmu.c
index d6221936f7..c8b74b5479 100644
--- a/target/nios2/mmu.c
+++ b/target/nios2/mmu.c
@@ -33,7 +33,7 @@ unsigned int mmu_translate(CPUNios2State *env,
target_ulong vaddr, int rw, int mmu_idx)
 {
 Nios2CPU *cpu = env_archcpu(env);
-int pid = (env->mmu.tlbmisc_wr & CR_TLBMISC_PID_MASK) >> 4;
+int pid = FIELD_EX32(env->mmu.tlbmisc_wr, CR_TLBMISC, PID);
 int vpn = vaddr >> 12;
 int way, n_ways = cpu->tlb_num_ways;
 
@@ -96,9 +96,9 @@ void helper_mmu_write_tlbacc(CPUNios2State *env, uint32_t v)
 
 /* if tlbmisc.WE == 1 then trigger a TLB write on writes to TLBACC */
 if (env->tlbmisc & CR_TLBMISC_WR) {
-int way = (env->tlbmisc >> CR_TLBMISC_WAY_SHIFT);
+int way = FIELD_EX32(env->tlbmisc, CR_TLBMISC, WAY);
 int vpn = FIELD_EX32(env->mmu.pteaddr_wr, CR_PTEADDR, VPN);
-int pid = (env->mmu.tlbmisc_wr & CR_TLBMISC_PID_MASK) >> 4;
+int pid = FIELD_EX32(env->mmu.tlbmisc_wr, CR_TLBMISC, PID);
 int g = FIELD_EX32(v, CR_TLBACC, G);
 int valid = FIELD_EX32(vpn, CR_TLBACC, PFN) < 0xC;
 Nios2TLBEntry *entry =
@@ -117,10 +117,8 @@ void helper_mmu_write_tlbacc(CPUNios2State *env, uint32_t 
v)
 entry->data = newData;
 }
 /* Auto-increment tlbmisc.WAY */
-env->tlbmisc =
-(env->tlbmisc & ~CR_TLBMISC_WAY_MASK) |
-(((way + 1) & (cpu->tlb_num_ways - 1)) <<
- CR_TLBMISC_WAY_SHIFT);
+env->tlbmisc = FIELD_DP32(env->tlbmisc, CR_TLBMISC, WAY,
+  (way + 1) & (cpu->tlb_num_ways - 1));
 }
 
 /* Writes to TLBACC don't change the read-back value */
@@ -130,24 +128,25 @@ void helper_mmu_write_tlbacc(CPUNios2State *env, uint32_t 
v)
 void helper_mmu_write_tlbmisc(CPUNios2State *env, uint32_t v)
 {
 Nios2CPU *cpu = env_archcpu(env);
+uint32_t new_pid = FIELD_EX32(v, CR_TLBMISC, PID);
+uint32_t old_pid = FIELD_EX32(env->mmu.tlbmisc_wr, CR_TLBMISC, PID);
+uint32_t way = FIELD_EX32(v, CR_TLBMISC, WAY);
 
-trace_nios2_mmu_write_tlbmisc(v >> CR_TLBMISC_WAY_SHIFT,
+trace_nios2_mmu_write_tlbmisc(way,
   (v & CR_TLBMISC_RD) ? 'R' : '.',
   (v & CR_TLBMISC_WR) ? 'W' : '.',
   (v & CR_TLBMISC_DBL) ? '2' : '.',
   (v & CR_TLBMISC_BAD) ? 'B' : '.',
   (v & CR_TLBMISC_PERM) ? 'P' : '.',
   (v &

[PATCH v4 06/33] target/nios2: Do not create TCGv for control registers

We don't need to reference them often, and when we do it
is just as easy to load/store from cpu_env directly.

Signed-off-by: Richard Henderson 
---
 target/nios2/translate.c | 21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/target/nios2/translate.c b/target/nios2/translate.c
index fe21bf45af..cefdcea81e 100644
--- a/target/nios2/translate.c
+++ b/target/nios2/translate.c
@@ -103,7 +103,7 @@ typedef struct DisasContext {
 int   mem_idx;
 } DisasContext;
 
-static TCGv cpu_R[NUM_CORE_REGS];
+static TCGv cpu_R[NUM_GP_REGS];
 static TCGv cpu_pc;
 
 typedef struct Nios2Instruction {
@@ -453,6 +453,7 @@ static void callr(DisasContext *dc, uint32_t code, uint32_t 
flags)
 static void rdctl(DisasContext *dc, uint32_t code, uint32_t flags)
 {
 R_TYPE(instr, code);
+TCGv t1, t2;
 
 if (!gen_check_supervisor(dc)) {
 return;
@@ -472,10 +473,19 @@ static void rdctl(DisasContext *dc, uint32_t code, 
uint32_t flags)
  * must perform the AND here, and anywhere else we need the
  * guest value of ipending.
  */
-tcg_gen_and_tl(cpu_R[instr.c], cpu_R[CR_IPENDING], cpu_R[CR_IENABLE]);
+t1 = tcg_temp_new();
+t2 = tcg_temp_new();
+tcg_gen_ld_tl(t1, cpu_env,
+  offsetof(CPUNios2State, regs[CR_IPENDING]));
+tcg_gen_ld_tl(t2, cpu_env,
+  offsetof(CPUNios2State, regs[CR_IENABLE]));
+tcg_gen_and_tl(cpu_R[instr.c], t1, t2);
+tcg_temp_free(t1);
+tcg_temp_free(t2);
 break;
 default:
-tcg_gen_mov_tl(cpu_R[instr.c], cpu_R[instr.imm5 + CR_BASE]);
+tcg_gen_ld_tl(cpu_R[instr.c], cpu_env,
+  offsetof(CPUNios2State, regs[instr.imm5 + CR_BASE]));
 break;
 }
 }
@@ -512,7 +522,8 @@ static void wrctl(DisasContext *dc, uint32_t code, uint32_t 
flags)
 dc->base.is_jmp = DISAS_UPDATE;
 /* fall through */
 default:
-tcg_gen_mov_tl(cpu_R[instr.imm5 + CR_BASE], v);
+tcg_gen_st_tl(v, cpu_env,
+  offsetof(CPUNios2State, regs[instr.imm5 + CR_BASE]));
 break;
 }
 #endif
@@ -900,7 +911,7 @@ void nios2_tcg_init(void)
 {
 int i;
 
-for (i = 0; i < NUM_CORE_REGS; i++) {
+for (i = 0; i < NUM_GP_REGS; i++) {
 cpu_R[i] = tcg_global_mem_new(cpu_env,
   offsetof(CPUNios2State, regs[i]),
   regnames[i]);
-- 
2.25.1

[PATCH v4 20/33] target/nios2: Remove CPU_INTERRUPT_NMI

This interrupt bit is never set, so testing it in
nios2_cpu_has_work is pointless.

Signed-off-by: Richard Henderson 
---
 target/nios2/cpu.h | 2 --
 target/nios2/cpu.c | 2 +-
 2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/target/nios2/cpu.h b/target/nios2/cpu.h
index b418deec4c..f582e52aa4 100644
--- a/target/nios2/cpu.h
+++ b/target/nios2/cpu.h
@@ -175,8 +175,6 @@ FIELD(CR_TLBMISC, EE, 24, 1)
 #define EXCP_MPUI 16
 #define EXCP_MPUD 17
 
-#define CPU_INTERRUPT_NMI   CPU_INTERRUPT_TGT_EXT_3
-
 struct CPUNios2State {
 uint32_t regs[NUM_GP_REGS];
 union {
diff --git a/target/nios2/cpu.c b/target/nios2/cpu.c
index ed7b9f9459..2779650128 100644
--- a/target/nios2/cpu.c
+++ b/target/nios2/cpu.c
@@ -36,7 +36,7 @@ static void nios2_cpu_set_pc(CPUState *cs, vaddr value)
 
 static bool nios2_cpu_has_work(CPUState *cs)
 {
-return cs->interrupt_request & (CPU_INTERRUPT_HARD | CPU_INTERRUPT_NMI);
+return cs->interrupt_request & CPU_INTERRUPT_HARD;
 }
 
 static void nios2_cpu_reset(DeviceState *dev)
-- 
2.25.1

[PATCH v4 10/33] target/nios2: Clean up nios2_cpu_dump_state

Do not print control registers for user-only mode.
Rename reserved control registers to "resN", where
N is the control register index.

Signed-off-by: Richard Henderson 
---
 target/nios2/translate.c | 20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/target/nios2/translate.c b/target/nios2/translate.c
index 2942921724..7a32e6626d 100644
--- a/target/nios2/translate.c
+++ b/target/nios2/translate.c
@@ -754,7 +754,7 @@ illegal_op:
 t_gen_helper_raise_exception(dc, EXCP_ILLEGAL);
 }
 
-static const char * const gr_regnames[] = {
+static const char * const gr_regnames[NUM_GP_REGS] = {
 "zero",   "at", "r2", "r3",
 "r4", "r5", "r6", "r7",
 "r8", "r9", "r10","r11",
@@ -765,17 +765,18 @@ static const char * const gr_regnames[] = {
 "fp", "ea", "ba", "ra",
 };
 
-static const char * const cr_regnames[] = {
+#ifndef CONFIG_USER_ONLY
+static const char * const cr_regnames[NUM_CR_REGS] = {
 "status", "estatus","bstatus","ienable",
-"ipending",   "cpuid",  "reserved0",  "exception",
+"ipending",   "cpuid",  "res6",   "exception",
 "pteaddr","tlbacc", "tlbmisc","reserved1",
 "badaddr","config", "mpubase","mpuacc",
-"reserved2",  "reserved3",  "reserved4",  "reserved5",
-"reserved6",  "reserved7",  "reserved8",  "reserved9",
-"reserved10", "reserved11", "reserved12", "reserved13",
-"reserved14", "reserved15", "reserved16", "reserved17",
-"rpc"
+"res16",  "res17",  "res18",  "res19",
+"res20",  "res21",  "res22",  "res23",
+"res24",  "res25",  "res26",  "res27",
+"res28",  "res29",  "res30",  "res31",
 };
+#endif
 
 #include "exec/gen-icount.h"
 
@@ -899,13 +900,14 @@ void nios2_cpu_dump_state(CPUState *cs, FILE *f, int 
flags)
 qemu_fprintf(f, "\n");
 }
 }
+
+#if !defined(CONFIG_USER_ONLY)
 for (i = 0; i < NUM_CR_REGS; i++) {
 qemu_fprintf(f, "%9s=%8.8x ", cr_regnames[i], env->ctrl[i]);
 if ((i + 1) % 4 == 0) {
 qemu_fprintf(f, "\n");
 }
 }
-#if !defined(CONFIG_USER_ONLY)
 qemu_fprintf(f, " mmu write: VPN=%05X PID %02X TLBACC %08X\n",
  env->mmu.pteaddr_wr & CR_PTEADDR_VPN_MASK,
  (env->mmu.tlbmisc_wr & CR_TLBMISC_PID_MASK) >> 4,
-- 
2.25.1

[PATCH v4 03/33] target/nios2: Add NUM_GP_REGS and NUM_CP_REGS

From: Amir Gonnen 

Split NUM_CORE_REGS into components that can be used elsewhere.

Signed-off-by: Amir Gonnen 
Message-Id: <20220303153906.2024748-3-amir.gon...@neuroblade.ai>
[rth: Split out of a larger patch for shadow register sets.]
Signed-off-by: Richard Henderson 
---
 target/nios2/cpu.h | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/target/nios2/cpu.h b/target/nios2/cpu.h
index a00e4229ce..655a440033 100644
--- a/target/nios2/cpu.h
+++ b/target/nios2/cpu.h
@@ -57,9 +57,11 @@ struct Nios2CPUClass {
 #define EXCEPTION_ADDRESS 0x0004
 #define FAST_TLB_MISS_ADDRESS 0x0008
 
+#define NUM_GP_REGS 32
+#define NUM_CR_REGS 32
 
 /* GP regs + CR regs + PC */
-#define NUM_CORE_REGS (32 + 32 + 1)
+#define NUM_CORE_REGS (NUM_GP_REGS + NUM_CR_REGS + 1)
 
 /* General purpose register aliases */
 #define R_ZERO   0
@@ -80,7 +82,7 @@ struct Nios2CPUClass {
 #define R_RA 31
 
 /* Control register aliases */
-#define CR_BASE  32
+#define CR_BASE  NUM_GP_REGS
 #define CR_STATUS(CR_BASE + 0)
 #define   CR_STATUS_PIE  (1 << 0)
 #define   CR_STATUS_U(1 << 1)
-- 
2.25.1

[PATCH v4 05/33] target/nios2: Split out helper for eret instruction

From: Amir Gonnen 

The implementation of eret will become much more complex
with the introduction of shadow registers.

Signed-off-by: Amir Gonnen 
Message-Id: <20220303153906.2024748-3-amir.gon...@neuroblade.ai>
[rth: Split out of a larger patch for shadow register sets.
  Directly exit to the cpu loop from the helper.]
Signed-off-by: Richard Henderson 
---
 target/nios2/helper.h|  1 +
 target/nios2/op_helper.c |  9 +
 target/nios2/translate.c | 10 ++
 3 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/target/nios2/helper.h b/target/nios2/helper.h
index a44ecfdf7a..02797c384d 100644
--- a/target/nios2/helper.h
+++ b/target/nios2/helper.h
@@ -21,6 +21,7 @@
 DEF_HELPER_FLAGS_2(raise_exception, TCG_CALL_NO_WG, noreturn, env, i32)
 
 #if !defined(CONFIG_USER_ONLY)
+DEF_HELPER_2(eret, noreturn, env, i32)
 DEF_HELPER_2(mmu_write_tlbacc, void, env, i32)
 DEF_HELPER_2(mmu_write_tlbmisc, void, env, i32)
 DEF_HELPER_2(mmu_write_pteaddr, void, env, i32)
diff --git a/target/nios2/op_helper.c b/target/nios2/op_helper.c
index caa885f7b4..df48e82cc2 100644
--- a/target/nios2/op_helper.c
+++ b/target/nios2/op_helper.c
@@ -30,3 +30,12 @@ void helper_raise_exception(CPUNios2State *env, uint32_t 
index)
 cs->exception_index = index;
 cpu_loop_exit(cs);
 }
+
+#ifndef CONFIG_USER_ONLY
+void helper_eret(CPUNios2State *env, uint32_t new_pc)
+{
+env->regs[CR_STATUS] = env->regs[CR_ESTATUS];
+env->pc = new_pc;
+cpu_loop_exit(env_cpu(env));
+}
+#endif /* !CONFIG_USER_ONLY */
diff --git a/target/nios2/translate.c b/target/nios2/translate.c
index 7a33181c4b..fe21bf45af 100644
--- a/target/nios2/translate.c
+++ b/target/nios2/translate.c
@@ -391,10 +391,12 @@ static void eret(DisasContext *dc, uint32_t code, 
uint32_t flags)
 return;
 }
 
-tcg_gen_mov_tl(cpu_R[CR_STATUS], cpu_R[CR_ESTATUS]);
-tcg_gen_mov_tl(cpu_pc, cpu_R[R_EA]);
-
-dc->base.is_jmp = DISAS_JUMP;
+#ifdef CONFIG_USER_ONLY
+g_assert_not_reached();
+#else
+gen_helper_eret(cpu_env, cpu_R[R_EA]);
+dc->base.is_jmp = DISAS_NORETURN;
+#endif
 }
 
 /* PC <- ra */
-- 
2.25.1

[PATCH v4 04/33] target/nios2: Split PC out of env->regs[]

It is cleaner to have a separate name for this variable.

Signed-off-by: Richard Henderson 
---
 target/nios2/cpu.h  | 10 +++-
 linux-user/elfload.c|  2 +-
 linux-user/nios2/cpu_loop.c | 17 ++---
 linux-user/nios2/signal.c   |  6 ++---
 target/nios2/cpu.c  |  8 +++---
 target/nios2/helper.c   | 51 +
 target/nios2/translate.c| 26 ++-
 7 files changed, 57 insertions(+), 63 deletions(-)

diff --git a/target/nios2/cpu.h b/target/nios2/cpu.h
index 655a440033..727d31c427 100644
--- a/target/nios2/cpu.h
+++ b/target/nios2/cpu.h
@@ -60,8 +60,8 @@ struct Nios2CPUClass {
 #define NUM_GP_REGS 32
 #define NUM_CR_REGS 32
 
-/* GP regs + CR regs + PC */
-#define NUM_CORE_REGS (NUM_GP_REGS + NUM_CR_REGS + 1)
+/* GP regs + CR regs */
+#define NUM_CORE_REGS (NUM_GP_REGS + NUM_CR_REGS)
 
 /* General purpose register aliases */
 #define R_ZERO   0
@@ -131,9 +131,6 @@ struct Nios2CPUClass {
 #define CR_MPUBASE   (CR_BASE + 14)
 #define CR_MPUACC(CR_BASE + 15)
 
-/* Other registers */
-#define R_PC 64
-
 /* Exceptions */
 #define EXCP_BREAK0x1000
 #define EXCP_RESET0
@@ -159,6 +156,7 @@ struct Nios2CPUClass {
 
 struct CPUNios2State {
 uint32_t regs[NUM_CORE_REGS];
+uint32_t pc;
 
 #if !defined(CONFIG_USER_ONLY)
 Nios2MMU mmu;
@@ -242,7 +240,7 @@ typedef Nios2CPU ArchCPU;
 static inline void cpu_get_tb_cpu_state(CPUNios2State *env, target_ulong *pc,
 target_ulong *cs_base, uint32_t *flags)
 {
-*pc = env->regs[R_PC];
+*pc = env->pc;
 *cs_base = 0;
 *flags = (env->regs[CR_STATUS] & (CR_STATUS_EH | CR_STATUS_U));
 }
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 9628a38361..23ff9659a5 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -1170,7 +1170,7 @@ static void elf_core_copy_regs(target_elf_gregset_t *regs,
 (*regs)[30] = -1;/* R_SSTATUS */
 (*regs)[31] = tswapreg(env->regs[R_RA]);
 
-(*regs)[32] = tswapreg(env->regs[R_PC]);
+(*regs)[32] = tswapreg(env->pc);
 
 (*regs)[33] = -1; /* R_STATUS */
 (*regs)[34] = tswapreg(env->regs[CR_ESTATUS]);
diff --git a/linux-user/nios2/cpu_loop.c b/linux-user/nios2/cpu_loop.c
index 1e93ef34e6..7b20c024db 100644
--- a/linux-user/nios2/cpu_loop.c
+++ b/linux-user/nios2/cpu_loop.c
@@ -56,25 +56,24 @@ void cpu_loop(CPUNios2State *env)
 env->regs[2] = abs(ret);
 /* Return value is 0..4096 */
 env->regs[7] = ret > 0xf000u;
-env->regs[R_PC] += 4;
+env->pc += 4;
 break;
 
 case 1:
 qemu_log_mask(CPU_LOG_INT, "\nTrap 1\n");
-force_sig_fault(TARGET_SIGUSR1, 0, env->regs[R_PC]);
+force_sig_fault(TARGET_SIGUSR1, 0, env->pc);
 break;
 case 2:
 qemu_log_mask(CPU_LOG_INT, "\nTrap 2\n");
-force_sig_fault(TARGET_SIGUSR2, 0, env->regs[R_PC]);
+force_sig_fault(TARGET_SIGUSR2, 0, env->pc);
 break;
 case 31:
 qemu_log_mask(CPU_LOG_INT, "\nTrap 31\n");
-force_sig_fault(TARGET_SIGTRAP, TARGET_TRAP_BRKPT, 
env->regs[R_PC]);
+force_sig_fault(TARGET_SIGTRAP, TARGET_TRAP_BRKPT, env->pc);
 break;
 default:
 qemu_log_mask(CPU_LOG_INT, "\nTrap %d\n", env->error_code);
-force_sig_fault(TARGET_SIGILL, TARGET_ILL_ILLTRP,
-env->regs[R_PC]);
+force_sig_fault(TARGET_SIGILL, TARGET_ILL_ILLTRP, env->pc);
 break;
 
 case 16: /* QEMU specific, for __kuser_cmpxchg */
@@ -99,7 +98,7 @@ void cpu_loop(CPUNios2State *env)
 o = env->regs[5];
 n = env->regs[6];
 env->regs[2] = qatomic_cmpxchg(h, o, n) - o;
-env->regs[R_PC] += 4;
+env->pc += 4;
 }
 break;
 }
@@ -117,7 +116,7 @@ void cpu_loop(CPUNios2State *env)
 info.si_errno = 0;
 /* TODO: check env->error_code */
 info.si_code = TARGET_SEGV_MAPERR;
-info._sifields._sigfault._addr = env->regs[R_PC];
+info._sifields._sigfault._addr = env->pc;
 queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
 }
 break;
@@ -155,6 +154,6 @@ void target_cpu_copy_regs(CPUArchState *env, struct 
target_pt_regs *regs)
 env->regs[R_SP] = regs->sp;
 env->regs[R_GP] = regs->gp;
 env->regs[CR_ESTATUS] = regs->estatus;
-env->regs[R_PC] = regs->ea;
+env->pc = regs->ea;
 /* TODO: unsigned long  orig_r7; */
 }
diff --git a/linux-user/nios2/signal.c b/linux-user/nios2/signal.c
index 517cd39270..ccfaa75d3b 100644
--- a/linux-user/nios2/signal.c
+++

[PATCH v4 23/33] target/nios2: Drop CR_STATUS_EH from tb->flags

There's nothing about EH that affects translation,
so there's no need to include it in tb->flags.

Signed-off-by: Richard Henderson 
---
 target/nios2/cpu.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/nios2/cpu.h b/target/nios2/cpu.h
index f582e52aa4..2a5e070960 100644
--- a/target/nios2/cpu.h
+++ b/target/nios2/cpu.h
@@ -291,7 +291,7 @@ static inline void cpu_get_tb_cpu_state(CPUNios2State *env, 
target_ulong *pc,
 {
 *pc = env->pc;
 *cs_base = 0;
-*flags = env->status & (CR_STATUS_EH | CR_STATUS_U);
+*flags = env->status & CR_STATUS_U;
 }
 
 #endif /* NIOS2_CPU_H */
-- 
2.25.1

[PATCH v4 02/33] target/nios2: Stop generating code if gen_check_supervisor fails

Whether the cpu is in user-mode or not is something that we
know at translation-time.  We do not need to generate code
after having raised an exception.

Suggested-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/nios2/translate.c | 20 +++-
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/target/nios2/translate.c b/target/nios2/translate.c
index 341f3a8273..1e0ab686dc 100644
--- a/target/nios2/translate.c
+++ b/target/nios2/translate.c
@@ -169,12 +169,14 @@ static void gen_excp(DisasContext *dc, uint32_t code, 
uint32_t flags)
 t_gen_helper_raise_exception(dc, flags);
 }
 
-static void gen_check_supervisor(DisasContext *dc)
+static bool gen_check_supervisor(DisasContext *dc)
 {
 if (dc->base.tb->flags & CR_STATUS_U) {
 /* CPU in user mode, privileged instruction called, stop. */
 t_gen_helper_raise_exception(dc, EXCP_SUPERI);
+return false;
 }
+return true;
 }
 
 /*
@@ -384,7 +386,9 @@ static const Nios2Instruction i_type_instructions[] = {
  */
 static void eret(DisasContext *dc, uint32_t code, uint32_t flags)
 {
-gen_check_supervisor(dc);
+if (!gen_check_supervisor(dc)) {
+return;
+}
 
 tcg_gen_mov_tl(cpu_R[CR_STATUS], cpu_R[CR_ESTATUS]);
 tcg_gen_mov_tl(cpu_R[R_PC], cpu_R[R_EA]);
@@ -447,7 +451,9 @@ static void rdctl(DisasContext *dc, uint32_t code, uint32_t 
flags)
 {
 R_TYPE(instr, code);
 
-gen_check_supervisor(dc);
+if (!gen_check_supervisor(dc)) {
+return;
+}
 
 if (unlikely(instr.c == R_ZERO)) {
 return;
@@ -474,9 +480,13 @@ static void rdctl(DisasContext *dc, uint32_t code, 
uint32_t flags)
 /* ctlN <- rA */
 static void wrctl(DisasContext *dc, uint32_t code, uint32_t flags)
 {
-gen_check_supervisor(dc);
+if (!gen_check_supervisor(dc)) {
+return;
+}
 
-#ifndef CONFIG_USER_ONLY
+#ifdef CONFIG_USER_ONLY
+g_assert_not_reached();
+#else
 R_TYPE(instr, code);
 TCGv v = load_gpr(dc, instr.a);
 
-- 
2.25.1

[PATCH v4 09/33] target/nios2: Split control registers away from general registers

Place the control registers into their own array, env->ctrl[].

Use an anonymous union and struct to give the entries in the
array distinct names, so that one may write env->foo instead
of env->ctrl[CR_FOO].

Signed-off-by: Richard Henderson 
---
 target/nios2/cpu.h   |  64 ++-
 target/nios2/cpu.c   |  19 +++
 target/nios2/helper.c| 106 +++
 target/nios2/mmu.c   |  26 +-
 target/nios2/op_helper.c |   2 +-
 target/nios2/translate.c |  31 +++-
 6 files changed, 136 insertions(+), 112 deletions(-)

diff --git a/target/nios2/cpu.h b/target/nios2/cpu.h
index 14ed46959e..5bc0e353b4 100644
--- a/target/nios2/cpu.h
+++ b/target/nios2/cpu.h
@@ -60,9 +60,6 @@ struct Nios2CPUClass {
 #define NUM_GP_REGS 32
 #define NUM_CR_REGS 32
 
-/* GP regs + CR regs */
-#define NUM_CORE_REGS (NUM_GP_REGS + NUM_CR_REGS)
-
 /* General purpose register aliases */
 #define R_ZERO   0
 #define R_AT 1
@@ -82,8 +79,7 @@ struct Nios2CPUClass {
 #define R_RA 31
 
 /* Control register aliases */
-#define CR_BASE  NUM_GP_REGS
-#define CR_STATUS(CR_BASE + 0)
+#define CR_STATUS0
 #define   CR_STATUS_PIE  (1 << 0)
 #define   CR_STATUS_U(1 << 1)
 #define   CR_STATUS_EH   (1 << 2)
@@ -93,19 +89,19 @@ struct Nios2CPUClass {
 #define   CR_STATUS_PRS  (63 << 16)
 #define   CR_STATUS_NMI  (1 << 22)
 #define   CR_STATUS_RSIE (1 << 23)
-#define CR_ESTATUS   (CR_BASE + 1)
-#define CR_BSTATUS   (CR_BASE + 2)
-#define CR_IENABLE   (CR_BASE + 3)
-#define CR_IPENDING  (CR_BASE + 4)
-#define CR_CPUID (CR_BASE + 5)
-#define CR_CTL6  (CR_BASE + 6)
-#define CR_EXCEPTION (CR_BASE + 7)
-#define CR_PTEADDR   (CR_BASE + 8)
+#define CR_ESTATUS   1
+#define CR_BSTATUS   2
+#define CR_IENABLE   3
+#define CR_IPENDING  4
+#define CR_CPUID 5
+#define CR_CTL6  6
+#define CR_EXCEPTION 7
+#define CR_PTEADDR   8
 #define   CR_PTEADDR_PTBASE_SHIFT 22
 #define   CR_PTEADDR_PTBASE_MASK  (0x3FF << CR_PTEADDR_PTBASE_SHIFT)
 #define   CR_PTEADDR_VPN_SHIFT2
 #define   CR_PTEADDR_VPN_MASK (0xF << CR_PTEADDR_VPN_SHIFT)
-#define CR_TLBACC(CR_BASE + 9)
+#define CR_TLBACC9
 #define   CR_TLBACC_IGN_SHIFT 25
 #define   CR_TLBACC_IGN_MASK  (0x7F << CR_TLBACC_IGN_SHIFT)
 #define   CR_TLBACC_C (1 << 24)
@@ -114,7 +110,7 @@ struct Nios2CPUClass {
 #define   CR_TLBACC_X (1 << 21)
 #define   CR_TLBACC_G (1 << 20)
 #define   CR_TLBACC_PFN_MASK  0x000F
-#define CR_TLBMISC   (CR_BASE + 10)
+#define CR_TLBMISC   10
 #define   CR_TLBMISC_WAY_SHIFT 20
 #define   CR_TLBMISC_WAY_MASK  (0xF << CR_TLBMISC_WAY_SHIFT)
 #define   CR_TLBMISC_RD(1 << 19)
@@ -125,11 +121,11 @@ struct Nios2CPUClass {
 #define   CR_TLBMISC_BAD   (1 << 2)
 #define   CR_TLBMISC_PERM  (1 << 1)
 #define   CR_TLBMISC_D (1 << 0)
-#define CR_ENCINJ(CR_BASE + 11)
-#define CR_BADADDR   (CR_BASE + 12)
-#define CR_CONFIG(CR_BASE + 13)
-#define CR_MPUBASE   (CR_BASE + 14)
-#define CR_MPUACC(CR_BASE + 15)
+#define CR_ENCINJ11
+#define CR_BADADDR   12
+#define CR_CONFIG13
+#define CR_MPUBASE   14
+#define CR_MPUACC15
 
 /* Exceptions */
 #define EXCP_BREAK0x1000
@@ -155,7 +151,28 @@ struct Nios2CPUClass {
 #define CPU_INTERRUPT_NMI   CPU_INTERRUPT_TGT_EXT_3
 
 struct CPUNios2State {
-uint32_t regs[NUM_CORE_REGS];
+uint32_t regs[NUM_GP_REGS];
+union {
+uint32_t ctrl[NUM_CR_REGS];
+struct {
+uint32_t status;
+uint32_t estatus;
+uint32_t bstatus;
+uint32_t ienable;
+uint32_t ipending;
+uint32_t cpuid;
+uint32_t reserved6;
+uint32_t exception;
+uint32_t pteaddr;
+uint32_t tlbacc;
+uint32_t tlbmisc;
+uint32_t eccinj;
+uint32_t badaddr;
+uint32_t config;
+uint32_t mpubase;
+uint32_t mpuacc;
+};
+};
 uint32_t pc;
 
 #if !defined(CONFIG_USER_ONLY)
@@ -213,8 +230,7 @@ void do_nios2_semihosting(CPUNios2State *env);
 
 static inline int cpu_mmu_index(CPUNios2State *env, bool ifetch)
 {
-return (env->regs[CR_STATUS] & CR_STATUS_U) ? MMU_USER_IDX :
-  MMU_SUPERVISOR_IDX;
+return (env->status & CR_STATUS_U) ? MMU_USER_IDX : MMU_SUPERVISOR_IDX;
 }
 
 #ifdef CONFIG_USER_ONLY
@@ -237,7 +253,7 @@ static inline void cpu_get_tb_cpu_state(CPUNios2State *env, 
target_ulong *pc,
 {
 *pc = env->pc;
 *cs_base = 0;
-*flags = (env->regs[CR_STATUS] & (CR_STATUS_EH | CR_STATUS_U));
+*flags = env->status & (CR_STATUS_EH | CR_STATUS_U);
 }
 
 #endif /* NIOS2_CPU_H */
diff --git a/target/nios2/cpu.c b/target/nios2/cpu.c
index 40031c9f20..f2813d3b47 100644
--- a/target/nios2/cpu.c
+++ b/target/nios2/cpu.c
@@ -53,14 +53,15 @@ static void nios2_cpu_reset(DeviceState *dev)

[PATCH v4 08/33] target/nios2: Remove cpu_interrupts_enabled

This function is unused.  The real computation of this value
is located in nios2_cpu_exec_interrupt.

Signed-off-by: Richard Henderson 
---
 target/nios2/cpu.h | 5 -
 1 file changed, 5 deletions(-)

diff --git a/target/nios2/cpu.h b/target/nios2/cpu.h
index 727d31c427..14ed46959e 100644
--- a/target/nios2/cpu.h
+++ b/target/nios2/cpu.h
@@ -227,11 +227,6 @@ bool nios2_cpu_tlb_fill(CPUState *cs, vaddr address, int 
size,
 bool probe, uintptr_t retaddr);
 #endif
 
-static inline int cpu_interrupts_enabled(CPUNios2State *env)
-{
-return env->regs[CR_STATUS] & CR_STATUS_PIE;
-}
-
 typedef CPUNios2State CPUArchState;
 typedef Nios2CPU ArchCPU;
 
-- 
2.25.1

[PATCH v4 01/33] target/nios2: Check supervisor on eret

From: Amir Gonnen 

eret instruction is only allowed in supervisor mode.

Reviewed-by: Peter Maydell 
Reviewed-by: Richard Henderson 
Signed-off-by: Amir Gonnen 
Message-Id: <20220303153906.2024748-2-amir.gon...@neuroblade.ai>
Signed-off-by: Richard Henderson 
---
 target/nios2/translate.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/target/nios2/translate.c b/target/nios2/translate.c
index f89271dbed..341f3a8273 100644
--- a/target/nios2/translate.c
+++ b/target/nios2/translate.c
@@ -384,6 +384,8 @@ static const Nios2Instruction i_type_instructions[] = {
  */
 static void eret(DisasContext *dc, uint32_t code, uint32_t flags)
 {
+gen_check_supervisor(dc);
+
 tcg_gen_mov_tl(cpu_R[CR_STATUS], cpu_R[CR_ESTATUS]);
 tcg_gen_mov_tl(cpu_R[R_PC], cpu_R[R_EA]);
 
-- 
2.25.1

[PATCH v4 00/33] target/nios2: Shadow register set, EIC and VIC

Hi Amir,

I've done a bunch of cleanup which Peter and I had recommended
during review.  The major bits are:

* Note reserved bits of control registers, which may include
  the entire control register.
* Complete conversion to registerfields.h.
* Use pointer to shadow register set, akin to Sparc windows.
* Use -M 10m50-ghrd,vic=on to enable VIC.

I can test this to a point, but not the final result.
It works to a point with our existing nios2 kernels,
but of course no interrupts are received. So it verifies
that something changes, and that the init doesn't crash,
but that's about it.


r~


Amir Gonnen (5):
  target/nios2: Check supervisor on eret
  target/nios2: Add NUM_GP_REGS and NUM_CP_REGS
  target/nios2: Split out helper for eret instruction
  hw/intc: Vectored Interrupt Controller (VIC)
  hw/nios2: Machine with a Vectored Interrupt Controller

Richard Henderson (28):
  target/nios2: Stop generating code if gen_check_supervisor fails
  target/nios2: Split PC out of env->regs[]
  target/nios2: Do not create TCGv for control registers
  linux-user/nios2: Trim target_pc_regs to sp and pc
  target/nios2: Remove cpu_interrupts_enabled
  target/nios2: Split control registers away from general registers
  target/nios2: Clean up nios2_cpu_dump_state
  target/nios2: Use hw/registerfields.h for CR_STATUS fields
  target/nios2: Use hw/registerfields.h for CR_EXCEPTION fields
  target/nios2: Use hw/registerfields.h for CR_TLBADDR fields
  target/nios2: Use hw/registerfields.h for CR_TLBACC fields
  target/nios2: Use hw/registerfields.h for CR_TLBMISC fields
  target/nios2: Move R_FOO and CR_BAR into enumerations
  target/nios2: Prevent writes to read-only or reserved control fields
  target/nios2: Implement cpuid
  target/nios2: Implement CR_STATUS.RSIE
  target/nios2: Remove CPU_INTERRUPT_NMI
  target/nios2: Use tcg_constant_tl
  target/nios2: Introduce dest_gpr
  target/nios2: Drop CR_STATUS_EH from tb->flags
  target/nios2: Introduce shadow register sets
  target/nios2: Implement rdprs, wrprs
  target/nios2: Update helper_eret for shadow registers
  target/nios2: Create EXCP_SEMIHOST for semi-hosting
  target/nios2: Clean up nios2_cpu_do_interrupt
  target/nios2: Implement EIC interrupt processing
  hw/nios2: Introduce Nios2MachineState
  hw/nios2: Move memory regions into Nios2Machine

 linux-user/nios2/target_syscall.h |  25 +-
 target/nios2/cpu.h| 253 --
 target/nios2/helper.h |   3 +
 hw/intc/nios2_vic.c   | 341 
 hw/nios2/10m50_devboard.c | 118 +++--
 hw/nios2/boot.c   |   8 +-
 linux-user/elfload.c  |   5 +-
 linux-user/nios2/cpu_loop.c   |  39 +--
 linux-user/nios2/signal.c |   6 +-
 target/nios2/cpu.c| 194 +++---
 target/nios2/helper.c | 218 +++
 target/nios2/mmu.c|  73 +++---
 target/nios2/nios2-semi.c |  13 +-
 target/nios2/op_helper.c  |  39 +++
 target/nios2/translate.c  | 422 +++---
 hw/intc/Kconfig   |   3 +
 hw/intc/meson.build   |   1 +
 hw/nios2/Kconfig  |   1 +
 18 files changed, 1223 insertions(+), 539 deletions(-)
 create mode 100644 hw/intc/nios2_vic.c

-- 
2.25.1

Re: [PATCH v5 00/15] vDPA shadow virtqueue

On Tue, Mar 8, 2022 at 3:11 PM Michael S. Tsirkin  wrote:
>
> On Tue, Mar 08, 2022 at 02:03:32PM +0800, Jason Wang wrote:
> >
> > 在 2022/3/7 下午11:33, Eugenio Pérez 写道:
> > > This series enable shadow virtqueue (SVQ) for vhost-vdpa devices. This
> > > is intended as a new method of tracking the memory the devices touch
> > > during a migration process: Instead of relay on vhost device's dirty
> > > logging capability, SVQ intercepts the VQ dataplane forwarding the
> > > descriptors between VM and device. This way qemu is the effective
> > > writer of guests memory, like in qemu's virtio device operation.
> > >
> > > When SVQ is enabled qemu offers a new virtual address space to the
> > > device to read and write into, and it maps new vrings and the guest
> > > memory in it. SVQ also intercepts kicks and calls between the device
> > > and the guest. Used buffers relay would cause dirty memory being
> > > tracked.
> > >
> > > This effectively means that vDPA device passthrough is intercepted by
> > > qemu. While SVQ should only be enabled at migration time, the switching
> > > from regular mode to SVQ mode is left for a future series.
> > >
> > > It is based on the ideas of DPDK SW assisted LM, in the series of
> > > DPDK's https://patchwork.dpdk.org/cover/48370/ . However, these does
> > > not map the shadow vq in guest's VA, but in qemu's.
> > >
> > > For qemu to use shadow virtqueues the guest virtio driver must not use
> > > features like event_idx.
> > >
> > > SVQ needs to be enabled with cmdline:
> > >
> > > -netdev type=vhost-vdpa,vhostdev=vhost-vdpa-0,id=vhost-vdpa0,svq=on
>
> A stable API for an incomplete feature is a problem imho.

It should be "x-svq".

>
>
> > >
> > > The first three patches enables notifications forwarding with
> > > assistance of qemu. It's easy to enable only this if the relevant
> > > cmdline part of the last patch is applied on top of these.
> > >
> > > Next four patches implement the actual buffer forwarding. However,
> > > address are not translated from HVA so they will need a host device with
> > > an iommu allowing them to access all of the HVA range.
> > >
> > > The last part of the series uses properly the host iommu, so qemu
> > > creates a new iova address space in the device's range and translates
> > > the buffers in it. Finally, it adds the cmdline parameter.
> > >
> > > Some simple performance tests with netperf were done. They used a nested
> > > guest with vp_vdpa, vhost-kernel at L0 host. Starting with no svq and a
> > > baseline average of ~9009.96Mbps:
> > > Recv   SendSend
> > > Socket Socket  Message  Elapsed
> > > Size   SizeSize Time Throughput
> > > bytes  bytes   bytessecs.10^6bits/sec
> > > 131072  16384  1638430.019061.03
> > > 131072  16384  1638430.018962.94
> > > 131072  16384  1638430.019005.92
> > >
> > > To enable SVQ buffers forwarding reduce throughput to about
> > > Recv   SendSend
> > > Socket Socket  Message  Elapsed
> > > Size   SizeSize Time Throughput
> > > bytes  bytes   bytessecs.10^6bits/sec
> > > 131072  16384  1638430.017689.72
> > > 131072  16384  1638430.007752.07
> > > 131072  16384  1638430.017750.30
> > >
> > > However, many performance improvements were left out of this series for
> > > simplicity, so difference should shrink in the future.
> > >
> > > Comments are welcome.
> >
> >
> > Hi Michael:
> >
> > What do you think of this series? It looks good to me as a start. The
> > feature could only be enabled as a dedicated parameter. If you're ok, I'd
> > try to make it for 7.0.
> >
> > Thanks
>
> Well that's cutting it awfully close, and it's not really useful
> at the current stage, is it?

This allows vDPA to be migrated when using "x-svq=on". But anyhow it's
experimental.

>
> The IOVA trick does not feel complete either.

I don't get here. We don't use any IOVA trick as DPDK (it reserve IOVA
for shadow vq) did. So we won't suffer from the issues of DPDK.

Thanks

>
> >
> > >
> > > TODO on future series:
> > > * Event, indirect, packed, and others features of virtio.
> > > * To support different set of features between the device<->SVQ and the
> > >SVQ<->guest communication.
> > > * Support of device host notifier memory regions.
> > > * To sepparate buffers forwarding in its own AIO context, so we can
> > >throw more threads to that task and we don't need to stop the main
> > >event loop.
> > > * Support multiqueue virtio-net vdpa.
> > > * Proper documentation.
> > >
> > > Changes from v4:
> > > * Iterate iova->hva tree instead on maintain own tree so we support HVA
> > >overlaps.
> > > * Fix: Errno completion at failure.
> > > * Rename x-svq to svq, so changes to stable does not affect cmdline 
> > > parameter.
> > >
> > > Changes from v3:
> > > * Add @unstable feature to NetdevVhostVDPAOptions.x-svq.
> > > * Fix uncomplete mapping (by 1 byte) of memory regions if svq is enabled.
> > > v3 link:
> > >

Re: [PATCH v5 15/15] vdpa: Add x-svq to NetdevVhostVDPAOptions

On Mon, Mar 07, 2022 at 04:33:34PM +0100, Eugenio Pérez wrote:
> Finally offering the possibility to enable SVQ from the command line.
> 
> Signed-off-by: Eugenio Pérez 
> ---
>  qapi/net.json|  8 +++-
>  net/vhost-vdpa.c | 48 
>  2 files changed, 47 insertions(+), 9 deletions(-)
> 
> diff --git a/qapi/net.json b/qapi/net.json
> index 7fab2e7cd8..d626fa441c 100644
> --- a/qapi/net.json
> +++ b/qapi/net.json
> @@ -445,12 +445,18 @@
>  # @queues: number of queues to be created for multiqueue vhost-vdpa
>  #  (default: 1)
>  #
> +# @svq: Start device with (experimental) shadow virtqueue. (Since 7.0)
> +#
> +# Features:
> +# @unstable: Member @svq is experimental.
> +#
>  # Since: 5.1
>  ##
>  { 'struct': 'NetdevVhostVDPAOptions',
>'data': {
>  '*vhostdev': 'str',
> -'*queues':   'int' } }
> +'*queues':   'int',
> +'*svq':  {'type': 'bool', 'features' : [ 'unstable'] } } }
>  
>  ##
>  # @NetClientDriver:

I think this should be x-svq same as other unstable features.

> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> index 1e9fe47c03..c827921654 100644
> --- a/net/vhost-vdpa.c
> +++ b/net/vhost-vdpa.c
> @@ -127,7 +127,11 @@ err_init:
>  static void vhost_vdpa_cleanup(NetClientState *nc)
>  {
>  VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
> +struct vhost_dev *dev = s->vhost_vdpa.dev;
>  
> +if (dev && dev->vq_index + dev->nvqs == dev->vq_index_end) {
> +g_clear_pointer(>vhost_vdpa.iova_tree, vhost_iova_tree_delete);
> +}
>  if (s->vhost_net) {
>  vhost_net_cleanup(s->vhost_net);
>  g_free(s->vhost_net);
> @@ -187,13 +191,23 @@ static NetClientInfo net_vhost_vdpa_info = {
>  .check_peer_type = vhost_vdpa_check_peer_type,
>  };
>  
> +static int vhost_vdpa_get_iova_range(int fd,
> + struct vhost_vdpa_iova_range 
> *iova_range)
> +{
> +int ret = ioctl(fd, VHOST_VDPA_GET_IOVA_RANGE, iova_range);
> +
> +return ret < 0 ? -errno : 0;
> +}
> +
>  static NetClientState *net_vhost_vdpa_init(NetClientState *peer,
> -   const char *device,
> -   const char *name,
> -   int vdpa_device_fd,
> -   int queue_pair_index,
> -   int nvqs,
> -   bool is_datapath)
> +   const char *device,
> +   const char *name,
> +   int vdpa_device_fd,
> +   int queue_pair_index,
> +   int nvqs,
> +   bool is_datapath,
> +   bool svq,
> +   VhostIOVATree *iova_tree)
>  {
>  NetClientState *nc = NULL;
>  VhostVDPAState *s;
> @@ -211,6 +225,8 @@ static NetClientState *net_vhost_vdpa_init(NetClientState 
> *peer,
>  
>  s->vhost_vdpa.device_fd = vdpa_device_fd;
>  s->vhost_vdpa.index = queue_pair_index;
> +s->vhost_vdpa.shadow_vqs_enabled = svq;
> +s->vhost_vdpa.iova_tree = iova_tree;
>  ret = vhost_vdpa_add(nc, (void *)>vhost_vdpa, queue_pair_index, nvqs);
>  if (ret) {
>  qemu_del_net_client(nc);
> @@ -266,6 +282,7 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char 
> *name,
>  g_autofree NetClientState **ncs = NULL;
>  NetClientState *nc;
>  int queue_pairs, i, has_cvq = 0;
> +g_autoptr(VhostIOVATree) iova_tree = NULL;
>  
>  assert(netdev->type == NET_CLIENT_DRIVER_VHOST_VDPA);
>  opts = >u.vhost_vdpa;
> @@ -285,29 +302,44 @@ int net_init_vhost_vdpa(const Netdev *netdev, const 
> char *name,
>  qemu_close(vdpa_device_fd);
>  return queue_pairs;
>  }
> +if (opts->svq) {
> +struct vhost_vdpa_iova_range iova_range;
> +
> +if (has_cvq) {
> +error_setg(errp, "vdpa svq does not work with cvq");
> +goto err_svq;
> +}
> +vhost_vdpa_get_iova_range(vdpa_device_fd, _range);
> +iova_tree = vhost_iova_tree_new(iova_range.first, iova_range.last);
> +}
>  
>  ncs = g_malloc0(sizeof(*ncs) * queue_pairs);
>  
>  for (i = 0; i < queue_pairs; i++) {
>  ncs[i] = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
> - vdpa_device_fd, i, 2, true);
> + vdpa_device_fd, i, 2, true, opts->svq,
> + iova_tree);
>  if (!ncs[i])
>  goto err;
>  }
>  
>  if (has_cvq) {
>  nc = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
> - vdpa_device_fd, i, 1, false);
> +

Re: [PATCH v5 00/15] vDPA shadow virtqueue

On Tue, Mar 08, 2022 at 02:03:32PM +0800, Jason Wang wrote:
> 
> 在 2022/3/7 下午11:33, Eugenio Pérez 写道:
> > This series enable shadow virtqueue (SVQ) for vhost-vdpa devices. This
> > is intended as a new method of tracking the memory the devices touch
> > during a migration process: Instead of relay on vhost device's dirty
> > logging capability, SVQ intercepts the VQ dataplane forwarding the
> > descriptors between VM and device. This way qemu is the effective
> > writer of guests memory, like in qemu's virtio device operation.
> > 
> > When SVQ is enabled qemu offers a new virtual address space to the
> > device to read and write into, and it maps new vrings and the guest
> > memory in it. SVQ also intercepts kicks and calls between the device
> > and the guest. Used buffers relay would cause dirty memory being
> > tracked.
> > 
> > This effectively means that vDPA device passthrough is intercepted by
> > qemu. While SVQ should only be enabled at migration time, the switching
> > from regular mode to SVQ mode is left for a future series.
> > 
> > It is based on the ideas of DPDK SW assisted LM, in the series of
> > DPDK's https://patchwork.dpdk.org/cover/48370/ . However, these does
> > not map the shadow vq in guest's VA, but in qemu's.
> > 
> > For qemu to use shadow virtqueues the guest virtio driver must not use
> > features like event_idx.
> > 
> > SVQ needs to be enabled with cmdline:
> > 
> > -netdev type=vhost-vdpa,vhostdev=vhost-vdpa-0,id=vhost-vdpa0,svq=on

A stable API for an incomplete feature is a problem imho.


> > 
> > The first three patches enables notifications forwarding with
> > assistance of qemu. It's easy to enable only this if the relevant
> > cmdline part of the last patch is applied on top of these.
> > 
> > Next four patches implement the actual buffer forwarding. However,
> > address are not translated from HVA so they will need a host device with
> > an iommu allowing them to access all of the HVA range.
> > 
> > The last part of the series uses properly the host iommu, so qemu
> > creates a new iova address space in the device's range and translates
> > the buffers in it. Finally, it adds the cmdline parameter.
> > 
> > Some simple performance tests with netperf were done. They used a nested
> > guest with vp_vdpa, vhost-kernel at L0 host. Starting with no svq and a
> > baseline average of ~9009.96Mbps:
> > Recv   SendSend
> > Socket Socket  Message  Elapsed
> > Size   SizeSize Time Throughput
> > bytes  bytes   bytessecs.10^6bits/sec
> > 131072  16384  1638430.019061.03
> > 131072  16384  1638430.018962.94
> > 131072  16384  1638430.019005.92
> > 
> > To enable SVQ buffers forwarding reduce throughput to about
> > Recv   SendSend
> > Socket Socket  Message  Elapsed
> > Size   SizeSize Time Throughput
> > bytes  bytes   bytessecs.10^6bits/sec
> > 131072  16384  1638430.017689.72
> > 131072  16384  1638430.007752.07
> > 131072  16384  1638430.017750.30
> > 
> > However, many performance improvements were left out of this series for
> > simplicity, so difference should shrink in the future.
> > 
> > Comments are welcome.
> 
> 
> Hi Michael:
> 
> What do you think of this series? It looks good to me as a start. The
> feature could only be enabled as a dedicated parameter. If you're ok, I'd
> try to make it for 7.0.
> 
> Thanks

Well that's cutting it awfully close, and it's not really useful
at the current stage, is it?

The IOVA trick does not feel complete either.

> 
> > 
> > TODO on future series:
> > * Event, indirect, packed, and others features of virtio.
> > * To support different set of features between the device<->SVQ and the
> >SVQ<->guest communication.
> > * Support of device host notifier memory regions.
> > * To sepparate buffers forwarding in its own AIO context, so we can
> >throw more threads to that task and we don't need to stop the main
> >event loop.
> > * Support multiqueue virtio-net vdpa.
> > * Proper documentation.
> > 
> > Changes from v4:
> > * Iterate iova->hva tree instead on maintain own tree so we support HVA
> >overlaps.
> > * Fix: Errno completion at failure.
> > * Rename x-svq to svq, so changes to stable does not affect cmdline 
> > parameter.
> > 
> > Changes from v3:
> > * Add @unstable feature to NetdevVhostVDPAOptions.x-svq.
> > * Fix uncomplete mapping (by 1 byte) of memory regions if svq is enabled.
> > v3 link:
> > https://lore.kernel.org/qemu-devel/20220302203012.3476835-1-epere...@redhat.com/
> > 
> > Changes from v2:
> > * Less assertions and more error handling in iova tree code.
> > * Better documentation, both fixing errors and making @param: format
> > * Homogeneize SVQ avail_idx_shadow and shadow_used_idx to make shadow a
> >prefix at both times.
> > * Fix: Fo not use VirtQueueElement->len field, track separatedly.
> > * Split vhost_svq_{enable,disable}_notification, so the code looks more
> >like the kernel

Re: [PATCH] virtio-net: fix map leaking on error during receive

On Tue, Mar 8, 2022 at 2:56 PM Michael S. Tsirkin  wrote:
>
> On Tue, Mar 08, 2022 at 01:56:42PM +0800, Jason Wang wrote:
> > Commit bedd7e93d0196 ("virtio-net: fix use after unmap/free for sg")
> > tries to fix the use after free of the sg by caching the virtqueue
> > elements in an array and unmap them at once after receiving the
> > packets, But it forgot to unmap the cached elements on error which
> > will lead to leaking of mapping and other unexpected results.
> >
> > Fixing this by detaching the cached elements on error. This addresses
> > CVE-2022-26353.
>
>
> Pls use a tag:
>
> Fixes: CVE-2022-26353

Will do.

>
>
> Besides that
>
> Reviewed-by: Michael S. Tsirkin 
>
> Feel free to merge.

Ok.

Thanks

>
> > Reported-by: Victor Tom 
> > Cc: qemu-sta...@nongnu.org
> > Fixes: bedd7e93d0196 ("virtio-net: fix use after unmap/free for sg")
> > Signed-off-by: Jason Wang 
>
> > ---
> >  hw/net/virtio-net.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > index cf8ab0f8af..65b61c836c 100644
> > --- a/hw/net/virtio-net.c
> > +++ b/hw/net/virtio-net.c
> > @@ -1867,6 +1867,7 @@ static ssize_t virtio_net_receive_rcu(NetClientState 
> > *nc, const uint8_t *buf,
> >
> >  err:
> >  for (j = 0; j < i; j++) {
> > +virtqueue_detach_element(q->rx_vq, elems[j], lens[j]);
> >  g_free(elems[j]);
> >  }
> >
> > --
> > 2.25.1
>

Re: [PATCH] virtio-net: fix map leaking on error during receive

On Tue, Mar 08, 2022 at 01:56:42PM +0800, Jason Wang wrote:
> Commit bedd7e93d0196 ("virtio-net: fix use after unmap/free for sg")
> tries to fix the use after free of the sg by caching the virtqueue
> elements in an array and unmap them at once after receiving the
> packets, But it forgot to unmap the cached elements on error which
> will lead to leaking of mapping and other unexpected results.
> 
> Fixing this by detaching the cached elements on error. This addresses
> CVE-2022-26353.


Pls use a tag:

Fixes: CVE-2022-26353


Besides that

Reviewed-by: Michael S. Tsirkin 

Feel free to merge.

> Reported-by: Victor Tom 
> Cc: qemu-sta...@nongnu.org
> Fixes: bedd7e93d0196 ("virtio-net: fix use after unmap/free for sg")
> Signed-off-by: Jason Wang 

> ---
>  hw/net/virtio-net.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index cf8ab0f8af..65b61c836c 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -1867,6 +1867,7 @@ static ssize_t virtio_net_receive_rcu(NetClientState 
> *nc, const uint8_t *buf,
>  
>  err:
>  for (j = 0; j < i; j++) {
> +virtqueue_detach_element(q->rx_vq, elems[j], lens[j]);
>  g_free(elems[j]);
>  }
>  
> -- 
> 2.25.1

Re: [PATCH V7 10/29] machine: memfd-alloc option

On Mon, Mar 07, 2022 at 09:41:44AM -0500, Steven Sistare wrote:
> On 3/4/2022 5:41 AM, Igor Mammedov wrote:
> > On Thu, 3 Mar 2022 12:21:15 -0500
> > "Michael S. Tsirkin"  wrote:
> > 
> >> On Wed, Dec 22, 2021 at 11:05:15AM -0800, Steve Sistare wrote:
> >>> Allocate anonymous memory using memfd_create if the memfd-alloc machine
> >>> option is set.
> >>>
> >>> Signed-off-by: Steve Sistare 
> >>> ---
> >>>  hw/core/machine.c   | 19 +++
> >>>  include/hw/boards.h |  1 +
> >>>  qemu-options.hx |  6 ++
> >>>  softmmu/physmem.c   | 47 ++-
> >>>  softmmu/vl.c|  1 +
> >>>  trace-events|  1 +
> >>>  util/qemu-config.c  |  4 
> >>>  7 files changed, 70 insertions(+), 9 deletions(-)
> >>>
> >>> diff --git a/hw/core/machine.c b/hw/core/machine.c
> >>> index 53a99ab..7739d88 100644
> >>> --- a/hw/core/machine.c
> >>> +++ b/hw/core/machine.c
> >>> @@ -392,6 +392,20 @@ static void machine_set_mem_merge(Object *obj, bool 
> >>> value, Error **errp)
> >>>  ms->mem_merge = value;
> >>>  }
> >>>  
> >>> +static bool machine_get_memfd_alloc(Object *obj, Error **errp)
> >>> +{
> >>> +MachineState *ms = MACHINE(obj);
> >>> +
> >>> +return ms->memfd_alloc;
> >>> +}
> >>> +
> >>> +static void machine_set_memfd_alloc(Object *obj, bool value, Error 
> >>> **errp)
> >>> +{
> >>> +MachineState *ms = MACHINE(obj);
> >>> +
> >>> +ms->memfd_alloc = value;
> >>> +}
> >>> +
> >>>  static bool machine_get_usb(Object *obj, Error **errp)
> >>>  {
> >>>  MachineState *ms = MACHINE(obj);
> >>> @@ -829,6 +843,11 @@ static void machine_class_init(ObjectClass *oc, void 
> >>> *data)
> >>>  object_class_property_set_description(oc, "mem-merge",
> >>>  "Enable/disable memory merge support");
> >>>  
> >>> +object_class_property_add_bool(oc, "memfd-alloc",
> >>> +machine_get_memfd_alloc, machine_set_memfd_alloc);
> >>> +object_class_property_set_description(oc, "memfd-alloc",
> >>> +"Enable/disable allocating anonymous memory using memfd_create");
> >>> +
> >>>  object_class_property_add_bool(oc, "usb",
> >>>  machine_get_usb, machine_set_usb);
> >>>  object_class_property_set_description(oc, "usb",
> >>> diff --git a/include/hw/boards.h b/include/hw/boards.h
> >>> index 9c1c190..a57d7a0 100644
> >>> --- a/include/hw/boards.h
> >>> +++ b/include/hw/boards.h
> >>> @@ -327,6 +327,7 @@ struct MachineState {
> >>>  char *dt_compatible;
> >>>  bool dump_guest_core;
> >>>  bool mem_merge;
> >>> +bool memfd_alloc;
> >>>  bool usb;
> >>>  bool usb_disabled;
> >>>  char *firmware;
> >>> diff --git a/qemu-options.hx b/qemu-options.hx
> >>> index 7d47510..33c8173 100644
> >>> --- a/qemu-options.hx
> >>> +++ b/qemu-options.hx
> >>> @@ -30,6 +30,7 @@ DEF("machine", HAS_ARG, QEMU_OPTION_machine, \
> >>>  "vmport=on|off|auto controls emulation of vmport 
> >>> (default: auto)\n"
> >>>  "dump-guest-core=on|off include guest memory in a 
> >>> core dump (default=on)\n"
> >>>  "mem-merge=on|off controls memory merge support 
> >>> (default: on)\n"
> >>> +"memfd-alloc=on|off controls allocating anonymous 
> >>> guest RAM using memfd_create (default: off)\n"  
> >>
> >> Question: are there any disadvantages associated with using
> >> memfd_create? I guess we are using up an fd, but that seems minor.  Any
> >> reason not to set to on by default? maybe with a fallback option to
> >> disable that?
> 
> Old Linux host kernels, circa 4.1, do not support huge pages for shared 
> memory.
> Also, the tunable to enable huge pages for share memory is different than for
> anon memory, so there could be performance loss if it is not set correctly.
> /sys/kernel/mm/transparent_hugepage/enabled
> vs
> /sys/kernel/mm/transparent_hugepage/shmem_enabled

I guess we can test this when launching the VM, and select
a good default.

> It might make sense to use memfd_create by default for the secondary segments.

Well there's also KSM now you mention it.

> >> I am concerned that it's actually a kind of memory backend, this flag
> >> seems to instead be closer to the deprecated mem-prealloc. E.g.
> >> it does not work with a mem path, does it?
> 
> One can still define a memory backend with mempath to create the main ram 
> segment,
> though it must be some form of shared to work with live update.  Indeed, I 
> would 
> expect most users to specify an explicit memory backend for it.  The secondary
> segments would still use memfd_create.
> 
> > (mem path and mem-prealloc are transparently aliased to used memory backend
> > if I recall it right.)
> > 
> > Steve,
> > 
> > For allocating guest RAM, we switched exclusively to using memory-backends
> > including initial guest RAM (-m size option) and we have hostmem-memfd
> > that uses memfd_create() and I'd rather avoid adding random knobs to machine
> > for

Re: [PATCH 5/6] aspeed/smc: Let the SSI core layer define the bus name

2022-03-07 Thread Alistair Francis

On Mon, Mar 7, 2022 at 5:34 PM Cédric Le Goater  wrote:
>
> If no id is provided, qdev automatically assigns a unique ename with
> the following pattern ".".
>
> Signed-off-by: Cédric Le Goater 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  hw/ssi/aspeed_smc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/hw/ssi/aspeed_smc.c b/hw/ssi/aspeed_smc.c
> index f194182beacf..113f31899a6b 100644
> --- a/hw/ssi/aspeed_smc.c
> +++ b/hw/ssi/aspeed_smc.c
> @@ -1130,7 +1130,7 @@ static void aspeed_smc_realize(DeviceState *dev, Error 
> **errp)
>  /* DMA irq. Keep it first for the initialization in the SoC */
>  sysbus_init_irq(sbd, >irq);
>
> -s->spi = ssi_create_bus(dev, "spi");
> +s->spi = ssi_create_bus(dev, NULL);
>
>  /* Setup cs_lines for peripherals */
>  s->cs_lines = g_new0(qemu_irq, asc->max_cs);
> --
> 2.34.1
>
>

Re: [PATCH v2] tests: add (riscv virt) machine mapping to testenv

2022-03-07 Thread Alistair Francis

On Tue, Mar 8, 2022 at 2:34 PM laokz  wrote:
>
> Some qemu-iotests(040 etc) use PCI disk to do test. Without the
> mapping, RISC-V flavor use spike as default machine which has no
> PCI bus, causing test failure.
>
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/894
>
> Signed-off-by: Kai Zhang 

Reviewed-by: Alistair Francis 

Alistair

> ---
> Thanks for the detailed info. Corrected S-o-b tag.
>
>  tests/qemu-iotests/testenv.py | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/tests/qemu-iotests/testenv.py b/tests/qemu-iotests/testenv.py
> index 0f32897fe8..975f26a785 100644
> --- a/tests/qemu-iotests/testenv.py
> +++ b/tests/qemu-iotests/testenv.py
> @@ -238,6 +238,8 @@ def __init__(self, imgfmt: str, imgproto: str, aiomode: 
> str,
>  ('aarch64', 'virt'),
>  ('avr', 'mega2560'),
>  ('m68k', 'virt'),
> +('riscv32', 'virt'),
> +('riscv64', 'virt'),
>  ('rx', 'gdbsim-r5f562n8'),
>  ('tricore', 'tricore_testboard')
>  )
> --
> 2.34.1
>
>

Re: [PATCH v5 00/15] vDPA shadow virtqueue

在 2022/3/7 下午11:33, Eugenio Pérez 写道:

This series enable shadow virtqueue (SVQ) for vhost-vdpa devices. This
is intended as a new method of tracking the memory the devices touch
during a migration process: Instead of relay on vhost device's dirty
logging capability, SVQ intercepts the VQ dataplane forwarding the
descriptors between VM and device. This way qemu is the effective
writer of guests memory, like in qemu's virtio device operation.

When SVQ is enabled qemu offers a new virtual address space to the
device to read and write into, and it maps new vrings and the guest
memory in it. SVQ also intercepts kicks and calls between the device
and the guest. Used buffers relay would cause dirty memory being
tracked.

This effectively means that vDPA device passthrough is intercepted by
qemu. While SVQ should only be enabled at migration time, the switching
from regular mode to SVQ mode is left for a future series.

It is based on the ideas of DPDK SW assisted LM, in the series of
DPDK's https://patchwork.dpdk.org/cover/48370/ . However, these does
not map the shadow vq in guest's VA, but in qemu's.

For qemu to use shadow virtqueues the guest virtio driver must not use
features like event_idx.

SVQ needs to be enabled with cmdline:

-netdev type=vhost-vdpa,vhostdev=vhost-vdpa-0,id=vhost-vdpa0,svq=on

The first three patches enables notifications forwarding with
assistance of qemu. It's easy to enable only this if the relevant
cmdline part of the last patch is applied on top of these.

Next four patches implement the actual buffer forwarding. However,
address are not translated from HVA so they will need a host device with
an iommu allowing them to access all of the HVA range.

The last part of the series uses properly the host iommu, so qemu
creates a new iova address space in the device's range and translates
the buffers in it. Finally, it adds the cmdline parameter.

Some simple performance tests with netperf were done. They used a nested
guest with vp_vdpa, vhost-kernel at L0 host. Starting with no svq and a
baseline average of ~9009.96Mbps:
Recv SendSend
Socket Socket Message Elapsed
Size SizeSize Time Throughput
bytes bytes bytessecs.10^6bits/sec
131072 16384 1638430.019061.03
131072 16384 1638430.018962.94
131072 16384 1638430.019005.92

To enable SVQ buffers forwarding reduce throughput to about
Recv SendSend
Socket Socket Message Elapsed
Size SizeSize Time Throughput
bytes bytes bytessecs.10^6bits/sec
131072 16384 1638430.017689.72
131072 16384 1638430.007752.07
131072 16384 1638430.017750.30

However, many performance improvements were left out of this series for
simplicity, so difference should shrink in the future.

Comments are welcome.

Hi Michael:

What do you think of this series? It looks good to me as a start. The
feature could only be enabled as a dedicated parameter. If you're ok,
I'd try to make it for 7.0.

Thanks

TODO on future series:
* Event, indirect, packed, and others features of virtio.
* To support different set of features between the device<->SVQ and the
SVQ<->guest communication.
* Support of device host notifier memory regions.
* To sepparate buffers forwarding in its own AIO context, so we can
throw more threads to that task and we don't need to stop the main
event loop.
* Support multiqueue virtio-net vdpa.
* Proper documentation.

Changes from v4:
* Iterate iova->hva tree instead on maintain own tree so we support HVA
overlaps.
* Fix: Errno completion at failure.
* Rename x-svq to svq, so changes to stable does not affect cmdline parameter.

Changes from v3:
* Add @unstable feature to NetdevVhostVDPAOptions.x-svq.
* Fix uncomplete mapping (by 1 byte) of memory regions if svq is enabled.
v3 link:
https://lore.kernel.org/qemu-devel/20220302203012.3476835-1-epere...@redhat.com/

Changes from v2:
* Less assertions and more error handling in iova tree code.
* Better documentation, both fixing errors and making @param: format
* Homogeneize SVQ avail_idx_shadow and shadow_used_idx to make shadow a
prefix at both times.
* Fix: Fo not use VirtQueueElement->len field, track separatedly.
* Split vhost_svq_{enable,disable}_notification, so the code looks more
like the kernel driver code.
* Small improvements.
v2 link:
https://lore.kernel.org/all/cajaqywfxhe0c54r_-oiwjzjc0gppke3ex0l8beezxgm1ery...@mail.gmail.com/

Changes from v1:
* Feature set at device->SVQ is now the same as SVQ->guest.
* Size of SVQ is not max available device size anymore, but guest's
negotiated.
* Add VHOST_FILE_UNBIND kick and call fd treatment.
* Make SVQ a public struct
* Come back to previous approach to iova-tree
* Some assertions are now fail paths. Some errors are now log_guest.
* Only mask _F_LOG feature at vdpa_set_features svq enable path.
* Refactor some errors and messages. Add missing error unwindings.
* Add memory barrier at _F_NO_NOTIFY

[PATCH] virtio-net: fix map leaking on error during receive

Commit bedd7e93d0196 ("virtio-net: fix use after unmap/free for sg")
tries to fix the use after free of the sg by caching the virtqueue
elements in an array and unmap them at once after receiving the
packets, But it forgot to unmap the cached elements on error which
will lead to leaking of mapping and other unexpected results.

Fixing this by detaching the cached elements on error. This addresses
CVE-2022-26353.

Reported-by: Victor Tom 
Cc: qemu-sta...@nongnu.org
Fixes: bedd7e93d0196 ("virtio-net: fix use after unmap/free for sg")
Signed-off-by: Jason Wang 
---
 hw/net/virtio-net.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index cf8ab0f8af..65b61c836c 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1867,6 +1867,7 @@ static ssize_t virtio_net_receive_rcu(NetClientState *nc, 
const uint8_t *buf,
 
 err:
 for (j = 0; j < i; j++) {
+virtqueue_detach_element(q->rx_vq, elems[j], lens[j]);
 g_free(elems[j]);
 }
 
-- 
2.25.1

Re: Question about atomics

On 3/7/22 18:18, Warner Losh wrote:
I have a question related to the user-mode emulation and atomics. I asked on IRC, but
thinking about it, I think it may be too complex to discuss in that medium...

In FreeBSD we have a system call that uses host atomic operations to interact memory that
userland also interacts with using atomic operations.

In bsd-user we call the kernel with a special flag for dealing with 32-bit processes
running on a 64-bit kernel. In this case, we use 32-bit-sized atomics to set variables in
the address space of the bsd-user guest. This is used when running armv7 binaries on amd64
hosts.

First question: Is this expected to work? I know I'm a bit vague, so as a followup
question: If there's restrictions on this, what might they be? Do some classes of atomic
operations work, while others may fail or need additional cooperation? Are there any
conformance tests I could compile for FreeBSD/armv7 to test the hypothesis that atomic
operations are misbehaving?

Yes, qatomic_foo is expected to work. It's what we use across threads, and it is expected
to work "in kernel mode", i.e. within cpu_loop().

There are compile-time restrictions on the set of atomic operations, mostly based on what
the host supports. But anything that actually compiles is expected to work (there are a
set of ifdefs if you need something more than the default).

Beyond that, there is start_exclusive() / end_exclusive() which will stop-the-world and
make sure that the current thread is the only one running.

Thanks for any help you might be able to give.

Show the code in question?

[PATCH v2] tests: add (riscv virt) machine mapping to testenv

2022-03-07 Thread laokz

Some qemu-iotests(040 etc) use PCI disk to do test. Without the
mapping, RISC-V flavor use spike as default machine which has no
PCI bus, causing test failure.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/894

Signed-off-by: Kai Zhang 
---
Thanks for the detailed info. Corrected S-o-b tag.

 tests/qemu-iotests/testenv.py | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tests/qemu-iotests/testenv.py b/tests/qemu-iotests/testenv.py
index 0f32897fe8..975f26a785 100644
--- a/tests/qemu-iotests/testenv.py
+++ b/tests/qemu-iotests/testenv.py
@@ -238,6 +238,8 @@ def __init__(self, imgfmt: str, imgproto: str, aiomode: str,
 ('aarch64', 'virt'),
 ('avr', 'mega2560'),
 ('m68k', 'virt'),
+('riscv32', 'virt'),
+('riscv64', 'virt'),
 ('rx', 'gdbsim-r5f562n8'),
 ('tricore', 'tricore_testboard')
 )
-- 
2.34.1

Question about atomics

2022-03-07 Thread Warner Losh

I have a question related to the user-mode emulation and atomics. I asked
on IRC, but thinking about it, I think it may be too complex to discuss in
that medium...

In FreeBSD we have a system call that uses host atomic operations to
interact memory that userland also interacts with using atomic operations.

In bsd-user we call the kernel with a special flag for dealing with 32-bit
processes running on a 64-bit kernel. In this case, we use 32-bit-sized
atomics to set variables in the address space of the bsd-user guest. This
is used when running armv7 binaries on amd64 hosts.

First question: Is this expected to work? I know I'm a bit vague, so as a
followup question: If there's restrictions on this, what might they be? Do
some classes of atomic operations work, while others may fail or need
additional cooperation? Are there any conformance tests I could compile for
FreeBSD/armv7 to test the hypothesis that atomic operations are misbehaving?

I'm asking because I'm seeing a rare, but not rare enough, race that's
corrupting state in ways that only appear to be possible when pthread
mutexes aren't working (which only break when atomic operations are
broken). So far my efforts to narrow this down has been unsuccessful and
I'm looking to both understand qemu/tcm better as well as to reduce the
problem space to search...

Thanks for any help you might be able to give.

Warner

Re: [PATCH v3 00/11] s390x/tcg: Implement Vector-Enhancements Facility 2

2022-03-07 Thread David Miller

I've reviewed all changes,  looks good.
Ran all of my own tests including vstrs, all passed.

Thank you for all reviews and changes here.

- David Miller

On Mon, Mar 7, 2022 at 8:54 PM Richard Henderson <
richard.hender...@linaro.org> wrote:

> Hi David,
>
> I've split up the patches a bit, made some improvements to
> the shifts and reversals, and fixed a few bugs.
>
> Please especially review vector string search, as that is
> has had major changes.
>
>
> r~
>
>
> David Miller (9):
>   target/s390x: vxeh2: vector convert short/32b
>   target/s390x: vxeh2: vector string search
>   target/s390x: vxeh2: Update for changes to vector shifts
>   target/s390x: vxeh2: vector shift double by bit
>   target/s390x: vxeh2: vector {load, store} elements reversed
>   target/s390x: vxeh2: vector {load, store} byte reversed elements
>   target/s390x: vxeh2: vector {load, store} byte reversed element
>   target/s390x: add S390_FEAT_VECTOR_ENH2 to cpu max
>   tests/tcg/s390x: Tests for Vector Enhancements Facility 2
>
> Richard Henderson (2):
>   tcg: Implement tcg_gen_{h,w}swap_{i32,i64}
>   target/s390x: Fix writeback to v1 in helper_vstl
>
>  include/tcg/tcg-op.h |   6 +
>  target/s390x/helper.h|  13 +
>  target/s390x/gen-features.c  |   2 +
>  target/s390x/tcg/translate.c |   3 +-
>  target/s390x/tcg/vec_fpu_helper.c|  31 ++
>  target/s390x/tcg/vec_helper.c|   2 -
>  target/s390x/tcg/vec_int_helper.c|  58 
>  target/s390x/tcg/vec_string_helper.c | 101 ++
>  tcg/tcg-op.c |  30 ++
>  tests/tcg/s390x/vxeh2_vcvt.c |  97 ++
>  tests/tcg/s390x/vxeh2_vlstr.c| 146 +
>  tests/tcg/s390x/vxeh2_vs.c   |  91 ++
>  target/s390x/tcg/translate_vx.c.inc  | 442 ---
>  target/s390x/tcg/insn-data.def   |  40 ++-
>  tests/tcg/s390x/Makefile.target  |   8 +
>  15 files changed, 1018 insertions(+), 52 deletions(-)
>  create mode 100644 tests/tcg/s390x/vxeh2_vcvt.c
>  create mode 100644 tests/tcg/s390x/vxeh2_vlstr.c
>  create mode 100644 tests/tcg/s390x/vxeh2_vs.c
>
> --
> 2.25.1
>
>

RE: [PATCH v2 05/10] vdpa-dev: implement the realize interface

2022-03-07 Thread longpeng2--- via




> -Original Message-
> From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> Sent: Monday, March 7, 2022 8:14 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> 
> Cc: stefa...@redhat.com; m...@redhat.com; coh...@redhat.com;
> pbonz...@redhat.com; Gonglei (Arei) ; Yechuan
> ; Huangzhichao ;
> qemu-devel@nongnu.org
> Subject: Re: [PATCH v2 05/10] vdpa-dev: implement the realize interface
> 
> On Mon, Mar 07, 2022 at 11:13:02AM +, Longpeng (Mike, Cloud Infrastructure
> Service Product Dept.) wrote:
> >
> >
> >> -Original Message-
> >> From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> >> Sent: Monday, March 7, 2022 4:24 PM
> >> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> >> 
> >> Cc: stefa...@redhat.com; m...@redhat.com; coh...@redhat.com;
> >> pbonz...@redhat.com; Gonglei (Arei) ; Yechuan
> >> ; Huangzhichao ;
> >> qemu-devel@nongnu.org
> >> Subject: Re: [PATCH v2 05/10] vdpa-dev: implement the realize interface
> >>
> >> On Sat, Mar 05, 2022 at 07:07:54AM +, Longpeng (Mike, Cloud 
> >> Infrastructure
> >> Service Product Dept.) wrote:
> >> >
> >> >
> >> >> -Original Message-
> >> >> From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> >> >> Sent: Wednesday, January 19, 2022 7:31 PM
> >> >> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> >> >> 
> >> >> Cc: stefa...@redhat.com; m...@redhat.com; coh...@redhat.com;
> >> >> pbonz...@redhat.com; Gonglei (Arei) ; Yechuan
> >> >> ; Huangzhichao ;
> >> >> qemu-devel@nongnu.org
> >> >> Subject: Re: [PATCH v2 05/10] vdpa-dev: implement the realize interface
> >> >>
> >> >> On Mon, Jan 17, 2022 at 08:43:26PM +0800, Longpeng(Mike) via wrote:
> >> >> >From: Longpeng 
> >> >> >
> >> >> >Implements the .realize interface.
> >> >> >
> >> >> >Signed-off-by: Longpeng 
> >> >> >---
> >> >> > hw/virtio/vdpa-dev.c | 101 +++
> >> >> > include/hw/virtio/vdpa-dev.h |   8 +++
> >> >> > 2 files changed, 109 insertions(+)
> >> >> >
> >> >> >diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
> >> >> >index b103768f33..bd28cf7a15 100644
> >> >> >--- a/hw/virtio/vdpa-dev.c
> >> >> >+++ b/hw/virtio/vdpa-dev.c
> >> >> >@@ -27,9 +27,109 @@ uint32_t vhost_vdpa_device_get_u32(int fd, unsigned
> long
> >> >> int cmd, Error **errp)
> >> >> > return val;
> >> >> > }
> >> >> >
> >> >> >+static void
> >> >> >+vhost_vdpa_device_dummy_handle_output(VirtIODevice *vdev, VirtQueue
> *vq)
> >> >> >+{
> >> >> >+/* Nothing to do */
> >> >> >+}
> >> >> >+
> >> >> > static void vhost_vdpa_device_realize(DeviceState *dev, Error **errp)
> >> >> > {
> >> >> >+VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> >> >> >+VhostVdpaDevice *s = VHOST_VDPA_DEVICE(vdev);
> >> >> >+uint32_t vdev_id, max_queue_size;
> >> >> >+struct vhost_virtqueue *vqs;
> >> >> >+int i, ret;
> >> >> >+
> >> >> >+if (s->vdpa_dev_fd == -1) {
> >> >> >+s->vdpa_dev_fd = qemu_open(s->vdpa_dev, O_RDWR, errp);
> >> >>
> >> >> So, here we are re-opening the `vdpa_dev` again (without checking if it
> >> >> is NULL).
> >> >>
> >> >> And we re-do the same ioctls already done in
> >> >> vhost_vdpa_device_pci_realize(), so I think we should do them in a
> >> >> single place, and that place should be here.
> >> >>
> >> >> So, what about doing all the ioctls here, setting appropriate fields in
> >> >> VhostVdpaDevice, then using that fields in
> >> >> vhost_vdpa_device_pci_realize() after qdev_realize() to set
> >> >> `class_code`, `trans_devid`, and `nvectors`?
> >> >>
> >> >
> >> >vhost_vdpa_device_pci_realize()
> >> >  qdev_realize()
> >> >virtio_device_realize()
> >> >  vhost_vdpa_device_realize()
> >> >  virtio_bus_device_plugged()
> >> >virtio_pci_device_plugged()
> >> >
> >> >These three fields would be used in virtio_pci_device_plugged(), so it's
> too
> >> >late to set them after qdev_realize().  And they belong to VirtIOPCIProxy,
> so
> >> >we cannot set them in vhost_vdpa_device_realize() which is transport layer
> >> >independent.
> >>
> >> Maybe I expressed myself wrong, I was saying to open the file and make
> >> ioctls in vhost_vdpa_device_realize(). Save the values we use on both
> >> sides in VhostVdpaDevice (e.g. num_queues, queue_size) and use these
> >> saved values in virtio_pci_device_plugged() without re-opening the file
> >> again.
> >>
> >
> >This means we need to access VhostVdpaDevice in virtio_pci_device_plugged()?
> 
> Yep, or implement some functions to get those values.
> 

I prefer not to modify the VIRTIO or the VIRTIO_PCI core too much.
How about the following proposal?

struct VhostVdpaDevice {
...
void (*post_init)(VhostVdpaDevice *vdpa_dev);
...
}

vhost_vdpa_device_pci_post_init(VhostVdpaDevice *vdpa_dev)
{
...
vpci_dev->class_code = virtio_pci_get_class_id(vdpa_dev->vdev_id);
vpci_dev->trans_devid = virtio_pci_get_trans_devid(vdpa_dev->vdev_id);
vpci_dev->nvectors =

[PATCH v3] target/arm: Fix sve2 ldnt1 and stnt1

For both ldnt1 and stnt1, the meaning of the Rn and Rm are different
from ld1 and st1: the vector and integer registers are reversed, and
the integer register 31 refers to XZR instead of SP.

Secondly, the 64-bit version of ldnt1 was being interpreted as
32-bit unpacked unscaled offset instead of 64-bit unscaled offset,
which discarded the upper 32 bits of the address coming from
the vector argument.

Thirdly, validate that the memory element size is in range for the
vector element size for ldnt1.  For ld1, we do this via independent
decode patterns, but for ldnt1 we need to do it manually.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/826
Signed-off-by: Richard Henderson 
---
 target/arm/sve.decode |  5 ++-
 target/arm/translate-sve.c| 51 +--
 tests/tcg/aarch64/test-826.c  | 50 ++
 tests/tcg/aarch64/Makefile.target |  4 +++
 tests/tcg/configure.sh|  4 +++
 5 files changed, 109 insertions(+), 5 deletions(-)
 create mode 100644 tests/tcg/aarch64/test-826.c

diff --git a/target/arm/sve.decode b/target/arm/sve.decode
index c60b9f0fec..0388cce3bd 100644
--- a/target/arm/sve.decode
+++ b/target/arm/sve.decode
@@ -1575,10 +1575,9 @@ USDOT_  01000100 .. 0 . 011 110 . .  
@rda_rn_rm
 
 ### SVE2 Memory Gather Load Group
 
-# SVE2 64-bit gather non-temporal load
-#   (scalar plus unpacked 32-bit unscaled offsets)
+# SVE2 64-bit gather non-temporal load (scalar plus 64-bit unscaled offsets)
 LDNT1_zprz  1100010 msz:2 00 rm:5 1 u:1 0 pg:3 rn:5 rd:5 \
-_gather_load xs=0 esz=3 scale=0 ff=0
+_gather_load xs=2 esz=3 scale=0 ff=0
 
 # SVE2 32-bit gather non-temporal load (scalar plus 32-bit unscaled offsets)
 LDNT1_zprz  110 msz:2 00 rm:5 10 u:1 pg:3 rn:5 rd:5 \
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index 33ca1bcfac..2c23459e76 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -6487,10 +6487,33 @@ static bool trans_LD1_zpiz(DisasContext *s, 
arg_LD1_zpiz *a)
 
 static bool trans_LDNT1_zprz(DisasContext *s, arg_LD1_zprz *a)
 {
+gen_helper_gvec_mem_scatter *fn = NULL;
+bool be = s->be_data == MO_BE;
+bool mte = s->mte_active[0];
+
+if (a->esz < a->msz + !a->u) {
+return false;
+}
 if (!dc_isar_feature(aa64_sve2, s)) {
 return false;
 }
-return trans_LD1_zprz(s, a);
+if (!sve_access_check(s)) {
+return true;
+}
+
+switch (a->esz) {
+case MO_32:
+fn = gather_load_fn32[mte][be][0][0][a->u][a->msz];
+break;
+case MO_64:
+fn = gather_load_fn64[mte][be][0][2][a->u][a->msz];
+break;
+}
+assert(fn != NULL);
+
+do_mem_zpz(s, a->rd, a->pg, a->rn, 0,
+   cpu_reg(s, a->rm), a->msz, false, fn);
+return true;
 }
 
 /* Indexed by [mte][be][xs][msz].  */
@@ -6647,10 +6670,34 @@ static bool trans_ST1_zpiz(DisasContext *s, 
arg_ST1_zpiz *a)
 
 static bool trans_STNT1_zprz(DisasContext *s, arg_ST1_zprz *a)
 {
+gen_helper_gvec_mem_scatter *fn;
+bool be = s->be_data == MO_BE;
+bool mte = s->mte_active[0];
+
+if (a->esz < a->msz) {
+return false;
+}
 if (!dc_isar_feature(aa64_sve2, s)) {
 return false;
 }
-return trans_ST1_zprz(s, a);
+if (!sve_access_check(s)) {
+return true;
+}
+
+switch (a->esz) {
+case MO_32:
+fn = scatter_store_fn32[mte][be][0][a->msz];
+break;
+case MO_64:
+fn = scatter_store_fn64[mte][be][2][a->msz];
+break;
+default:
+g_assert_not_reached();
+}
+
+do_mem_zpz(s, a->rd, a->pg, a->rn, 0,
+   cpu_reg(s, a->rm), a->msz, true, fn);
+return true;
 }
 
 /*
diff --git a/tests/tcg/aarch64/test-826.c b/tests/tcg/aarch64/test-826.c
new file mode 100644
index 00..f59740a8c5
--- /dev/null
+++ b/tests/tcg/aarch64/test-826.c
@@ -0,0 +1,50 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static void *expected;
+
+void sigsegv(int sig, siginfo_t *info, void *vuc)
+{
+ucontext_t *uc = vuc;
+
+assert(info->si_addr == expected);
+uc->uc_mcontext.pc += 4;
+}
+
+int main()
+{
+struct sigaction sa = {
+.sa_sigaction = sigsegv,
+.sa_flags = SA_SIGINFO
+};
+
+void *page;
+long ofs;
+
+if (sigaction(SIGSEGV, , NULL) < 0) {
+perror("sigaction");
+return EXIT_FAILURE;
+}
+
+page = mmap(0, getpagesize(), PROT_NONE, MAP_PRIVATE | MAP_ANON, -1, 0);
+if (page == MAP_FAILED) {
+perror("mmap");
+return EXIT_FAILURE;
+}
+
+ofs = 0x124;
+expected = page + ofs;
+
+asm("ptrue p0.d, vl1\n\t"
+"dup z0.d, %0\n\t"
+"ldnt1h {z1.d}, p0/z, [z0.d, %1]\n\t"
+"dup z1.d, %1\n\t"
+"ldnt1h {z0.d}, p0/z, [z1.d, %0]"
+: : "r"(page), "r"(ofs) : "v0", "v1");
+
+return EXIT_SUCCESS;
+}
diff --git

[PATCH v3 1/5] python/utils: add add_visual_margin() text decoration utility

>>> print(add_visual_margin(msg, width=72, name="Commit Message"))
┏━ Commit Message ━━
┃ add_visual_margin() takes a chunk of text and wraps it in a visual
┃ container that force-wraps to a specified width. An optional title
┃ label may be given, and any of the individual glyphs used to draw the
┃ box may be replaced or specified as well.
┗━━━

Signed-off-by: John Snow 
Reviewed-by: Eric Blake 
---
 python/qemu/utils/__init__.py | 78 +++
 1 file changed, 78 insertions(+)

diff --git a/python/qemu/utils/__init__.py b/python/qemu/utils/__init__.py
index 7f1a5138c4..5babf40df2 100644
--- a/python/qemu/utils/__init__.py
+++ b/python/qemu/utils/__init__.py
@@ -15,7 +15,10 @@
 # the COPYING file in the top-level directory.
 #
 
+import os
 import re
+import shutil
+import textwrap
 from typing import Optional
 
 # pylint: disable=import-error
@@ -23,6 +26,7 @@
 
 
 __all__ = (
+'add_visual_margin',
 'get_info_usernet_hostfwd_port',
 'kvm_available',
 'list_accel',
@@ -43,3 +47,77 @@ def get_info_usernet_hostfwd_port(info_usernet_output: str) 
-> Optional[int]:
 if match is not None:
 return int(match[1])
 return None
+
+
+# pylint: disable=too-many-arguments
+def add_visual_margin(
+content: str = '',
+width: Optional[int] = None,
+name: Optional[str] = None,
+padding: int = 1,
+upper_left: str = '┏',
+lower_left: str = '┗',
+horizontal: str = '━',
+vertical: str = '┃',
+) -> str:
+"""
+Decorate and wrap some text with a visual decoration around it.
+
+This function assumes that the text decoration characters are single
+characters that display using a single monospace column.
+
+┏━ Example ━
+┃ This is what this function looks like with text content that's
+┃ wrapped to 72 characters. The right-hand margin is left open to
+┃ acommodate the occasional unicode character that might make
+┃ predicting the total "visual" width of a line difficult. This
+┃ provides a visual distinction that's good-enough, though.
+┗━━━
+
+:param content: The text to wrap and decorate.
+:param width:
+The number of columns to use, including for the decoration
+itself. The default (None) uses the the available width of the
+current terminal, or a fallback of 72 lines. A negative number
+subtracts a fixed-width from the default size. The default obeys
+the COLUMNS environment variable, if set.
+:param name: A label to apply to the upper-left of the box.
+:param padding: How many columns of padding to apply inside.
+:param upper_left: Upper-left single-width text decoration character.
+:param lower_left: Lower-left single-width text decoration character.
+:param horizontal: Horizontal single-width text decoration character.
+:param vertical: Vertical single-width text decoration character.
+"""
+if width is None or width < 0:
+avail = shutil.get_terminal_size(fallback=(72, 24))[0]
+if width is None:
+_width = avail
+else:
+_width = avail + width
+else:
+_width = width
+
+prefix = vertical + (' ' * padding)
+
+def _bar(name: Optional[str], top: bool = True) -> str:
+ret = upper_left if top else lower_left
+if name is not None:
+ret += f"{horizontal} {name} "
+
+filler_len = _width - len(ret)
+ret += f"{horizontal * filler_len}"
+return ret
+
+def _wrap(line: str) -> str:
+return os.linesep.join(
+textwrap.wrap(
+line, width=_width - padding, initial_indent=prefix,
+subsequent_indent=prefix, replace_whitespace=False,
+drop_whitespace=True, break_on_hyphens=False)
+)
+
+return os.linesep.join((
+_bar(name, top=True),
+os.linesep.join(_wrap(line) for line in content.splitlines()),
+_bar(None, top=False),
+))
-- 
2.34.1

[PATCH v3 10/11] tests/tcg/s390x: Tests for Vector Enhancements Facility 2

From: David Miller 

* tests/tcg/s390x/vxeh2_vcvt.c  : vector convert
* tests/tcg/s390x/vxeh2_vs.c: vector shift
* tests/tcg/s390x/vxeh2_vlstr.c : vector load/store reversed

Signed-off-by: David Miller 
Message-Id: <20220307020327.3003-8-dmiller...@gmail.com>
Reviewed-by: Richard Henderson 
Signed-off-by: Richard Henderson 
---
 tests/tcg/s390x/vxeh2_vcvt.c|  97 +
 tests/tcg/s390x/vxeh2_vlstr.c   | 146 
 tests/tcg/s390x/vxeh2_vs.c  |  91 
 tests/tcg/s390x/Makefile.target |   8 ++
 4 files changed, 342 insertions(+)
 create mode 100644 tests/tcg/s390x/vxeh2_vcvt.c
 create mode 100644 tests/tcg/s390x/vxeh2_vlstr.c
 create mode 100644 tests/tcg/s390x/vxeh2_vs.c

diff --git a/tests/tcg/s390x/vxeh2_vcvt.c b/tests/tcg/s390x/vxeh2_vcvt.c
new file mode 100644
index 00..71ecbd77b0
--- /dev/null
+++ b/tests/tcg/s390x/vxeh2_vcvt.c
@@ -0,0 +1,97 @@
+/*
+ * vxeh2_vcvt: vector-enhancements facility 2 vector convert *
+ */
+#include 
+
+typedef union S390Vector {
+uint64_t d[2];  /* doubleword */
+uint32_t w[4];  /* word */
+uint16_t h[8];  /* halfword */
+uint8_t  b[16]; /* byte */
+floatf[4];
+double   fd[2];
+__uint128_t v;
+} S390Vector;
+
+#define M_S 8
+#define M4_XxC 4
+#define M4_def M4_XxC
+
+static inline void vcfps(S390Vector *v1, S390Vector *v2,
+const uint8_t m3,  const uint8_t m4,  const uint8_t m5)
+{
+asm volatile("vcfps %[v1], %[v2], %[m3], %[m4], %[m5]\n"
+: [v1] "=v" (v1->v)
+: [v2]  "v" (v2->v)
+, [m3]  "i" (m3)
+, [m4]  "i" (m4)
+, [m5]  "i" (m5));
+}
+
+static inline void vcfpl(S390Vector *v1, S390Vector *v2,
+const uint8_t m3,  const uint8_t m4,  const uint8_t m5)
+{
+asm volatile("vcfpl %[v1], %[v2], %[m3], %[m4], %[m5]\n"
+: [v1] "=v" (v1->v)
+: [v2]  "v" (v2->v)
+, [m3]  "i" (m3)
+, [m4]  "i" (m4)
+, [m5]  "i" (m5));
+}
+
+static inline void vcsfp(S390Vector *v1, S390Vector *v2,
+const uint8_t m3,  const uint8_t m4,  const uint8_t m5)
+{
+asm volatile("vcsfp %[v1], %[v2], %[m3], %[m4], %[m5]\n"
+: [v1] "=v" (v1->v)
+: [v2]  "v" (v2->v)
+, [m3]  "i" (m3)
+, [m4]  "i" (m4)
+, [m5]  "i" (m5));
+}
+
+static inline void vclfp(S390Vector *v1, S390Vector *v2,
+const uint8_t m3,  const uint8_t m4,  const uint8_t m5)
+{
+asm volatile("vclfp %[v1], %[v2], %[m3], %[m4], %[m5]\n"
+: [v1] "=v" (v1->v)
+: [v2]  "v" (v2->v)
+, [m3]  "i" (m3)
+, [m4]  "i" (m4)
+, [m5]  "i" (m5));
+}
+
+int main(int argc, char *argv[])
+{
+S390Vector vd;
+S390Vector vs_i32 = { .w[0] = 1, .w[1] = 64, .w[2] = 1024, .w[3] = -10 };
+S390Vector vs_u32 = { .w[0] = 2, .w[1] = 32, .w[2] = 4096, .w[3] =  };
+S390Vector vs_f32 = { .f[0] = 3.987, .f[1] = 5.123,
+  .f[2] = 4.499, .f[3] = 0.512 };
+
+vd.d[0] = vd.d[1] = 0;
+vcfps(, _i32, 2, M4_def, 0);
+if (1 != vd.f[0] || 1024 != vd.f[2] || 64 != vd.f[1] || -10 != vd.f[3]) {
+return 1;
+}
+
+vd.d[0] = vd.d[1] = 0;
+vcfpl(, _u32, 2, M4_def, 0);
+if (2 != vd.f[0] || 4096 != vd.f[2] || 32 != vd.f[1] ||  != vd.f[3]) {
+return 1;
+}
+
+vd.d[0] = vd.d[1] = 0;
+vcsfp(, _f32, 2, M4_def, 0);
+if (4 != vd.w[0] || 4 != vd.w[2] || 5 != vd.w[1] || 1 != vd.w[3]) {
+return 1;
+}
+
+vd.d[0] = vd.d[1] = 0;
+vclfp(, _f32, 2, M4_def, 0);
+if (4 != vd.w[0] || 4 != vd.w[2] || 5 != vd.w[1] || 1 != vd.w[3]) {
+return 1;
+}
+
+return 0;
+}
diff --git a/tests/tcg/s390x/vxeh2_vlstr.c b/tests/tcg/s390x/vxeh2_vlstr.c
new file mode 100644
index 00..bf2954e86d
--- /dev/null
+++ b/tests/tcg/s390x/vxeh2_vlstr.c
@@ -0,0 +1,146 @@
+/*
+ * vxeh2_vlstr: vector-enhancements facility 2 vector load/store reversed *
+ */
+#include 
+
+typedef union S390Vector {
+uint64_t d[2];  /* doubleword */
+uint32_t w[4];  /* word */
+uint16_t h[8];  /* halfword */
+uint8_t  b[16]; /* byte */
+__uint128_t v;
+} S390Vector;
+
+#define ES8  0
+#define ES16 1
+#define ES32 2
+#define ES64 3
+
+#define vtst(v1, v2) \
+if (v1.d[0] != v2.d[0] || v1.d[1] != v2.d[1]) { \
+return 1; \
+}
+
+static inline void vler(S390Vector *v1, const void *va, uint8_t m3)
+{
+asm volatile("vler %[v1], 0(%[va]), %[m3]\n"
+: [v1] "+v" (v1->v)
+: [va]  "d" (va)
+, [m3]  "i" (m3)
+: "memory");
+}
+
+static inline void vster(S390Vector *v1, const void *va, uint8_t m3)
+{
+asm volatile("vster %[v1], 0(%[va]), %[m3]\n"
+: [va] "+d" (va)
+: [v1]  "v" (v1->v)
+, [m3]  "i" (m3)
+:

[PATCH v3 08/11] target/s390x: vxeh2: vector {load, store} byte reversed element

From: David Miller 

This includes VLEBR* and VSTEBR* (single element);
VLBRREP (load single element and replicate); and
VLLEBRZ (load single element and zero).

Signed-off-by: David Miller 
Message-Id: <20220307020327.3003-6-dmiller...@gmail.com>
[rth: Split out elements (plural) from element (scalar),
  Use tcg little-endian memory operations.]
Signed-off-by: Richard Henderson 
---
 target/s390x/tcg/translate_vx.c.inc | 85 +
 target/s390x/tcg/insn-data.def  | 12 
 2 files changed, 97 insertions(+)

diff --git a/target/s390x/tcg/translate_vx.c.inc 
b/target/s390x/tcg/translate_vx.c.inc
index 9a82401d71..ce77578325 100644
--- a/target/s390x/tcg/translate_vx.c.inc
+++ b/target/s390x/tcg/translate_vx.c.inc
@@ -457,6 +457,73 @@ static DisasJumpType op_vlrep(DisasContext *s, DisasOps *o)
 return DISAS_NEXT;
 }
 
+static DisasJumpType op_vlebr(DisasContext *s, DisasOps *o)
+{
+const uint8_t es = s->insn->data;
+const uint8_t enr = get_field(s, m3);
+TCGv_i64 tmp;
+
+if (!valid_vec_element(enr, es)) {
+gen_program_exception(s, PGM_SPECIFICATION);
+return DISAS_NORETURN;
+}
+
+tmp = tcg_temp_new_i64();
+tcg_gen_qemu_ld_i64(tmp, o->addr1, get_mem_index(s), MO_LE | es);
+write_vec_element_i64(tmp, get_field(s, v1), enr, es);
+tcg_temp_free_i64(tmp);
+return DISAS_NEXT;
+}
+
+static DisasJumpType op_vlbrrep(DisasContext *s, DisasOps *o)
+{
+const uint8_t es = get_field(s, m3);
+TCGv_i64 tmp;
+
+if (es < ES_16 || es > ES_64) {
+gen_program_exception(s, PGM_SPECIFICATION);
+return DISAS_NORETURN;
+}
+
+tmp = tcg_temp_new_i64();
+tcg_gen_qemu_ld_i64(tmp, o->addr1, get_mem_index(s), MO_LE | es);
+gen_gvec_dup_i64(es, get_field(s, v1), tmp);
+tcg_temp_free_i64(tmp);
+return DISAS_NEXT;
+}
+
+static DisasJumpType op_vllebrz(DisasContext *s, DisasOps *o)
+{
+const uint8_t m3 = get_field(s, m3);
+TCGv_i64 tmp;
+int es, lshift;
+
+switch (m3) {
+case ES_16:
+case ES_32:
+case ES_64:
+es = m3;
+lshift = 0;
+break;
+case 6:
+es = ES_32;
+lshift = 32;
+break;
+default:
+gen_program_exception(s, PGM_SPECIFICATION);
+return DISAS_NORETURN;
+}
+
+tmp = tcg_temp_new_i64();
+tcg_gen_qemu_ld_i64(tmp, o->addr1, get_mem_index(s), MO_LE | es);
+tcg_gen_shli_i64(tmp, tmp, lshift);
+
+write_vec_element_i64(tmp, get_field(s, v1), 0, ES_64);
+write_vec_element_i64(tcg_constant_i64(0), get_field(s, v1), 1, ES_64);
+tcg_temp_free_i64(tmp);
+return DISAS_NEXT;
+}
+
 static DisasJumpType op_vlbr(DisasContext *s, DisasOps *o)
 {
 const uint8_t es = get_field(s, m3);
@@ -1048,6 +1115,24 @@ static DisasJumpType op_vst(DisasContext *s, DisasOps *o)
 return DISAS_NEXT;
 }
 
+static DisasJumpType op_vstebr(DisasContext *s, DisasOps *o)
+{
+const uint8_t es = s->insn->data;
+const uint8_t enr = get_field(s, m3);
+TCGv_i64 tmp;
+
+if (!valid_vec_element(enr, es)) {
+gen_program_exception(s, PGM_SPECIFICATION);
+return DISAS_NORETURN;
+}
+
+tmp = tcg_temp_new_i64();
+read_vec_element_i64(tmp, get_field(s, v1), enr, es);
+tcg_gen_qemu_st_i64(tmp, o->addr1, get_mem_index(s), MO_LE | es);
+tcg_temp_free_i64(tmp);
+return DISAS_NEXT;
+}
+
 static DisasJumpType op_vstbr(DisasContext *s, DisasOps *o)
 {
 const uint8_t es = get_field(s, m3);
diff --git a/target/s390x/tcg/insn-data.def b/target/s390x/tcg/insn-data.def
index ee6e1dc9e5..b80f989002 100644
--- a/target/s390x/tcg/insn-data.def
+++ b/target/s390x/tcg/insn-data.def
@@ -1027,6 +1027,14 @@
 F(0xe756, VLR, VRR_a, V,   0, 0, 0, 0, vlr, 0, IF_VEC)
 /* VECTOR LOAD AND REPLICATE */
 F(0xe705, VLREP,   VRX,   V,   la2, 0, 0, 0, vlrep, 0, IF_VEC)
+/* VECTOR LOAD BYTE REVERSED ELEMENT */
+E(0xe601, VLEBRH,  VRX,   VE2, la2, 0, 0, 0, vlebr, 0, ES_16, IF_VEC)
+E(0xe603, VLEBRF,  VRX,   VE2, la2, 0, 0, 0, vlebr, 0, ES_32, IF_VEC)
+E(0xe602, VLEBRG,  VRX,   VE2, la2, 0, 0, 0, vlebr, 0, ES_64, IF_VEC)
+/* VECTOR LOAD BYTE REVERSED ELEMENT AND REPLOCATE */
+F(0xe605, VLBRREP, VRX,   VE2, la2, 0, 0, 0, vlbrrep, 0, IF_VEC)
+/* VECTOR LOAD BYTE REVERSED ELEMENT AND ZERO */
+F(0xe604, VLLEBRZ, VRX,   VE2, la2, 0, 0, 0, vllebrz, 0, IF_VEC)
 /* VECTOR LOAD BYTE REVERSED ELEMENTS */
 F(0xe606, VLBR,VRX,   VE2, la2, 0, 0, 0, vlbr, 0, IF_VEC)
 /* VECTOR LOAD ELEMENT */
@@ -1081,6 +1089,10 @@
 F(0xe75f, VSEG,VRR_a, V,   0, 0, 0, 0, vseg, 0, IF_VEC)
 /* VECTOR STORE */
 F(0xe70e, VST, VRX,   V,   la2, 0, 0, 0, vst, 0, IF_VEC)
+/* VECTOR STORE BYTE REVERSED ELEMENT */
+E(0xe609, VSTEBRH,  VRX,   VE2, la2, 0, 0, 0, vstebr, 0, ES_16, IF_VEC)
+E(0xe60b, VSTEBRF,  VRX,   VE2, la2, 0, 0, 0, vstebr, 0, ES_32, IF_VEC)
+E(0xe60a, VSTEBRG,  VRX,   VE2, la2, 0, 0, 0, vstebr, 0, ES_64, IF_VEC)
 /* VECTOR STORE BYTE REVERSED

[PATCH v3 07/11] target/s390x: vxeh2: vector {load, store} byte reversed elements

From: David Miller 

Signed-off-by: David Miller 
Message-Id: <20220307020327.3003-6-dmiller...@gmail.com>
[rth: Split out elements (plural) from element (scalar)
  Use tcg little-endian memory ops, plus hswap and wswap.]
Signed-off-by: Richard Henderson 
---
 target/s390x/tcg/translate_vx.c.inc | 101 
 target/s390x/tcg/insn-data.def  |   4 ++
 2 files changed, 105 insertions(+)

diff --git a/target/s390x/tcg/translate_vx.c.inc 
b/target/s390x/tcg/translate_vx.c.inc
index ac807122a3..9a82401d71 100644
--- a/target/s390x/tcg/translate_vx.c.inc
+++ b/target/s390x/tcg/translate_vx.c.inc
@@ -457,6 +457,56 @@ static DisasJumpType op_vlrep(DisasContext *s, DisasOps *o)
 return DISAS_NEXT;
 }
 
+static DisasJumpType op_vlbr(DisasContext *s, DisasOps *o)
+{
+const uint8_t es = get_field(s, m3);
+TCGv_i64 t0, t1, tt;
+
+if (es < ES_16 || es > ES_128) {
+gen_program_exception(s, PGM_SPECIFICATION);
+return DISAS_NORETURN;
+}
+
+t0 = tcg_temp_new_i64();
+t1 = tcg_temp_new_i64();
+
+/* Begin with byte reversed doublewords... */
+tcg_gen_qemu_ld_i64(t0, o->addr1, get_mem_index(s), MO_LEUQ);
+gen_addi_and_wrap_i64(s, o->addr1, o->addr1, 8);
+tcg_gen_qemu_ld_i64(t1, o->addr1, get_mem_index(s), MO_LEUQ);
+
+/*
+ * For 16 and 32-bit elements, the doubleword bswap also reversed
+ * the order of the elements.  Perform a larger order swap to put
+ * them back into place.  For the 128-bit "element", finish the
+ * bswap by swapping the doublewords.
+ */
+switch (es) {
+case ES_16:
+tcg_gen_hswap_i64(t0, t0);
+tcg_gen_hswap_i64(t1, t1);
+break;
+case ES_32:
+tcg_gen_wswap_i64(t0, t0);
+tcg_gen_wswap_i64(t1, t1);
+break;
+case ES_64:
+break;
+case ES_128:
+tt = t0, t0 = t1, t1 = tt;
+break;
+default:
+g_assert_not_reached();
+}
+
+write_vec_element_i64(t0, get_field(s, v1), 0, ES_64);
+write_vec_element_i64(t1, get_field(s, v1), 1, ES_64);
+
+tcg_temp_free(t0);
+tcg_temp_free(t1);
+return DISAS_NEXT;
+}
+
 static DisasJumpType op_vle(DisasContext *s, DisasOps *o)
 {
 const uint8_t es = s->insn->data;
@@ -998,6 +1048,57 @@ static DisasJumpType op_vst(DisasContext *s, DisasOps *o)
 return DISAS_NEXT;
 }
 
+static DisasJumpType op_vstbr(DisasContext *s, DisasOps *o)
+{
+const uint8_t es = get_field(s, m3);
+TCGv_i64 t0, t1, tt;
+
+if (es < ES_16 || es > ES_128) {
+gen_program_exception(s, PGM_SPECIFICATION);
+return DISAS_NORETURN;
+}
+
+/* Probe write access before actually modifying memory */
+gen_helper_probe_write_access(cpu_env, o->addr1, tcg_constant_i64(16));
+
+t0 = tcg_temp_new_i64();
+t1 = tcg_temp_new_i64();
+read_vec_element_i64(t0, get_field(s, v1), 0, ES_64);
+read_vec_element_i64(t1, get_field(s, v1), 1, ES_64);
+
+/*
+ * For 16 and 32-bit elements, the doubleword bswap below will
+ * reverse the order of the elements.  Perform a larger order
+ * swap to put them back into place.  For the 128-bit "element",
+ * finish the bswap by swapping the doublewords.
+ */
+switch (es) {
+case MO_16:
+tcg_gen_hswap_i64(t0, t0);
+tcg_gen_hswap_i64(t1, t1);
+break;
+case MO_32:
+tcg_gen_wswap_i64(t0, t0);
+tcg_gen_wswap_i64(t1, t1);
+break;
+case MO_64:
+break;
+case MO_128:
+tt = t0, t0 = t1, t1 = tt;
+break;
+default:
+g_assert_not_reached();
+}
+
+tcg_gen_qemu_st_i64(t0, o->addr1, get_mem_index(s), MO_LEUQ);
+gen_addi_and_wrap_i64(s, o->addr1, o->addr1, 8);
+tcg_gen_qemu_st_i64(t1, o->addr1, get_mem_index(s), MO_LEUQ);
+
+tcg_temp_free(t0);
+tcg_temp_free(t1);
+return DISAS_NEXT;
+}
+
 static DisasJumpType op_vste(DisasContext *s, DisasOps *o)
 {
 const uint8_t es = s->insn->data;
diff --git a/target/s390x/tcg/insn-data.def b/target/s390x/tcg/insn-data.def
index b524541a7d..ee6e1dc9e5 100644
--- a/target/s390x/tcg/insn-data.def
+++ b/target/s390x/tcg/insn-data.def
@@ -1027,6 +1027,8 @@
 F(0xe756, VLR, VRR_a, V,   0, 0, 0, 0, vlr, 0, IF_VEC)
 /* VECTOR LOAD AND REPLICATE */
 F(0xe705, VLREP,   VRX,   V,   la2, 0, 0, 0, vlrep, 0, IF_VEC)
+/* VECTOR LOAD BYTE REVERSED ELEMENTS */
+F(0xe606, VLBR,VRX,   VE2, la2, 0, 0, 0, vlbr, 0, IF_VEC)
 /* VECTOR LOAD ELEMENT */
 E(0xe700, VLEB,VRX,   V,   la2, 0, 0, 0, vle, 0, ES_8, IF_VEC)
 E(0xe701, VLEH,VRX,   V,   la2, 0, 0, 0, vle, 0, ES_16, IF_VEC)
@@ -1079,6 +1081,8 @@
 F(0xe75f, VSEG,VRR_a, V,   0, 0, 0, 0, vseg, 0, IF_VEC)
 /* VECTOR STORE */
 F(0xe70e, VST, VRX,   V,   la2, 0, 0, 0, vst, 0, IF_VEC)
+/* VECTOR STORE BYTE REVERSED ELEMENTS */
+F(0xe60e, VSTBR,VRX,   VE2, la2, 0, 0, 0, vstbr, 0, IF_VEC)
 /* VECTOR STORE ELEMENT */
 E(0xe708, VSTEB,   VRX,   V,

[PATCH v3 06/11] target/s390x: vxeh2: vector {load, store} elements reversed

From: David Miller 

Signed-off-by: David Miller 
Message-Id: <20220307020327.3003-5-dmiller...@gmail.com>
[rth: Use new hswap and wswap tcg expanders.]
Signed-off-by: Richard Henderson 
---
 target/s390x/tcg/translate_vx.c.inc | 84 +
 target/s390x/tcg/insn-data.def  |  4 ++
 2 files changed, 88 insertions(+)

diff --git a/target/s390x/tcg/translate_vx.c.inc 
b/target/s390x/tcg/translate_vx.c.inc
index a5283ef2f8..ac807122a3 100644
--- a/target/s390x/tcg/translate_vx.c.inc
+++ b/target/s390x/tcg/translate_vx.c.inc
@@ -492,6 +492,46 @@ static DisasJumpType op_vlei(DisasContext *s, DisasOps *o)
 return DISAS_NEXT;
 }
 
+static DisasJumpType op_vler(DisasContext *s, DisasOps *o)
+{
+const uint8_t es = get_field(s, m3);
+
+if (es < ES_16 || es > ES_64) {
+gen_program_exception(s, PGM_SPECIFICATION);
+return DISAS_NORETURN;
+}
+
+TCGv_i64 t0 = tcg_temp_new_i64();
+TCGv_i64 t1 = tcg_temp_new_i64();
+
+/* Begin with the two doublewords swapped... */
+tcg_gen_qemu_ld_i64(t1, o->addr1, get_mem_index(s), MO_TEUQ);
+gen_addi_and_wrap_i64(s, o->addr1, o->addr1, 8);
+tcg_gen_qemu_ld_i64(t0, o->addr1, get_mem_index(s), MO_TEUQ);
+
+/* ... then swap smaller elements within the doublewords as required. */
+switch (es) {
+case MO_16:
+tcg_gen_hswap_i64(t1, t1);
+tcg_gen_hswap_i64(t0, t0);
+break;
+case MO_32:
+tcg_gen_wswap_i64(t1, t1);
+tcg_gen_wswap_i64(t0, t0);
+break;
+case MO_64:
+break;
+default:
+g_assert_not_reached();
+}
+
+write_vec_element_i64(t0, get_field(s, v1), 0, ES_64);
+write_vec_element_i64(t1, get_field(s, v1), 1, ES_64);
+tcg_temp_free(t0);
+tcg_temp_free(t1);
+return DISAS_NEXT;
+}
+
 static DisasJumpType op_vlgv(DisasContext *s, DisasOps *o)
 {
 const uint8_t es = get_field(s, m4);
@@ -976,6 +1016,50 @@ static DisasJumpType op_vste(DisasContext *s, DisasOps *o)
 return DISAS_NEXT;
 }
 
+static DisasJumpType op_vster(DisasContext *s, DisasOps *o)
+{
+const uint8_t es = get_field(s, m3);
+TCGv_i64 t0, t1;
+
+if (es < ES_16 || es > ES_64) {
+gen_program_exception(s, PGM_SPECIFICATION);
+return DISAS_NORETURN;
+}
+
+/* Probe write access before actually modifying memory */
+gen_helper_probe_write_access(cpu_env, o->addr1, tcg_constant_i64(16));
+
+/* Begin with the two doublewords swapped... */
+t0 = tcg_temp_new_i64();
+t1 = tcg_temp_new_i64();
+read_vec_element_i64(t1,  get_field(s, v1), 0, ES_64);
+read_vec_element_i64(t0,  get_field(s, v1), 1, ES_64);
+
+/* ... then swap smaller elements within the doublewords as required. */
+switch (es) {
+case MO_16:
+tcg_gen_hswap_i64(t1, t1);
+tcg_gen_hswap_i64(t0, t0);
+break;
+case MO_32:
+tcg_gen_wswap_i64(t1, t1);
+tcg_gen_wswap_i64(t0, t0);
+break;
+case MO_64:
+break;
+default:
+g_assert_not_reached();
+}
+
+tcg_gen_qemu_st_i64(t0, o->addr1, get_mem_index(s), MO_TEUQ);
+gen_addi_and_wrap_i64(s, o->addr1, o->addr1, 8);
+tcg_gen_qemu_st_i64(t1, o->addr1, get_mem_index(s), MO_TEUQ);
+
+tcg_temp_free(t0);
+tcg_temp_free(t1);
+return DISAS_NEXT;
+}
+
 static DisasJumpType op_vstm(DisasContext *s, DisasOps *o)
 {
 const uint8_t v3 = get_field(s, v3);
diff --git a/target/s390x/tcg/insn-data.def b/target/s390x/tcg/insn-data.def
index 98a31a557d..b524541a7d 100644
--- a/target/s390x/tcg/insn-data.def
+++ b/target/s390x/tcg/insn-data.def
@@ -1037,6 +1037,8 @@
 E(0xe741, VLEIH,   VRI_a, V,   0, 0, 0, 0, vlei, 0, ES_16, IF_VEC)
 E(0xe743, VLEIF,   VRI_a, V,   0, 0, 0, 0, vlei, 0, ES_32, IF_VEC)
 E(0xe742, VLEIG,   VRI_a, V,   0, 0, 0, 0, vlei, 0, ES_64, IF_VEC)
+/* VECTOR LOAD ELEMENTS REVERSED */
+F(0xe607, VLER,VRX,   VE2, la2, 0, 0, 0, vler, 0, IF_VEC)
 /* VECTOR LOAD GR FROM VR ELEMENT */
 F(0xe721, VLGV,VRS_c, V,   la2, 0, r1, 0, vlgv, 0, IF_VEC)
 /* VECTOR LOAD LOGICAL ELEMENT AND ZERO */
@@ -1082,6 +1084,8 @@
 E(0xe709, VSTEH,   VRX,   V,   la2, 0, 0, 0, vste, 0, ES_16, IF_VEC)
 E(0xe70b, VSTEF,   VRX,   V,   la2, 0, 0, 0, vste, 0, ES_32, IF_VEC)
 E(0xe70a, VSTEG,   VRX,   V,   la2, 0, 0, 0, vste, 0, ES_64, IF_VEC)
+/* VECTOR STORE ELEMENTS REVERSED */
+F(0xe60f, VSTER,   VRX,   VE2, la2, 0, 0, 0, vster, 0, IF_VEC)
 /* VECTOR STORE MULTIPLE */
 F(0xe73e, VSTM,VRS_a, V,   la2, 0, 0, 0, vstm, 0, IF_VEC)
 /* VECTOR STORE WITH LENGTH */
-- 
2.25.1

[PATCH v3 03/11] target/s390x: vxeh2: vector string search

From: David Miller 

Signed-off-by: David Miller 
Message-Id: <20220307020327.3003-3-dmiller...@gmail.com>
[rth: Rewrite helpers; fix validation of m6.]
Signed-off-by: Richard Henderson 
---

The substring search was incorrect, in that it didn't properly
restart the search when a match failed.  Split the helper into
multiple, so that the memory accesses can be optimized.
---
 target/s390x/helper.h|   6 ++
 target/s390x/tcg/translate.c |   3 +-
 target/s390x/tcg/vec_string_helper.c | 101 +++
 target/s390x/tcg/translate_vx.c.inc  |  26 +++
 target/s390x/tcg/insn-data.def   |   2 +
 5 files changed, 137 insertions(+), 1 deletion(-)

diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index 7cbcbd7f0b..7412130883 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -246,6 +246,12 @@ DEF_HELPER_6(gvec_vstrc_cc32, void, ptr, cptr, cptr, cptr, 
env, i32)
 DEF_HELPER_6(gvec_vstrc_cc_rt8, void, ptr, cptr, cptr, cptr, env, i32)
 DEF_HELPER_6(gvec_vstrc_cc_rt16, void, ptr, cptr, cptr, cptr, env, i32)
 DEF_HELPER_6(gvec_vstrc_cc_rt32, void, ptr, cptr, cptr, cptr, env, i32)
+DEF_HELPER_6(gvec_vstrs_8, void, ptr, cptr, cptr, cptr, env, i32)
+DEF_HELPER_6(gvec_vstrs_16, void, ptr, cptr, cptr, cptr, env, i32)
+DEF_HELPER_6(gvec_vstrs_32, void, ptr, cptr, cptr, cptr, env, i32)
+DEF_HELPER_6(gvec_vstrs_zs8, void, ptr, cptr, cptr, cptr, env, i32)
+DEF_HELPER_6(gvec_vstrs_zs16, void, ptr, cptr, cptr, cptr, env, i32)
+DEF_HELPER_6(gvec_vstrs_zs32, void, ptr, cptr, cptr, cptr, env, i32)
 
 /* === Vector Floating-Point Instructions */
 DEF_HELPER_FLAGS_5(gvec_vfa32, TCG_CALL_NO_WG, void, ptr, cptr, cptr, env, i32)
diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
index 904b51542f..d9ac29573d 100644
--- a/target/s390x/tcg/translate.c
+++ b/target/s390x/tcg/translate.c
@@ -6222,7 +6222,8 @@ enum DisasInsnEnum {
 #define FAC_PCI S390_FEAT_ZPCI /* z/PCI facility */
 #define FAC_AIS S390_FEAT_ADAPTER_INT_SUPPRESSION
 #define FAC_V   S390_FEAT_VECTOR /* vector facility */
-#define FAC_VE  S390_FEAT_VECTOR_ENH /* vector enhancements facility 1 
*/
+#define FAC_VE  S390_FEAT_VECTOR_ENH  /* vector enhancements facility 
1 */
+#define FAC_VE2 S390_FEAT_VECTOR_ENH2 /* vector enhancements facility 
2 */
 #define FAC_MIE2S390_FEAT_MISC_INSTRUCTION_EXT2 /* 
miscellaneous-instruction-extensions facility 2 */
 #define FAC_MIE3S390_FEAT_MISC_INSTRUCTION_EXT3 /* 
miscellaneous-instruction-extensions facility 3 */
 
diff --git a/target/s390x/tcg/vec_string_helper.c 
b/target/s390x/tcg/vec_string_helper.c
index ac315eb095..6c0476ecc1 100644
--- a/target/s390x/tcg/vec_string_helper.c
+++ b/target/s390x/tcg/vec_string_helper.c
@@ -471,3 +471,104 @@ void HELPER(gvec_vstrc_cc_rt##BITS)(void *v1, const void 
*v2, const void *v3,  \
 DEF_VSTRC_CC_RT_HELPER(8)
 DEF_VSTRC_CC_RT_HELPER(16)
 DEF_VSTRC_CC_RT_HELPER(32)
+
+static int vstrs(S390Vector *v1, const S390Vector *v2, const S390Vector *v3,
+ const S390Vector *v4, uint8_t es, bool zs)
+{
+int substr_elen, substr_0, str_elen, i, j, k, cc;
+int nelem = 16 >> es;
+bool eos = false;
+
+substr_elen = s390_vec_read_element8(v4, 7) >> es;
+
+/* If ZS, bound substr length by min(nelem, strlen(v3)). */
+if (zs) {
+int i;
+for (i = 0; i < nelem; i++) {
+if (s390_vec_read_element(v3, i, es) == 0) {
+break;
+}
+}
+if (i < substr_elen) {
+substr_elen = i;
+}
+}
+
+if (substr_elen == 0) {
+cc = 2; /* full match for degenerate case of empty substr */
+k = 0;
+goto done;
+}
+
+/* If ZS, look for eos in the searched string. */
+if (zs) {
+for (k = 0; k < nelem; k++) {
+if (s390_vec_read_element(v2, k, es) == 0) {
+eos = true;
+break;
+}
+}
+str_elen = k;
+} else {
+str_elen = nelem;
+}
+
+substr_0 = s390_vec_read_element(v3, 0, es);
+
+for (k = 0; ; k++) {
+for (; k < str_elen; k++) {
+if (s390_vec_read_element(v2, k, es) == substr_0) {
+break;
+}
+}
+
+/* If we reached the end of the string, no match. */
+if (k == str_elen) {
+cc = eos; /* no match (with or without zero char) */
+goto done;
+}
+
+/* If the substring is only one char, match. */
+if (substr_elen == 1) {
+cc = 2; /* full match */
+goto done;
+}
+
+/* If the match begins at the last char, we have a partial match. */
+if (k == str_elen - 1) {
+cc = 3; /* partial match */
+goto done;
+}
+
+i = MIN(nelem, k + substr_elen);
+for (j = k + 1; j < i; j++) {
+uint32_t e2 = s390_vec_read_element(v2, j,

[PATCH v3 02/11] target/s390x: vxeh2: vector convert short/32b

From: David Miller 

Signed-off-by: David Miller 
Reviewed-by: Richard Henderson 
Message-Id: <20220307020327.3003-2-dmiller...@gmail.com>
Signed-off-by: Richard Henderson 
---
 target/s390x/helper.h   |  4 +++
 target/s390x/tcg/vec_fpu_helper.c   | 31 
 target/s390x/tcg/translate_vx.c.inc | 44 ++---
 3 files changed, 75 insertions(+), 4 deletions(-)

diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index 69f69cf718..7cbcbd7f0b 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -275,6 +275,10 @@ DEF_HELPER_FLAGS_5(gvec_vfche64, TCG_CALL_NO_WG, void, 
ptr, cptr, cptr, env, i32
 DEF_HELPER_5(gvec_vfche64_cc, void, ptr, cptr, cptr, env, i32)
 DEF_HELPER_FLAGS_5(gvec_vfche128, TCG_CALL_NO_WG, void, ptr, cptr, cptr, env, 
i32)
 DEF_HELPER_5(gvec_vfche128_cc, void, ptr, cptr, cptr, env, i32)
+DEF_HELPER_FLAGS_4(gvec_vcdg32, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
+DEF_HELPER_FLAGS_4(gvec_vcdlg32, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
+DEF_HELPER_FLAGS_4(gvec_vcgd32, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
+DEF_HELPER_FLAGS_4(gvec_vclgd32, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
 DEF_HELPER_FLAGS_4(gvec_vcdg64, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
 DEF_HELPER_FLAGS_4(gvec_vcdlg64, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
 DEF_HELPER_FLAGS_4(gvec_vcgd64, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
diff --git a/target/s390x/tcg/vec_fpu_helper.c 
b/target/s390x/tcg/vec_fpu_helper.c
index 1a77993471..6834dbc540 100644
--- a/target/s390x/tcg/vec_fpu_helper.c
+++ b/target/s390x/tcg/vec_fpu_helper.c
@@ -176,6 +176,30 @@ static void vop128_2(S390Vector *v1, const S390Vector *v2, 
CPUS390XState *env,
 *v1 = tmp;
 }
 
+static float32 vcdg32(float32 a, float_status *s)
+{
+return int32_to_float32(a, s);
+}
+
+static float32 vcdlg32(float32 a, float_status *s)
+{
+return uint32_to_float32(a, s);
+}
+
+static float32 vcgd32(float32 a, float_status *s)
+{
+const float32 tmp = float32_to_int32(a, s);
+
+return float32_is_any_nan(a) ? INT32_MIN : tmp;
+}
+
+static float32 vclgd32(float32 a, float_status *s)
+{
+const float32 tmp = float32_to_uint32(a, s);
+
+return float32_is_any_nan(a) ? 0 : tmp;
+}
+
 static float64 vcdg64(float64 a, float_status *s)
 {
 return int64_to_float64(a, s);
@@ -211,6 +235,9 @@ void HELPER(gvec_##NAME##BITS)(void *v1, const void *v2, 
CPUS390XState *env,   \
 vop##BITS##_2(v1, v2, env, se, XxC, erm, FN, GETPC()); 
\
 }
 
+#define DEF_GVEC_VOP2_32(NAME) 
\
+DEF_GVEC_VOP2_FN(NAME, NAME##32, 32)
+
 #define DEF_GVEC_VOP2_64(NAME) 
\
 DEF_GVEC_VOP2_FN(NAME, NAME##64, 64)
 
@@ -219,6 +246,10 @@ DEF_GVEC_VOP2_FN(NAME, float32_##OP, 32)   
\
 DEF_GVEC_VOP2_FN(NAME, float64_##OP, 64)   
\
 DEF_GVEC_VOP2_FN(NAME, float128_##OP, 128)
 
+DEF_GVEC_VOP2_32(vcdg)
+DEF_GVEC_VOP2_32(vcdlg)
+DEF_GVEC_VOP2_32(vcgd)
+DEF_GVEC_VOP2_32(vclgd)
 DEF_GVEC_VOP2_64(vcdg)
 DEF_GVEC_VOP2_64(vcdlg)
 DEF_GVEC_VOP2_64(vcgd)
diff --git a/target/s390x/tcg/translate_vx.c.inc 
b/target/s390x/tcg/translate_vx.c.inc
index 98eb7710a4..ea28e40d4f 100644
--- a/target/s390x/tcg/translate_vx.c.inc
+++ b/target/s390x/tcg/translate_vx.c.inc
@@ -2720,23 +2720,59 @@ static DisasJumpType op_vcdg(DisasContext *s, DisasOps 
*o)
 
 switch (s->fields.op2) {
 case 0xc3:
-if (fpf == FPF_LONG) {
+switch (fpf) {
+case FPF_LONG:
 fn = gen_helper_gvec_vcdg64;
+break;
+case FPF_SHORT:
+if (s390_has_feat(S390_FEAT_VECTOR_ENH2)) {
+fn = gen_helper_gvec_vcdg32;
+}
+break;
+default:
+break;
 }
 break;
 case 0xc1:
-if (fpf == FPF_LONG) {
+switch (fpf) {
+case FPF_LONG:
 fn = gen_helper_gvec_vcdlg64;
+break;
+case FPF_SHORT:
+if (s390_has_feat(S390_FEAT_VECTOR_ENH2)) {
+fn = gen_helper_gvec_vcdlg32;
+}
+break;
+default:
+break;
 }
 break;
 case 0xc2:
-if (fpf == FPF_LONG) {
+switch (fpf) {
+case FPF_LONG:
 fn = gen_helper_gvec_vcgd64;
+break;
+case FPF_SHORT:
+if (s390_has_feat(S390_FEAT_VECTOR_ENH2)) {
+fn = gen_helper_gvec_vcgd32;
+}
+break;
+default:
+break;
 }
 break;
 case 0xc0:
-if (fpf == FPF_LONG) {
+switch (fpf) {
+case FPF_LONG:
 fn = gen_helper_gvec_vclgd64;
+break;
+case FPF_SHORT:
+if (s390_has_feat(S390_FEAT_VECTOR_ENH2)) {
+fn = gen_helper_gvec_vclgd32;
+}
+break;
+

[PATCH v3 3/5] iotests: Remove explicit checks for qemu_img() == 0

qemu_img() returning zero ought to be the rule, not the
exception. Remove all explicit checks against the condition in
preparation for making non-zero returns an Exception.

Signed-off-by: John Snow 
---
 tests/qemu-iotests/163  |  9 +++--
 tests/qemu-iotests/216  |  6 +++---
 tests/qemu-iotests/218  |  2 +-
 tests/qemu-iotests/224  | 11 +--
 tests/qemu-iotests/228  | 12 ++--
 tests/qemu-iotests/257  |  3 +--
 tests/qemu-iotests/258  |  4 ++--
 tests/qemu-iotests/310  | 13 ++---
 tests/qemu-iotests/tests/block-status-cache |  3 +--
 tests/qemu-iotests/tests/graph-changes-while-io |  7 +++
 tests/qemu-iotests/tests/image-fleecing | 10 +-
 tests/qemu-iotests/tests/mirror-ready-cancel-error  |  6 ++
 tests/qemu-iotests/tests/mirror-top-perms   |  3 +--
 tests/qemu-iotests/tests/remove-bitmap-from-backing |  8 
 tests/qemu-iotests/tests/stream-error-on-reset  |  4 ++--
 15 files changed, 45 insertions(+), 56 deletions(-)

diff --git a/tests/qemu-iotests/163 b/tests/qemu-iotests/163
index b8bfc95358..e4cd4b230f 100755
--- a/tests/qemu-iotests/163
+++ b/tests/qemu-iotests/163
@@ -107,8 +107,7 @@ class ShrinkBaseClass(iotests.QMPTestCase):
 
 if iotests.imgfmt == 'raw':
 return
-self.assertEqual(qemu_img('check', test_img), 0,
- "Verifying image corruption")
+qemu_img('check', test_img)
 
 def test_empty_image(self):
 qemu_img('resize',  '-f', iotests.imgfmt, '--shrink', test_img,
@@ -130,8 +129,7 @@ class ShrinkBaseClass(iotests.QMPTestCase):
 qemu_img('resize',  '-f', iotests.imgfmt, '--shrink', test_img,
  self.shrink_size)
 
-self.assertEqual(qemu_img("compare", test_img, check_img), 0,
- "Verifying image content")
+qemu_img("compare", test_img, check_img)
 
 self.image_verify()
 
@@ -146,8 +144,7 @@ class ShrinkBaseClass(iotests.QMPTestCase):
 qemu_img('resize',  '-f', iotests.imgfmt, '--shrink', test_img,
  self.shrink_size)
 
-self.assertEqual(qemu_img("compare", test_img, check_img), 0,
- "Verifying image content")
+qemu_img("compare", test_img, check_img)
 
 self.image_verify()
 
diff --git a/tests/qemu-iotests/216 b/tests/qemu-iotests/216
index c02f8d2880..88b385afa3 100755
--- a/tests/qemu-iotests/216
+++ b/tests/qemu-iotests/216
@@ -51,10 +51,10 @@ with iotests.FilePath('base.img') as base_img_path, \
 log('--- Setting up images ---')
 log('')
 
-assert qemu_img('create', '-f', iotests.imgfmt, base_img_path, '64M') == 0
+qemu_img('create', '-f', iotests.imgfmt, base_img_path, '64M')
 assert qemu_io_silent(base_img_path, '-c', 'write -P 1 0M 1M') == 0
-assert qemu_img('create', '-f', iotests.imgfmt, '-b', base_img_path,
-'-F', iotests.imgfmt, top_img_path) == 0
+qemu_img('create', '-f', iotests.imgfmt, '-b', base_img_path,
+ '-F', iotests.imgfmt, top_img_path)
 assert qemu_io_silent(top_img_path,  '-c', 'write -P 2 1M 1M') == 0
 
 log('Done')
diff --git a/tests/qemu-iotests/218 b/tests/qemu-iotests/218
index 4922b4d3b6..853ed52b34 100755
--- a/tests/qemu-iotests/218
+++ b/tests/qemu-iotests/218
@@ -145,7 +145,7 @@ log('')
 with iotests.VM() as vm, \
  iotests.FilePath('src.img') as src_img_path:
 
-assert qemu_img('create', '-f', iotests.imgfmt, src_img_path, '64M') == 0
+qemu_img('create', '-f', iotests.imgfmt, src_img_path, '64M')
 assert qemu_io_silent('-f', iotests.imgfmt, src_img_path,
   '-c', 'write -P 42 0M 64M') == 0
 
diff --git a/tests/qemu-iotests/224 b/tests/qemu-iotests/224
index 38dd153625..c31c55b49d 100755
--- a/tests/qemu-iotests/224
+++ b/tests/qemu-iotests/224
@@ -47,12 +47,11 @@ for filter_node_name in False, True:
  iotests.FilePath('top.img') as top_img_path, \
  iotests.VM() as vm:
 
-assert qemu_img('create', '-f', iotests.imgfmt,
-base_img_path, '64M') == 0
-assert qemu_img('create', '-f', iotests.imgfmt, '-b', base_img_path,
-'-F', iotests.imgfmt, mid_img_path) == 0
-assert qemu_img('create', '-f', iotests.imgfmt, '-b', mid_img_path,
-'-F', iotests.imgfmt, top_img_path) == 0
+qemu_img('create', '-f', iotests.imgfmt, base_img_path, '64M')
+qemu_img('create', '-f', iotests.imgfmt, '-b', base_img_path,
+ '-F', iotests.imgfmt, mid_img_path)
+qemu_img('create', '-f', iotests.imgfmt, '-b', mid_img_path,
+ '-F', iotests.imgfmt, top_img_path)
 
 # Something to commit
 assert

[PATCH v3 4/5] iotests: make qemu_img raise on non-zero rc by default

re-write qemu_img() as a function that will by default raise a
VerboseProcessException (extended from CalledProcessException) on
non-zero return codes. This will produce a stack trace that will show
the command line arguments and return code from the failed process run.

Users that want something more flexible (there appears to be only one)
can use check=False and manage the return themselves. However, when the
return code is negative, the Exception will be raised no matter what.
This is done under the belief that there's no legitimate reason, even in
negative tests, to see a crash from qemu-img.

Signed-off-by: John Snow 
Reviewed-by: Eric Blake 
---
 tests/qemu-iotests/257|  8 +++--
 tests/qemu-iotests/iotests.py | 56 ++-
 2 files changed, 54 insertions(+), 10 deletions(-)

diff --git a/tests/qemu-iotests/257 b/tests/qemu-iotests/257
index fb5359c581..e7e7a2317e 100755
--- a/tests/qemu-iotests/257
+++ b/tests/qemu-iotests/257
@@ -241,11 +241,13 @@ def compare_images(image, reference, baseimg=None, 
expected_match=True):
 expected_ret = 0 if expected_match else 1
 if baseimg:
 qemu_img("rebase", "-u", "-b", baseimg, '-F', iotests.imgfmt, image)
-ret = qemu_img("compare", image, reference)
+
+sub = qemu_img("compare", image, reference, check=False)
+
 log('qemu_img compare "{:s}" "{:s}" ==> {:s}, {:s}'.format(
 image, reference,
-"Identical" if ret == 0 else "Mismatch",
-"OK!" if ret == expected_ret else "ERROR!"),
+"Identical" if sub.returncode == 0 else "Mismatch",
+"OK!" if sub.returncode == expected_ret else "ERROR!"),
 filters=[iotests.filter_testfiles])
 
 def test_bitmap_sync(bsync_mode, msync_mode='bitmap', failure=None):
diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 508adade9e..ec4568b24a 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -37,9 +37,10 @@
 
 from contextlib import contextmanager
 
+from qemu.aqmp.legacy import QEMUMonitorProtocol
 from qemu.machine import qtest
 from qemu.qmp import QMPMessage
-from qemu.aqmp.legacy import QEMUMonitorProtocol
+from qemu.utils import VerboseProcessError
 
 # Use this logger for logging messages directly from the iotests module
 logger = logging.getLogger('qemu.iotests')
@@ -215,9 +216,49 @@ def qemu_img_pipe_and_status(*args: str) -> Tuple[str, 
int]:
 return qemu_tool_pipe_and_status('qemu-img', full_args,
  drop_successful_output=is_create)
 
-def qemu_img(*args: str) -> int:
-'''Run qemu-img and return the exit code'''
-return qemu_img_pipe_and_status(*args)[1]
+def qemu_img(*args: str, check: bool = True, combine_stdio: bool = True
+ ) -> subprocess.CompletedProcess[str]:
+"""
+Run qemu_img and return the status code and console output.
+
+This function always prepends QEMU_IMG_OPTIONS and may further alter
+the args for 'create' commands.
+
+:param args: command-line arguments to qemu-img.
+:param check: Enforce a return code of zero.
+:param combine_stdio: set to False to keep stdout/stderr separated.
+
+:raise VerboseProcessError:
+When the return code is negative, or on any non-zero exit code
+when 'check=True' was provided (the default). This exception has
+'stdout', 'stderr', and 'returncode' properties that may be
+inspected to show greater detail. If this exception is not
+handled, the command-line, return code, and all console output
+will be included at the bottom of the stack trace.
+
+:return: a CompletedProcess. This object has args, returncode, and
+stdout properties. If streams are not combined, it will also
+have a stderr property.
+"""
+full_args = qemu_img_args + qemu_img_create_prepare_args(list(args))
+
+subp = subprocess.run(
+full_args,
+stdout=subprocess.PIPE,
+stderr=subprocess.STDOUT if combine_stdio else subprocess.PIPE,
+universal_newlines=True,
+check=False
+)
+
+if check and subp.returncode or (subp.returncode < 0):
+raise VerboseProcessError(
+subp.returncode, full_args,
+output=subp.stdout,
+stderr=subp.stderr,
+)
+
+return subp
+
 
 def ordered_qmp(qmsg, conv_keys=True):
 # Dictionaries are not ordered prior to 3.6, therefore:
@@ -232,7 +273,7 @@ def ordered_qmp(qmsg, conv_keys=True):
 return od
 return qmsg
 
-def qemu_img_create(*args):
+def qemu_img_create(*args: str) -> subprocess.CompletedProcess[str]:
 return qemu_img('create', *args)
 
 def qemu_img_measure(*args):
@@ -467,8 +508,9 @@ def qemu_nbd_popen(*args):
 
 def compare_images(img1, img2, fmt1=imgfmt, fmt2=imgfmt):
 '''Return True if two image files are identical'''
-return qemu_img('compare', '-f', fmt1,
-'-F', fmt2, img1, img2) == 0
+res =

[PATCH v3 09/11] target/s390x: add S390_FEAT_VECTOR_ENH2 to cpu max

From: David Miller 

Signed-off-by: David Miller 
Message-Id: <20220307020327.3003-7-dmiller...@gmail.com>
Reviewed-by: Richard Henderson 
Signed-off-by: Richard Henderson 
---
 target/s390x/gen-features.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/target/s390x/gen-features.c b/target/s390x/gen-features.c
index 22846121c4..499a3b10a8 100644
--- a/target/s390x/gen-features.c
+++ b/target/s390x/gen-features.c
@@ -740,7 +740,9 @@ static uint16_t qemu_V6_2[] = {
 
 static uint16_t qemu_LATEST[] = {
 S390_FEAT_MISC_INSTRUCTION_EXT3,
+S390_FEAT_VECTOR_ENH2,
 };
+
 /* add all new definitions before this point */
 static uint16_t qemu_MAX[] = {
 /* generates a dependency warning, leave it out for now */
-- 
2.25.1

[PATCH v3 2/5] python/utils: add VerboseProcessError

This adds an Exception that extends the Python stdlib
subprocess.CalledProcessError.

The difference is that the str() method of this exception also adds the
stdout/stderr logs. In effect, if this exception goes unhandled, Python
will print the output in a visually distinct wrapper to the terminal so
that it's easy to spot in a sea of traceback information.

Signed-off-by: John Snow 
Reviewed-by: Eric Blake 
---
 python/qemu/utils/__init__.py | 36 +++
 1 file changed, 36 insertions(+)

diff --git a/python/qemu/utils/__init__.py b/python/qemu/utils/__init__.py
index 5babf40df2..355ac550bc 100644
--- a/python/qemu/utils/__init__.py
+++ b/python/qemu/utils/__init__.py
@@ -18,6 +18,7 @@
 import os
 import re
 import shutil
+from subprocess import CalledProcessError
 import textwrap
 from typing import Optional
 
@@ -26,6 +27,7 @@
 
 
 __all__ = (
+'VerboseProcessError',
 'add_visual_margin',
 'get_info_usernet_hostfwd_port',
 'kvm_available',
@@ -121,3 +123,37 @@ def _wrap(line: str) -> str:
 os.linesep.join(_wrap(line) for line in content.splitlines()),
 _bar(None, top=False),
 ))
+
+
+class VerboseProcessError(CalledProcessError):
+"""
+The same as CalledProcessError, but more verbose.
+
+This is useful for debugging failed calls during test executions.
+The return code, signal (if any), and terminal output will be displayed
+on unhandled exceptions.
+"""
+def summary(self) -> str:
+"""Return the normal CalledProcessError str() output."""
+return super().__str__()
+
+def __str__(self) -> str:
+lmargin = '  '
+width = -len(lmargin)
+sections = []
+
+name = 'output' if self.stderr is None else 'stdout'
+if self.stdout:
+sections.append(add_visual_margin(self.stdout, width, name))
+else:
+sections.append(f"{name}: N/A")
+
+if self.stderr:
+sections.append(add_visual_margin(self.stderr, width, 'stderr'))
+elif self.stderr is not None:
+sections.append("stderr: N/A")
+
+return os.linesep.join((
+self.summary(),
+textwrap.indent(os.linesep.join(sections), prefix=lmargin),
+))
-- 
2.34.1

[PULL 15/15] qemu-io: Allow larger write zeroes under no fallback

When writing zeroes can fall back to a slow write, permitting an
overly large request can become an amplification denial of service
attack in triggering a large amount of work from a small request.  But
the whole point of the no fallback flag is to quickly determine if
writing an entire device to zero can be done quickly (such as when it
is already known that the device started with zero contents); in those
cases, artificially capping things at 2G in qemu-io itself doesn't
help us.

Signed-off-by: Eric Blake 
Message-Id: <20211203231539.3900865-4-ebl...@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 qemu-io-cmds.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/qemu-io-cmds.c b/qemu-io-cmds.c
index 954955c12fb9..45a957093369 100644
--- a/qemu-io-cmds.c
+++ b/qemu-io-cmds.c
@@ -603,10 +603,6 @@ static int do_co_pwrite_zeroes(BlockBackend *blk, int64_t 
offset,
 .done   = false,
 };

-if (bytes > INT_MAX) {
-return -ERANGE;
-}
-
 co = qemu_coroutine_create(co_pwrite_zeroes_entry, );
 bdrv_coroutine_enter(blk_bs(blk), co);
 while (!data.done) {
@@ -1160,8 +1156,9 @@ static int write_f(BlockBackend *blk, int argc, char 
**argv)
 if (count < 0) {
 print_cvtnum_err(count, argv[optind]);
 return count;
-} else if (count > BDRV_REQUEST_MAX_BYTES) {
-printf("length cannot exceed %" PRIu64 ", given %s\n",
+} else if (count > BDRV_REQUEST_MAX_BYTES &&
+   !(flags & BDRV_REQ_NO_FALLBACK)) {
+printf("length cannot exceed %" PRIu64 " without -n, given %s\n",
(uint64_t)BDRV_REQUEST_MAX_BYTES, argv[optind]);
 return -EINVAL;
 }
-- 
2.35.1

[PATCH v3 05/11] target/s390x: vxeh2: vector shift double by bit

From: David Miller 

Signed-off-by: David Miller 
Message-Id: <20220307020327.3003-4-dmiller...@gmail.com>
[rth: Split out of larger patch.]
Signed-off-by: Richard Henderson 
---
 target/s390x/tcg/translate_vx.c.inc | 47 ++---
 target/s390x/tcg/insn-data.def  |  6 +++-
 2 files changed, 48 insertions(+), 5 deletions(-)

diff --git a/target/s390x/tcg/translate_vx.c.inc 
b/target/s390x/tcg/translate_vx.c.inc
index 967f6213d8..a5283ef2f8 100644
--- a/target/s390x/tcg/translate_vx.c.inc
+++ b/target/s390x/tcg/translate_vx.c.inc
@@ -2056,11 +2056,19 @@ static DisasJumpType op_vsrl(DisasContext *s, DisasOps 
*o)
 gen_helper_gvec_vsrl_ve2);
 }
 
-static DisasJumpType op_vsldb(DisasContext *s, DisasOps *o)
+static DisasJumpType op_vsld(DisasContext *s, DisasOps *o)
 {
-const uint8_t i4 = get_field(s, i4) & 0xf;
-const int left_shift = (i4 & 7) * 8;
-const int right_shift = 64 - left_shift;
+const bool byte = s->insn->data;
+const uint8_t mask = byte ? 15 : 7;
+const uint8_t mul  = byte ?  8 : 1;
+const uint8_t i4   = get_field(s, i4);
+const int right_shift = 64 - (i4 & 7) * mul;
+
+if (i4 & ~mask) {
+gen_program_exception(s, PGM_SPECIFICATION);
+return DISAS_NORETURN;
+}
+
 TCGv_i64 t0 = tcg_temp_new_i64();
 TCGv_i64 t1 = tcg_temp_new_i64();
 TCGv_i64 t2 = tcg_temp_new_i64();
@@ -2074,8 +2082,39 @@ static DisasJumpType op_vsldb(DisasContext *s, DisasOps 
*o)
 read_vec_element_i64(t1, get_field(s, v3), 0, ES_64);
 read_vec_element_i64(t2, get_field(s, v3), 1, ES_64);
 }
+
 tcg_gen_extract2_i64(t0, t1, t0, right_shift);
 tcg_gen_extract2_i64(t1, t2, t1, right_shift);
+
+write_vec_element_i64(t0, get_field(s, v1), 0, ES_64);
+write_vec_element_i64(t1, get_field(s, v1), 1, ES_64);
+
+tcg_temp_free(t0);
+tcg_temp_free(t1);
+tcg_temp_free(t2);
+return DISAS_NEXT;
+}
+
+static DisasJumpType op_vsrd(DisasContext *s, DisasOps *o)
+{
+const uint8_t i4 = get_field(s, i4);
+
+if (i4 & ~7) {
+gen_program_exception(s, PGM_SPECIFICATION);
+return DISAS_NORETURN;
+}
+
+TCGv_i64 t0 = tcg_temp_new_i64();
+TCGv_i64 t1 = tcg_temp_new_i64();
+TCGv_i64 t2 = tcg_temp_new_i64();
+
+read_vec_element_i64(t0, get_field(s, v2), 1, ES_64);
+read_vec_element_i64(t1, get_field(s, v3), 0, ES_64);
+read_vec_element_i64(t2, get_field(s, v3), 1, ES_64);
+
+tcg_gen_extract2_i64(t0, t1, t0, i4);
+tcg_gen_extract2_i64(t1, t2, t1, i4);
+
 write_vec_element_i64(t0, get_field(s, v1), 0, ES_64);
 write_vec_element_i64(t1, get_field(s, v1), 1, ES_64);
 
diff --git a/target/s390x/tcg/insn-data.def b/target/s390x/tcg/insn-data.def
index f487a64abf..98a31a557d 100644
--- a/target/s390x/tcg/insn-data.def
+++ b/target/s390x/tcg/insn-data.def
@@ -1207,12 +1207,16 @@
 E(0xe774, VSL, VRR_c, V,   0, 0, 0, 0, vsl, 0, 0, IF_VEC)
 /* VECTOR SHIFT LEFT BY BYTE */
 E(0xe775, VSLB,VRR_c, V,   0, 0, 0, 0, vsl, 0, 1, IF_VEC)
+/* VECTOR SHIFT LEFT DOUBLE BY BIT */
+E(0xe786, VSLD,VRI_d, VE2, 0, 0, 0, 0, vsld, 0, 0, IF_VEC)
 /* VECTOR SHIFT LEFT DOUBLE BY BYTE */
-F(0xe777, VSLDB,   VRI_d, V,   0, 0, 0, 0, vsldb, 0, IF_VEC)
+E(0xe777, VSLDB,   VRI_d, V,   0, 0, 0, 0, vsld, 0, 1, IF_VEC)
 /* VECTOR SHIFT RIGHT ARITHMETIC */
 E(0xe77e, VSRA,VRR_c, V,   0, 0, 0, 0, vsra, 0, 0, IF_VEC)
 /* VECTOR SHIFT RIGHT ARITHMETIC BY BYTE */
 E(0xe77f, VSRAB,   VRR_c, V,   0, 0, 0, 0, vsra, 0, 1, IF_VEC)
+/* VECTOR SHIFT RIGHT DOUBLE BY BIT */
+F(0xe787, VSRD,VRI_d, VE2, 0, 0, 0, 0, vsrd, 0, IF_VEC)
 /* VECTOR SHIFT RIGHT LOGICAL */
 E(0xe77c, VSRL,VRR_c, V,   0, 0, 0, 0, vsrl, 0, 0, IF_VEC)
 /* VECTOR SHIFT RIGHT LOGICAL BY BYTE */
-- 
2.25.1

[PULL 14/15] qemu-io: Utilize 64-bit status during map

The block layer has supported 64-bit block status from drivers since
commit 86a3d5c688 ("block: Add .bdrv_co_block_status() callback",
v2.12) and friends, with individual driver callbacks responsible for
capping things where necessary.  Artificially capping things below 2G
in the qemu-io 'map' command, added in commit d6a644bbfe ("block: Make
bdrv_is_allocated() byte-based", v2.10) is thus no longer necessary.

One way to test this is with qemu-nbd as server on a raw file larger
than 4G (the entire file should show as allocated), plus 'qemu-io -f
raw -c map nbd://localhost --trace=nbd_\*' as client.  Prior to this
patch, the NBD_CMD_BLOCK_STATUS requests are fragmented at 0x7e00
distances; with this patch, the fragmenting changes to 0x7fff
(since the NBD protocol is currently still limited to 32-bit
transactions - see block/nbd.c:nbd_client_co_block_status).  Then in
later patches, once I add an NBD extension for a 64-bit block status,
the same map command completes with just one NBD_CMD_BLOCK_STATUS.

Signed-off-by: Eric Blake 
Message-Id: <20211203231539.3900865-3-ebl...@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 qemu-io-cmds.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/qemu-io-cmds.c b/qemu-io-cmds.c
index 46593d632d8f..954955c12fb9 100644
--- a/qemu-io-cmds.c
+++ b/qemu-io-cmds.c
@@ -1993,11 +1993,9 @@ static int map_is_allocated(BlockDriverState *bs, 
int64_t offset,
 int64_t bytes, int64_t *pnum)
 {
 int64_t num;
-int num_checked;
 int ret, firstret;

-num_checked = MIN(bytes, BDRV_REQUEST_MAX_BYTES);
-ret = bdrv_is_allocated(bs, offset, num_checked, );
+ret = bdrv_is_allocated(bs, offset, bytes, );
 if (ret < 0) {
 return ret;
 }
@@ -2009,8 +2007,7 @@ static int map_is_allocated(BlockDriverState *bs, int64_t 
offset,
 offset += num;
 bytes -= num;

-num_checked = MIN(bytes, BDRV_REQUEST_MAX_BYTES);
-ret = bdrv_is_allocated(bs, offset, num_checked, );
+ret = bdrv_is_allocated(bs, offset, bytes, );
 if (ret == firstret && num) {
 *pnum += num;
 } else {
-- 
2.35.1

[PATCH v3 11/11] target/s390x: Fix writeback to v1 in helper_vstl

Copy-paste error from vector load length -- do not write
zeros back to v1 after storing from v1.

Signed-off-by: Richard Henderson 
---
 target/s390x/tcg/vec_helper.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/target/s390x/tcg/vec_helper.c b/target/s390x/tcg/vec_helper.c
index ededf13cf0..48d86722b2 100644
--- a/target/s390x/tcg/vec_helper.c
+++ b/target/s390x/tcg/vec_helper.c
@@ -200,7 +200,6 @@ void HELPER(vstl)(CPUS390XState *env, const void *v1, 
uint64_t addr,
 addr = wrap_address(env, addr + 8);
 cpu_stq_data_ra(env, addr, s390_vec_read_element64(v1, 1), GETPC());
 } else {
-S390Vector tmp = {};
 int i;
 
 for (i = 0; i < bytes; i++) {
@@ -209,6 +208,5 @@ void HELPER(vstl)(CPUS390XState *env, const void *v1, 
uint64_t addr,
 cpu_stb_data_ra(env, addr, byte, GETPC());
 addr = wrap_address(env, addr + 1);
 }
-*(S390Vector *)v1 = tmp;
 }
 }
-- 
2.25.1

[PATCH v3 5/5] iotests: fortify compare_images() against crashes

Fortify compare_images() to be more discerning about the status codes it
receives. If qemu_img() returns an exit code that implies it didn't
actually perform the comparison, treat that as an exceptional
circumstance and force the caller to be aware of the peril.

If a negative test is desired (perhaps to test how qemu_img compare
behaves on malformed images, for instance), it is still possible to
catch the exception in the test and deal with that circumstance
manually.

Signed-off-by: John Snow 
Reviewed-by: Eric Blake 
---
 tests/qemu-iotests/iotests.py | 21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index ec4568b24a..7057db0686 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -506,11 +506,22 @@ def qemu_nbd_popen(*args):
 p.kill()
 p.wait()
 
-def compare_images(img1, img2, fmt1=imgfmt, fmt2=imgfmt):
-'''Return True if two image files are identical'''
-res = qemu_img('compare', '-f', fmt1,
-   '-F', fmt2, img1, img2, check=False)
-return res.returncode == 0
+def compare_images(img1: str, img2: str,
+   fmt1: str = imgfmt, fmt2: str = imgfmt) -> bool:
+"""
+Compare two images with QEMU_IMG; return True if they are identical.
+
+:raise CalledProcessError:
+when qemu-img crashes or returns a status code of anything other
+than 0 (identical) or 1 (different).
+"""
+try:
+qemu_img('compare', '-f', fmt1, '-F', fmt2, img1, img2)
+return True
+except subprocess.CalledProcessError as exc:
+if exc.returncode == 1:
+return False
+raise
 
 def create_image(name, size):
 '''Create a fully-allocated raw image with sector markers'''
-- 
2.34.1

[PATCH v3 01/11] tcg: Implement tcg_gen_{h,w}swap_{i32,i64}

Swap half-words (16-bit) and words (32-bit) within a larger value.
Mirrors functions of the same names within include/qemu/bitops.h.

Signed-off-by: Richard Henderson 
---
 include/tcg/tcg-op.h |  6 ++
 tcg/tcg-op.c | 30 ++
 2 files changed, 36 insertions(+)

diff --git a/include/tcg/tcg-op.h b/include/tcg/tcg-op.h
index caa0a63612..b09b8b4a05 100644
--- a/include/tcg/tcg-op.h
+++ b/include/tcg/tcg-op.h
@@ -332,6 +332,7 @@ void tcg_gen_ext8u_i32(TCGv_i32 ret, TCGv_i32 arg);
 void tcg_gen_ext16u_i32(TCGv_i32 ret, TCGv_i32 arg);
 void tcg_gen_bswap16_i32(TCGv_i32 ret, TCGv_i32 arg, int flags);
 void tcg_gen_bswap32_i32(TCGv_i32 ret, TCGv_i32 arg);
+void tcg_gen_hswap_i32(TCGv_i32 ret, TCGv_i32 arg);
 void tcg_gen_smin_i32(TCGv_i32, TCGv_i32 arg1, TCGv_i32 arg2);
 void tcg_gen_smax_i32(TCGv_i32, TCGv_i32 arg1, TCGv_i32 arg2);
 void tcg_gen_umin_i32(TCGv_i32, TCGv_i32 arg1, TCGv_i32 arg2);
@@ -531,6 +532,8 @@ void tcg_gen_ext32u_i64(TCGv_i64 ret, TCGv_i64 arg);
 void tcg_gen_bswap16_i64(TCGv_i64 ret, TCGv_i64 arg, int flags);
 void tcg_gen_bswap32_i64(TCGv_i64 ret, TCGv_i64 arg, int flags);
 void tcg_gen_bswap64_i64(TCGv_i64 ret, TCGv_i64 arg);
+void tcg_gen_hswap_i64(TCGv_i64 ret, TCGv_i64 arg);
+void tcg_gen_wswap_i64(TCGv_i64 ret, TCGv_i64 arg);
 void tcg_gen_smin_i64(TCGv_i64, TCGv_i64 arg1, TCGv_i64 arg2);
 void tcg_gen_smax_i64(TCGv_i64, TCGv_i64 arg1, TCGv_i64 arg2);
 void tcg_gen_umin_i64(TCGv_i64, TCGv_i64 arg1, TCGv_i64 arg2);
@@ -1077,6 +1080,8 @@ void tcg_gen_stl_vec(TCGv_vec r, TCGv_ptr base, TCGArg 
offset, TCGType t);
 #define tcg_gen_bswap32_tl tcg_gen_bswap32_i64
 #define tcg_gen_bswap64_tl tcg_gen_bswap64_i64
 #define tcg_gen_bswap_tl tcg_gen_bswap64_i64
+#define tcg_gen_hswap_tl tcg_gen_hswap_i64
+#define tcg_gen_wswap_tl tcg_gen_wswap_i64
 #define tcg_gen_concat_tl_i64 tcg_gen_concat32_i64
 #define tcg_gen_extr_i64_tl tcg_gen_extr32_i64
 #define tcg_gen_andc_tl tcg_gen_andc_i64
@@ -1192,6 +1197,7 @@ void tcg_gen_stl_vec(TCGv_vec r, TCGv_ptr base, TCGArg 
offset, TCGType t);
 #define tcg_gen_bswap16_tl tcg_gen_bswap16_i32
 #define tcg_gen_bswap32_tl(D, S, F) tcg_gen_bswap32_i32(D, S)
 #define tcg_gen_bswap_tl tcg_gen_bswap32_i32
+#define tcg_gen_hswap_tl tcg_gen_hswap_i32
 #define tcg_gen_concat_tl_i64 tcg_gen_concat_i32_i64
 #define tcg_gen_extr_i64_tl tcg_gen_extr_i64_i32
 #define tcg_gen_andc_tl tcg_gen_andc_i32
diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index 65e1c94c2d..379adb4b9f 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -1056,6 +1056,12 @@ void tcg_gen_bswap32_i32(TCGv_i32 ret, TCGv_i32 arg)
 }
 }
 
+void tcg_gen_hswap_i32(TCGv_i32 ret, TCGv_i32 arg)
+{
+/* Swapping 2 16-bit elements is a rotate. */
+tcg_gen_rotli_i32(ret, arg, 16);
+}
+
 void tcg_gen_smin_i32(TCGv_i32 ret, TCGv_i32 a, TCGv_i32 b)
 {
 tcg_gen_movcond_i32(TCG_COND_LT, ret, a, b, a, b);
@@ -1792,6 +1798,30 @@ void tcg_gen_bswap64_i64(TCGv_i64 ret, TCGv_i64 arg)
 }
 }
 
+void tcg_gen_hswap_i64(TCGv_i64 ret, TCGv_i64 arg)
+{
+uint64_t m = 0xull;
+TCGv_i64 t0 = tcg_temp_new_i64();
+TCGv_i64 t1 = tcg_temp_new_i64();
+
+/* See include/qemu/bitops.h, hswap64. */
+tcg_gen_rotli_i64(t1, arg, 32);
+tcg_gen_andi_i64(t0, t1, m);
+tcg_gen_shri_i64(t1, t1, 16);
+tcg_gen_shli_i64(t0, t0, 16);
+tcg_gen_andi_i64(t1, t1, m);
+tcg_gen_or_i64(ret, t0, t1);
+
+tcg_temp_free_i64(t0);
+tcg_temp_free_i64(t1);
+}
+
+void tcg_gen_wswap_i64(TCGv_i64 ret, TCGv_i64 arg)
+{
+/* Swapping 2 32-bit elements is a rotate. */
+tcg_gen_rotli_i64(ret, arg, 32);
+}
+
 void tcg_gen_not_i64(TCGv_i64 ret, TCGv_i64 arg)
 {
 if (TCG_TARGET_REG_BITS == 32) {
-- 
2.25.1

[PULL 10/15] tests/qemu-iotests: validate NBD TLS with hostname mismatch

From: Daniel P. Berrangé 

This validates that connections to an NBD server where the certificate
hostname does not match will fail. It further validates that using the
new 'tls-hostname' override option can solve the failure.

Reviewed-by: Eric Blake 
Signed-off-by: Daniel P. Berrangé 
Message-Id: <20220304193610.3293146-11-berra...@redhat.com>
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/common.tls |  7 ---
 tests/qemu-iotests/233| 18 ++
 tests/qemu-iotests/233.out| 16 
 3 files changed, 38 insertions(+), 3 deletions(-)

diff --git a/tests/qemu-iotests/common.tls b/tests/qemu-iotests/common.tls
index 6ba28a78d3c8..4a5760949d0f 100644
--- a/tests/qemu-iotests/common.tls
+++ b/tests/qemu-iotests/common.tls
@@ -118,12 +118,13 @@ tls_x509_create_server()
 caname=$1
 name=$2

+# We don't include 'localhost' in the cert, as
+# we want to keep it unlisted to let tests
+# validate hostname override
 mkdir -p "${tls_dir}/$name"
 cat > "${tls_dir}/cert.info" <&1 | _filter_qemu_nbd_exports

+echo
+echo "== check TLS fail over TCP with mismatched hostname =="
+obj1=tls-creds-x509,dir=${tls_dir}/client1,endpoint=client,id=tls0
+$QEMU_IMG info --image-opts --object $obj1 \
+driver=nbd,host=localhost,port=$nbd_tcp_port,tls-creds=tls0 \
+2>&1 | _filter_nbd
+$QEMU_NBD_PROG -L -b localhost -p $nbd_tcp_port --object $obj1 \
+--tls-creds=tls0 | _filter_qemu_nbd_exports
+
+echo
+echo "== check TLS works over TCP with mismatched hostname and override =="
+obj1=tls-creds-x509,dir=${tls_dir}/client1,endpoint=client,id=tls0
+$QEMU_IMG info --image-opts --object $obj1 \
+
driver=nbd,host=localhost,port=$nbd_tcp_port,tls-creds=tls0,tls-hostname=127.0.0.1
 \
+2>&1 | _filter_nbd
+$QEMU_NBD_PROG -L -b localhost -p $nbd_tcp_port --object $obj1 \
+--tls-creds=tls0 --tls-hostname=127.0.0.1 | _filter_qemu_nbd_exports
+
 echo
 echo "== check TLS with different CA fails =="
 obj=tls-creds-x509,dir=${tls_dir}/client2,endpoint=client,id=tls0
diff --git a/tests/qemu-iotests/233.out b/tests/qemu-iotests/233.out
index 67a027d87986..d42611bf74a6 100644
--- a/tests/qemu-iotests/233.out
+++ b/tests/qemu-iotests/233.out
@@ -38,6 +38,20 @@ exports available: 1
   size:  67108864
   min block: 1

+== check TLS fail over TCP with mismatched hostname ==
+qemu-img: Could not open 'driver=nbd,host=localhost,port=PORT,tls-creds=tls0': 
Certificate does not match the hostname localhost
+qemu-nbd: Certificate does not match the hostname localhost
+
+== check TLS works over TCP with mismatched hostname and override ==
+image: nbd://localhost:PORT
+file format: nbd
+virtual size: 64 MiB (67108864 bytes)
+disk size: unavailable
+exports available: 1
+ export: ''
+  size:  67108864
+  min block: 1
+
 == check TLS with different CA fails ==
 qemu-img: Could not open 'driver=nbd,host=127.0.0.1,port=PORT,tls-creds=tls0': 
The certificate hasn't got a known issuer
 qemu-nbd: The certificate hasn't got a known issuer
@@ -55,6 +69,8 @@ qemu-img: Could not open 
'driver=nbd,host=127.0.0.1,port=PORT,tls-creds=tls0': F
 qemu-img: Could not open 'driver=nbd,host=127.0.0.1,port=PORT,tls-creds=tls0': 
Failed to read option reply: Cannot read from TLS channel: Software caused 
connection abort

 == final server log ==
+qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read 
from TLS channel: Software caused connection abort
+qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read 
from TLS channel: Software caused connection abort
 qemu-nbd: option negotiation failed: Verify failed: No certificate was found.
 qemu-nbd: option negotiation failed: Verify failed: No certificate was found.
 qemu-nbd: option negotiation failed: TLS x509 authz check for 
DISTINGUISHED-NAME is denied
-- 
2.35.1

[PATCH v3 00/11] s390x/tcg: Implement Vector-Enhancements Facility 2

Hi David,

I've split up the patches a bit, made some improvements to
the shifts and reversals, and fixed a few bugs.

Please especially review vector string search, as that is
has had major changes.


r~


David Miller (9):
  target/s390x: vxeh2: vector convert short/32b
  target/s390x: vxeh2: vector string search
  target/s390x: vxeh2: Update for changes to vector shifts
  target/s390x: vxeh2: vector shift double by bit
  target/s390x: vxeh2: vector {load, store} elements reversed
  target/s390x: vxeh2: vector {load, store} byte reversed elements
  target/s390x: vxeh2: vector {load, store} byte reversed element
  target/s390x: add S390_FEAT_VECTOR_ENH2 to cpu max
  tests/tcg/s390x: Tests for Vector Enhancements Facility 2

Richard Henderson (2):
  tcg: Implement tcg_gen_{h,w}swap_{i32,i64}
  target/s390x: Fix writeback to v1 in helper_vstl

 include/tcg/tcg-op.h |   6 +
 target/s390x/helper.h|  13 +
 target/s390x/gen-features.c  |   2 +
 target/s390x/tcg/translate.c |   3 +-
 target/s390x/tcg/vec_fpu_helper.c|  31 ++
 target/s390x/tcg/vec_helper.c|   2 -
 target/s390x/tcg/vec_int_helper.c|  58 
 target/s390x/tcg/vec_string_helper.c | 101 ++
 tcg/tcg-op.c |  30 ++
 tests/tcg/s390x/vxeh2_vcvt.c |  97 ++
 tests/tcg/s390x/vxeh2_vlstr.c| 146 +
 tests/tcg/s390x/vxeh2_vs.c   |  91 ++
 target/s390x/tcg/translate_vx.c.inc  | 442 ---
 target/s390x/tcg/insn-data.def   |  40 ++-
 tests/tcg/s390x/Makefile.target  |   8 +
 15 files changed, 1018 insertions(+), 52 deletions(-)
 create mode 100644 tests/tcg/s390x/vxeh2_vcvt.c
 create mode 100644 tests/tcg/s390x/vxeh2_vlstr.c
 create mode 100644 tests/tcg/s390x/vxeh2_vs.c

-- 
2.25.1

[PATCH v3 0/5] iotests: add enhanced debugging info to qemu-img failures

V3:
 - Rebase on origin/master
 - Expand 3/5 to cover new uses upstream
 - Fix reflow nit by eblake on 3/5

V2:
 - Rebase on top of kwolf's latest PR.
 - Adjust tests/graph-changes-while-io in patch 3/5
 - Drop eblake's r-b on 3/5.

This is a series I started in response to Thomas Huth's encountering a
failure in qemu-img because of missing zstd support. This series changes
the qemu_img() function in iotests.py to one that raises an Exception on
non-zero return code by default.

Alongside this, the Exception object itself is also augmented so that it
prints the stdout/stderr logs to screen if the exception goes unhandled
so that failure cases are very obvious and easy to spot in the middle of
python tracebacks.

(Test this out yourself: Disable zstd support and then run qcow2 iotest
065 before and after this patchset. It makes a real difference!)

NOTES:

(1) I have another 13-ish patches that go the rest of the way and ensure
that *every* call to qemu-img goes through this new qemu_img() function,
but for the sake of doing the most good in the shortest amount of time,
I am sending just the first 5 patches, and the rest will be sent
later. I think this is a very good series to get in before freeze so
that we have it during the heavy testing season.

(2) ... And then another 10 or so to give the same treatment to all
qemu_io() calls.

(3) ... And I'm working on the same for qemu_nbd(). Ultimately I want to
make every last subprocess call one that's checked and can produce nice
diagnostic info to the terminal if it goes unhandled.

John Snow (5):
  python/utils: add add_visual_margin() text decoration utility
  python/utils: add VerboseProcessError
  iotests: Remove explicit checks for qemu_img() == 0
  iotests: make qemu_img raise on non-zero rc by default
  iotests: fortify compare_images() against crashes

 python/qemu/utils/__init__.py | 114 ++
 tests/qemu-iotests/163|   9 +-
 tests/qemu-iotests/216|   6 +-
 tests/qemu-iotests/218|   2 +-
 tests/qemu-iotests/224|  11 +-
 tests/qemu-iotests/228|  12 +-
 tests/qemu-iotests/257|  11 +-
 tests/qemu-iotests/258|   4 +-
 tests/qemu-iotests/310|  13 +-
 tests/qemu-iotests/iotests.py |  71 +--
 tests/qemu-iotests/tests/block-status-cache   |   3 +-
 .../qemu-iotests/tests/graph-changes-while-io |   7 +-
 tests/qemu-iotests/tests/image-fleecing   |  10 +-
 .../tests/mirror-ready-cancel-error   |   6 +-
 tests/qemu-iotests/tests/mirror-top-perms |   3 +-
 .../tests/remove-bitmap-from-backing  |   8 +-
 .../qemu-iotests/tests/stream-error-on-reset  |   4 +-
 17 files changed, 226 insertions(+), 68 deletions(-)

-- 
2.34.1

[PATCH v3 04/11] target/s390x: vxeh2: Update for changes to vector shifts

From: David Miller 

Prior to vector enhancements 2, the shift count was supposed to be equal
for each byte lest the result be unpredictable, which allowed us to assume
that the shift count was the same, and optimize accordingly.

With vector enhancements 2, the shift count is allowed to be different
for each byte, and we must cope with that.

Signed-off-by: David Miller 
Message-Id: <20220307020327.3003-4-dmiller...@gmail.com>
[rth: Split out of larger patch; simplify shift/merge code.]
Signed-off-by: Richard Henderson 
---
 target/s390x/helper.h   |  3 ++
 target/s390x/tcg/vec_int_helper.c   | 58 ++
 target/s390x/tcg/translate_vx.c.inc | 77 -
 target/s390x/tcg/insn-data.def  | 12 ++---
 4 files changed, 99 insertions(+), 51 deletions(-)

diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index 7412130883..bf33d86f74 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -203,8 +203,11 @@ DEF_HELPER_FLAGS_3(gvec_vpopct16, TCG_CALL_NO_RWG, void, 
ptr, cptr, i32)
 DEF_HELPER_FLAGS_4(gvec_verim8, TCG_CALL_NO_RWG, void, ptr, cptr, cptr, i32)
 DEF_HELPER_FLAGS_4(gvec_verim16, TCG_CALL_NO_RWG, void, ptr, cptr, cptr, i32)
 DEF_HELPER_FLAGS_4(gvec_vsl, TCG_CALL_NO_RWG, void, ptr, cptr, i64, i32)
+DEF_HELPER_FLAGS_4(gvec_vsl_ve2, TCG_CALL_NO_RWG, void, ptr, cptr, cptr, i32)
 DEF_HELPER_FLAGS_4(gvec_vsra, TCG_CALL_NO_RWG, void, ptr, cptr, i64, i32)
+DEF_HELPER_FLAGS_4(gvec_vsra_ve2, TCG_CALL_NO_RWG, void, ptr, cptr, cptr, i32)
 DEF_HELPER_FLAGS_4(gvec_vsrl, TCG_CALL_NO_RWG, void, ptr, cptr, i64, i32)
+DEF_HELPER_FLAGS_4(gvec_vsrl_ve2, TCG_CALL_NO_RWG, void, ptr, cptr, cptr, i32)
 DEF_HELPER_FLAGS_4(gvec_vscbi8, TCG_CALL_NO_RWG, void, ptr, cptr, cptr, i32)
 DEF_HELPER_FLAGS_4(gvec_vscbi16, TCG_CALL_NO_RWG, void, ptr, cptr, cptr, i32)
 DEF_HELPER_4(gvec_vtm, void, ptr, cptr, env, i32)
diff --git a/target/s390x/tcg/vec_int_helper.c 
b/target/s390x/tcg/vec_int_helper.c
index 5561b3ed90..a881d5d267 100644
--- a/target/s390x/tcg/vec_int_helper.c
+++ b/target/s390x/tcg/vec_int_helper.c
@@ -540,18 +540,76 @@ void HELPER(gvec_vsl)(void *v1, const void *v2, uint64_t 
count,
 s390_vec_shl(v1, v2, count);
 }
 
+void HELPER(gvec_vsl_ve2)(void *v1, const void *v2, const void *v3,
+  uint32_t desc)
+{
+S390Vector tmp;
+uint32_t sh, e0, e1 = 0;
+
+for (int i = 15; i >= 0; --i, e1 = e0 << 24) {
+e0 = s390_vec_read_element8(v2, i);
+sh = s390_vec_read_element8(v3, i) & 7;
+
+s390_vec_write_element8(, i, rol32(e0 | e1, sh));
+}
+
+*(S390Vector *)v1 = tmp;
+}
+
 void HELPER(gvec_vsra)(void *v1, const void *v2, uint64_t count,
uint32_t desc)
 {
 s390_vec_sar(v1, v2, count);
 }
 
+void HELPER(gvec_vsra_ve2)(void *v1, const void *v2, const void *v3,
+   uint32_t desc)
+{
+S390Vector tmp;
+uint32_t sh, e0, e1;
+int i = 0;
+
+e0 = s390_vec_read_element8(v2, 0);
+e1 = -(e0 >> 7) << 8;
+
+for (;;) {
+sh = s390_vec_read_element8(v3, i) & 7;
+
+s390_vec_write_element8(, i, (e0 | e1) >> sh);
+
+if (++i >= 16) {
+break;
+}
+
+e1 = e0 << 8;
+e0 = s390_vec_read_element8(v2, i);
+}
+
+*(S390Vector *)v1 = tmp;
+}
+
 void HELPER(gvec_vsrl)(void *v1, const void *v2, uint64_t count,
uint32_t desc)
 {
 s390_vec_shr(v1, v2, count);
 }
 
+void HELPER(gvec_vsrl_ve2)(void *v1, const void *v2, const void *v3,
+   uint32_t desc)
+{
+S390Vector tmp;
+uint32_t sh, e0, e1 = 0;
+
+for (int i = 0; i < 16; ++i, e1 = e0 << 8) {
+e0 = s390_vec_read_element8(v2, i);
+sh = s390_vec_read_element8(v3, i) & 7;
+
+s390_vec_write_element8(, i, (e0 | e1) >> sh);
+}
+
+*(S390Vector *)v1 = tmp;
+}
+
 #define DEF_VSCBI(BITS)
\
 void HELPER(gvec_vscbi##BITS)(void *v1, const void *v2, const void *v3,
\
   uint32_t desc)   
\
diff --git a/target/s390x/tcg/translate_vx.c.inc 
b/target/s390x/tcg/translate_vx.c.inc
index d514e8b218..967f6213d8 100644
--- a/target/s390x/tcg/translate_vx.c.inc
+++ b/target/s390x/tcg/translate_vx.c.inc
@@ -2018,21 +2018,42 @@ static DisasJumpType op_ves(DisasContext *s, DisasOps 
*o)
 return DISAS_NEXT;
 }
 
+static DisasJumpType gen_vsh_bit_byte(DisasContext *s, DisasOps *o,
+  gen_helper_gvec_2i *gen,
+  gen_helper_gvec_3 *gen_ve2)
+{
+bool byte = s->insn->data;
+
+if (!byte && s390_has_feat(S390_FEAT_VECTOR_ENH2)) {
+gen_gvec_3_ool(get_field(s, v1), get_field(s, v2),
+   get_field(s, v3), 0, gen_ve2);
+} else {
+TCGv_i64 shift = tcg_temp_new_i64();
+
+read_vec_element_i64(shift, get_field(s, v3), 7, ES_8);
+

[PULL 12/15] tests/qemu-iotests: validate NBD TLS with UNIX sockets and PSK

From: Daniel P. Berrangé 

This validates that connections to an NBD server running on a UNIX
socket can use TLS with pre-shared keys (PSK).

Reviewed-by: Eric Blake 
Signed-off-by: Daniel P. Berrangé 
Message-Id: <20220304193610.3293146-13-berra...@redhat.com>
[eblake: squash in rebase fix]
Tested-by: Eric Blake 
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/common.tls | 24 
 tests/qemu-iotests/233| 28 
 tests/qemu-iotests/233.out| 18 ++
 3 files changed, 70 insertions(+)

diff --git a/tests/qemu-iotests/common.tls b/tests/qemu-iotests/common.tls
index 4a5760949d0f..b9c546298610 100644
--- a/tests/qemu-iotests/common.tls
+++ b/tests/qemu-iotests/common.tls
@@ -24,6 +24,7 @@ tls_x509_cleanup()
 {
 rm -f "${tls_dir}"/*.pem
 rm -f "${tls_dir}"/*/*.pem
+rm -f "${tls_dir}"/*/*.psk
 rmdir "${tls_dir}"/*
 rmdir "${tls_dir}"
 }
@@ -40,6 +41,18 @@ tls_certtool()
 rm -f "${tls_dir}"/certtool.log
 }

+tls_psktool()
+{
+psktool "$@" 1>"${tls_dir}"/psktool.log 2>&1
+if test "$?" = 0; then
+  head -1 "${tls_dir}"/psktool.log
+else
+  cat "${tls_dir}"/psktool.log
+fi
+rm -f "${tls_dir}"/psktool.log
+}
+
+
 tls_x509_init()
 {
 (certtool --help) >/dev/null 2>&1 || \
@@ -176,3 +189,14 @@ EOF

 rm -f "${tls_dir}/cert.info"
 }
+
+tls_psk_create_creds()
+{
+name=$1
+
+mkdir -p "${tls_dir}/$name"
+
+tls_psktool \
+   --pskfile "${tls_dir}/$name/keys.psk" \
+   --username "$name"
+}
diff --git a/tests/qemu-iotests/233 b/tests/qemu-iotests/233
index 442fd1378c1d..55db5b3811fd 100755
--- a/tests/qemu-iotests/233
+++ b/tests/qemu-iotests/233
@@ -61,6 +61,8 @@ tls_x509_create_server "ca1" "server1"
 tls_x509_create_client "ca1" "client1"
 tls_x509_create_client "ca2" "client2"
 tls_x509_create_client "ca1" "client3"
+tls_psk_create_creds "psk1"
+tls_psk_create_creds "psk2"

 echo
 echo "== preparing image =="
@@ -191,6 +193,32 @@ $QEMU_IMG info --image-opts --object $obj1 \
 $QEMU_NBD_PROG -L -k $nbd_unix_socket --object $obj1 \
 --tls-creds=tls0 --tls-hostname=127.0.0.1  2>&1 | _filter_qemu_nbd_exports

+
+echo
+echo "== check TLS works over UNIX with PSK =="
+nbd_server_stop
+
+nbd_server_start_unix_socket \
+--object 
tls-creds-psk,dir=${tls_dir}/psk1,endpoint=server,id=tls0,verify-peer=on \
+--tls-creds tls0 \
+-f $IMGFMT "$TEST_IMG" 2>> "$TEST_DIR/server.log"
+
+obj1=tls-creds-psk,dir=${tls_dir}/psk1,username=psk1,endpoint=client,id=tls0
+$QEMU_IMG info --image-opts --object $obj1 \
+driver=nbd,path=$nbd_unix_socket,tls-creds=tls0 \
+2>&1 | _filter_nbd
+$QEMU_NBD_PROG -L -k $nbd_unix_socket --object $obj1 \
+--tls-creds=tls0 2>&1 | _filter_qemu_nbd_exports
+
+echo
+echo "== check TLS fails over UNIX with mismatch PSK =="
+obj1=tls-creds-psk,dir=${tls_dir}/psk2,username=psk2,endpoint=client,id=tls0
+$QEMU_IMG info --image-opts --object $obj1 \
+driver=nbd,path=$nbd_unix_socket,tls-creds=tls0 \
+2>&1 | _filter_nbd
+$QEMU_NBD_PROG -L -k $nbd_unix_socket --object $obj1 \
+--tls-creds=tls0 2>&1 | _filter_qemu_nbd_exports
+
 echo
 echo "== final server log =="
 cat "$TEST_DIR/server.log" | _filter_authz_check_tls
diff --git a/tests/qemu-iotests/233.out b/tests/qemu-iotests/233.out
index 6e55be779946..237c82767ea3 100644
--- a/tests/qemu-iotests/233.out
+++ b/tests/qemu-iotests/233.out
@@ -7,6 +7,8 @@ Generating a signed certificate...
 Generating a signed certificate...
 Generating a signed certificate...
 Generating a signed certificate...
+Generating a random key for user 'psk1'
+Generating a random key for user 'psk2'

 == preparing image ==
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
@@ -82,6 +84,20 @@ exports available: 1
   size:  67108864
   min block: 1

+== check TLS works over UNIX with PSK ==
+image: nbd+unix://?socket=SOCK_DIR/qemu-nbd.sock
+file format: nbd
+virtual size: 64 MiB (67108864 bytes)
+disk size: unavailable
+exports available: 1
+ export: ''
+  size:  67108864
+  min block: 1
+
+== check TLS fails over UNIX with mismatch PSK ==
+qemu-img: Could not open 
'driver=nbd,path=SOCK_DIR/qemu-nbd.sock,tls-creds=tls0': TLS handshake failed: 
The TLS connection was non-properly terminated.
+qemu-nbd: TLS handshake failed: The TLS connection was non-properly terminated.
+
 == final server log ==
 qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read 
from TLS channel: Software caused connection abort
 qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read 
from TLS channel: Software caused connection abort
@@ -91,4 +107,6 @@ qemu-nbd: option negotiation failed: TLS x509 authz check 
for DISTINGUISHED-NAME
 qemu-nbd: option negotiation failed: TLS x509 authz check for 
DISTINGUISHED-NAME is denied
 qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read 
from TLS channel: Software caused connection abort
 qemu-nbd: option negotiation failed:

Re: [PATCH v4 03/12] mm: Introduce memfile_notifier

2022-03-07 Thread Chao Peng

On Mon, Mar 07, 2022 at 04:42:08PM +0100, Vlastimil Babka wrote:
> On 1/18/22 14:21, Chao Peng wrote:
> > This patch introduces memfile_notifier facility so existing memory file
> > subsystems (e.g. tmpfs/hugetlbfs) can provide memory pages to allow a
> > third kernel component to make use of memory bookmarked in the memory
> > file and gets notified when the pages in the memory file become
> > allocated/invalidated.
> > 
> > It will be used for KVM to use a file descriptor as the guest memory
> > backing store and KVM will use this memfile_notifier interface to
> > interact with memory file subsystems. In the future there might be other
> > consumers (e.g. VFIO with encrypted device memory).
> > 
> > It consists two sets of callbacks:
> >   - memfile_notifier_ops: callbacks for memory backing store to notify
> > KVM when memory gets allocated/invalidated.
> >   - memfile_pfn_ops: callbacks for KVM to call into memory backing store
> > to request memory pages for guest private memory.
> > 
> > Userspace is in charge of guest memory lifecycle: it first allocates
> > pages in memory backing store and then passes the fd to KVM and lets KVM
> > register each memory slot to memory backing store via
> > memfile_register_notifier.
> > 
> > The supported memory backing store should maintain a memfile_notifier list
> > and provide routine for memfile_notifier to get the list head address and
> > memfile_pfn_ops callbacks for memfile_register_notifier. It also should call
> > memfile_notifier_fallocate/memfile_notifier_invalidate when the bookmarked
> > memory gets allocated/invalidated.
> > 
> > Signed-off-by: Kirill A. Shutemov 
> 
> Process nitpick:
> Here and in patch 4/12 you have Kirill's S-o-b so there should probably be
> also "From: Kirill ..." as was in v3? Or in case you modified the original
> patches so much to become the primary author, you should add
> "Co-developed-by: Kirill ..." here before his S-o-b.

Thanks. 3/12 is vastly rewritten so the latter case can be applied.
4/12 should keep Kirill as the primary author.

Chao

[PULL 09/15] tests/qemu-iotests: convert NBD TLS test to use standard filters

From: Daniel P. Berrangé 

Using standard filters is more future proof than rolling our own.

Reviewed-by: Eric Blake 
Signed-off-by: Daniel P. Berrangé 
Message-Id: <20220304193610.3293146-10-berra...@redhat.com>
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/233 | 29 -
 tests/qemu-iotests/233.out |  8 
 2 files changed, 16 insertions(+), 21 deletions(-)

diff --git a/tests/qemu-iotests/233 b/tests/qemu-iotests/233
index 9ca7b68f42cf..050267298d67 100755
--- a/tests/qemu-iotests/233
+++ b/tests/qemu-iotests/233
@@ -65,7 +65,7 @@ tls_x509_create_client "ca1" "client3"
 echo
 echo "== preparing image =="
 _make_test_img 64M
-$QEMU_IO -c 'w -P 0x11 1m 1m' "$TEST_IMG" | _filter_qemu_io
+$QEMU_IO -c 'w -P 0x11 1m 1m' "$TEST_IMG" 2>&1 | _filter_qemu_io

 echo
 echo "== check TLS client to plain server fails =="
@@ -74,9 +74,9 @@ nbd_server_start_tcp_socket -f $IMGFMT "$TEST_IMG" 2> 
"$TEST_DIR/server.log"
 obj=tls-creds-x509,dir=${tls_dir}/client1,endpoint=client,id=tls0
 $QEMU_IMG info --image-opts --object $obj \
 driver=nbd,host=$nbd_tcp_addr,port=$nbd_tcp_port,tls-creds=tls0 \
-2>&1 | sed "s/$nbd_tcp_port/PORT/g"
+2>&1 | _filter_nbd
 $QEMU_NBD_PROG -L -b $nbd_tcp_addr -p $nbd_tcp_port --object $obj \
---tls-creds=tls0
+--tls-creds=tls0 2>&1 | _filter_qemu_nbd_exports

 nbd_server_stop

@@ -88,8 +88,10 @@ nbd_server_start_tcp_socket \
 --tls-creds tls0 \
 -f $IMGFMT "$TEST_IMG" 2>> "$TEST_DIR/server.log"

-$QEMU_IMG info nbd://localhost:$nbd_tcp_port 2>&1 | sed 
"s/$nbd_tcp_port/PORT/g"
-$QEMU_NBD_PROG -L -b $nbd_tcp_addr -p $nbd_tcp_port
+$QEMU_IMG info nbd://localhost:$nbd_tcp_port \
+2>&1 | _filter_nbd
+$QEMU_NBD_PROG -L -b $nbd_tcp_addr -p $nbd_tcp_port \
+2>&1 | _filter_qemu_nbd_exports

 echo
 echo "== check TLS works =="
@@ -97,21 +99,21 @@ 
obj1=tls-creds-x509,dir=${tls_dir}/client1,endpoint=client,id=tls0
 obj2=tls-creds-x509,dir=${tls_dir}/client3,endpoint=client,id=tls0
 $QEMU_IMG info --image-opts --object $obj1 \
 driver=nbd,host=$nbd_tcp_addr,port=$nbd_tcp_port,tls-creds=tls0 \
-2>&1 | sed "s/$nbd_tcp_port/PORT/g"
+2>&1 | _filter_nbd
 $QEMU_IMG info --image-opts --object $obj2 \
 driver=nbd,host=$nbd_tcp_addr,port=$nbd_tcp_port,tls-creds=tls0 \
-2>&1 | sed "s/$nbd_tcp_port/PORT/g"
+2>&1 | _filter_nbd
 $QEMU_NBD_PROG -L -b $nbd_tcp_addr -p $nbd_tcp_port --object $obj1 \
---tls-creds=tls0
+--tls-creds=tls0 2>&1 | _filter_qemu_nbd_exports

 echo
 echo "== check TLS with different CA fails =="
 obj=tls-creds-x509,dir=${tls_dir}/client2,endpoint=client,id=tls0
 $QEMU_IMG info --image-opts --object $obj \
 driver=nbd,host=$nbd_tcp_addr,port=$nbd_tcp_port,tls-creds=tls0 \
-2>&1 | sed "s/$nbd_tcp_port/PORT/g"
+2>&1 | _filter_nbd
 $QEMU_NBD_PROG -L -b $nbd_tcp_addr -p $nbd_tcp_port --object $obj \
---tls-creds=tls0
+--tls-creds=tls0 2>&1 | _filter_qemu_nbd_exports

 echo
 echo "== perform I/O over TLS =="
@@ -121,7 +123,8 @@ $QEMU_IO -c 'r -P 0x11 1m 1m' -c 'w -P 0x22 1m 1m' 
--image-opts \
 driver=nbd,host=$nbd_tcp_addr,port=$nbd_tcp_port,tls-creds=tls0 \
 2>&1 | _filter_qemu_io

-$QEMU_IO -f $IMGFMT -r -U -c 'r -P 0x22 1m 1m' "$TEST_IMG" | _filter_qemu_io
+$QEMU_IO -f $IMGFMT -r -U -c 'r -P 0x22 1m 1m' "$TEST_IMG" \
+2>&1 | _filter_qemu_io

 echo
 echo "== check TLS with authorization =="
@@ -139,12 +142,12 @@ nbd_server_start_tcp_socket \
 $QEMU_IMG info --image-opts \
 --object tls-creds-x509,dir=${tls_dir}/client1,endpoint=client,id=tls0 \
 driver=nbd,host=$nbd_tcp_addr,port=$nbd_tcp_port,tls-creds=tls0 \
-2>&1 | sed "s/$nbd_tcp_port/PORT/g"
+2>&1 | _filter_nbd

 $QEMU_IMG info --image-opts \
 --object tls-creds-x509,dir=${tls_dir}/client3,endpoint=client,id=tls0 \
 driver=nbd,host=$nbd_tcp_addr,port=$nbd_tcp_port,tls-creds=tls0 \
-2>&1 | sed "s/$nbd_tcp_port/PORT/g"
+2>&1 | _filter_nbd

 echo
 echo "== final server log =="
diff --git a/tests/qemu-iotests/233.out b/tests/qemu-iotests/233.out
index 4b1f6a0e1513..67a027d87986 100644
--- a/tests/qemu-iotests/233.out
+++ b/tests/qemu-iotests/233.out
@@ -17,15 +17,12 @@ wrote 1048576/1048576 bytes at offset 1048576
 qemu-img: Could not open 'driver=nbd,host=127.0.0.1,port=PORT,tls-creds=tls0': 
Denied by server for option 5 (starttls)
 server reported: TLS not configured
 qemu-nbd: Denied by server for option 5 (starttls)
-server reported: TLS not configured

 == check plain client to TLS server fails ==
 qemu-img: Could not open 'nbd://localhost:PORT': TLS negotiation required 
before option 7 (go)
 Did you forget a valid tls-creds?
 server reported: Option 0x7 not permitted before TLS
 qemu-nbd: TLS negotiation required before option 3 (list)
-Did you forget a valid tls-creds?
-server reported: Option 0x3 not permitted before TLS

 == check TLS works ==
 image: nbd://127.0.0.1:PORT
@@ -39,12 +36,7 @@ disk size: unavailable
 exports available: 1
  export: ''
   size:  67108864
-

[PULL 05/15] block/nbd: don't restrict TLS usage to IP sockets

From: Daniel P. Berrangé 

The TLS usage for NBD was restricted to IP sockets because validating
x509 certificates requires knowledge of the hostname that the client
is connecting to.

TLS does not have to use x509 certificates though, as PSK (pre-shared
keys) provide an alternative credential option. These have no
requirement for a hostname and can thus be trivially used for UNIX
sockets.

Furthermore, with the ability to overide the default hostname for
TLS validation in the previous patch, it is now also valid to want
to use x509 certificates with FD passing and UNIX sockets.

Reviewed-by: Eric Blake 
Signed-off-by: Daniel P. Berrangé 
Message-Id: <20220304193610.3293146-6-berra...@redhat.com>
Signed-off-by: Eric Blake 
---
 block/nbd.c| 8 ++--
 blockdev-nbd.c | 6 --
 qemu-nbd.c | 8 +++-
 3 files changed, 5 insertions(+), 17 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index 0a9b6cde5bd3..34b9429de387 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -1839,13 +1839,9 @@ static int nbd_process_options(BlockDriverState *bs, 
QDict *options,
 goto error;
 }

-/* TODO SOCKET_ADDRESS_KIND_FD where fd has AF_INET or AF_INET6 */
-if (s->saddr->type != SOCKET_ADDRESS_TYPE_INET) {
-error_setg(errp, "TLS only supported over IP sockets");
-goto error;
-}
 s->tlshostname = g_strdup(qemu_opt_get(opts, "tls-hostname"));
-if (!s->tlshostname) {
+if (!s->tlshostname &&
+s->saddr->type == SOCKET_ADDRESS_TYPE_INET) {
 s->tlshostname = g_strdup(s->saddr->u.inet.host);
 }
 }
diff --git a/blockdev-nbd.c b/blockdev-nbd.c
index bdfa7ed3a5a9..9840d25a8298 100644
--- a/blockdev-nbd.c
+++ b/blockdev-nbd.c
@@ -148,12 +148,6 @@ void nbd_server_start(SocketAddress *addr, const char 
*tls_creds,
 if (!nbd_server->tlscreds) {
 goto error;
 }
-
-/* TODO SOCKET_ADDRESS_TYPE_FD where fd has AF_INET or AF_INET6 */
-if (addr->type != SOCKET_ADDRESS_TYPE_INET) {
-error_setg(errp, "TLS is only supported with IPv4/IPv6");
-goto error;
-}
 }

 nbd_server->tlsauthz = g_strdup(tls_authz);
diff --git a/qemu-nbd.c b/qemu-nbd.c
index 18d281aba3d1..713e7557a9eb 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -808,7 +808,9 @@ int main(int argc, char **argv)

 socket_activation = check_socket_activation();
 if (socket_activation == 0) {
-setup_address_and_port(, );
+if (!sockpath) {
+setup_address_and_port(, );
+}
 } else {
 /* Using socket activation - check user didn't use -p etc. */
 const char *err_msg = socket_activation_validate_opts(device, sockpath,
@@ -829,10 +831,6 @@ int main(int argc, char **argv)
 }

 if (tlscredsid) {
-if (sockpath) {
-error_report("TLS is only supported with IPv4/IPv6");
-exit(EXIT_FAILURE);
-}
 if (device) {
 error_report("TLS is not supported with a host device");
 exit(EXIT_FAILURE);
-- 
2.35.1

[PULL 13/15] nbd/server: Minor cleanups

Spelling fixes, grammar improvements and consistent spacing, noticed
while preparing other patches in this file.

Signed-off-by: Eric Blake 
Message-Id: <20211203231539.3900865-2-ebl...@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 nbd/server.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index 9fb2f264023e..ba6f71e15d49 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -2084,11 +2084,10 @@ static void 
nbd_extent_array_convert_to_be(NBDExtentArray *ea)
  * Add extent to NBDExtentArray. If extent can't be added (no available space),
  * return -1.
  * For safety, when returning -1 for the first time, .can_add is set to false,
- * further call to nbd_extent_array_add() will crash.
- * (to avoid the situation, when after failing to add an extent (returned -1),
- * user miss this failure and add another extent, which is successfully added
- * (array is full, but new extent may be squashed into the last one), then we
- * have invalid array with skipped extent)
+ * and further calls to nbd_extent_array_add() will crash.
+ * (this avoids the situation where a caller ignores failure to add one extent,
+ * where adding another extent that would squash into the last array entry
+ * would result in an incorrect range reported to the client)
  */
 static int nbd_extent_array_add(NBDExtentArray *ea,
 uint32_t length, uint32_t flags)
@@ -2287,7 +2286,7 @@ static int nbd_co_receive_request(NBDRequestData *req, 
NBDRequest *request,
 assert(client->recv_coroutine == qemu_coroutine_self());
 ret = nbd_receive_request(client, request, errp);
 if (ret < 0) {
-return  ret;
+return ret;
 }

 trace_nbd_co_receive_request_decode_type(request->handle, request->type,
@@ -2647,7 +2646,7 @@ static coroutine_fn void nbd_trip(void *opaque)
 }

 if (ret < 0) {
-/* It wans't -EIO, so, according to nbd_co_receive_request()
+/* It wasn't -EIO, so, according to nbd_co_receive_request()
  * semantics, we should return the error to the client. */
 Error *export_err = local_err;

-- 
2.35.1

[PULL 02/15] block: pass desired TLS hostname through from block driver client

From: Daniel P. Berrangé 

In

  commit a71d597b989fd701b923f09b3c20ac4fcaa55e81
  Author: Vladimir Sementsov-Ogievskiy 
  Date:   Thu Jun 10 13:08:00 2021 +0300

block/nbd: reuse nbd_co_do_establish_connection() in nbd_open()

the use of the 'hostname' field from the BDRVNBDState struct was
lost, and 'nbd_connect' just hardcoded it to match the IP socket
address. This was a harmless bug at the time since we block use
with anything other than IP sockets.

Shortly though, we want to allow the caller to override the hostname
used in the TLS certificate checks. This is to allow for TLS
when doing port forwarding or tunneling. Thus we need to reinstate
the passing along of the 'hostname'.

Reviewed-by: Eric Blake 
Signed-off-by: Daniel P. Berrangé 
Message-Id: <20220304193610.3293146-3-berra...@redhat.com>
Signed-off-by: Eric Blake 
---
 include/block/nbd.h |  3 ++-
 block/nbd.c |  7 ---
 nbd/client-connection.c | 12 +---
 3 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index 78d101b77488..a98eb665da04 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -415,7 +415,8 @@ NBDClientConnection *nbd_client_connection_new(const 
SocketAddress *saddr,
bool do_negotiation,
const char *export_name,
const char *x_dirty_bitmap,
-   QCryptoTLSCreds *tlscreds);
+   QCryptoTLSCreds *tlscreds,
+   const char *tlshostname);
 void nbd_client_connection_release(NBDClientConnection *conn);

 QIOChannel *coroutine_fn
diff --git a/block/nbd.c b/block/nbd.c
index 146d25660e86..f04634905584 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -92,7 +92,7 @@ typedef struct BDRVNBDState {
 SocketAddress *saddr;
 char *export, *tlscredsid;
 QCryptoTLSCreds *tlscreds;
-const char *hostname;
+const char *tlshostname;
 char *x_dirty_bitmap;
 bool alloc_depth;

@@ -1836,7 +1836,7 @@ static int nbd_process_options(BlockDriverState *bs, 
QDict *options,
 error_setg(errp, "TLS only supported over IP sockets");
 goto error;
 }
-s->hostname = s->saddr->u.inet.host;
+s->tlshostname = s->saddr->u.inet.host;
 }

 s->x_dirty_bitmap = g_strdup(qemu_opt_get(opts, "x-dirty-bitmap"));
@@ -1876,7 +1876,8 @@ static int nbd_open(BlockDriverState *bs, QDict *options, 
int flags,
 }

 s->conn = nbd_client_connection_new(s->saddr, true, s->export,
-s->x_dirty_bitmap, s->tlscreds);
+s->x_dirty_bitmap, s->tlscreds,
+s->tlshostname);

 if (s->open_timeout) {
 nbd_client_connection_enable_retry(s->conn);
diff --git a/nbd/client-connection.c b/nbd/client-connection.c
index 2bda42641dc8..2a632931c393 100644
--- a/nbd/client-connection.c
+++ b/nbd/client-connection.c
@@ -33,6 +33,7 @@ struct NBDClientConnection {
 /* Initialization constants, never change */
 SocketAddress *saddr; /* address to connect to */
 QCryptoTLSCreds *tlscreds;
+char *tlshostname;
 NBDExportInfo initial_info;
 bool do_negotiation;
 bool do_retry;
@@ -77,7 +78,8 @@ NBDClientConnection *nbd_client_connection_new(const 
SocketAddress *saddr,
bool do_negotiation,
const char *export_name,
const char *x_dirty_bitmap,
-   QCryptoTLSCreds *tlscreds)
+   QCryptoTLSCreds *tlscreds,
+   const char *tlshostname)
 {
 NBDClientConnection *conn = g_new(NBDClientConnection, 1);

@@ -85,6 +87,7 @@ NBDClientConnection *nbd_client_connection_new(const 
SocketAddress *saddr,
 *conn = (NBDClientConnection) {
 .saddr = QAPI_CLONE(SocketAddress, saddr),
 .tlscreds = tlscreds,
+.tlshostname = g_strdup(tlshostname),
 .do_negotiation = do_negotiation,

 .initial_info.request_sizes = true,
@@ -107,6 +110,7 @@ static void 
nbd_client_connection_do_free(NBDClientConnection *conn)
 }
 error_free(conn->err);
 qapi_free_SocketAddress(conn->saddr);
+g_free(conn->tlshostname);
 object_unref(OBJECT(conn->tlscreds));
 g_free(conn->initial_info.x_dirty_bitmap);
 g_free(conn->initial_info.name);
@@ -120,6 +124,7 @@ static void 
nbd_client_connection_do_free(NBDClientConnection *conn)
  */
 static int nbd_connect(QIOChannelSocket *sioc, SocketAddress *addr,
NBDExportInfo *info, QCryptoTLSCreds *tlscreds,
+   const char

[PULL 07/15] tests/qemu-iotests: expand _filter_nbd rules

From: Daniel P. Berrangé 

Some tests will want to use 'localhost' instead of '127.0.0.1', and
some will use the image options syntax rather than the classic URI
syntax.

Reviewed-by: Eric Blake 
Signed-off-by: Daniel P. Berrangé 
Message-Id: <20220304193610.3293146-8-berra...@redhat.com>
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/common.filter | 4 
 1 file changed, 4 insertions(+)

diff --git a/tests/qemu-iotests/common.filter b/tests/qemu-iotests/common.filter
index 21819db9c3a5..f53d8cbb9daa 100644
--- a/tests/qemu-iotests/common.filter
+++ b/tests/qemu-iotests/common.filter
@@ -301,6 +301,10 @@ _filter_nbd()
 # Filter out the TCP port number since this changes between runs.
 sed -e '/nbd\/.*\.c:/d' \
 -e 's#127\.0\.0\.1:[0-9]*#127.0.0.1:PORT#g' \
+-e 's#localhost:[0-9]*#localhost:PORT#g' \
+-e 's#host=127\.0\.0\.1,port=[0-9]*#host=127.0.0.1,port=PORT#g' \
+-e 's#host=localhost,port=[0-9]*#host=localhost,port=PORT#g' \
+-e "s#path=$SOCK_DIR#path=SOCK_DIR#g" \
 -e "s#?socket=$SOCK_DIR#?socket=SOCK_DIR#g" \
 -e 's#\(foo\|PORT/\?\|.sock\): Failed to .*$#\1#'
 }
-- 
2.35.1

[PULL 04/15] qemu-nbd: add --tls-hostname option for TLS certificate validation

From: Daniel P. Berrangé 

When using the --list option, qemu-nbd acts as an NBD client rather
than a server. As such when using TLS, it has a need to validate
the server certificate. This adds a --tls-hostname option which can
be used to override the default hostname used for certificate
validation.

Reviewed-by: Eric Blake 
Signed-off-by: Daniel P. Berrangé 
Message-Id: <20220304193610.3293146-5-berra...@redhat.com>
Signed-off-by: Eric Blake 
---
 docs/tools/qemu-nbd.rst | 13 +
 qemu-nbd.c  | 17 -
 2 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/docs/tools/qemu-nbd.rst b/docs/tools/qemu-nbd.rst
index 6031f9689312..2b8c90c35498 100644
--- a/docs/tools/qemu-nbd.rst
+++ b/docs/tools/qemu-nbd.rst
@@ -169,6 +169,19 @@ driver options if ``--image-opts`` is specified.
   option; or provide the credentials needed for connecting as a client
   in list mode.

+.. option:: --tls-hostname=hostname
+
+  When validating an x509 certificate received over a TLS connection,
+  the hostname that the NBD client used to connect will be checked
+  against information in the server provided certificate. Sometimes
+  it might be required to override the hostname used to perform this
+  check. For example, if the NBD client is using a tunnel from localhost
+  to connect to the remote server, the `--tls-hostname` option should
+  be used to set the officially expected hostname of the remote NBD
+  server. This can also be used if accessing NBD over a UNIX socket
+  where there is no inherent hostname available. This is only permitted
+  when acting as a NBD client with the `--list` option.
+
 .. option:: --fork

   Fork off the server process and exit the parent once the server is running.
diff --git a/qemu-nbd.c b/qemu-nbd.c
index c6c20df68a4d..18d281aba3d1 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -69,6 +69,7 @@
 #define QEMU_NBD_OPT_TLSAUTHZ  264
 #define QEMU_NBD_OPT_PID_FILE  265
 #define QEMU_NBD_OPT_SELINUX_LABEL 266
+#define QEMU_NBD_OPT_TLSHOSTNAME   267

 #define MBR_SIZE 512

@@ -542,6 +543,7 @@ int main(int argc, char **argv)
 { "export-name", required_argument, NULL, 'x' },
 { "description", required_argument, NULL, 'D' },
 { "tls-creds", required_argument, NULL, QEMU_NBD_OPT_TLSCREDS },
+{ "tls-hostname", required_argument, NULL, QEMU_NBD_OPT_TLSHOSTNAME },
 { "tls-authz", required_argument, NULL, QEMU_NBD_OPT_TLSAUTHZ },
 { "image-opts", no_argument, NULL, QEMU_NBD_OPT_IMAGE_OPTS },
 { "trace", required_argument, NULL, 'T' },
@@ -568,6 +570,7 @@ int main(int argc, char **argv)
 strList *bitmaps = NULL;
 bool alloc_depth = false;
 const char *tlscredsid = NULL;
+const char *tlshostname = NULL;
 bool imageOpts = false;
 bool writethrough = false; /* Client will flush as needed. */
 bool fork_process = false;
@@ -747,6 +750,9 @@ int main(int argc, char **argv)
 case QEMU_NBD_OPT_TLSCREDS:
 tlscredsid = optarg;
 break;
+case QEMU_NBD_OPT_TLSHOSTNAME:
+tlshostname = optarg;
+break;
 case QEMU_NBD_OPT_IMAGE_OPTS:
 imageOpts = true;
 break;
@@ -835,6 +841,10 @@ int main(int argc, char **argv)
 error_report("TLS authorization is incompatible with export list");
 exit(EXIT_FAILURE);
 }
+if (tlshostname && !list) {
+error_report("TLS hostname is only supported with export list");
+exit(EXIT_FAILURE);
+}
 tlscreds = nbd_get_tls_creds(tlscredsid, list, _err);
 if (local_err) {
 error_reportf_err(local_err, "Failed to get TLS creds: ");
@@ -845,6 +855,10 @@ int main(int argc, char **argv)
 error_report("--tls-authz is not permitted without --tls-creds");
 exit(EXIT_FAILURE);
 }
+if (tlshostname) {
+error_report("--tls-hostname is not permitted without 
--tls-creds");
+exit(EXIT_FAILURE);
+}
 }

 if (selinux_label) {
@@ -861,7 +875,8 @@ int main(int argc, char **argv)

 if (list) {
 saddr = nbd_build_socket_address(sockpath, bindto, port);
-return qemu_nbd_client_list(saddr, tlscreds, bindto);
+return qemu_nbd_client_list(saddr, tlscreds,
+tlshostname ? tlshostname : bindto);
 }

 #if !HAVE_NBD_DEVICE
-- 
2.35.1

[PULL 08/15] tests/qemu-iotests: introduce filter for qemu-nbd export list

From: Daniel P. Berrangé 

Introduce a filter for the output of qemu-nbd export list so it can be
reused in multiple tests.

The filter is a bit more permissive that what test 241 currently uses,
as its allows printing of the export count, along with any possible
error messages that might be emitted.

Reviewed-by: Eric Blake 
Signed-off-by: Daniel P. Berrangé 
Message-Id: <20220304193610.3293146-9-berra...@redhat.com>
Tested-by: Eric Blake 
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/common.filter | 5 +
 tests/qemu-iotests/241   | 6 +++---
 tests/qemu-iotests/241.out   | 6 ++
 3 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/tests/qemu-iotests/common.filter b/tests/qemu-iotests/common.filter
index f53d8cbb9daa..9790411bf0e4 100644
--- a/tests/qemu-iotests/common.filter
+++ b/tests/qemu-iotests/common.filter
@@ -309,6 +309,11 @@ _filter_nbd()
 -e 's#\(foo\|PORT/\?\|.sock\): Failed to .*$#\1#'
 }

+_filter_qemu_nbd_exports()
+{
+grep '\(exports available\|export\|size\|min block\|qemu-nbd\):'
+}
+
 _filter_qmp_empty_return()
 {
 grep -v '{"return": {}}'
diff --git a/tests/qemu-iotests/241 b/tests/qemu-iotests/241
index c962c8b6075d..f196650afad0 100755
--- a/tests/qemu-iotests/241
+++ b/tests/qemu-iotests/241
@@ -58,7 +58,7 @@ echo

 nbd_server_start_unix_socket -f $IMGFMT "$TEST_IMG_FILE"

-$QEMU_NBD_PROG --list -k $nbd_unix_socket | grep '\(size\|min\)'
+$QEMU_NBD_PROG --list -k $nbd_unix_socket | _filter_qemu_nbd_exports
 $QEMU_IMG map -f raw --output=json "$TEST_IMG" | _filter_qemu_img_map
 $QEMU_IO -f raw -c map "$TEST_IMG"
 nbd_server_stop
@@ -71,7 +71,7 @@ echo
 # sector alignment, here at the server.
 nbd_server_start_unix_socket "$TEST_IMG_FILE" 2> "$TEST_DIR/server.log"

-$QEMU_NBD_PROG --list -k $nbd_unix_socket | grep '\(size\|min\)'
+$QEMU_NBD_PROG --list -k $nbd_unix_socket | _filter_qemu_nbd_exports
 $QEMU_IMG map -f raw --output=json "$TEST_IMG" | _filter_qemu_img_map
 $QEMU_IO -f raw -c map "$TEST_IMG"
 nbd_server_stop
@@ -84,7 +84,7 @@ echo
 # Now force sector alignment at the client.
 nbd_server_start_unix_socket -f $IMGFMT "$TEST_IMG_FILE"

-$QEMU_NBD_PROG --list -k $nbd_unix_socket | grep '\(size\|min\)'
+$QEMU_NBD_PROG --list -k $nbd_unix_socket | _filter_qemu_nbd_exports
 $QEMU_IMG map --output=json "$TEST_IMG" | _filter_qemu_img_map
 $QEMU_IO -c map "$TEST_IMG"
 nbd_server_stop
diff --git a/tests/qemu-iotests/241.out b/tests/qemu-iotests/241.out
index 56e95b599a3d..88e8cfcd7e25 100644
--- a/tests/qemu-iotests/241.out
+++ b/tests/qemu-iotests/241.out
@@ -2,6 +2,8 @@ QA output created by 241

 === Exporting unaligned raw image, natural alignment ===

+exports available: 1
+ export: ''
   size:  1024
   min block: 1
 [{ "start": 0, "length": 1000, "depth": 0, "present": true, "zero": false, 
"data": true, "offset": OFFSET},
@@ -10,6 +12,8 @@ QA output created by 241

 === Exporting unaligned raw image, forced server sector alignment ===

+exports available: 1
+ export: ''
   size:  1024
   min block: 512
 [{ "start": 0, "length": 1024, "depth": 0, "present": true, "zero": false, 
"data": true, "offset": OFFSET}]
@@ -20,6 +24,8 @@ WARNING: Image format was not specified for 'TEST_DIR/t.raw' 
and probing guessed

 === Exporting unaligned raw image, forced client sector alignment ===

+exports available: 1
+ export: ''
   size:  1024
   min block: 1
 [{ "start": 0, "length": 1000, "depth": 0, "present": true, "zero": false, 
"data": true, "offset": OFFSET},
-- 
2.35.1

[PULL 03/15] block/nbd: support override of hostname for TLS certificate validation

From: Daniel P. Berrangé 

When connecting to an NBD server with TLS and x509 credentials,
the client must validate the hostname it uses for the connection,
against that published in the server's certificate. If the client
is tunnelling its connection over some other channel, however, the
hostname it uses may not match the info reported in the server's
certificate. In such a case, the user needs to explicitly set an
override for the hostname to use for certificate validation.

This is achieved by adding a 'tls-hostname' property to the NBD
block driver.

Reviewed-by: Eric Blake 
Signed-off-by: Daniel P. Berrangé 
Message-Id: <20220304193610.3293146-4-berra...@redhat.com>
Signed-off-by: Eric Blake 
---
 qapi/block-core.json |  3 +++
 block/nbd.c  | 18 +++---
 2 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index f13b5ff942b6..e89f2dfb5be7 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -4079,6 +4079,8 @@
 #
 # @tls-creds: TLS credentials ID
 #
+# @tls-hostname: TLS hostname override for certificate validation (Since 7.0)
+#
 # @x-dirty-bitmap: A metadata context name such as "qemu:dirty-bitmap:NAME"
 #  or "qemu:allocation-depth" to query in place of the
 #  traditional "base:allocation" block status (see
@@ -4109,6 +4111,7 @@
   'data': { 'server': 'SocketAddress',
 '*export': 'str',
 '*tls-creds': 'str',
+'*tls-hostname': 'str',
 '*x-dirty-bitmap': { 'type': 'str', 'features': [ 'unstable' ] },
 '*reconnect-delay': 'uint32',
 '*open-timeout': 'uint32' } }
diff --git a/block/nbd.c b/block/nbd.c
index f04634905584..0a9b6cde5bd3 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -90,9 +90,10 @@ typedef struct BDRVNBDState {
 uint32_t reconnect_delay;
 uint32_t open_timeout;
 SocketAddress *saddr;
-char *export, *tlscredsid;
+char *export;
+char *tlscredsid;
 QCryptoTLSCreds *tlscreds;
-const char *tlshostname;
+char *tlshostname;
 char *x_dirty_bitmap;
 bool alloc_depth;

@@ -121,6 +122,8 @@ static void nbd_clear_bdrvstate(BlockDriverState *bs)
 s->export = NULL;
 g_free(s->tlscredsid);
 s->tlscredsid = NULL;
+g_free(s->tlshostname);
+s->tlshostname = NULL;
 g_free(s->x_dirty_bitmap);
 s->x_dirty_bitmap = NULL;
 }
@@ -1765,6 +1768,11 @@ static QemuOptsList nbd_runtime_opts = {
 .type = QEMU_OPT_STRING,
 .help = "ID of the TLS credentials to use",
 },
+{
+.name = "tls-hostname",
+.type = QEMU_OPT_STRING,
+.help = "Override hostname for validating TLS x509 certificate",
+},
 {
 .name = "x-dirty-bitmap",
 .type = QEMU_OPT_STRING,
@@ -1836,7 +1844,10 @@ static int nbd_process_options(BlockDriverState *bs, 
QDict *options,
 error_setg(errp, "TLS only supported over IP sockets");
 goto error;
 }
-s->tlshostname = s->saddr->u.inet.host;
+s->tlshostname = g_strdup(qemu_opt_get(opts, "tls-hostname"));
+if (!s->tlshostname) {
+s->tlshostname = g_strdup(s->saddr->u.inet.host);
+}
 }

 s->x_dirty_bitmap = g_strdup(qemu_opt_get(opts, "x-dirty-bitmap"));
@@ -2038,6 +2049,7 @@ static const char *const nbd_strong_runtime_opts[] = {
 "port",
 "export",
 "tls-creds",
+"tls-hostname",
 "server.",

 NULL
-- 
2.35.1

[PULL 11/15] tests/qemu-iotests: validate NBD TLS with UNIX sockets

From: Daniel P. Berrangé 

This validates that connections to an NBD server running on a UNIX
socket can use TLS, and require a TLS hostname override to pass
certificate validation.

Reviewed-by: Eric Blake 
Signed-off-by: Daniel P. Berrangé 
Message-Id: <20220304193610.3293146-12-berra...@redhat.com>
[eblake: squash in rebase fix]
Tested-by: Eric Blake 
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/233 | 24 
 tests/qemu-iotests/233.out | 16 
 2 files changed, 40 insertions(+)

diff --git a/tests/qemu-iotests/233 b/tests/qemu-iotests/233
index c24d877be88e..442fd1378c1d 100755
--- a/tests/qemu-iotests/233
+++ b/tests/qemu-iotests/233
@@ -167,6 +167,30 @@ $QEMU_IMG info --image-opts \
 driver=nbd,host=$nbd_tcp_addr,port=$nbd_tcp_port,tls-creds=tls0 \
 2>&1 | _filter_nbd

+nbd_server_stop
+
+nbd_server_start_unix_socket \
+--object 
tls-creds-x509,dir=${tls_dir}/server1,endpoint=server,id=tls0,verify-peer=on \
+--tls-creds tls0 \
+-f $IMGFMT "$TEST_IMG" 2>> "$TEST_DIR/server.log"
+
+echo
+echo "== check TLS fail over UNIX with no hostname =="
+obj1=tls-creds-x509,dir=${tls_dir}/client1,endpoint=client,id=tls0
+$QEMU_IMG info --image-opts --object $obj1 \
+driver=nbd,path=$nbd_unix_socket,tls-creds=tls0 2>&1 | _filter_nbd
+$QEMU_NBD_PROG -L -k $nbd_unix_socket --object $obj1 --tls-creds=tls0 \
+2>&1 | _filter_qemu_nbd_exports
+
+echo
+echo "== check TLS works over UNIX with hostname override =="
+obj1=tls-creds-x509,dir=${tls_dir}/client1,endpoint=client,id=tls0
+$QEMU_IMG info --image-opts --object $obj1 \
+driver=nbd,path=$nbd_unix_socket,tls-creds=tls0,tls-hostname=127.0.0.1 \
+2>&1 | _filter_nbd
+$QEMU_NBD_PROG -L -k $nbd_unix_socket --object $obj1 \
+--tls-creds=tls0 --tls-hostname=127.0.0.1  2>&1 | _filter_qemu_nbd_exports
+
 echo
 echo "== final server log =="
 cat "$TEST_DIR/server.log" | _filter_authz_check_tls
diff --git a/tests/qemu-iotests/233.out b/tests/qemu-iotests/233.out
index d42611bf74a6..6e55be779946 100644
--- a/tests/qemu-iotests/233.out
+++ b/tests/qemu-iotests/233.out
@@ -68,6 +68,20 @@ read 1048576/1048576 bytes at offset 1048576
 qemu-img: Could not open 'driver=nbd,host=127.0.0.1,port=PORT,tls-creds=tls0': 
Failed to read option reply: Cannot read from TLS channel: Software caused 
connection abort
 qemu-img: Could not open 'driver=nbd,host=127.0.0.1,port=PORT,tls-creds=tls0': 
Failed to read option reply: Cannot read from TLS channel: Software caused 
connection abort

+== check TLS fail over UNIX with no hostname ==
+qemu-img: Could not open 
'driver=nbd,path=SOCK_DIR/qemu-nbd.sock,tls-creds=tls0': No hostname for 
certificate validation
+qemu-nbd: No hostname for certificate validation
+
+== check TLS works over UNIX with hostname override ==
+image: nbd+unix://?socket=SOCK_DIR/qemu-nbd.sock
+file format: nbd
+virtual size: 64 MiB (67108864 bytes)
+disk size: unavailable
+exports available: 1
+ export: ''
+  size:  67108864
+  min block: 1
+
 == final server log ==
 qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read 
from TLS channel: Software caused connection abort
 qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read 
from TLS channel: Software caused connection abort
@@ -75,4 +89,6 @@ qemu-nbd: option negotiation failed: Verify failed: No 
certificate was found.
 qemu-nbd: option negotiation failed: Verify failed: No certificate was found.
 qemu-nbd: option negotiation failed: TLS x509 authz check for 
DISTINGUISHED-NAME is denied
 qemu-nbd: option negotiation failed: TLS x509 authz check for 
DISTINGUISHED-NAME is denied
+qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read 
from TLS channel: Software caused connection abort
+qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read 
from TLS channel: Software caused connection abort
 *** done
-- 
2.35.1

[PULL 00/15] NBD patches for 7.0-rc0

The following changes since commit b49872aa8fc0f3f5a3036cc37aa2cb5c92866f33:

  Merge remote-tracking branch 
'remotes/hreitz-gitlab/tags/pull-block-2022-03-07' into staging (2022-03-07 
17:14:09 +)

are available in the Git repository at:

  https://repo.or.cz/qemu/ericb.git tags/pull-nbd-2022-03-07

for you to fetch changes up to 395aecd037dc35d110b8e1e8cc7d20c1082894b5:

  qemu-io: Allow larger write zeroes under no fallback (2022-03-07 19:28:00 
-0600)

I'm also trying to get v3 patches posted for my NBD_CAN_MULTI_CONN
patch series, but given the close proximity of soft freeze, getting
that into 7.0 may not be feasible.


nbd patches for 2022-03-07

- Dan Berrange: Allow qemu-nbd to support TLS over Unix sockets
- Eric Blake: Minor cleanups related to 64-bit block operations


Daniel P. Berrangé (12):
  crypto: mandate a hostname when checking x509 creds on a client
  block: pass desired TLS hostname through from block driver client
  block/nbd: support override of hostname for TLS certificate validation
  qemu-nbd: add --tls-hostname option for TLS certificate validation
  block/nbd: don't restrict TLS usage to IP sockets
  tests/qemu-iotests: add QEMU_IOTESTS_REGEN=1 to update reference file
  tests/qemu-iotests: expand _filter_nbd rules
  tests/qemu-iotests: introduce filter for qemu-nbd export list
  tests/qemu-iotests: convert NBD TLS test to use standard filters
  tests/qemu-iotests: validate NBD TLS with hostname mismatch
  tests/qemu-iotests: validate NBD TLS with UNIX sockets
  tests/qemu-iotests: validate NBD TLS with UNIX sockets and PSK

Eric Blake (3):
  nbd/server: Minor cleanups
  qemu-io: Utilize 64-bit status during map
  qemu-io: Allow larger write zeroes under no fallback

 docs/tools/qemu-nbd.rst  | 13 ++
 qapi/block-core.json |  3 ++
 include/block/nbd.h  |  3 +-
 block/nbd.c  | 25 ++
 blockdev-nbd.c   |  6 ---
 crypto/tlssession.c  |  6 +++
 nbd/client-connection.c  | 12 +++--
 nbd/server.c | 13 +++---
 qemu-io-cmds.c   | 16 ++-
 qemu-nbd.c   | 25 +++---
 tests/qemu-iotests/common.filter |  9 
 tests/qemu-iotests/common.tls| 31 +++--
 tests/qemu-iotests/233   | 99 ++--
 tests/qemu-iotests/233.out   | 58 +++
 tests/qemu-iotests/241   |  6 +--
 tests/qemu-iotests/241.out   |  6 +++
 tests/qemu-iotests/testrunner.py |  6 +++
 17 files changed, 268 insertions(+), 69 deletions(-)

-- 
2.35.1

[PULL 01/15] crypto: mandate a hostname when checking x509 creds on a client

From: Daniel P. Berrangé 

Currently the TLS session object assumes that the caller will always
provide a hostname when using x509 creds on a client endpoint. This
relies on the caller to detect and report an error if the user has
configured QEMU with x509 credentials on a UNIX socket. The migration
code has such a check, but it is too broad, reporting an error when
the user has configured QEMU with PSK credentials on a UNIX socket,
where hostnames are irrelevant.

Putting the check into the TLS session object credentials validation
code ensures we report errors in only the scenario that matters.

Reviewed-by: Eric Blake 
Signed-off-by: Daniel P. Berrangé 
Message-Id: <20220304193610.3293146-2-berra...@redhat.com>
Signed-off-by: Eric Blake 
---
 crypto/tlssession.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/crypto/tlssession.c b/crypto/tlssession.c
index a8db8c76d138..b302d835d215 100644
--- a/crypto/tlssession.c
+++ b/crypto/tlssession.c
@@ -373,6 +373,12 @@ qcrypto_tls_session_check_certificate(QCryptoTLSSession 
*session,
session->hostname);
 goto error;
 }
+} else {
+if (session->creds->endpoint ==
+QCRYPTO_TLS_CREDS_ENDPOINT_CLIENT) {
+error_setg(errp, "No hostname for certificate validation");
+goto error;
+}
 }
 }

-- 
2.35.1

[PULL 06/15] tests/qemu-iotests: add QEMU_IOTESTS_REGEN=1 to update reference file

From: Daniel P. Berrangé 

When developing an I/O test it is typical to add some logic to the
test script, run it to view the output diff, and then apply the
output diff to the reference file. This can be drastically simplified
by letting the test runner update the reference file in place.

By setting 'QEMU_IOTESTS_REGEN=1', the test runner will report the
failure and show the diff, but at the same time update the reference
file. So next time the I/O test is run it will succeed.

Continuing to display the diff when updating the reference gives the
developer a chance to review what was changed.

Reviewed-by: Eric Blake 
Signed-off-by: Daniel P. Berrangé 
Message-Id: <20220304193610.3293146-7-berra...@redhat.com>
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/testrunner.py | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/tests/qemu-iotests/testrunner.py b/tests/qemu-iotests/testrunner.py
index 41083ff9c6f7..5c207225b140 100644
--- a/tests/qemu-iotests/testrunner.py
+++ b/tests/qemu-iotests/testrunner.py
@@ -25,6 +25,7 @@
 import contextlib
 import json
 import termios
+import shutil
 import sys
 from multiprocessing import Pool
 from contextlib import contextmanager
@@ -322,6 +323,11 @@ def do_run_test(self, test: str, mp: bool) -> TestResult:

 diff = file_diff(str(f_reference), str(f_bad))
 if diff:
+if os.environ.get("QEMU_IOTESTS_REGEN", None) is not None:
+shutil.copyfile(str(f_bad), str(f_reference))
+print("")
+print("#REFERENCE FILE UPDATED#")
+print("")
 return TestResult(status='fail', elapsed=elapsed,
   description=f'output mismatch (see {f_bad})',
   diff=diff, casenotrun=casenotrun)
-- 
2.35.1

[PULL v2 00/16] MIPS patches for 2022-03-07

2022-03-07 Thread Philippe Mathieu-Daudé

From: Philippe Mathieu-Daudé 

The following changes since commit b49872aa8fc0f3f5a3036cc37aa2cb5c92866f33:

  Merge remote-tracking branch 
'remotes/hreitz-gitlab/tags/pull-block-2022-03-07' into staging (2022-03-07 
17:14:09 +)

are available in the Git repository at:

  https://github.com/philmd/qemu.git tags/mips-20220308

for you to fetch changes up to c35fef9a9c7fd5397bc624d5bba05cef514b5737:

  tests/avocado/linux_ssh_mips_malta.py: add missing accel (tcg) tag 
(2022-03-07 20:38:41 +0100)

Since v1:
- Corrected last patch (screwed during git-am conflict)


MIPS patches queue

- Fix CP0 cycle counter timing
- Fix VMState of gt64120 IRQs
- Long due PIIX4 QOM cleanups
- ISA IRQ QOM'ification / cleanups



Bernhard Beschow (13):
  hw/mips/gt64xxx_pci: Fix PCI IRQ levels to be preserved during
migration
  malta: Move PCI interrupt handling from gt64xxx_pci to piix4
  hw/isa/piix4: Resolve redundant i8259[] attribute
  hw/isa/piix4: Pass PIIX4State as opaque parameter for piix4_set_irq()
  hw/isa/piix4: Resolve global instance variable
  hw/isa/piix4: Replace some magic IRQ constants
  hw/mips/gt64xxx_pci: Resolve gt64120_register()
  hw/rtc/mc146818rtc: QOM'ify IRQ number
  hw/rtc/m48t59-isa: QOM'ify IRQ number
  hw/input/pckbd: QOM'ify IRQ numbers
  hw/isa/isa-bus: Remove isabus_dev_print()
  hw/isa: Drop unused attributes from ISADevice
  hw/isa: Inline and remove one-line isa_init_irq()

Cleber Rosa (1):
  tests/avocado/linux_ssh_mips_malta.py: add missing accel (tcg) tag

Philippe Mathieu-Daudé (1):
  target/mips: Remove duplicated MIPSCPU::cp0_count_rate

Simon Burge (1):
  target/mips: Fix cycle counter timing calculations

 hw/audio/cs4231a.c|  2 +-
 hw/audio/gus.c|  2 +-
 hw/audio/sb16.c   |  2 +-
 hw/block/fdc-isa.c|  2 +-
 hw/char/parallel.c|  2 +-
 hw/char/serial-isa.c  |  2 +-
 hw/ide/isa.c  |  2 +-
 hw/input/pckbd.c  | 26 +++--
 hw/ipmi/isa_ipmi_bt.c |  2 +-
 hw/ipmi/isa_ipmi_kcs.c|  2 +-
 hw/isa/isa-bus.c  | 37 +
 hw/isa/piix4.c| 56 +--
 hw/mips/gt64xxx_pci.c | 80 +++
 hw/mips/malta.c   |  7 +--
 hw/net/ne2000-isa.c   |  2 +-
 hw/rtc/m48t59-isa.c   |  9 ++-
 hw/rtc/mc146818rtc.c  | 13 -
 hw/tpm/tpm_tis_isa.c  |  2 +-
 include/hw/isa/isa.h  |  3 -
 include/hw/mips/mips.h|  3 -
 include/hw/rtc/mc146818rtc.h  |  1 +
 include/hw/southbridge/piix.h |  2 -
 target/mips/cpu.c | 11 +---
 target/mips/cpu.h |  9 ---
 target/mips/internal.h|  9 +++
 tests/avocado/linux_ssh_mips_malta.py |  3 +
 tests/qemu-iotests/172.out| 26 -
 27 files changed, 127 insertions(+), 190 deletions(-)

-- 
2.34.1

[PULL v2 15/16] hw/isa: Inline and remove one-line isa_init_irq()

2022-03-07 Thread Philippe Mathieu-Daudé

From: Bernhard Beschow 

isa_init_irq() has become a trivial one-line wrapper for isa_get_irq().
It can therefore be removed.

Signed-off-by: Bernhard Beschow 
Reviewed-by: Stefan Berger  (tpm_tis_isa)
Acked-by: Corey Minyard  (isa_ipmi_bt, isa_ipmi_kcs)
Reviewed-by: Philippe Mathieu-Daudé 
Acked-by: Gerd Hoffmann 
Message-Id: <20220301220037.76555-8-shen...@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé 
Message-Id: <20220307134353.1950-14-philippe.mathieu.da...@gmail.com>
Reviewed-by: Bernhard Beschow 
---
 hw/audio/cs4231a.c | 2 +-
 hw/audio/gus.c | 2 +-
 hw/audio/sb16.c| 2 +-
 hw/block/fdc-isa.c | 2 +-
 hw/char/parallel.c | 2 +-
 hw/char/serial-isa.c   | 2 +-
 hw/ide/isa.c   | 2 +-
 hw/input/pckbd.c   | 4 ++--
 hw/ipmi/isa_ipmi_bt.c  | 2 +-
 hw/ipmi/isa_ipmi_kcs.c | 2 +-
 hw/isa/isa-bus.c   | 8 +---
 hw/isa/piix4.c | 2 +-
 hw/net/ne2000-isa.c| 2 +-
 hw/rtc/m48t59-isa.c| 2 +-
 hw/tpm/tpm_tis_isa.c   | 2 +-
 include/hw/isa/isa.h   | 1 -
 16 files changed, 16 insertions(+), 23 deletions(-)

diff --git a/hw/audio/cs4231a.c b/hw/audio/cs4231a.c
index 7d60ce6f0e..0723e39430 100644
--- a/hw/audio/cs4231a.c
+++ b/hw/audio/cs4231a.c
@@ -677,7 +677,7 @@ static void cs4231a_realizefn (DeviceState *dev, Error 
**errp)
 return;
 }
 
-isa_init_irq(d, >pic, s->irq);
+s->pic = isa_get_irq(d, s->irq);
 k = ISADMA_GET_CLASS(s->isa_dma);
 k->register_channel(s->isa_dma, s->dma, cs_dma_read, s);
 
diff --git a/hw/audio/gus.c b/hw/audio/gus.c
index e8719ee117..42f010b671 100644
--- a/hw/audio/gus.c
+++ b/hw/audio/gus.c
@@ -282,7 +282,7 @@ static void gus_realizefn (DeviceState *dev, Error **errp)
 s->emu.himemaddr = s->himem;
 s->emu.gusdatapos = s->emu.himemaddr + 1024 * 1024 + 32;
 s->emu.opaque = s;
-isa_init_irq (d, >pic, s->emu.gusirq);
+s->pic = isa_get_irq(d, s->emu.gusirq);
 
 AUD_set_active_out (s->voice, 1);
 }
diff --git a/hw/audio/sb16.c b/hw/audio/sb16.c
index 60f1f75e3a..2215386ddb 100644
--- a/hw/audio/sb16.c
+++ b/hw/audio/sb16.c
@@ -1408,7 +1408,7 @@ static void sb16_realizefn (DeviceState *dev, Error 
**errp)
 return;
 }
 
-isa_init_irq (isadev, >pic, s->irq);
+s->pic = isa_get_irq(isadev, s->irq);
 
 s->mixer_regs[0x80] = magic_of_irq (s->irq);
 s->mixer_regs[0x81] = (1 << s->dma) | (1 << s->hdma);
diff --git a/hw/block/fdc-isa.c b/hw/block/fdc-isa.c
index ab663dce93..fa20450747 100644
--- a/hw/block/fdc-isa.c
+++ b/hw/block/fdc-isa.c
@@ -94,7 +94,7 @@ static void isabus_fdc_realize(DeviceState *dev, Error **errp)
  isa->iobase, fdc_portio_list, fdctrl,
  "fdc");
 
-isa_init_irq(isadev, >irq, isa->irq);
+fdctrl->irq = isa_get_irq(isadev, isa->irq);
 fdctrl->dma_chann = isa->dma;
 if (fdctrl->dma_chann != -1) {
 IsaDmaClass *k;
diff --git a/hw/char/parallel.c b/hw/char/parallel.c
index b45e67bfbb..adb9bd9be3 100644
--- a/hw/char/parallel.c
+++ b/hw/char/parallel.c
@@ -553,7 +553,7 @@ static void parallel_isa_realizefn(DeviceState *dev, Error 
**errp)
 index++;
 
 base = isa->iobase;
-isa_init_irq(isadev, >irq, isa->isairq);
+s->irq = isa_get_irq(isadev, isa->isairq);
 qemu_register_reset(parallel_reset, s);
 
 qemu_chr_fe_set_handlers(>chr, parallel_can_receive, NULL,
diff --git a/hw/char/serial-isa.c b/hw/char/serial-isa.c
index 1b8b303079..7a7ed239cd 100644
--- a/hw/char/serial-isa.c
+++ b/hw/char/serial-isa.c
@@ -75,7 +75,7 @@ static void serial_isa_realizefn(DeviceState *dev, Error 
**errp)
 }
 index++;
 
-isa_init_irq(isadev, >irq, isa->isairq);
+s->irq = isa_get_irq(isadev, isa->isairq);
 qdev_realize(DEVICE(s), NULL, errp);
 qdev_set_legacy_instance_id(dev, isa->iobase, 3);
 
diff --git a/hw/ide/isa.c b/hw/ide/isa.c
index 24bbde24c2..8bedbd13f1 100644
--- a/hw/ide/isa.c
+++ b/hw/ide/isa.c
@@ -75,7 +75,7 @@ static void isa_ide_realizefn(DeviceState *dev, Error **errp)
 
 ide_bus_init(>bus, sizeof(s->bus), dev, 0, 2);
 ide_init_ioport(>bus, isadev, s->iobase, s->iobase2);
-isa_init_irq(isadev, >irq, s->isairq);
+s->irq = isa_get_irq(isadev, s->isairq);
 ide_init2(>bus, s->irq);
 vmstate_register(VMSTATE_IF(dev), 0, _ide_isa, s);
 ide_register_restart_cb(>bus);
diff --git a/hw/input/pckbd.c b/hw/input/pckbd.c
index eb77e12f6f..1773db0d25 100644
--- a/hw/input/pckbd.c
+++ b/hw/input/pckbd.c
@@ -749,8 +749,8 @@ static void i8042_realizefn(DeviceState *dev, Error **errp)
 return;
 }
 
-isa_init_irq(isadev, >irq_kbd, isa_s->kbd_irq);
-isa_init_irq(isadev, >irq_mouse, isa_s->mouse_irq);
+s->irq_kbd = isa_get_irq(isadev, isa_s->kbd_irq);
+s->irq_mouse = isa_get_irq(isadev, isa_s->mouse_irq);
 
 isa_register_ioport(isadev, isa_s->io + 0, 0x60);
 isa_register_ioport(isadev, isa_s->io + 1, 0x64);
diff --git a/hw/ipmi/isa_ipmi_bt.c b/hw/ipmi/isa_ipmi_bt.c
index

Re: [PULL v2 00/47] virtio,pc,pci: features, cleanups, fixes

On Mon, Mar 07, 2022 at 05:13:16PM +, Peter Maydell wrote:
> On Mon, 7 Mar 2022 at 17:06, Peter Maydell  wrote:
> >
> > On Mon, 7 Mar 2022 at 10:01, Michael S. Tsirkin  wrote:
> > >
> > > The following changes since commit 
> > > 6629bf78aac7e53f83fd0bcbdbe322e2302dfd1f:
> > >
> > >   Merge remote-tracking branch 
> > > 'remotes/pmaydell/tags/pull-target-arm-20220302' into staging (2022-03-03 
> > > 14:46:48 +)
> > >
> > > are available in the Git repository at:
> > >
> > >   git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git tags/for_upstream
> > >
> > > for you to fetch changes up to 41d137fc631bd9315ff84727d780757d25054c58:
> > >
> > >   hw/acpi/microvm: turn on 8042 bit in FADT boot architecture flags if 
> > > present (2022-03-06 16:06:16 -0500)
> > >
> > > 
> > > virtio,pc,pci: features, cleanups, fixes
> > >
> > > vhost-user enabled on non-linux systems
> > > beginning of nvme sriov support
> > > bigger tx queue for vdpa
> > > virtio iommu bypass
> > > An FADT flag to detect legacy keyboards.
> > >
> > > Fixes, cleanups all over the place
> > >
> > > Signed-off-by: Michael S. Tsirkin 
> >
> > Fails to build on the build-system-centos job:
> >
> > libqemu-ppc64-softmmu.fa.p/hw_virtio_virtio.c.o: In function
> > `qmp_decode_features':
> > /builds/qemu-project/qemu/build/../hw/virtio/virtio.c:4155: undefined
> > reference to `gpu_map'
> > /builds/qemu-project/qemu/build/../hw/virtio/virtio.c:4155: undefined
> > reference to `gpu_map'
> > collect2: error: ld returned 1 exit status
> >
> > https://gitlab.com/qemu-project/qemu/-/jobs/2172339948
> 
> Also fails on cross-win64-system:
> 
> https://gitlab.com/qemu-project/qemu/-/jobs/2172339938
> 
> ../hw/virtio/virtio.c: In function 'qmp_x_query_virtio_vhost_queue_status':
> ../hw/virtio/virtio.c:4358:30: error: cast from pointer to integer of
> different size [-Werror=pointer-to-int-cast]
> 4358 | status->desc = (uint64_t)(unsigned long)hdev->vqs[queue].desc;
> | ^
> ../hw/virtio/virtio.c:4359:31: error: cast from pointer to integer of
> different size [-Werror=pointer-to-int-cast]
> 4359 | status->avail = (uint64_t)(unsigned long)hdev->vqs[queue].avail;
> | ^
> ../hw/virtio/virtio.c:4360:30: error: cast from pointer to integer of
> different size [-Werror=pointer-to-int-cast]
> 4360 | status->used = (uint64_t)(unsigned long)hdev->vqs[queue].used;
> | ^
> cc1: all warnings being treated as errors
> 
> -- PMM

I dropped these for now but I really question the value of this warning,
as you can see the reason we have the buggy cast to unsigned long
is because someone wanted to shut up the warning on a 32 bit system.

Now, I could maybe get behind this if it simply warned about a cast that
loses information (cast to a smaller integer) or integer/pointer cast
that does not go through uintptr_t without regard to size.

> 
> 
> >
> > thanks
> > -- PMM
> 
> 
> 
> -- 
> 12345678901234567890123456789012345678901234567890123456789012345678901234567890
>  1 2 3 4 5 6 7
>  8

[PULL v4 46/47] tests/acpi: i386: update FACP table differences

From: Liav Albani 

After changing the IAPC boot flags register to indicate support of i8042
in the machine chipset to help the guest OS to determine its existence
"faster", we need to have the updated FACP ACPI binary images in tree.

The ASL changes introduced are shown by the following diff:

@@ -42,35 +42,35 @@
 [059h 0089   1] PM1 Control Block Length : 02
 [05Ah 0090   1] PM2 Control Block Length : 00
 [05Bh 0091   1]PM Timer Block Length : 04
 [05Ch 0092   1]GPE0 Block Length : 10
 [05Dh 0093   1]GPE1 Block Length : 00
 [05Eh 0094   1] GPE1 Base Offset : 00
 [05Fh 0095   1] _CST Support : 00
 [060h 0096   2]   C2 Latency : 0FFF
 [062h 0098   2]   C3 Latency : 0FFF
 [064h 0100   2]   CPU Cache Size : 
 [066h 0102   2]   Cache Flush Stride : 
 [068h 0104   1]Duty Cycle Offset : 00
 [069h 0105   1] Duty Cycle Width : 00
 [06Ah 0106   1]  RTC Day Alarm Index : 00
 [06Bh 0107   1]RTC Month Alarm Index : 00
 [06Ch 0108   1]RTC Century Index : 32
-[06Dh 0109   2]   Boot Flags (decoded below) : 
+[06Dh 0109   2]   Boot Flags (decoded below) : 0002
Legacy Devices Supported (V2) : 0
-8042 Present on ports 60/64 (V2) : 0
+8042 Present on ports 60/64 (V2) : 1
 VGA Not Present (V4) : 0
   MSI Not Supported (V4) : 0
 PCIe ASPM Not Supported (V4) : 0
CMOS RTC Not Present (V5) : 0
 [06Fh 0111   1] Reserved : 00
 [070h 0112   4]Flags (decoded below) : 84A5
   WBINVD instruction is operational (V1) : 1
   WBINVD flushes all caches (V1) : 0
 All CPUs support C1 (V1) : 1
   C2 works on MP system (V1) : 0
 Control Method Power Button (V1) : 0
 Control Method Sleep Button (V1) : 1
 RTC wake not in fixed reg space (V1) : 0
 RTC can wake system from S4 (V1) : 1
 32-bit PM Timer (V1) : 0
   Docking Supported (V1) : 0

Signed-off-by: Liav Albani 
Acked-by: Ani Sinha 
Message-Id: <20220304154032.2071585-4-...@anisinha.ca>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 tests/qtest/bios-tables-test-allowed-diff.h |   4 
 tests/data/acpi/q35/FACP| Bin 244 -> 244 bytes
 tests/data/acpi/q35/FACP.nosmm  | Bin 244 -> 244 bytes
 tests/data/acpi/q35/FACP.slic   | Bin 244 -> 244 bytes
 tests/data/acpi/q35/FACP.xapic  | Bin 244 -> 244 bytes
 5 files changed, 4 deletions(-)

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h 
b/tests/qtest/bios-tables-test-allowed-diff.h
index 7570e39369..dfb8523c8b 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1,5 +1 @@
 /* List of comma-separated changed AML files to ignore */
-"tests/data/acpi/q35/FACP",
-"tests/data/acpi/q35/FACP.nosmm",
-"tests/data/acpi/q35/FACP.slic",
-"tests/data/acpi/q35/FACP.xapic",
diff --git a/tests/data/acpi/q35/FACP b/tests/data/acpi/q35/FACP
index 
f6a864cc863c7763f6c09d3814ad184a658fa0a0..a8f6a8961109d01059aceef9f1869cde09a2f10c
 100644
GIT binary patch
delta 23
ecmeyu_=S

[PULL v4 45/47] hw/acpi: add indication for i8042 in IA-PC boot flags of the FADT table

From: Liav Albani 

This can allow the guest OS to determine more easily if i8042 controller
is present in the system or not, so it doesn't need to do probing of the
controller, but just initialize it immediately, before enumerating the
ACPI AML namespace.

The 8042 bit in IAPC_BOOT_ARCH was introduced from ACPI spec v2 (FADT
revision 2 and above). Therefore, in this change, we only enable this bit for
x86/q35 machine types since x86/i440fx machines use FADT ACPI table with
revision 1.

Signed-off-by: Liav Albani 
Signed-off-by: Ani Sinha 
Message-Id: <20220304154032.2071585-3-...@anisinha.ca>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/acpi/acpi-defs.h |  1 +
 include/hw/input/i8042.h| 15 +++
 hw/acpi/aml-build.c |  8 +++-
 hw/i386/acpi-build.c|  8 
 4 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/include/hw/acpi/acpi-defs.h b/include/hw/acpi/acpi-defs.h
index c97e8633ad..2b42e4192b 100644
--- a/include/hw/acpi/acpi-defs.h
+++ b/include/hw/acpi/acpi-defs.h
@@ -77,6 +77,7 @@ typedef struct AcpiFadtData {
 uint16_t plvl2_lat;/* P_LVL2_LAT */
 uint16_t plvl3_lat;/* P_LVL3_LAT */
 uint16_t arm_boot_arch;/* ARM_BOOT_ARCH */
+uint16_t iapc_boot_arch;   /* IAPC_BOOT_ARCH */
 uint8_t minor_ver; /* FADT Minor Version */
 
 /*
diff --git a/include/hw/input/i8042.h b/include/hw/input/i8042.h
index 1d90432dae..e070f546e4 100644
--- a/include/hw/input/i8042.h
+++ b/include/hw/input/i8042.h
@@ -23,4 +23,19 @@ void i8042_mm_init(qemu_irq kbd_irq, qemu_irq mouse_irq,
 void i8042_isa_mouse_fake_event(ISAKBDState *isa);
 void i8042_setup_a20_line(ISADevice *dev, qemu_irq a20_out);
 
+static inline bool i8042_present(void)
+{
+bool amb = false;
+return object_resolve_path_type("", TYPE_I8042, ) || amb;
+}
+
+/*
+ * ACPI v2, Table 5-10 - Fixed ACPI Description Table Boot Architecture
+ * Flags, bit offset 1 - 8042.
+ */
+static inline uint16_t iapc_boot_arch_8042(void)
+{
+return i8042_present() ? 0x1 << 1 : 0x0 ;
+}
+
 #endif /* HW_INPUT_I8042_H */
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index 8966e16320..1773cf55f1 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -2152,7 +2152,13 @@ void build_fadt(GArray *tbl, BIOSLinker *linker, const 
AcpiFadtData *f,
 build_append_int_noprefix(tbl, 0, 1); /* DAY_ALRM */
 build_append_int_noprefix(tbl, 0, 1); /* MON_ALRM */
 build_append_int_noprefix(tbl, f->rtc_century, 1); /* CENTURY */
-build_append_int_noprefix(tbl, 0, 2); /* IAPC_BOOT_ARCH */
+/* IAPC_BOOT_ARCH */
+if (f->rev == 1) {
+build_append_int_noprefix(tbl, 0, 2);
+} else {
+/* since ACPI v2.0 */
+build_append_int_noprefix(tbl, f->iapc_boot_arch, 2);
+}
 build_append_int_noprefix(tbl, 0, 1); /* Reserved */
 build_append_int_noprefix(tbl, f->flags, 4); /* Flags */
 
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index ebd47aa26f..4ad4d7286c 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -38,6 +38,7 @@
 #include "hw/nvram/fw_cfg.h"
 #include "hw/acpi/bios-linker-loader.h"
 #include "hw/isa/isa.h"
+#include "hw/input/i8042.h"
 #include "hw/block/fdc.h"
 #include "hw/acpi/memory_hotplug.h"
 #include "sysemu/tpm.h"
@@ -192,6 +193,13 @@ static void init_common_fadt_data(MachineState *ms, Object 
*o,
 .address = object_property_get_uint(o, ACPI_PM_PROP_GPE0_BLK, NULL)
 },
 };
+
+/*
+ * ACPI v2, Table 5-10 - Fixed ACPI Description Table Boot Architecture
+ * Flags, bit offset 1 - 8042.
+ */
+fadt.iapc_boot_arch = iapc_boot_arch_8042();
+
 *data = fadt;
 }
 
-- 
MST

[PULL v4 43/47] docs: vhost-user: add subsection for non-Linux platforms

From: Sergio Lopez 

Add a section explaining how vhost-user is supported on platforms
other than Linux.

Signed-off-by: Sergio Lopez 
Reviewed-by: Stefan Hajnoczi 
Message-Id: <20220304100854.14829-5-...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 docs/interop/vhost-user.rst | 20 
 1 file changed, 20 insertions(+)

diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
index edc3ad84a3..4dbc84fd00 100644
--- a/docs/interop/vhost-user.rst
+++ b/docs/interop/vhost-user.rst
@@ -38,6 +38,26 @@ conventions `.
 *Master* and *slave* can be either a client (i.e. connecting) or
 server (listening) in the socket communication.
 
+Support for platforms other than Linux
+--
+
+While vhost-user was initially developed targeting Linux, nowadays it
+is supported on any platform that provides the following features:
+
+- A way for requesting shared memory represented by a file descriptor
+  so it can be passed over a UNIX domain socket and then mapped by the
+  other process.
+
+- AF_UNIX sockets with SCM_RIGHTS, so QEMU and the other process can
+  exchange messages through it, including ancillary data when needed.
+
+- Either eventfd or pipe/pipe2. On platforms where eventfd is not
+  available, QEMU will automatically fall back to pipe2 or, as a last
+  resort, pipe. Each file descriptor will be used for receiving or
+  sending events by reading or writing (respectively) an 8-byte value
+  to the corresponding it. The 8-value itself has no meaning and
+  should not be interpreted.
+
 Message Specification
 =
 
-- 
MST

Re: [PATCH v13 0/8] hmp,qmp: Add commands to introspect virtio devices

On Mon, Mar 07, 2022 at 08:08:33AM -0500, Jonah Palmer wrote:
> This series introduces new QMP/HMP commands to dump the status of a
> virtio device at different levels.


Fails to build on the build-system-centos job:

libqemu-ppc64-softmmu.fa.p/hw_virtio_virtio.c.o: In function
`qmp_decode_features':
/builds/qemu-project/qemu/build/../hw/virtio/virtio.c:4155: undefined
reference to `gpu_map'
/builds/qemu-project/qemu/build/../hw/virtio/virtio.c:4155: undefined
reference to `gpu_map'
collect2: error: ld returned 1 exit status

https://gitlab.com/qemu-project/qemu/-/jobs/2172339948


Also fails on cross-win64-system:

https://gitlab.com/qemu-project/qemu/-/jobs/2172339938

../hw/virtio/virtio.c: In function 'qmp_x_query_virtio_vhost_queue_status':
../hw/virtio/virtio.c:4358:30: error: cast from pointer to integer of
different size [-Werror=pointer-to-int-cast]
4358 | status->desc = (uint64_t)(unsigned long)hdev->vqs[queue].desc;
| ^
../hw/virtio/virtio.c:4359:31: error: cast from pointer to integer of
different size [-Werror=pointer-to-int-cast]
4359 | status->avail = (uint64_t)(unsigned long)hdev->vqs[queue].avail;
| ^
../hw/virtio/virtio.c:4360:30: error: cast from pointer to integer of
different size [-Werror=pointer-to-int-cast]
4360 | status->used = (uint64_t)(unsigned long)hdev->vqs[queue].used;
| ^
cc1: all warnings being treated as errors

Reported-by: PMM

-- 
MST

[PULL v4 41/47] vhost: use wfd on functions setting vring call fd

From: Sergio Lopez 

When ioeventfd is emulated using qemu_pipe(), only EventNotifier's wfd
can be used for writing.

Use the recently introduced event_notifier_get_wfd() function to
obtain the fd that our peer must use to signal the vring.

Signed-off-by: Sergio Lopez 
Reviewed-by: Stefan Hajnoczi 
Message-Id: <20220304100854.14829-3-...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/virtio/vhost.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 7b03efccec..b643f42ea4 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1287,7 +1287,7 @@ static int vhost_virtqueue_init(struct vhost_dev *dev,
 return r;
 }
 
-file.fd = event_notifier_get_fd(>masked_notifier);
+file.fd = event_notifier_get_wfd(>masked_notifier);
 r = dev->vhost_ops->vhost_set_vring_call(dev, );
 if (r) {
 VHOST_OPS_DEBUG(r, "vhost_set_vring_call failed");
@@ -1542,9 +1542,9 @@ void vhost_virtqueue_mask(struct vhost_dev *hdev, 
VirtIODevice *vdev, int n,
 
 if (mask) {
 assert(vdev->use_guest_notifier_mask);
-file.fd = event_notifier_get_fd(>vqs[index].masked_notifier);
+file.fd = event_notifier_get_wfd(>vqs[index].masked_notifier);
 } else {
-file.fd = event_notifier_get_fd(virtio_queue_get_guest_notifier(vvq));
+file.fd = event_notifier_get_wfd(virtio_queue_get_guest_notifier(vvq));
 }
 
 file.index = hdev->vhost_ops->vhost_get_vq_index(hdev, n);
-- 
MST

[PULL v4 39/47] pci: drop COMPAT_PROP_PCP for 2.0 machine types

From: Igor Mammedov 

COMPAT_PROP_PCP is 'on' by default and it's used for turning
off PCP capability on PCIe slots for 2.0 machine types using
compat machinery.
Drop not needed compat glue as Q35 supports migration starting
from 2.4 machine types.

Signed-off-by: Igor Mammedov 
Message-Id: <20220222102504.3080104-1-imamm...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/i386/pc.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 32bf12421e..fd55fc725c 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -318,8 +318,6 @@ GlobalProperty pc_compat_2_0[] = {
 { "pci-serial-4x", "prog_if", "0" },
 { "virtio-net-pci", "guest_announce", "off" },
 { "ICH9-LPC", "memory-hotplug-support", "off" },
-{ "xio3130-downstream", COMPAT_PROP_PCP, "off" },
-{ "ioh3420", COMPAT_PROP_PCP, "off" },
 };
 const size_t pc_compat_2_0_len = G_N_ELEMENTS(pc_compat_2_0);
 
-- 
MST

[PULL v4 38/47] hw/smbios: Add table 4 parameter, "processor-id"