date:20240407

Re: [PATCH 1/1] virtio-pci: Fix the crash when the vector changes back from VIRTIO_NO_VECTOR

2024-04-07 Thread Jason Wang

On Sun, Apr 7, 2024 at 3:00 PM Cindy Lu  wrote:
>
> On Sun, Apr 7, 2024 at 12:20 PM Jason Wang  wrote:
> >
> > On Tue, Apr 2, 2024 at 11:02 PM Cindy Lu  wrote:
> > >
> > > When the guest calls virtio_stop and then virtio_reset,
> >
> > Guests could not call those functions directly, it is triggered by for
> > example writing to some of the registers like reset or others.
> >
> sure , Will fix this
> > > the vector will change
> > > to VIRTIO_NO_VECTOR and the IRQFD for this vector will be released. After 
> > > that
> > > If you want to change the vector back,
> >
> > What do you mean by "change the vector back"? Something like
> >
> > assign VIRTIO_MSI_NO_VECTOR to vector 0
> > assign X to vector 0
> >
> yes, the process is something  like
> 
> set config_vector = VIRTIO_MSI_NO_VECTOR
> ...
> set config_vector = 0
> > And I guess what you meant is to configure the vector after DRIVER_OK.
>
> >
> >
> > > it will cause a crash.
> > >
> > > To fix this, we need to call the function 
> > > "kvm_virtio_pci_vector_use_one()"
> > > when the vector changes back from VIRTIO_NO_VECTOR
> > >
> > > Signed-off-by: Cindy Lu 
> > > ---
> > >  hw/virtio/virtio-pci.c | 31 ---
> > >  1 file changed, 28 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> > > index e433879542..45f3ab38c3 100644
> > > --- a/hw/virtio/virtio-pci.c
> > > +++ b/hw/virtio/virtio-pci.c
> > > @@ -874,12 +874,14 @@ static int virtio_pci_get_notifier(VirtIOPCIProxy 
> > > *proxy, int queue_no,
> > >  return 0;
> > >  }
> > >
> > > -static int kvm_virtio_pci_vector_use_one(VirtIOPCIProxy *proxy, int 
> > > queue_no)
> > > +static int kvm_virtio_pci_vector_use_one(VirtIOPCIProxy *proxy, int 
> > > queue_no,
> > > + bool recovery)
> > >  {
> > >  unsigned int vector;
> > >  int ret;
> > >  EventNotifier *n;
> > >  PCIDevice *dev = >pci_dev;
> > > +VirtIOIRQFD *irqfd;
> > >  VirtIODevice *vdev = virtio_bus_get_device(>bus);
> > >  VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
> > >
> > > @@ -890,10 +892,21 @@ static int 
> > > kvm_virtio_pci_vector_use_one(VirtIOPCIProxy *proxy, int queue_no)
> > >  if (vector >= msix_nr_vectors_allocated(dev)) {
> > >  return 0;
> > >  }
> > > +/*
> > > + * if this is recovery and irqfd still in use, means the irqfd was 
> > > not
> > > + * release before and don't need to set up again
> > > + */
> > > +if (recovery) {
> > > +irqfd = >vector_irqfd[vector];
> > > +if (irqfd->users != 0) {
> > > +return 0;
> > > +}
> > > +}
> > >  ret = kvm_virtio_pci_vq_vector_use(proxy, vector);
> > >  if (ret < 0) {
> > >  goto undo;
> > >  }
> > > +
> > >  /*
> > >   * If guest supports masking, set up irqfd now.
> > >   * Otherwise, delay until unmasked in the frontend.
> > > @@ -932,14 +945,14 @@ static int 
> > > kvm_virtio_pci_vector_vq_use(VirtIOPCIProxy *proxy, int nvqs)
> > >  if (!virtio_queue_get_num(vdev, queue_no)) {
> > >  return -1;
> > >  }
> > > -ret = kvm_virtio_pci_vector_use_one(proxy, queue_no);
> > > +ret = kvm_virtio_pci_vector_use_one(proxy, queue_no, false);
> > >  }
> > >  return ret;
> > >  }
> > >
> > >  static int kvm_virtio_pci_vector_config_use(VirtIOPCIProxy *proxy)
> > >  {
> > > -return kvm_virtio_pci_vector_use_one(proxy, VIRTIO_CONFIG_IRQ_IDX);
> > > +return kvm_virtio_pci_vector_use_one(proxy, VIRTIO_CONFIG_IRQ_IDX, 
> > > false);
> > >  }
> > >
> > >  static void kvm_virtio_pci_vector_release_one(VirtIOPCIProxy *proxy,
> > > @@ -1570,7 +1583,13 @@ static void virtio_pci_common_write(void *opaque, 
> > > hwaddr addr,
> > >  } else {
> > >  val = VIRTIO_NO_VECTOR;
> > >  }
> > > +vector = vdev->config_vector;
> > >  vdev->config_vector = val;
> > > +/*check if the vector need to recovery*/
> > > +if ((val != VIRTIO_NO_VECTOR) && (vector == VIRTIO_NO_VECTOR) &&
> > > +(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > +kvm_virtio_pci_vector_use_one(proxy, VIRTIO_CONFIG_IRQ_IDX, 
> > > true);
> > > +}
> >
> > This looks too tricky.
> >
> > Think hard of this. I think it's better to split this into two parts:
> >
> > 1) a series that disables config irqfd for vhost-net, this series
> > needs to be backported to -stable which needs to be conservative. It
> > looks more like your V1, but let's add a boolean for pci proxy.
> sure, I wll try this
>
> > 2) a series that deal with the msix vector configuration after
> > driver_ok, we probably need some refactoring to do per vq use instead
> > of the current loop in DRIVER_OK
> >
> Sorry I didn't get what you mean , Do you mean we need to move the check
> to inside kvm_virtio_pci_vector_vq_use()？
> Thanks
> cindy

I meant try to do

Re: [RFC QEMU PATCH v8 2/2] virtio-pci: implement No_Soft_Reset bit

2024-04-07 Thread Jason Wang

On Sun, Apr 7, 2024 at 7:50 PM Michael S. Tsirkin  wrote:
>
> On Sun, Apr 07, 2024 at 11:20:57AM +0800, Jason Wang wrote:
> > On Tue, Apr 2, 2024 at 11:03 AM Chen, Jiqian  wrote:
> > >
> > > On 2024/3/29 18:44, Michael S. Tsirkin wrote:
> > > > On Fri, Mar 29, 2024 at 03:20:59PM +0800, Jason Wang wrote:
> > > >> On Fri, Mar 29, 2024 at 3:07 PM Chen, Jiqian  
> > > >> wrote:
> > > >>>
> > > >>> On 2024/3/28 20:36, Michael S. Tsirkin wrote:
> > > >>> +}
> > > >>> +
> > > >>>  static void virtio_pci_bus_reset_hold(Object *obj)
> > > >>>  {
> > > >>>  PCIDevice *dev = PCI_DEVICE(obj);
> > > >>>  DeviceState *qdev = DEVICE(obj);
> > > >>>
> > > >>> +if (virtio_pci_no_soft_reset(dev)) {
> > > >>> +return;
> > > >>> +}
> > > >>> +
> > > >>>  virtio_pci_reset(qdev);
> > > >>>
> > > >>>  if (pci_is_express(dev)) {
> > > >>> @@ -2484,6 +2511,8 @@ static Property virtio_pci_properties[] = {
> > > >>>  VIRTIO_PCI_FLAG_INIT_LNKCTL_BIT, true),
> > > >>>  DEFINE_PROP_BIT("x-pcie-pm-init", VirtIOPCIProxy, flags,
> > > >>>  VIRTIO_PCI_FLAG_INIT_PM_BIT, true),
> > > >>> +DEFINE_PROP_BIT("x-pcie-pm-no-soft-reset", VirtIOPCIProxy, 
> > > >>> flags,
> > > >>> +VIRTIO_PCI_FLAG_PM_NO_SOFT_RESET_BIT, false),
> > > >>
> > > >> Why does it come with an x prefix?
> > > >>
> > > >>>  DEFINE_PROP_BIT("x-pcie-flr-init", VirtIOPCIProxy, flags,
> > > >>>  VIRTIO_PCI_FLAG_INIT_FLR_BIT, true),
> > > >>>  DEFINE_PROP_BIT("aer", VirtIOPCIProxy, flags,
> > > >>
> > > >> I am a bit confused about this part.
> > > >> Do you want to make this software controllable?
> > > > Yes, because even the real hardware, this bit is not always set.
> > > >>
> > > >> We are talking about emulated devices here.
> > > >>
> > > 
> > >  So which virtio devices should and which should not set this bit?
> > > >>> This depends on the scenario the virtio-device is used, if we want to 
> > > >>> trigger an internal soft reset for the virtio-device during S3, this 
> > > >>> bit shouldn't be set.
> > > >>
> > > >> If the device doesn't need reset, why bother the driver for this?
> > > >>
> > > >> Btw, no_soft_reset is insufficient for some cases, there's a proposal
> > > >> for the virtio-spec. I think we need to wait until it is done.
> > > >
> > > > That seems orthogonal or did I miss something?
> > > Yes, I looked the detail of the proposal, I also think they are unrelated.
> >
> > The point is the proposal said
> >
> > """
> > Without a mechanism to
> > suspend/resume virtio devices when the driver is suspended/resumed in
> > the early phase of suspend/late phase of resume, there is a window where
> > interrupts can be lost.
> > """
> >
> > It looks safe to enable it with the suspend bit. Or if you think it's
> > wrong, please comment on the virtio spec patch.
> >
> > > I will set the default value of No_Soft_Reset bit to true in next version 
> > > according to your opinion.
> > > About the compatibility of old machine types, which types should I 
> > > consider? Does the same as x-pcie-pm-init(hw_compat_2_8)?
> > > Forgive me for not knowing much about compatibility.
> >
> > "x" means no compatibility at all, please drop the "x" prefix. And it
> > looks more safe to start as "false" by default.
> >
> > Thanks
>
>
> Not sure I agree. External flags are for when users want to tweak them.
> When would users want it to be off?

If I understand the suspending status proposal correctly, there are at
least 1 device that is not safe. And this series does not mention
which device it has tested.

It means if we enable it unconditionally, guests may break.

Thanks

> What is done here is I feel sane, just add machine compat machinery
> to change to off for old machine types.
>
>
> > > >
> > > >>> In my use case on my environment, I don't want to reset virtio-gpu 
> > > >>> during S3,
> > > >>> because once the display resources are destroyed, there are not 
> > > >>> enough information to re-create them, so this bit should be set.
> > > >>> Making this bit software controllable is convenient for users to take 
> > > >>> their own choices.
> > > >>
> > > >> Thanks
> > > >>
> > > >>>
> > > 
> > > >> Or should this be set to true by default and then
> > > >> changed to false for old machine types?
> > > > How can I do so?
> > > > Do you mean set this to true by default, and if old machine types 
> > > > don't need this bit, they can pass false config to qemu when 
> > > > running qemu?
> > > 
> > >  No, you would use compat machinery. See how is x-pcie-flr-init 
> > >  handled.
> > > 
> > > 
> > > >>>
> > > >>> --
> > > >>> Best regards,
> > > >>> Jiqian Chen.
> > > >
> > >
> > > --
> > > Best regards,
> > > Jiqian Chen.
>

Re: [PATCH 1/1] virtio-pci: Fix the crash when the vector changes back from VIRTIO_NO_VECTOR

2024-04-07 Thread Jason Wang

On Sun, Apr 7, 2024 at 7:53 PM Michael S. Tsirkin  wrote:
>
> On Sun, Apr 07, 2024 at 12:19:57PM +0800, Jason Wang wrote:
> > On Tue, Apr 2, 2024 at 11:02 PM Cindy Lu  wrote:
> > >
> > > When the guest calls virtio_stop and then virtio_reset,
> >
> > Guests could not call those functions directly, it is triggered by for
> > example writing to some of the registers like reset or others.
> >
> > > the vector will change
> > > to VIRTIO_NO_VECTOR and the IRQFD for this vector will be released. After 
> > > that
> > > If you want to change the vector back,
> >
> > What do you mean by "change the vector back"? Something like
> >
> > assign VIRTIO_MSI_NO_VECTOR to vector 0
> > assign X to vector 0
> >
> > And I guess what you meant is to configure the vector after DRIVER_OK.
> >
> >
> > > it will cause a crash.
> > >
> > > To fix this, we need to call the function 
> > > "kvm_virtio_pci_vector_use_one()"
> > > when the vector changes back from VIRTIO_NO_VECTOR
> > >
> > > Signed-off-by: Cindy Lu 
> > > ---
> > >  hw/virtio/virtio-pci.c | 31 ---
> > >  1 file changed, 28 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> > > index e433879542..45f3ab38c3 100644
> > > --- a/hw/virtio/virtio-pci.c
> > > +++ b/hw/virtio/virtio-pci.c
> > > @@ -874,12 +874,14 @@ static int virtio_pci_get_notifier(VirtIOPCIProxy 
> > > *proxy, int queue_no,
> > >  return 0;
> > >  }
> > >
> > > -static int kvm_virtio_pci_vector_use_one(VirtIOPCIProxy *proxy, int 
> > > queue_no)
> > > +static int kvm_virtio_pci_vector_use_one(VirtIOPCIProxy *proxy, int 
> > > queue_no,
> > > + bool recovery)
> > >  {
> > >  unsigned int vector;
> > >  int ret;
> > >  EventNotifier *n;
> > >  PCIDevice *dev = >pci_dev;
> > > +VirtIOIRQFD *irqfd;
> > >  VirtIODevice *vdev = virtio_bus_get_device(>bus);
> > >  VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
> > >
> > > @@ -890,10 +892,21 @@ static int 
> > > kvm_virtio_pci_vector_use_one(VirtIOPCIProxy *proxy, int queue_no)
> > >  if (vector >= msix_nr_vectors_allocated(dev)) {
> > >  return 0;
> > >  }
> > > +/*
> > > + * if this is recovery and irqfd still in use, means the irqfd was 
> > > not
> > > + * release before and don't need to set up again
> > > + */
> > > +if (recovery) {
> > > +irqfd = >vector_irqfd[vector];
> > > +if (irqfd->users != 0) {
> > > +return 0;
> > > +}
> > > +}
> > >  ret = kvm_virtio_pci_vq_vector_use(proxy, vector);
> > >  if (ret < 0) {
> > >  goto undo;
> > >  }
> > > +
> > >  /*
> > >   * If guest supports masking, set up irqfd now.
> > >   * Otherwise, delay until unmasked in the frontend.
> > > @@ -932,14 +945,14 @@ static int 
> > > kvm_virtio_pci_vector_vq_use(VirtIOPCIProxy *proxy, int nvqs)
> > >  if (!virtio_queue_get_num(vdev, queue_no)) {
> > >  return -1;
> > >  }
> > > -ret = kvm_virtio_pci_vector_use_one(proxy, queue_no);
> > > +ret = kvm_virtio_pci_vector_use_one(proxy, queue_no, false);
> > >  }
> > >  return ret;
> > >  }
> > >
> > >  static int kvm_virtio_pci_vector_config_use(VirtIOPCIProxy *proxy)
> > >  {
> > > -return kvm_virtio_pci_vector_use_one(proxy, VIRTIO_CONFIG_IRQ_IDX);
> > > +return kvm_virtio_pci_vector_use_one(proxy, VIRTIO_CONFIG_IRQ_IDX, 
> > > false);
> > >  }
> > >
> > >  static void kvm_virtio_pci_vector_release_one(VirtIOPCIProxy *proxy,
> > > @@ -1570,7 +1583,13 @@ static void virtio_pci_common_write(void *opaque, 
> > > hwaddr addr,
> > >  } else {
> > >  val = VIRTIO_NO_VECTOR;
> > >  }
> > > +vector = vdev->config_vector;
> > >  vdev->config_vector = val;
> > > +/*check if the vector need to recovery*/
> > > +if ((val != VIRTIO_NO_VECTOR) && (vector == VIRTIO_NO_VECTOR) &&
> > > +(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > +kvm_virtio_pci_vector_use_one(proxy, VIRTIO_CONFIG_IRQ_IDX, 
> > > true);
> > > +}
> >
> > This looks too tricky.
> >
> > Think hard of this. I think it's better to split this into two parts:
> >
> > 1) a series that disables config irqfd for vhost-net, this series
> > needs to be backported to -stable which needs to be conservative. It
> > looks more like your V1, but let's add a boolean for pci proxy.
>
> I don't get it. Looks like a huge change to do in stable.
> All as a replacement to a small 20 line patch?

For example, this patch needs more tweak so it won't be that tiny:

1) needs to check if we can use guest notifiers (irqfd)
2) the switching from X to NO_VECTOR might need
kvm_virtio_pci_vq_vector_release()

>
> Generally I think irqfd is best used everywhere.

Only when msix and kvm are both enabled.

>
>
> > 2) a series that deal with the msix vector configuration after
> >

Re: [PATCH 1/6] hw/acpi/GI: Fix trivial parameter alignment issue.

2024-04-07 Thread Ankit Agrawal

> Before making additional modification, tidy up this misleading indentation.

Thanks for fixing it.
Reviewed-by: Ankit Agrawal

[PATCH v2] vdpa-dev: Fix the issue of device status not updating when configuration interruption is triggered

2024-04-07 Thread lyx634449800

The set_config callback function vhost_vdpa_device_get_config in
vdpa-dev does not fetch the current device status from the hardware
device, causing the guest os to not receive the latest device status
information.

The hardware updates the config status of the vdpa device and then
notifies the os. The guest os receives an interrupt notification,
triggering a get_config access in the kernel, which then enters qemu
internally. Ultimately, the vhost_vdpa_device_get_config function of
vdpa-dev is called

One scenario encountered is when the device needs to bring down the
vdpa net device. After modifying the status field of virtio_net_config
in the hardware, it sends an interrupt notification. However, the guest
os always receives the STATUS field as VIRTIO_NET_S_LINK_UP.

Signed-off-by: Yuxue Liu 
Acked-by: Jason Wang 
---
V2: Amending the capitalization issue in the last commit message

 hw/virtio/vdpa-dev.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
index 13e87f06f6..64b96b226c 100644
--- a/hw/virtio/vdpa-dev.c
+++ b/hw/virtio/vdpa-dev.c
@@ -195,7 +195,14 @@ static void
 vhost_vdpa_device_get_config(VirtIODevice *vdev, uint8_t *config)
 {
 VhostVdpaDevice *s = VHOST_VDPA_DEVICE(vdev);
+int ret;
 
+ret = vhost_dev_get_config(>dev, s->config, s->config_size,
+NULL);
+if (ret < 0) {
+error_report("get device config space failed");
+return;
+}
 memcpy(config, s->config, s->config_size);
 }
 
-- 
2.43.0

Re: [PATCH v9 16/20] virtio-net: Do not write hashes to peer buffer

2024-04-07 Thread Akihiko Odaki


On 2024/04/08 7:09, Yuri Benditovich wrote:

On Wed, Apr 3, 2024 at 2:12 PM Akihiko Odaki  wrote:


The peer buffer is qualified with const and not meant to be modified.


IMHO, this buffer is not so 'const' (although the prototype states so),
it is allocated in net.c
btw, another procedure in this file also modifies the buffer
(work_around_broken_dhclient)


Right but it has a FIXME comment.




It also prevents enabling VIRTIO_NET_F_HASH_REPORT for peers without
virtio-net header support.


Does it mean _this commit_ prevents enabling VIRTIO_NET_F_HASH_REPORT
for peers without
virtio-net header support? Where?


No, but I meant that this patch fixes such a problem.

Regards,
Akihiko Odaki

Re: [PATCH v9 13/20] virtio-net: Return an error when vhost cannot enable RSS

2024-04-07 Thread Akihiko Odaki


On 2024/04/08 6:46, Yuri Benditovich wrote:

On Wed, Apr 3, 2024 at 2:11 PM Akihiko Odaki  wrote:


vhost requires eBPF for RSS. When eBPF is not available, virtio-net
implicitly disables RSS even if the user explicitly requests it. Return
an error instead of implicitly disabling RSS if RSS is requested but not
available.

Signed-off-by: Akihiko Odaki 
---
  hw/net/virtio-net.c | 97 ++---
  1 file changed, 48 insertions(+), 49 deletions(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 61b49e335dea..3d53eba88cfc 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -793,9 +793,6 @@ static uint64_t virtio_net_get_features(VirtIODevice *vdev, 
uint64_t features,
  return features;
  }

-if (!ebpf_rss_is_loaded(>ebpf_rss)) {
-virtio_clear_feature(, VIRTIO_NET_F_RSS);
-}
  features = vhost_net_get_features(get_vhost_net(nc->peer), features);
  vdev->backend_features = features;

@@ -3591,6 +3588,50 @@ static bool failover_hide_primary_device(DeviceListener 
*listener,
  return qatomic_read(>failover_primary_hidden);
  }

+static void virtio_net_device_unrealize(DeviceState *dev)
+{
+VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+VirtIONet *n = VIRTIO_NET(dev);
+int i, max_queue_pairs;
+
+if (virtio_has_feature(n->host_features, VIRTIO_NET_F_RSS)) {
+virtio_net_unload_ebpf(n);
+}
+
+/* This will stop vhost backend if appropriate. */
+virtio_net_set_status(vdev, 0);
+
+g_free(n->netclient_name);
+n->netclient_name = NULL;
+g_free(n->netclient_type);
+n->netclient_type = NULL;
+
+g_free(n->mac_table.macs);
+g_free(n->vlans);
+
+if (n->failover) {
+qobject_unref(n->primary_opts);
+device_listener_unregister(>primary_listener);
+migration_remove_notifier(>migration_state);
+} else {
+assert(n->primary_opts == NULL);
+}
+
+max_queue_pairs = n->multiqueue ? n->max_queue_pairs : 1;
+for (i = 0; i < max_queue_pairs; i++) {
+virtio_net_del_queue(n, i);
+}
+/* delete also control vq */
+virtio_del_queue(vdev, max_queue_pairs * 2);
+qemu_announce_timer_del(>announce_timer, false);
+g_free(n->vqs);
+qemu_del_nic(n->nic);
+virtio_net_rsc_cleanup(n);
+g_free(n->rss_data.indirections_table);
+net_rx_pkt_uninit(n->rx_pkt);
+virtio_cleanup(vdev);
+}
+
  static void virtio_net_device_realize(DeviceState *dev, Error **errp)
  {
  VirtIODevice *vdev = VIRTIO_DEVICE(dev);
@@ -3760,53 +3801,11 @@ static void virtio_net_device_realize(DeviceState *dev, 
Error **errp)

  net_rx_pkt_init(>rx_pkt);

-if (virtio_has_feature(n->host_features, VIRTIO_NET_F_RSS)) {
-virtio_net_load_ebpf(n);
-}
-}
-
-static void virtio_net_device_unrealize(DeviceState *dev)
-{
-VirtIODevice *vdev = VIRTIO_DEVICE(dev);
-VirtIONet *n = VIRTIO_NET(dev);
-int i, max_queue_pairs;
-
-if (virtio_has_feature(n->host_features, VIRTIO_NET_F_RSS)) {
-virtio_net_unload_ebpf(n);
+if (virtio_has_feature(n->host_features, VIRTIO_NET_F_RSS) &&


I disagree with this change of qemu behavior.
 From my point of view:
- this is not a major problem and it should not be a reason to stop VM execution
- it is enough to disable the RSS feature and continue working. Depending on
   other qemu parameters (number of queues, number of cpus) this might be just
   suboptimal. might be a minor problem and might be not a problem at all


The reasoning is that we shouldn't disable what the user explicitly 
requested. c.f., 
https://lore.kernel.org/all/20231102091717-mutt-send-email-...@kernel.org/



- this change defines rss as _only_ feature whose absence breaks the VM start,
   _all_ other features are dropped silently and only rss is not. Why??


I'm following what QEMU does in the other places rather than what it 
does just in virtio-net. I have pointed out virtio-gpu raises errors in 
such a situation. c.f., 
https://lore.kernel.org/all/8880b6f9-f556-46f7-a191-eeec0fe20...@daynix.com



- the series has a title 'Fixes and improvements' . This is not a fix and not an
   improvement, this is significant behavioral change that should be discussed 
in
   light of future plans regarding rss
- I suggest to remove this change from the series, submit it separately
   and discuss from all the sides


We should have already discussed about these matters; I responded all 
past replies in the previous versions months ago and had no update after 
that. Let's focus on matters that were not previously pointed out.


Regards,
Akihiko Odaki

Re: [PATCH v9 16/20] virtio-net: Do not write hashes to peer buffer

2024-04-07 Thread Yuri Benditovich

On Wed, Apr 3, 2024 at 2:12 PM Akihiko Odaki  wrote:
>
> The peer buffer is qualified with const and not meant to be modified.

IMHO, this buffer is not so 'const' (although the prototype states so),
it is allocated in net.c
btw, another procedure in this file also modifies the buffer
(work_around_broken_dhclient)

> It also prevents enabling VIRTIO_NET_F_HASH_REPORT for peers without
> virtio-net header support.

Does it mean _this commit_ prevents enabling VIRTIO_NET_F_HASH_REPORT
for peers without
virtio-net header support? Where?

> Signed-off-by: Akihiko Odaki 
> ---
>  hw/net/virtio-net.c | 36 +---
>  1 file changed, 17 insertions(+), 19 deletions(-)
>
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index 2de073ce18fd..ff1884564d0d 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -1823,16 +1823,9 @@ static uint8_t virtio_net_get_hash_type(bool hasip4,
>  return 0xff;
>  }
>
> -static void virtio_set_packet_hash(const uint8_t *buf, uint8_t report,
> -   uint32_t hash)
> -{
> -struct virtio_net_hdr_v1_hash *hdr = (void *)buf;
> -hdr->hash_value = hash;
> -hdr->hash_report = report;
> -}
> -
>  static int virtio_net_process_rss(NetClientState *nc, const uint8_t *buf,
> -  size_t size)
> +  size_t size,
> +  struct virtio_net_hdr_v1_hash *hdr)
>  {
>  VirtIONet *n = qemu_get_nic_opaque(nc);
>  unsigned int index = nc->queue_index, new_index = index;
> @@ -1863,7 +1856,8 @@ static int virtio_net_process_rss(NetClientState *nc, 
> const uint8_t *buf,
>   n->rss_data.hash_types);
>  if (net_hash_type > NetPktRssIpV6UdpEx) {
>  if (n->rss_data.populate_hash) {
> -virtio_set_packet_hash(buf, VIRTIO_NET_HASH_REPORT_NONE, 0);
> +hdr->hash_value = VIRTIO_NET_HASH_REPORT_NONE;
> +hdr->hash_report = 0;
>  }
>  return n->rss_data.redirect ? n->rss_data.default_queue : -1;
>  }
> @@ -1871,7 +1865,8 @@ static int virtio_net_process_rss(NetClientState *nc, 
> const uint8_t *buf,
>  hash = net_rx_pkt_calc_rss_hash(pkt, net_hash_type, n->rss_data.key);
>
>  if (n->rss_data.populate_hash) {
> -virtio_set_packet_hash(buf, reports[net_hash_type], hash);
> +hdr->hash_value = hash;
> +hdr->hash_report = reports[net_hash_type];
>  }
>
>  if (n->rss_data.redirect) {
> @@ -1891,7 +1886,7 @@ static ssize_t virtio_net_receive_rcu(NetClientState 
> *nc, const uint8_t *buf,
>  VirtQueueElement *elems[VIRTQUEUE_MAX_SIZE];
>  size_t lens[VIRTQUEUE_MAX_SIZE];
>  struct iovec mhdr_sg[VIRTQUEUE_MAX_SIZE];
> -struct virtio_net_hdr_mrg_rxbuf mhdr;
> +struct virtio_net_hdr_v1_hash extra_hdr;
>  unsigned mhdr_cnt = 0;
>  size_t offset, i, guest_offset, j;
>  ssize_t err;
> @@ -1901,7 +1896,7 @@ static ssize_t virtio_net_receive_rcu(NetClientState 
> *nc, const uint8_t *buf,
>  }
>
>  if (!no_rss && n->rss_data.enabled && n->rss_data.enabled_software_rss) {
> -int index = virtio_net_process_rss(nc, buf, size);
> +int index = virtio_net_process_rss(nc, buf, size, _hdr);
>  if (index >= 0) {
>  NetClientState *nc2 = qemu_get_subqueue(n->nic, index);
>  return virtio_net_receive_rcu(nc2, buf, size, true);
> @@ -1961,15 +1956,17 @@ static ssize_t virtio_net_receive_rcu(NetClientState 
> *nc, const uint8_t *buf,
>  if (n->mergeable_rx_bufs) {
>  mhdr_cnt = iov_copy(mhdr_sg, ARRAY_SIZE(mhdr_sg),
>  sg, elem->in_num,
> -offsetof(typeof(mhdr), num_buffers),
> -sizeof(mhdr.num_buffers));
> +offsetof(typeof(extra_hdr), 
> hdr.num_buffers),
> +sizeof(extra_hdr.hdr.num_buffers));
>  }
>
>  receive_header(n, sg, elem->in_num, buf, size);
>  if (n->rss_data.populate_hash) {
> -offset = sizeof(mhdr);
> +offset = offsetof(typeof(extra_hdr), hash_value);
>  iov_from_buf(sg, elem->in_num, offset,
> - buf + offset, n->host_hdr_len - sizeof(mhdr));
> + (char *)_hdr + offset,
> + sizeof(extra_hdr.hash_value) +
> + sizeof(extra_hdr.hash_report));
>  }
>  offset = n->host_hdr_len;
>  total += n->guest_hdr_len;
> @@ -2015,10 +2012,11 @@ static ssize_t virtio_net_receive_rcu(NetClientState 
> *nc, const uint8_t *buf,
>  }
>
>  if (mhdr_cnt) {
> -virtio_stw_p(vdev, _buffers, i);
> +virtio_stw_p(vdev, _hdr.hdr.num_buffers, i);
>  iov_from_buf(mhdr_sg,

Re: [PATCH v9 13/20] virtio-net: Return an error when vhost cannot enable RSS

2024-04-07 Thread Yuri Benditovich

On Wed, Apr 3, 2024 at 2:11 PM Akihiko Odaki  wrote:
>
> vhost requires eBPF for RSS. When eBPF is not available, virtio-net
> implicitly disables RSS even if the user explicitly requests it. Return
> an error instead of implicitly disabling RSS if RSS is requested but not
> available.
>
> Signed-off-by: Akihiko Odaki 
> ---
>  hw/net/virtio-net.c | 97 
> ++---
>  1 file changed, 48 insertions(+), 49 deletions(-)
>
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index 61b49e335dea..3d53eba88cfc 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -793,9 +793,6 @@ static uint64_t virtio_net_get_features(VirtIODevice 
> *vdev, uint64_t features,
>  return features;
>  }
>
> -if (!ebpf_rss_is_loaded(>ebpf_rss)) {
> -virtio_clear_feature(, VIRTIO_NET_F_RSS);
> -}
>  features = vhost_net_get_features(get_vhost_net(nc->peer), features);
>  vdev->backend_features = features;
>
> @@ -3591,6 +3588,50 @@ static bool 
> failover_hide_primary_device(DeviceListener *listener,
>  return qatomic_read(>failover_primary_hidden);
>  }
>
> +static void virtio_net_device_unrealize(DeviceState *dev)
> +{
> +VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> +VirtIONet *n = VIRTIO_NET(dev);
> +int i, max_queue_pairs;
> +
> +if (virtio_has_feature(n->host_features, VIRTIO_NET_F_RSS)) {
> +virtio_net_unload_ebpf(n);
> +}
> +
> +/* This will stop vhost backend if appropriate. */
> +virtio_net_set_status(vdev, 0);
> +
> +g_free(n->netclient_name);
> +n->netclient_name = NULL;
> +g_free(n->netclient_type);
> +n->netclient_type = NULL;
> +
> +g_free(n->mac_table.macs);
> +g_free(n->vlans);
> +
> +if (n->failover) {
> +qobject_unref(n->primary_opts);
> +device_listener_unregister(>primary_listener);
> +migration_remove_notifier(>migration_state);
> +} else {
> +assert(n->primary_opts == NULL);
> +}
> +
> +max_queue_pairs = n->multiqueue ? n->max_queue_pairs : 1;
> +for (i = 0; i < max_queue_pairs; i++) {
> +virtio_net_del_queue(n, i);
> +}
> +/* delete also control vq */
> +virtio_del_queue(vdev, max_queue_pairs * 2);
> +qemu_announce_timer_del(>announce_timer, false);
> +g_free(n->vqs);
> +qemu_del_nic(n->nic);
> +virtio_net_rsc_cleanup(n);
> +g_free(n->rss_data.indirections_table);
> +net_rx_pkt_uninit(n->rx_pkt);
> +virtio_cleanup(vdev);
> +}
> +
>  static void virtio_net_device_realize(DeviceState *dev, Error **errp)
>  {
>  VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> @@ -3760,53 +3801,11 @@ static void virtio_net_device_realize(DeviceState 
> *dev, Error **errp)
>
>  net_rx_pkt_init(>rx_pkt);
>
> -if (virtio_has_feature(n->host_features, VIRTIO_NET_F_RSS)) {
> -virtio_net_load_ebpf(n);
> -}
> -}
> -
> -static void virtio_net_device_unrealize(DeviceState *dev)
> -{
> -VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> -VirtIONet *n = VIRTIO_NET(dev);
> -int i, max_queue_pairs;
> -
> -if (virtio_has_feature(n->host_features, VIRTIO_NET_F_RSS)) {
> -virtio_net_unload_ebpf(n);
> +if (virtio_has_feature(n->host_features, VIRTIO_NET_F_RSS) &&

I disagree with this change of qemu behavior.
>From my point of view:
- this is not a major problem and it should not be a reason to stop VM execution
- it is enough to disable the RSS feature and continue working. Depending on
  other qemu parameters (number of queues, number of cpus) this might be just
  suboptimal. might be a minor problem and might be not a problem at all
- this change defines rss as _only_ feature whose absence breaks the VM start,
  _all_ other features are dropped silently and only rss is not. Why??
- the series has a title 'Fixes and improvements' . This is not a fix and not an
  improvement, this is significant behavioral change that should be discussed in
  light of future plans regarding rss
- I suggest to remove this change from the series, submit it separately
  and discuss from all the sides




> +!virtio_net_load_ebpf(n) && get_vhost_net(nc->peer)) {
> +virtio_net_device_unrealize(dev);
> +error_setg(errp, "Can't load eBPF RSS for vhost");
>  }
> -
> -/* This will stop vhost backend if appropriate. */
> -virtio_net_set_status(vdev, 0);
> -
> -g_free(n->netclient_name);
> -n->netclient_name = NULL;
> -g_free(n->netclient_type);
> -n->netclient_type = NULL;
> -
> -g_free(n->mac_table.macs);
> -g_free(n->vlans);
> -
> -if (n->failover) {
> -qobject_unref(n->primary_opts);
> -device_listener_unregister(>primary_listener);
> -migration_remove_notifier(>migration_state);
> -} else {
> -assert(n->primary_opts == NULL);
> -}
> -
> -max_queue_pairs = n->multiqueue ? n->max_queue_pairs : 1;
> -for (i = 0; i < max_queue_pairs; i++) {
> -

Re: [PATCH] target/sh4: add missing CHECK_NOT_DELAY_SLOT

2024-04-07 Thread Richard Henderson


On 4/7/24 05:07, Zack Buhman wrote:

CHECK_NOT_DELAY_SLOT is correctly applied to the branch-related
instructions, but not to the PC-relative mov* instructions.

I verified the existence of an illegal slot exception on a SH7091 when
any of these instructions are attempted inside a delay slot.

This also matches the behavior described in the SH-4 ISA manual.

Signed-off-by: Zack Buhman
---
  target/sh4/translate.c | 3 +++
  1 file changed, 3 insertions(+)


Reviewed-by: Richard Henderson 

r~

[PULL 2/2] MAINTAINERS: Adjust migration documentation files

2024-04-07 Thread peterx

From: Avihai Horon 

Commit 8cb2f8b172e7 ("docs/migration: Create migration/ directory")
changed migration documentation file structure but forgot to update the
entries in the MAINTAINERS file.

Commit 4c6f8a79ae53 ("docs/migration: Split 'dirty limit'") extracted
dirty limit documentation to a new file without updating dirty limit
section in MAINTAINERS file.

Fix the above.

Fixes: 8cb2f8b172e7 ("docs/migration: Create migration/ directory")
Fixes: 4c6f8a79ae53 ("docs/migration: Split 'dirty limit'")
Signed-off-by: Avihai Horon 
Link: https://lore.kernel.org/r/20240407081125.13951-1-avih...@nvidia.com
Signed-off-by: Peter Xu 
---
 MAINTAINERS | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index e71183eef9..d3fc2a06e3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2170,7 +2170,7 @@ S: Supported
 F: hw/vfio/*
 F: include/hw/vfio/
 F: docs/igd-assign.txt
-F: docs/devel/vfio-migration.rst
+F: docs/devel/migration/vfio.rst
 
 vfio-ccw
 M: Eric Farman 
@@ -2231,6 +2231,7 @@ F: qapi/virtio.json
 F: net/vhost-user.c
 F: include/hw/virtio/
 F: docs/devel/virtio*
+F: docs/devel/migration/virtio.rst
 
 virtio-balloon
 M: Michael S. Tsirkin 
@@ -3422,7 +3423,7 @@ F: migration/
 F: scripts/vmstate-static-checker.py
 F: tests/vmstate-static-checker-data/
 F: tests/qtest/migration-test.c
-F: docs/devel/migration.rst
+F: docs/devel/migration/
 F: qapi/migration.json
 F: tests/migration/
 F: util/userfaultfd.c
@@ -3442,6 +3443,7 @@ F: include/sysemu/dirtylimit.h
 F: migration/dirtyrate.c
 F: migration/dirtyrate.h
 F: include/sysemu/dirtyrate.h
+F: docs/devel/migration/dirty-limit.rst
 
 Detached LUKS header
 M: Hyman Huang 
-- 
2.44.0

[PULL 0/2] Migration 20240407 patches

2024-04-07 Thread peterx

From: Peter Xu 

The following changes since commit ce64e6224affb8b4e4b019f76d2950270b391af5:

  Merge tag 'qemu-sparc-20240404' of https://github.com/mcayland/qemu into 
staging (2024-04-04 15:28:06 +0100)

are available in the Git repository at:

  https://gitlab.com/peterx/qemu.git tags/migration-20240407-pull-request

for you to fetch changes up to 8e0b21e375f0f6e6dbaeaecc1d52e2220f163e40:

  MAINTAINERS: Adjust migration documentation files (2024-04-07 14:40:55 -0400)


Migration pull for 9.0-rc3

- Wei/Lei's fix on a rare postcopy race that can hang the channel (since 8.0)
- Avihai's fix on maintainers file, points to the right doc links



Avihai Horon (1):
  MAINTAINERS: Adjust migration documentation files

Wei Wang (1):
  migration/postcopy: ensure preempt channel is ready before loading
states

 MAINTAINERS|  6 --
 migration/savevm.c | 21 +
 2 files changed, 25 insertions(+), 2 deletions(-)

-- 
2.44.0

[PULL 1/2] migration/postcopy: ensure preempt channel is ready before loading states

2024-04-07 Thread peterx

From: Wei Wang 

Before loading the guest states, ensure that the preempt channel has been
ready to use, as some of the states (e.g. via virtio_load) might trigger
page faults that will be handled through the preempt channel. So yield to
the main thread in the case that the channel create event hasn't been
dispatched.

Cc: qemu-stable 
Fixes: 9358982744 ("migration: Send requested page directly in rp-return 
thread")
Originally-by: Lei Wang 
Link: 
https://lore.kernel.org/all/9aa5d1be-7801-40dd-83fd-f7e041ced...@intel.com/T/
Signed-off-by: Lei Wang 
Signed-off-by: Wei Wang 
Link: https://lore.kernel.org/r/20240405034056.23933-1-wei.w.w...@intel.com
[peterx: add a todo section, add Fixes and copy stable for 8.0+]
Signed-off-by: Peter Xu 
---
 migration/savevm.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/migration/savevm.c b/migration/savevm.c
index 388d7af7cd..e7c1215671 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2342,6 +2342,27 @@ static int 
loadvm_handle_cmd_packaged(MigrationIncomingState *mis)
 
 QEMUFile *packf = qemu_file_new_input(QIO_CHANNEL(bioc));
 
+/*
+ * Before loading the guest states, ensure that the preempt channel has
+ * been ready to use, as some of the states (e.g. via virtio_load) might
+ * trigger page faults that will be handled through the preempt channel.
+ * So yield to the main thread in the case that the channel create event
+ * hasn't been dispatched.
+ *
+ * TODO: if we can move migration loadvm out of main thread, then we
+ * won't block main thread from polling the accept() fds.  We can drop
+ * this as a whole when that is done.
+ */
+do {
+if (!migrate_postcopy_preempt() || !qemu_in_coroutine() ||
+mis->postcopy_qemufile_dst) {
+break;
+}
+
+aio_co_schedule(qemu_get_current_aio_context(), qemu_coroutine_self());
+qemu_coroutine_yield();
+} while (1);
+
 ret = qemu_loadvm_state_main(packf, mis);
 trace_loadvm_handle_cmd_packaged_main(ret);
 qemu_fclose(packf);
-- 
2.44.0

Re: [PATCH for-9.0] MAINTAINERS: Adjust migration documentation files

2024-04-07 Thread Peter Xu

On Sun, Apr 07, 2024 at 11:11:25AM +0300, Avihai Horon wrote:
> Commit 8cb2f8b172e7 ("docs/migration: Create migration/ directory")
> changed migration documentation file structure but forgot to update the
> entries in the MAINTAINERS file.
> 
> Commit 4c6f8a79ae53 ("docs/migration: Split 'dirty limit'") extracted
> dirty limit documentation to a new file without updating dirty limit
> section in MAINTAINERS file.
> 
> Fix the above.
> 
> Fixes: 8cb2f8b172e7 ("docs/migration: Create migration/ directory")
> Fixes: 4c6f8a79ae53 ("docs/migration: Split 'dirty limit'")
> Signed-off-by: Avihai Horon 

queued for rc3, thanks.

-- 
Peter Xu

Re: [PATCH net v4] virtio_net: Do not send RSS key if it is not supported

2024-04-07 Thread patchwork-bot+netdevbpf

Hello:

This patch was applied to netdev/net.git (main)
by David S. Miller :

On Wed,  3 Apr 2024 08:43:12 -0700 you wrote:
> There is a bug when setting the RSS options in virtio_net that can break
> the whole machine, getting the kernel into an infinite loop.
> 
> Running the following command in any QEMU virtual machine with virtionet
> will reproduce this problem:
> 
> # ethtool -X eth0  hfunc toeplitz
> 
> [...]

Here is the summary with links:
  - [net,v4] virtio_net: Do not send RSS key if it is not supported
https://git.kernel.org/netdev/net/c/059a49aa2e25

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html

[PATCH] target/sh4: add missing CHECK_NOT_DELAY_SLOT

2024-04-07 Thread Zack Buhman

CHECK_NOT_DELAY_SLOT is correctly applied to the branch-related
instructions, but not to the PC-relative mov* instructions.

I verified the existence of an illegal slot exception on a SH7091 when
any of these instructions are attempted inside a delay slot.

This also matches the behavior described in the SH-4 ISA manual.

Signed-off-by: Zack Buhman 
---
 target/sh4/translate.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/target/sh4/translate.c b/target/sh4/translate.c
index 6643c14dde..ebb6c901bf 100644
--- a/target/sh4/translate.c
+++ b/target/sh4/translate.c
@@ -523,6 +523,7 @@ static void _decode_opc(DisasContext * ctx)
 tcg_gen_movi_i32(REG(B11_8), B7_0s);
 return;
 case 0x9000: /* mov.w @(disp,PC),Rn */
+CHECK_NOT_DELAY_SLOT
 {
 TCGv addr = tcg_constant_i32(ctx->base.pc_next + 4 + B7_0 * 2);
 tcg_gen_qemu_ld_i32(REG(B11_8), addr, ctx->memidx,
@@ -530,6 +531,7 @@ static void _decode_opc(DisasContext * ctx)
 }
 return;
 case 0xd000: /* mov.l @(disp,PC),Rn */
+CHECK_NOT_DELAY_SLOT
 {
 TCGv addr = tcg_constant_i32((ctx->base.pc_next + 4 + B7_0 * 4) & 
~3);
 tcg_gen_qemu_ld_i32(REG(B11_8), addr, ctx->memidx,
@@ -1236,6 +1238,7 @@ static void _decode_opc(DisasContext * ctx)
 }
 return;
 case 0xc700: /* mova @(disp,PC),R0 */
+CHECK_NOT_DELAY_SLOT
 tcg_gen_movi_i32(REG(0), ((ctx->base.pc_next & 0xfffc) +
   4 + B7_0 * 4) & ~3);
 return;
-- 
2.41.0

Thanks! Reply: [PATCH]vdpa-dev: Fix the issue of device status not updating when configuration interruption is triggered

2024-04-07 Thread Gavin Liu

Thanks

- Original Message -
From: Jason Wang jasow...@redhat.com
Sent: April 7, 2024 11:46 AM
To: Gavin Liu gavin@jaguarmicro.com
Cc: epere...@redhat.com; sgarz...@redhat.com; m...@redhat.com; 
qemu-sta...@nongnu.org; qemu-devel@nongnu.org; kw...@redhat.com
Subject: Re: [PATCH] vdpa-dev: Fix the issue of device status not updating when 
configuration interruption is triggered
External Mail: This email originated from OUTSIDE of the organization!
Do not click links, open attachments or provide ANY information unless you 
recognize the sender and know the content is safe.


On Sun, Apr 7, 2024 at 11:22 AM lyx634449800  wrote:
>
> The set_config callback function vhost_vdpa_device_get_config in 
> vdpa-dev does not fetch the current device status from the hardware 
> device, causing the GUEST OS to not receive the latest device status

nit: no need for upper case here.

> information.
>
> The hardware updates the config status of the vdpa device and then 
> notifies the OS. The GUEST OS receives an interrupt notification, 
> triggering a get_config access in the kernel, which then enters qemu 
> internally. Ultimately, the vhost_vdpa_device_get_config function of 
> vdpa-dev is called
>
> One scenario encountered is when the device needs to bring down the 
> vdpa net device. After modifying the status field of virtio_net_config 
> in the hardware, it sends an interrupt notification. However, the 
> guest OS always receives the STATUS field as VIRTIO_NET_S_LINK_UP.
>
> Signed-off-by: Yuxue Liu 

This aligns with the vhost-net support for vDPA.

Acked-by: Jason Wang 

Thanks

> ---
>  hw/virtio/vdpa-dev.c | 7 +++
>  1 file changed, 7 insertions(+)
>
> diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c index 
> 13e87f06f6..64b96b226c 100644
> --- a/hw/virtio/vdpa-dev.c
> +++ b/hw/virtio/vdpa-dev.c
> @@ -195,7 +195,14 @@ static void
>  vhost_vdpa_device_get_config(VirtIODevice *vdev, uint8_t *config)  {
>  VhostVdpaDevice *s = VHOST_VDPA_DEVICE(vdev);
> +int ret;
>
> +ret = vhost_dev_get_config(>dev, s->config, s->config_size,
> +NULL);
> +if (ret < 0) {
> +error_report("get device config space failed");
> +return;
> +}
>  memcpy(config, s->config, s->config_size);
>  }
>
> --
> 2.43.0
>

[PATCH 1/2] Fix typo to allow migrate_qmp_fail command with 'channels' argument

2024-04-07 Thread Het Gala

Fixes: (tests/qtest/migration: Add negative tests to validate migration QAPIs)
Signed-off-by: Het Gala 
---
 tests/qtest/migration-test.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index d03a655f83..584d7c496f 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -1724,7 +1724,7 @@ static void test_precopy_common(MigrateCommon *args)
 }
 
 if (args->result == MIG_TEST_QMP_ERROR) {
-migrate_qmp_fail(from, args->connect_uri, args->connect_uri, "{}");
+migrate_qmp_fail(from, args->connect_uri, args->connect_channels, 
"{}");
 goto finish;
 }
 
-- 
2.22.3

[PATCH 0/2] Fix: qtest/migration: Improve multifd_tcp_channels_none test

2024-04-07 Thread Het Gala

With the introduction of new patchset to have 'channels' as the start
argument of migrate QAPIs instead of 'uri' (tests/qtest/migration: Add
tests for introducing 'channels' argument in migrate QAPIs), a few minor
typos got went unnoticed, which were caught while trying to introduce
similar qtests in migration.
Fix multifd_tcp_channels_none qtest to actually utilise 'channels' arg
in migrate QAPIs

This patchset is built on top of (tests/qtest/migration: Add tests for
introducing 'channels' argument in migrate QAPIs)

Het Gala (2):
  Fix typo to allow migrate_qmp_fail command with 'channels' argument
  Call args->connect_channels to actually test multifd_tcp_channels_none
qtest

 tests/qtest/migration-helpers.c | 2 --
 tests/qtest/migration-test.c| 4 ++--
 2 files changed, 2 insertions(+), 4 deletions(-)

-- 
2.22.3

[PATCH 2/2] Call args->connect_channels to actually test multifd_tcp_channels_none qtest

2024-04-07 Thread Het Gala

Earlier, without args->connect_channels, multifd_tcp_channels_none would
call uri internally even though connect_channels was introduced in
function definition. To actually call 'migrate' QAPI with modified syntax,
args->connect_channels need to be passed.
Double free happens while setting correct migration ports. Fix that.

Fixes: (tests/qtest/migration: Add multifd_tcp_plain test using list of
channels instead of uri)
Signed-off-by: Het Gala 
---
 tests/qtest/migration-helpers.c | 2 --
 tests/qtest/migration-test.c| 2 +-
 2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/tests/qtest/migration-helpers.c b/tests/qtest/migration-helpers.c
index b2a90469fb..b1d06187ab 100644
--- a/tests/qtest/migration-helpers.c
+++ b/tests/qtest/migration-helpers.c
@@ -146,8 +146,6 @@ static void migrate_set_ports(QTestState *to, QList 
*channel_list)
 qdict_put_str(addrdict, "port", addr_port);
 }
 }
-
-qobject_unref(addr);
 }
 
 bool migrate_watch_for_events(QTestState *who, const char *name,
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 584d7c496f..5d6d8cd634 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -1728,7 +1728,7 @@ static void test_precopy_common(MigrateCommon *args)
 goto finish;
 }
 
-migrate_qmp(from, to, args->connect_uri, NULL, "{}");
+migrate_qmp(from, to, args->connect_uri, args->connect_channels, "{}");
 
 if (args->result != MIG_TEST_SUCCEED) {
 bool allow_active = args->result == MIG_TEST_FAIL;
-- 
2.22.3

[PATCH] vdpa-dev: Fix the issue of device status not updating when configuration interruption is triggered

2024-04-07 Thread lyx634449800

The set_config callback function vhost_vdpa_device_get_config in
vdpa-dev does not fetch the current device status from the hardware
device, causing the GUEST OS to not receive the latest device status
information.

The hardware updates the config status of the vdpa device and then
notifies the OS. The GUEST OS receives an interrupt notification,
triggering a get_config access in the kernel, which then enters qemu
internally. Ultimately, the vhost_vdpa_device_get_config function of
vdpa-dev is called

One scenario encountered is when the device needs to bring down the
vdpa net device. After modifying the status field of virtio_net_config
in the hardware, it sends an interrupt notification. However, the guest
OS always receives the STATUS field as VIRTIO_NET_S_LINK_UP.

Signed-off-by: Yuxue Liu 
---
 hw/virtio/vdpa-dev.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
index 13e87f06f6..64b96b226c 100644
--- a/hw/virtio/vdpa-dev.c
+++ b/hw/virtio/vdpa-dev.c
@@ -195,7 +195,14 @@ static void
 vhost_vdpa_device_get_config(VirtIODevice *vdev, uint8_t *config)
 {
 VhostVdpaDevice *s = VHOST_VDPA_DEVICE(vdev);
+int ret;
 
+ret = vhost_dev_get_config(>dev, s->config, s->config_size,
+NULL);
+if (ret < 0) {
+error_report("get device config space failed");
+return;
+}
 memcpy(config, s->config, s->config_size);
 }
 
-- 
2.43.0

[PATCH] hw/intc/riscv_aplic: APLICs should add child earlier than realize

2024-04-07 Thread yang.zhang

From: "yang.zhang" 

Since only root APLICs can have hw IRQ lines, aplic->parent should
be initialized first.

Signed-off-by: yang.zhang 
---
 hw/intc/riscv_aplic.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/intc/riscv_aplic.c b/hw/intc/riscv_aplic.c
index fc5df0d598..32edd6d07b 100644
--- a/hw/intc/riscv_aplic.c
+++ b/hw/intc/riscv_aplic.c
@@ -1000,16 +1000,16 @@ DeviceState *riscv_aplic_create(hwaddr addr, hwaddr 
size,
 qdev_prop_set_bit(dev, "msimode", msimode);
 qdev_prop_set_bit(dev, "mmode", mmode);
 
+if (parent) {
+riscv_aplic_add_child(parent, dev);
+}
+
 sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), _fatal);
 
 if (!is_kvm_aia(msimode)) {
 sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, addr);
 }
 
-if (parent) {
-riscv_aplic_add_child(parent, dev);
-}
-
 if (!msimode) {
 for (i = 0; i < num_harts; i++) {
 CPUState *cpu = cpu_by_arch_id(hartid_base + i);
-- 
2.25.1

Re: [PATCH 1/1] virtio-pci: Fix the crash when the vector changes back from VIRTIO_NO_VECTOR

2024-04-07 Thread Michael S. Tsirkin

On Sun, Apr 07, 2024 at 12:19:57PM +0800, Jason Wang wrote:
> On Tue, Apr 2, 2024 at 11:02 PM Cindy Lu  wrote:
> >
> > When the guest calls virtio_stop and then virtio_reset,
> 
> Guests could not call those functions directly, it is triggered by for
> example writing to some of the registers like reset or others.
> 
> > the vector will change
> > to VIRTIO_NO_VECTOR and the IRQFD for this vector will be released. After 
> > that
> > If you want to change the vector back,
> 
> What do you mean by "change the vector back"? Something like
> 
> assign VIRTIO_MSI_NO_VECTOR to vector 0
> assign X to vector 0
> 
> And I guess what you meant is to configure the vector after DRIVER_OK.
> 
> 
> > it will cause a crash.
> >
> > To fix this, we need to call the function "kvm_virtio_pci_vector_use_one()"
> > when the vector changes back from VIRTIO_NO_VECTOR
> >
> > Signed-off-by: Cindy Lu 
> > ---
> >  hw/virtio/virtio-pci.c | 31 ---
> >  1 file changed, 28 insertions(+), 3 deletions(-)
> >
> > diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> > index e433879542..45f3ab38c3 100644
> > --- a/hw/virtio/virtio-pci.c
> > +++ b/hw/virtio/virtio-pci.c
> > @@ -874,12 +874,14 @@ static int virtio_pci_get_notifier(VirtIOPCIProxy 
> > *proxy, int queue_no,
> >  return 0;
> >  }
> >
> > -static int kvm_virtio_pci_vector_use_one(VirtIOPCIProxy *proxy, int 
> > queue_no)
> > +static int kvm_virtio_pci_vector_use_one(VirtIOPCIProxy *proxy, int 
> > queue_no,
> > + bool recovery)
> >  {
> >  unsigned int vector;
> >  int ret;
> >  EventNotifier *n;
> >  PCIDevice *dev = >pci_dev;
> > +VirtIOIRQFD *irqfd;
> >  VirtIODevice *vdev = virtio_bus_get_device(>bus);
> >  VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
> >
> > @@ -890,10 +892,21 @@ static int 
> > kvm_virtio_pci_vector_use_one(VirtIOPCIProxy *proxy, int queue_no)
> >  if (vector >= msix_nr_vectors_allocated(dev)) {
> >  return 0;
> >  }
> > +/*
> > + * if this is recovery and irqfd still in use, means the irqfd was not
> > + * release before and don't need to set up again
> > + */
> > +if (recovery) {
> > +irqfd = >vector_irqfd[vector];
> > +if (irqfd->users != 0) {
> > +return 0;
> > +}
> > +}
> >  ret = kvm_virtio_pci_vq_vector_use(proxy, vector);
> >  if (ret < 0) {
> >  goto undo;
> >  }
> > +
> >  /*
> >   * If guest supports masking, set up irqfd now.
> >   * Otherwise, delay until unmasked in the frontend.
> > @@ -932,14 +945,14 @@ static int 
> > kvm_virtio_pci_vector_vq_use(VirtIOPCIProxy *proxy, int nvqs)
> >  if (!virtio_queue_get_num(vdev, queue_no)) {
> >  return -1;
> >  }
> > -ret = kvm_virtio_pci_vector_use_one(proxy, queue_no);
> > +ret = kvm_virtio_pci_vector_use_one(proxy, queue_no, false);
> >  }
> >  return ret;
> >  }
> >
> >  static int kvm_virtio_pci_vector_config_use(VirtIOPCIProxy *proxy)
> >  {
> > -return kvm_virtio_pci_vector_use_one(proxy, VIRTIO_CONFIG_IRQ_IDX);
> > +return kvm_virtio_pci_vector_use_one(proxy, VIRTIO_CONFIG_IRQ_IDX, 
> > false);
> >  }
> >
> >  static void kvm_virtio_pci_vector_release_one(VirtIOPCIProxy *proxy,
> > @@ -1570,7 +1583,13 @@ static void virtio_pci_common_write(void *opaque, 
> > hwaddr addr,
> >  } else {
> >  val = VIRTIO_NO_VECTOR;
> >  }
> > +vector = vdev->config_vector;
> >  vdev->config_vector = val;
> > +/*check if the vector need to recovery*/
> > +if ((val != VIRTIO_NO_VECTOR) && (vector == VIRTIO_NO_VECTOR) &&
> > +(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > +kvm_virtio_pci_vector_use_one(proxy, VIRTIO_CONFIG_IRQ_IDX, 
> > true);
> > +}
> 
> This looks too tricky.
> 
> Think hard of this. I think it's better to split this into two parts:
> 
> 1) a series that disables config irqfd for vhost-net, this series
> needs to be backported to -stable which needs to be conservative. It
> looks more like your V1, but let's add a boolean for pci proxy.

I don't get it. Looks like a huge change to do in stable.
All as a replacement to a small 20 line patch?

Generally I think irqfd is best used everywhere.


> 2) a series that deal with the msix vector configuration after
> driver_ok, we probably need some refactoring to do per vq use instead
> of the current loop in DRIVER_OK
> 
> Does this make sense?
> 
> Thanks


Not really let's fix the bug for starters, refactoring can be done later
as appropriate.

> >  break;
> >  case VIRTIO_PCI_COMMON_STATUS:
> >  if (!(val & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > @@ -1611,6 +1630,12 @@ static void virtio_pci_common_write(void *opaque, 
> > hwaddr addr,
> >  val = VIRTIO_NO_VECTOR;
> >  }
> >  virtio_queue_set_vector(vdev, vdev->queue_sel, val);
> > +

Re: [RFC QEMU PATCH v8 2/2] virtio-pci: implement No_Soft_Reset bit

2024-04-07 Thread Michael S. Tsirkin

On Sun, Apr 07, 2024 at 11:20:57AM +0800, Jason Wang wrote:
> On Tue, Apr 2, 2024 at 11:03 AM Chen, Jiqian  wrote:
> >
> > On 2024/3/29 18:44, Michael S. Tsirkin wrote:
> > > On Fri, Mar 29, 2024 at 03:20:59PM +0800, Jason Wang wrote:
> > >> On Fri, Mar 29, 2024 at 3:07 PM Chen, Jiqian  wrote:
> > >>>
> > >>> On 2024/3/28 20:36, Michael S. Tsirkin wrote:
> > >>> +}
> > >>> +
> > >>>  static void virtio_pci_bus_reset_hold(Object *obj)
> > >>>  {
> > >>>  PCIDevice *dev = PCI_DEVICE(obj);
> > >>>  DeviceState *qdev = DEVICE(obj);
> > >>>
> > >>> +if (virtio_pci_no_soft_reset(dev)) {
> > >>> +return;
> > >>> +}
> > >>> +
> > >>>  virtio_pci_reset(qdev);
> > >>>
> > >>>  if (pci_is_express(dev)) {
> > >>> @@ -2484,6 +2511,8 @@ static Property virtio_pci_properties[] = {
> > >>>  VIRTIO_PCI_FLAG_INIT_LNKCTL_BIT, true),
> > >>>  DEFINE_PROP_BIT("x-pcie-pm-init", VirtIOPCIProxy, flags,
> > >>>  VIRTIO_PCI_FLAG_INIT_PM_BIT, true),
> > >>> +DEFINE_PROP_BIT("x-pcie-pm-no-soft-reset", VirtIOPCIProxy, 
> > >>> flags,
> > >>> +VIRTIO_PCI_FLAG_PM_NO_SOFT_RESET_BIT, false),
> > >>
> > >> Why does it come with an x prefix?
> > >>
> > >>>  DEFINE_PROP_BIT("x-pcie-flr-init", VirtIOPCIProxy, flags,
> > >>>  VIRTIO_PCI_FLAG_INIT_FLR_BIT, true),
> > >>>  DEFINE_PROP_BIT("aer", VirtIOPCIProxy, flags,
> > >>
> > >> I am a bit confused about this part.
> > >> Do you want to make this software controllable?
> > > Yes, because even the real hardware, this bit is not always set.
> > >>
> > >> We are talking about emulated devices here.
> > >>
> > 
> >  So which virtio devices should and which should not set this bit?
> > >>> This depends on the scenario the virtio-device is used, if we want to 
> > >>> trigger an internal soft reset for the virtio-device during S3, this 
> > >>> bit shouldn't be set.
> > >>
> > >> If the device doesn't need reset, why bother the driver for this?
> > >>
> > >> Btw, no_soft_reset is insufficient for some cases, there's a proposal
> > >> for the virtio-spec. I think we need to wait until it is done.
> > >
> > > That seems orthogonal or did I miss something?
> > Yes, I looked the detail of the proposal, I also think they are unrelated.
> 
> The point is the proposal said
> 
> """
> Without a mechanism to
> suspend/resume virtio devices when the driver is suspended/resumed in
> the early phase of suspend/late phase of resume, there is a window where
> interrupts can be lost.
> """
> 
> It looks safe to enable it with the suspend bit. Or if you think it's
> wrong, please comment on the virtio spec patch.
> 
> > I will set the default value of No_Soft_Reset bit to true in next version 
> > according to your opinion.
> > About the compatibility of old machine types, which types should I 
> > consider? Does the same as x-pcie-pm-init(hw_compat_2_8)?
> > Forgive me for not knowing much about compatibility.
> 
> "x" means no compatibility at all, please drop the "x" prefix. And it
> looks more safe to start as "false" by default.
> 
> Thanks


Not sure I agree. External flags are for when users want to tweak them.
When would users want it to be off?
What is done here is I feel sane, just add machine compat machinery
to change to off for old machine types.


> > >
> > >>> In my use case on my environment, I don't want to reset virtio-gpu 
> > >>> during S3,
> > >>> because once the display resources are destroyed, there are not enough 
> > >>> information to re-create them, so this bit should be set.
> > >>> Making this bit software controllable is convenient for users to take 
> > >>> their own choices.
> > >>
> > >> Thanks
> > >>
> > >>>
> > 
> > >> Or should this be set to true by default and then
> > >> changed to false for old machine types?
> > > How can I do so?
> > > Do you mean set this to true by default, and if old machine types 
> > > don't need this bit, they can pass false config to qemu when running 
> > > qemu?
> > 
> >  No, you would use compat machinery. See how is x-pcie-flr-init handled.
> > 
> > 
> > >>>
> > >>> --
> > >>> Best regards,
> > >>> Jiqian Chen.
> > >
> >
> > --
> > Best regards,
> > Jiqian Chen.

[PATCH 0/1] qemu-options.hx: Fix typo for interleave-granularity of CFMW

2024-04-07 Thread Yuquan Wang

This patch fixes the unit typo of interleave-granularity of
CXL Fixed Memory Window in qemu-option.hx.

Yuquan Wang (1):
  qemu-options.hx: Fix typo for interleave-granularity of CFMW

 qemu-options.hx | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

-- 
2.34.1

[PATCH 1/1] qemu-options.hx: Fix typo for interleave-granularity of CFMW

2024-04-07 Thread Yuquan Wang

This patch fixes the unit typo of interleave-granularity of
CXL Fixed Memory Window in qemu-option.hx.

Signed-off-by: Yuquan Wang wangyuquan1...@phytium.com.cn
---
 qemu-options.hx | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index 7fd1713fa8..e1b272d51a 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -151,14 +151,14 @@ SRST
 platform and configuration dependent.
 
 ``interleave-granularity=granularity`` sets the granularity of
-interleave. Default 256KiB. Only 256KiB, 512KiB, 1024KiB, 2048KiB
-4096KiB, 8192KiB and 16384KiB granularities supported.
+interleave. Default 256(bytes). Only 256, 512, 1k, 2k
+4k, 8k and 16k granularities supported.
 
 Example:
 
 ::
 
--machine 
cxl-fmw.0.targets.0=cxl.0,cxl-fmw.0.targets.1=cxl.1,cxl-fmw.0.size=128G,cxl-fmw.0.interleave-granularity=512k
+-machine 
cxl-fmw.0.targets.0=cxl.0,cxl-fmw.0.targets.1=cxl.1,cxl-fmw.0.size=128G,cxl-fmw.0.interleave-granularity=512
 ERST
 
 DEF("M", HAS_ARG, QEMU_OPTION_M,
-- 
2.34.1

[PATCH v13 09/24] target/arm: Handle PSTATE.ALLINT on taking an exception

2024-04-07 Thread Jinjie Ruan via

Set or clear PSTATE.ALLINT on taking an exception to ELx according to the
SCTLR_ELx.SPINTMASK bit.

Signed-off-by: Jinjie Ruan 
Reviewed-by: Richard Henderson 
Reviewed-by: Peter Maydell 
---
v13:
- Add Reviewed-by.
v9:
- Not check SCTLR_NMI in arm_cpu_do_interrupt_aarch64().
v3:
- Add Reviewed-by.
---
 target/arm/helper.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 0e7eefd7e5..65f2ddfa56 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11729,6 +11729,14 @@ static void arm_cpu_do_interrupt_aarch64(CPUState *cs)
 }
 }
 
+if (cpu_isar_feature(aa64_nmi, cpu)) {
+if (!(env->cp15.sctlr_el[new_el] & SCTLR_SPINTMASK)) {
+new_mode |= PSTATE_ALLINT;
+} else {
+new_mode &= ~PSTATE_ALLINT;
+}
+}
+
 pstate_write(env, PSTATE_DAIF | new_mode);
 env->aarch64 = true;
 aarch64_restore_sp(env, new_el);
-- 
2.34.1

[PATCH v13 22/24] hw/intc/arm_gicv3: Report the VINMI interrupt

2024-04-07 Thread Jinjie Ruan via

In vCPU Interface, if the vIRQ has the non-maskable property, report
vINMI to the corresponding vPE.

Signed-off-by: Jinjie Ruan 
Reviewed-by: Richard Henderson 
Reviewed-by: Peter Maydell 
---
v12:
- Do not check nmi_support repetitively.
- Add Reviewed-by.
v10:
- Update the commit message, superpriority -> non-maskable.
v9:
- Update the commit subject and message, vNMI -> vINMI.
v6:
- Add Reviewed-by.
---
 hw/intc/arm_gicv3_cpuif.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
index 60e2d9ec9c..0de7c41f96 100644
--- a/hw/intc/arm_gicv3_cpuif.c
+++ b/hw/intc/arm_gicv3_cpuif.c
@@ -480,6 +480,7 @@ void gicv3_cpuif_virt_irq_fiq_update(GICv3CPUState *cs)
 int idx;
 int irqlevel = 0;
 int fiqlevel = 0;
+int nmilevel = 0;
 
 idx = hppvi_index(cs);
 trace_gicv3_cpuif_virt_update(gicv3_redist_affid(cs), idx,
@@ -497,9 +498,17 @@ void gicv3_cpuif_virt_irq_fiq_update(GICv3CPUState *cs)
 uint64_t lr = cs->ich_lr_el2[idx];
 
 if (icv_hppi_can_preempt(cs, lr)) {
-/* Virtual interrupts are simple: G0 are always FIQ, and G1 IRQ */
+/*
+ * Virtual interrupts are simple: G0 are always FIQ, and G1 are
+ * IRQ or NMI which depends on the ICH_LR_EL2.NMI to have
+ * non-maskable property.
+ */
 if (lr & ICH_LR_EL2_GROUP) {
-irqlevel = 1;
+if (lr & ICH_LR_EL2_NMI) {
+nmilevel = 1;
+} else {
+irqlevel = 1;
+}
 } else {
 fiqlevel = 1;
 }
@@ -509,6 +518,7 @@ void gicv3_cpuif_virt_irq_fiq_update(GICv3CPUState *cs)
 trace_gicv3_cpuif_virt_set_irqs(gicv3_redist_affid(cs), fiqlevel, 
irqlevel);
 qemu_set_irq(cs->parent_vfiq, fiqlevel);
 qemu_set_irq(cs->parent_virq, irqlevel);
+qemu_set_irq(cs->parent_vnmi, nmilevel);
 }
 
 static void gicv3_cpuif_virt_update(GICv3CPUState *cs)
-- 
2.34.1

[PATCH v13 11/24] hw/arm/virt: Wire NMI and VINMI irq lines from GIC to CPU

2024-04-07 Thread Jinjie Ruan via

Wire the new NMI and VINMI interrupt line from the GIC to each CPU if it
is not GICv2.

Signed-off-by: Jinjie Ruan 
Reviewed-by: Richard Henderson 
---
v13:
- Adjust to after "hw/intc/arm_gicv3: Add external IRQ lines for NMI" to fix
  the unexpected error with patchseries at this point.
- Only connect these up if vms->gic_version is not VIRT_GIC_VERSION_2 to fix
  the gic-version=2 unexpected error.
v9:
- Rename ARM_CPU_VNMI to ARM_CPU_VINMI.
- Update the commit message.
v4:
- Add Reviewed-by.
v3:
- Also add VNMI wire.
---
 hw/arm/virt.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index a9a913aead..dca509d082 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -821,7 +821,8 @@ static void create_gic(VirtMachineState *vms, MemoryRegion 
*mem)
 
 /* Wire the outputs from each CPU's generic timer and the GICv3
  * maintenance interrupt signal to the appropriate GIC PPI inputs,
- * and the GIC's IRQ/FIQ/VIRQ/VFIQ interrupt outputs to the CPU's inputs.
+ * and the GIC's IRQ/FIQ/VIRQ/VFIQ/NMI/VINMI interrupt outputs to the
+ * CPU's inputs.
  */
 for (i = 0; i < smp_cpus; i++) {
 DeviceState *cpudev = DEVICE(qemu_get_cpu(i));
@@ -865,6 +866,13 @@ static void create_gic(VirtMachineState *vms, MemoryRegion 
*mem)
qdev_get_gpio_in(cpudev, ARM_CPU_VIRQ));
 sysbus_connect_irq(gicbusdev, i + 3 * smp_cpus,
qdev_get_gpio_in(cpudev, ARM_CPU_VFIQ));
+
+if (vms->gic_version != VIRT_GIC_VERSION_2) {
+sysbus_connect_irq(gicbusdev, i + 4 * smp_cpus,
+   qdev_get_gpio_in(cpudev, ARM_CPU_NMI));
+sysbus_connect_irq(gicbusdev, i + 5 * smp_cpus,
+   qdev_get_gpio_in(cpudev, ARM_CPU_VINMI));
+}
 }
 
 fdt_add_gic_node(vms);
-- 
2.34.1

[PATCH v13 19/24] hw/intc/arm_gicv3: Handle icv_nmiar1_read() for icc_nmiar1_read()

2024-04-07 Thread Jinjie Ruan via

Implement icv_nmiar1_read() for icc_nmiar1_read(), so add definition for
ICH_LR_EL2.NMI and ICH_AP1R_EL2.NMI bit.

If FEAT_GICv3_NMI is supported, ich_ap_write() should consider ICV_AP1R_EL1.NMI
bit. In icv_activate_irq() and icv_eoir_write(), the ICV_AP1R_EL1.NMI bit
should be set or clear according to the Non-maskable property. And the RPR
priority should also update the NMI bit according to the APR priority NMI bit.

By the way, add gicv3_icv_nmiar1_read trace event.

If the hpp irq is a NMI, the icv iar read should return 1022 and trap for
NMI again

Signed-off-by: Jinjie Ruan 
Reviewed-by: Richard Henderson 
Reviewed-by: Peter Maydell 
---
v13:
- Enforce RES0 bit in NMI field when FEAT_GICv3_NMI is not implemented for
  ICH_LR_EL2.
- Add Reviewed-by.
v12:
- When NMI is 1, the virtual interrupt's priority is 0x0.
- Make the thisnmi logic more concisely in hppvi_index().
- Use is_nmi to simplify the code and check is_nmi before comparing vpmr.
- Remove redundant nmi_support check in ich_highest_active_virt_prio(),
  hppvi_index(), icv_hppi_can_preempt(), icv_rpr_read() and icv_activate_irq().
- Also check sctlrx.NMI in icv_iar_read().
- Check icv_hppi_can_preempt() for icv_nmiar1_read().
- Check ICH_LR_EL2.NMI after check icv_hppi_can_preempt() as icv_iar_read()
  do it in icv_nmiar1_read().
- Fix the comment style in icv_nmiar1_read().
- Correct thisnmi to bool in icv_eoir_write().
- Check thisnmi and nmi both true instead of identical in icv_eoir_write().
v11:
- Deal with NMI in the callers instead of ich_highest_active_virt_prio().
- Set either NMI or a group-priority bit, not both.
- Only set AP NMI bits in the 0 reg.
- Handle NMI in hppvi_index(), icv_hppi_can_preempt() and icv_eoir_write().
v10:
- Rename ICH_AP1R_EL2_NMI to ICV_AP1R_EL1_NMI.
- Add ICV_RPR_EL1_NMI definition.
- Set ICV_RPR_EL1.NMI according to the ICV_AP1R_EL1.NMI in
  ich_highest_active_virt_prio().
v9:
- Correct the INTID_NMI logic.
v8:
- Fix an unexpected interrupt bug when sending VNMI by running qemu VM.
v7:
- Add Reviewed-by.
v6:
- Implement icv_nmiar1_read().
---
 hw/intc/arm_gicv3_cpuif.c | 105 +-
 hw/intc/gicv3_internal.h  |   4 ++
 hw/intc/trace-events  |   1 +
 3 files changed, 98 insertions(+), 12 deletions(-)

diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
index a5a1ef93ca..dff88c4283 100644
--- a/hw/intc/arm_gicv3_cpuif.c
+++ b/hw/intc/arm_gicv3_cpuif.c
@@ -157,6 +157,10 @@ static int ich_highest_active_virt_prio(GICv3CPUState *cs)
 int i;
 int aprmax = ich_num_aprs(cs);
 
+if (cs->ich_apr[GICV3_G1NS][0] & ICV_AP1R_EL1_NMI) {
+return 0x0;
+}
+
 for (i = 0; i < aprmax; i++) {
 uint32_t apr = cs->ich_apr[GICV3_G0][i] |
 cs->ich_apr[GICV3_G1NS][i];
@@ -191,6 +195,7 @@ static int hppvi_index(GICv3CPUState *cs)
  * correct behaviour.
  */
 int prio = 0xff;
+bool nmi = false;
 
 if (!(cs->ich_vmcr_el2 & (ICH_VMCR_EL2_VENG0 | ICH_VMCR_EL2_VENG1))) {
 /* Both groups disabled, definitely nothing to do */
@@ -199,6 +204,7 @@ static int hppvi_index(GICv3CPUState *cs)
 
 for (i = 0; i < cs->num_list_regs; i++) {
 uint64_t lr = cs->ich_lr_el2[i];
+bool thisnmi;
 int thisprio;
 
 if (ich_lr_state(lr) != ICH_LR_EL2_STATE_PENDING) {
@@ -217,10 +223,12 @@ static int hppvi_index(GICv3CPUState *cs)
 }
 }
 
+thisnmi = lr & ICH_LR_EL2_NMI;
 thisprio = ich_lr_prio(lr);
 
-if (thisprio < prio) {
+if ((thisprio < prio) || ((thisprio == prio) && (thisnmi & (!nmi {
 prio = thisprio;
+nmi = thisnmi;
 idx = i;
 }
 }
@@ -289,6 +297,7 @@ static bool icv_hppi_can_preempt(GICv3CPUState *cs, 
uint64_t lr)
  * equivalent of these checks.
  */
 int grp;
+bool is_nmi;
 uint32_t mask, prio, rprio, vpmr;
 
 if (!(cs->ich_hcr_el2 & ICH_HCR_EL2_EN)) {
@@ -301,10 +310,11 @@ static bool icv_hppi_can_preempt(GICv3CPUState *cs, 
uint64_t lr)
  */
 
 prio = ich_lr_prio(lr);
+is_nmi = lr & ICH_LR_EL2_NMI;
 vpmr = extract64(cs->ich_vmcr_el2, ICH_VMCR_EL2_VPMR_SHIFT,
  ICH_VMCR_EL2_VPMR_LENGTH);
 
-if (prio >= vpmr) {
+if (!is_nmi && prio >= vpmr) {
 /* Priority mask masks this interrupt */
 return false;
 }
@@ -326,6 +336,11 @@ static bool icv_hppi_can_preempt(GICv3CPUState *cs, 
uint64_t lr)
 return true;
 }
 
+if ((prio & mask) == (rprio & mask) && is_nmi &&
+!(cs->ich_apr[GICV3_G1NS][0] & ICV_AP1R_EL1_NMI)) {
+return true;
+}
+
 return false;
 }
 
@@ -550,7 +565,11 @@ static void icv_ap_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 
 trace_gicv3_icv_ap_write(ri->crm & 1, regno, gicv3_redist_affid(cs), 
value);
 
-cs->ich_apr[grp][regno] = value & 0xU;
+if (cs->gic->nmi_support) {
+cs->ich_apr[grp][regno] = value & (0xU |

[PATCH v13 13/24] hw/intc/arm_gicv3: Add has-nmi property to GICv3 device

2024-04-07 Thread Jinjie Ruan via

Add a property has-nmi to the GICv3 device, and use this to set
the NMI bit in the GICD_TYPER register. This isn't visible to
guests yet because the property defaults to false and we won't
set it in the board code until we've landed all of the changes
needed to implement FEAT_GICV3_NMI.

Signed-off-by: Jinjie Ruan 
Reviewed-by: Richard Henderson 
Reviewed-by: Peter Maydell 
---
v12:
- Update the subject and commit message.
- Add Reviewed-by.
v10:
- Adjust to before add irq non-maskable property.
v4:
- Add Reviewed-by.
---
 hw/intc/arm_gicv3_common.c | 1 +
 hw/intc/arm_gicv3_dist.c   | 2 ++
 hw/intc/gicv3_internal.h   | 1 +
 include/hw/intc/arm_gicv3_common.h | 1 +
 4 files changed, 5 insertions(+)

diff --git a/hw/intc/arm_gicv3_common.c b/hw/intc/arm_gicv3_common.c
index c52f060026..2d2cea6858 100644
--- a/hw/intc/arm_gicv3_common.c
+++ b/hw/intc/arm_gicv3_common.c
@@ -569,6 +569,7 @@ static Property arm_gicv3_common_properties[] = {
 DEFINE_PROP_UINT32("num-irq", GICv3State, num_irq, 32),
 DEFINE_PROP_UINT32("revision", GICv3State, revision, 3),
 DEFINE_PROP_BOOL("has-lpi", GICv3State, lpi_enable, 0),
+DEFINE_PROP_BOOL("has-nmi", GICv3State, nmi_support, 0),
 DEFINE_PROP_BOOL("has-security-extensions", GICv3State, security_extn, 0),
 /*
  * Compatibility property: force 8 bits of physical priority, even
diff --git a/hw/intc/arm_gicv3_dist.c b/hw/intc/arm_gicv3_dist.c
index 35e850685c..22ddc0d666 100644
--- a/hw/intc/arm_gicv3_dist.c
+++ b/hw/intc/arm_gicv3_dist.c
@@ -389,6 +389,7 @@ static bool gicd_readl(GICv3State *s, hwaddr offset,
  *  by GICD_TYPER.IDbits)
  * MBIS == 0 (message-based SPIs not supported)
  * SecurityExtn == 1 if security extns supported
+ * NMI = 1 if Non-maskable interrupt property is supported
  * CPUNumber == 0 since for us ARE is always 1
  * ITLinesNumber == (((max SPI IntID + 1) / 32) - 1)
  */
@@ -402,6 +403,7 @@ static bool gicd_readl(GICv3State *s, hwaddr offset,
 bool dvis = s->revision >= 4;
 
 *data = (1 << 25) | (1 << 24) | (dvis << 18) | (sec_extn << 10) |
+(s->nmi_support << GICD_TYPER_NMI_SHIFT) |
 (s->lpi_enable << GICD_TYPER_LPIS_SHIFT) |
 (0xf << 19) | itlinesnumber;
 return true;
diff --git a/hw/intc/gicv3_internal.h b/hw/intc/gicv3_internal.h
index 29d5cdc1b6..8f4ebed2f4 100644
--- a/hw/intc/gicv3_internal.h
+++ b/hw/intc/gicv3_internal.h
@@ -68,6 +68,7 @@
 #define GICD_CTLR_E1NWF (1U << 7)
 #define GICD_CTLR_RWP   (1U << 31)
 
+#define GICD_TYPER_NMI_SHIFT   9
 #define GICD_TYPER_LPIS_SHIFT  17
 
 /* 16 bits EventId */
diff --git a/include/hw/intc/arm_gicv3_common.h 
b/include/hw/intc/arm_gicv3_common.h
index 7324c7d983..4358c5319c 100644
--- a/include/hw/intc/arm_gicv3_common.h
+++ b/include/hw/intc/arm_gicv3_common.h
@@ -249,6 +249,7 @@ struct GICv3State {
 uint32_t num_irq;
 uint32_t revision;
 bool lpi_enable;
+bool nmi_support;
 bool security_extn;
 bool force_8bit_prio;
 bool irq_reset_nonsecure;
-- 
2.34.1

[PATCH v13 23/24] target/arm: Add FEAT_NMI to max

2024-04-07 Thread Jinjie Ruan via

Enable FEAT_NMI on the 'max' CPU.

Signed-off-by: Jinjie Ruan 
Reviewed-by: Richard Henderson 
Reviewed-by: Peter Maydell 
---
v12:
- Add Reviewed-by.
v3:
- Add Reviewed-by.
- Sorted to last.
---
 docs/system/arm/emulation.rst | 1 +
 target/arm/tcg/cpu64.c| 1 +
 2 files changed, 2 insertions(+)

diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst
index 2a7bbb82dc..a9ae7ede9f 100644
--- a/docs/system/arm/emulation.rst
+++ b/docs/system/arm/emulation.rst
@@ -64,6 +64,7 @@ the following architecture extensions:
 - FEAT_MTE (Memory Tagging Extension)
 - FEAT_MTE2 (Memory Tagging Extension)
 - FEAT_MTE3 (MTE Asymmetric Fault Handling)
+- FEAT_NMI (Non-maskable Interrupt)
 - FEAT_NV (Nested Virtualization)
 - FEAT_NV2 (Enhanced nested virtualization support)
 - FEAT_PACIMP (Pointer authentication - IMPLEMENTATION DEFINED algorithm)
diff --git a/target/arm/tcg/cpu64.c b/target/arm/tcg/cpu64.c
index 9f7a9f3d2c..62c4663512 100644
--- a/target/arm/tcg/cpu64.c
+++ b/target/arm/tcg/cpu64.c
@@ -1175,6 +1175,7 @@ void aarch64_max_tcg_initfn(Object *obj)
 t = FIELD_DP64(t, ID_AA64PFR1, RAS_FRAC, 0);  /* FEAT_RASv1p1 + 
FEAT_DoubleFault */
 t = FIELD_DP64(t, ID_AA64PFR1, SME, 1);   /* FEAT_SME */
 t = FIELD_DP64(t, ID_AA64PFR1, CSV2_FRAC, 0); /* FEAT_CSV2_2 */
+t = FIELD_DP64(t, ID_AA64PFR1, NMI, 1);   /* FEAT_NMI */
 cpu->isar.id_aa64pfr1 = t;
 
 t = cpu->isar.id_aa64mmfr0;
-- 
2.34.1

[PATCH v13 10/24] hw/intc/arm_gicv3: Add external IRQ lines for NMI

2024-04-07 Thread Jinjie Ruan via

Augment the GICv3's QOM device interface by adding one
new set of sysbus IRQ line, to signal NMI to each CPU.

Signed-off-by: Jinjie Ruan 
Reviewed-by: Richard Henderson 
Reviewed-by: Peter Maydell 
---
v11：
- Add new Reviewed-by.
v4:
- Add Reviewed-by.
v3:
- Add support for VNMI.
---
 hw/intc/arm_gicv3_common.c | 6 ++
 include/hw/intc/arm_gic_common.h   | 2 ++
 include/hw/intc/arm_gicv3_common.h | 2 ++
 3 files changed, 10 insertions(+)

diff --git a/hw/intc/arm_gicv3_common.c b/hw/intc/arm_gicv3_common.c
index cb55c72681..c52f060026 100644
--- a/hw/intc/arm_gicv3_common.c
+++ b/hw/intc/arm_gicv3_common.c
@@ -299,6 +299,12 @@ void gicv3_init_irqs_and_mmio(GICv3State *s, 
qemu_irq_handler handler,
 for (i = 0; i < s->num_cpu; i++) {
 sysbus_init_irq(sbd, >cpu[i].parent_vfiq);
 }
+for (i = 0; i < s->num_cpu; i++) {
+sysbus_init_irq(sbd, >cpu[i].parent_nmi);
+}
+for (i = 0; i < s->num_cpu; i++) {
+sysbus_init_irq(sbd, >cpu[i].parent_vnmi);
+}
 
 memory_region_init_io(>iomem_dist, OBJECT(s), ops, s,
   "gicv3_dist", 0x1);
diff --git a/include/hw/intc/arm_gic_common.h b/include/hw/intc/arm_gic_common.h
index 7080375008..97fea4102d 100644
--- a/include/hw/intc/arm_gic_common.h
+++ b/include/hw/intc/arm_gic_common.h
@@ -71,6 +71,8 @@ struct GICState {
 qemu_irq parent_fiq[GIC_NCPU];
 qemu_irq parent_virq[GIC_NCPU];
 qemu_irq parent_vfiq[GIC_NCPU];
+qemu_irq parent_nmi[GIC_NCPU];
+qemu_irq parent_vnmi[GIC_NCPU];
 qemu_irq maintenance_irq[GIC_NCPU];
 
 /* GICD_CTLR; for a GIC with the security extensions the NS banked version
diff --git a/include/hw/intc/arm_gicv3_common.h 
b/include/hw/intc/arm_gicv3_common.h
index 4e2fb518e7..7324c7d983 100644
--- a/include/hw/intc/arm_gicv3_common.h
+++ b/include/hw/intc/arm_gicv3_common.h
@@ -155,6 +155,8 @@ struct GICv3CPUState {
 qemu_irq parent_fiq;
 qemu_irq parent_virq;
 qemu_irq parent_vfiq;
+qemu_irq parent_nmi;
+qemu_irq parent_vnmi;
 
 /* Redistributor */
 uint32_t level;  /* Current IRQ level */
-- 
2.34.1

[PATCH v13 04/24] target/arm: Implement ALLINT MSR (immediate)

2024-04-07 Thread Jinjie Ruan via

Add ALLINT MSR (immediate) to decodetree, in which the CRm is 0b000x. The
EL0 check is necessary to ALLINT, and the EL1 check is necessary when
imm == 1. So implement it inline for EL2/3, or EL1 with imm==0. Avoid the
unconditional write to pc and use raise_exception_ra to unwind.

Signed-off-by: Jinjie Ruan 
Reviewed-by: Richard Henderson 
Reviewed-by: Peter Maydell 
---
v13:
- Add Reviewed-by.
v10:
- Correct the exception_target_el(env) to 2, since it is a hypervisor trap
  from EL1 to EL2.
v7:
- Add Reviewed-by.
v6:
- Fix DISAS_TOO_MANY to DISAS_UPDATE_EXIT and add the comment.
v5:
- Drop the & 1 in trans_MSR_i_ALLINT().
- Simplify and merge msr_i_allint() and allint_check().
- Rename msr_i_allint() to msr_set_allint_el1().
v4:
- Fix the ALLINT MSR (immediate) decodetree implementation.
- Remove arm_is_el2_enabled() check in allint_check().
- Update env->allint to env->pstate.
- Only call allint_check() when imm == 1.
- Simplify the allint_check() to not pass "op" and extract.
- Implement it inline for EL2/3, or EL1 with imm==0.
- Pass (a->imm & 1) * PSTATE_ALLINT (i64) to simplfy the ALLINT set/clear.
v3:
- Remove EL0 check in allint_check().
- Add TALLINT check for EL1 in allint_check().
- Remove unnecessarily arm_rebuild_hflags() in msr_i_allint helper.
---
 target/arm/tcg/a64.decode  |  1 +
 target/arm/tcg/helper-a64.c| 12 
 target/arm/tcg/helper-a64.h|  1 +
 target/arm/tcg/translate-a64.c | 19 +++
 4 files changed, 33 insertions(+)

diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
index 8a20dce3c8..0e7656fd15 100644
--- a/target/arm/tcg/a64.decode
+++ b/target/arm/tcg/a64.decode
@@ -207,6 +207,7 @@ MSR_i_DIT   1101 0101  0 011 0100  010 1 
@msr_i
 MSR_i_TCO   1101 0101  0 011 0100  100 1 @msr_i
 MSR_i_DAIFSET   1101 0101  0 011 0100  110 1 @msr_i
 MSR_i_DAIFCLEAR 1101 0101  0 011 0100  111 1 @msr_i
+MSR_i_ALLINT1101 0101  0 001 0100 000 imm:1 000 1
 MSR_i_SVCR  1101 0101  0 011 0100 0 mask:2 imm:1 011 1
 
 # MRS, MSR (register), SYS, SYSL. These are all essentially the
diff --git a/target/arm/tcg/helper-a64.c b/target/arm/tcg/helper-a64.c
index 29f3ef274a..0ea8668ab4 100644
--- a/target/arm/tcg/helper-a64.c
+++ b/target/arm/tcg/helper-a64.c
@@ -66,6 +66,18 @@ void HELPER(msr_i_spsel)(CPUARMState *env, uint32_t imm)
 update_spsel(env, imm);
 }
 
+void HELPER(msr_set_allint_el1)(CPUARMState *env)
+{
+/* ALLINT update to PSTATE. */
+if (arm_hcrx_el2_eff(env) & HCRX_TALLINT) {
+raise_exception_ra(env, EXCP_UDEF,
+   syn_aa64_sysregtrap(0, 1, 0, 4, 1, 0x1f, 0), 2,
+   GETPC());
+}
+
+env->pstate |= PSTATE_ALLINT;
+}
+
 static void daif_check(CPUARMState *env, uint32_t op,
uint32_t imm, uintptr_t ra)
 {
diff --git a/target/arm/tcg/helper-a64.h b/target/arm/tcg/helper-a64.h
index 575a5dab7d..0518165399 100644
--- a/target/arm/tcg/helper-a64.h
+++ b/target/arm/tcg/helper-a64.h
@@ -22,6 +22,7 @@ DEF_HELPER_FLAGS_1(rbit64, TCG_CALL_NO_RWG_SE, i64, i64)
 DEF_HELPER_2(msr_i_spsel, void, env, i32)
 DEF_HELPER_2(msr_i_daifset, void, env, i32)
 DEF_HELPER_2(msr_i_daifclear, void, env, i32)
+DEF_HELPER_1(msr_set_allint_el1, void, env)
 DEF_HELPER_3(vfp_cmph_a64, i64, f16, f16, ptr)
 DEF_HELPER_3(vfp_cmpeh_a64, i64, f16, f16, ptr)
 DEF_HELPER_3(vfp_cmps_a64, i64, f32, f32, ptr)
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index 340265beb0..21758b290d 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -2036,6 +2036,25 @@ static bool trans_MSR_i_DAIFCLEAR(DisasContext *s, arg_i 
*a)
 return true;
 }
 
+static bool trans_MSR_i_ALLINT(DisasContext *s, arg_i *a)
+{
+if (!dc_isar_feature(aa64_nmi, s) || s->current_el == 0) {
+return false;
+}
+
+if (a->imm == 0) {
+clear_pstate_bits(PSTATE_ALLINT);
+} else if (s->current_el > 1) {
+set_pstate_bits(PSTATE_ALLINT);
+} else {
+gen_helper_msr_set_allint_el1(tcg_env);
+}
+
+/* Exit the cpu loop to re-evaluate pending IRQs. */
+s->base.is_jmp = DISAS_UPDATE_EXIT;
+return true;
+}
+
 static bool trans_MSR_i_SVCR(DisasContext *s, arg_MSR_i_SVCR *a)
 {
 if (!dc_isar_feature(aa64_sme, s) || a->mask == 0) {
-- 
2.34.1

[PATCH v13 15/24] hw/intc/arm_gicv3: Add irq non-maskable property

2024-04-07 Thread Jinjie Ruan via

A SPI, PPI or SGI interrupt can have non-maskable property. So maintain
non-maskable property in PendingIrq and GICR/GICD. Since add new device
state, it also needs to be migrated, so also save NMI info in
vmstate_gicv3_cpu and vmstate_gicv3.

Signed-off-by: Jinjie Ruan 
Acked-by: Richard Henderson 
Reviewed-by: Peter Maydell 
---
v12:
- nmi_needed -> gicv3_cpu_nmi_needed.
- needed_nmi -> gicv3_nmi_needed.
- Add Reviewed-by.
v11:
- Put vmstate_gicv3_cpu_nmi and vmstate_gicv3_gicd_nmi into existing list.
- Remove the excess != 0.
v10:
- superprio -> nmi, gicr_isuperprio -> gicr_inmir0.
- Save NMI state in vmstate_gicv3_cpu and vmstate_gicv3.
- Update the commit message.
v3:
- Place this ahead of implement GICR_INMIR.
- Add Acked-by.
---
 hw/intc/arm_gicv3_common.c | 38 ++
 include/hw/intc/arm_gicv3_common.h |  4 
 2 files changed, 42 insertions(+)

diff --git a/hw/intc/arm_gicv3_common.c b/hw/intc/arm_gicv3_common.c
index 2d2cea6858..9810558b07 100644
--- a/hw/intc/arm_gicv3_common.c
+++ b/hw/intc/arm_gicv3_common.c
@@ -164,6 +164,24 @@ const VMStateDescription vmstate_gicv3_gicv4 = {
 }
 };
 
+static bool gicv3_cpu_nmi_needed(void *opaque)
+{
+GICv3CPUState *cs = opaque;
+
+return cs->gic->nmi_support;
+}
+
+static const VMStateDescription vmstate_gicv3_cpu_nmi = {
+.name = "arm_gicv3_cpu/nmi",
+.version_id = 1,
+.minimum_version_id = 1,
+.needed = gicv3_cpu_nmi_needed,
+.fields = (const VMStateField[]) {
+VMSTATE_UINT32(gicr_inmir0, GICv3CPUState),
+VMSTATE_END_OF_LIST()
+}
+};
+
 static const VMStateDescription vmstate_gicv3_cpu = {
 .name = "arm_gicv3_cpu",
 .version_id = 1,
@@ -196,6 +214,7 @@ static const VMStateDescription vmstate_gicv3_cpu = {
 _gicv3_cpu_virt,
 _gicv3_cpu_sre_el1,
 _gicv3_gicv4,
+_gicv3_cpu_nmi,
 NULL
 }
 };
@@ -238,6 +257,24 @@ const VMStateDescription 
vmstate_gicv3_gicd_no_migration_shift_bug = {
 }
 };
 
+static bool gicv3_nmi_needed(void *opaque)
+{
+GICv3State *cs = opaque;
+
+return cs->nmi_support;
+}
+
+const VMStateDescription vmstate_gicv3_gicd_nmi = {
+.name = "arm_gicv3/gicd_nmi",
+.version_id = 1,
+.minimum_version_id = 1,
+.needed = gicv3_nmi_needed,
+.fields = (const VMStateField[]) {
+VMSTATE_UINT32_ARRAY(nmi, GICv3State, GICV3_BMP_SIZE),
+VMSTATE_END_OF_LIST()
+}
+};
+
 static const VMStateDescription vmstate_gicv3 = {
 .name = "arm_gicv3",
 .version_id = 1,
@@ -266,6 +303,7 @@ static const VMStateDescription vmstate_gicv3 = {
 },
 .subsections = (const VMStateDescription * const []) {
 _gicv3_gicd_no_migration_shift_bug,
+_gicv3_gicd_nmi,
 NULL
 }
 };
diff --git a/include/hw/intc/arm_gicv3_common.h 
b/include/hw/intc/arm_gicv3_common.h
index 4358c5319c..88533749eb 100644
--- a/include/hw/intc/arm_gicv3_common.h
+++ b/include/hw/intc/arm_gicv3_common.h
@@ -146,6 +146,7 @@ typedef struct {
 int irq;
 uint8_t prio;
 int grp;
+bool nmi;
 } PendingIrq;
 
 struct GICv3CPUState {
@@ -172,6 +173,7 @@ struct GICv3CPUState {
 uint32_t gicr_ienabler0;
 uint32_t gicr_ipendr0;
 uint32_t gicr_iactiver0;
+uint32_t gicr_inmir0;
 uint32_t edge_trigger; /* ICFGR0 and ICFGR1 even bits */
 uint32_t gicr_igrpmodr0;
 uint32_t gicr_nsacr;
@@ -275,6 +277,7 @@ struct GICv3State {
 GIC_DECLARE_BITMAP(active);   /* GICD_ISACTIVER */
 GIC_DECLARE_BITMAP(level);/* Current level */
 GIC_DECLARE_BITMAP(edge_trigger); /* GICD_ICFGR even bits */
+GIC_DECLARE_BITMAP(nmi);  /* GICD_INMIR */
 uint8_t gicd_ipriority[GICV3_MAXIRQ];
 uint64_t gicd_irouter[GICV3_MAXIRQ];
 /* Cached information: pointer to the cpu i/f for the CPUs specified
@@ -314,6 +317,7 @@ GICV3_BITMAP_ACCESSORS(pending)
 GICV3_BITMAP_ACCESSORS(active)
 GICV3_BITMAP_ACCESSORS(level)
 GICV3_BITMAP_ACCESSORS(edge_trigger)
+GICV3_BITMAP_ACCESSORS(nmi)
 
 #define TYPE_ARM_GICV3_COMMON "arm-gicv3-common"
 typedef struct ARMGICv3CommonClass ARMGICv3CommonClass;
-- 
2.34.1

[PATCH v13 20/24] hw/intc/arm_gicv3: Implement NMI interrupt priority

2024-04-07 Thread Jinjie Ruan via

If GICD_CTLR_DS bit is zero and the NMI is non-secure, the NMI priority is
higher than 0x80, otherwise it is higher than 0x0. And save the interrupt
non-maskable property in hppi.nmi to deliver NMI exception. Since both GICR
and GICD can deliver NMI, it is both necessary to check whether the pending
irq is NMI in gicv3_redist_update_noirqset and gicv3_update_noirqset.

Signed-off-by: Jinjie Ruan 
Reviewed-by: Richard Henderson 
Reviewed-by: Peter Maydell 
---
v13:
- Swap the order of the irq and prio args in gicv3_get_priority() to make
  input before output.
- Update the commit message.
v12:
- Fix the typo, "prioirty" -> "priority"
- Update the commit message, hppi.superprio -> hppi.nmi
  super priority -> non-maskable property.
- Add Reviewed-by.
v10:
- has_superprio -> nmi.
- superpriority -> non-maskable property.
- gicr_isuperprio -> gicr_inmir0.
- superprio -> nmi.
v8:
- Add Reviewed-by.
v7:
- Reorder the irqbetter() code for clarity.
- Eliminate the has_superprio local variable for gicv3_get_priority().
- false -> cs->hpplpi.superprio in gicv3_redist_update_noirqset().
- 0x0 -> false in arm_gicv3_common_reset_hold().
- Clear superprio in several places for hppi, hpplpi and hppvlpi.
v6:
- Put the "extract superprio info" logic into gicv3_get_priority().
- Update the comment in irqbetter().
- Reset the cs->hppi.superprio to 0x0.
- Set hppi.superprio to false for LPI.
v4:
- Replace is_nmi with has_superprio to not a mix NMI and superpriority.
- Update the comment in irqbetter().
- Extract gicv3_get_priority() to avoid code repeat.
---
v3:
- Add missing brace
---
 hw/intc/arm_gicv3.c| 67 +-
 hw/intc/arm_gicv3_common.c |  3 ++
 hw/intc/arm_gicv3_redist.c |  3 ++
 3 files changed, 64 insertions(+), 9 deletions(-)

diff --git a/hw/intc/arm_gicv3.c b/hw/intc/arm_gicv3.c
index 0b8f79a122..58e18fff54 100644
--- a/hw/intc/arm_gicv3.c
+++ b/hw/intc/arm_gicv3.c
@@ -21,7 +21,7 @@
 #include "hw/intc/arm_gicv3.h"
 #include "gicv3_internal.h"
 
-static bool irqbetter(GICv3CPUState *cs, int irq, uint8_t prio)
+static bool irqbetter(GICv3CPUState *cs, int irq, uint8_t prio, bool nmi)
 {
 /* Return true if this IRQ at this priority should take
  * precedence over the current recorded highest priority
@@ -30,14 +30,23 @@ static bool irqbetter(GICv3CPUState *cs, int irq, uint8_t 
prio)
  * is the same as this one (a property which the calling code
  * relies on).
  */
-if (prio < cs->hppi.prio) {
-return true;
+if (prio != cs->hppi.prio) {
+return prio < cs->hppi.prio;
+}
+
+/*
+ * The same priority IRQ with non-maskable property should signal to
+ * the CPU as it have the priority higher than the labelled 0x80 or 0x00.
+ */
+if (nmi != cs->hppi.nmi) {
+return nmi;
 }
+
 /* If multiple pending interrupts have the same priority then it is an
  * IMPDEF choice which of them to signal to the CPU. We choose to
  * signal the one with the lowest interrupt number.
  */
-if (prio == cs->hppi.prio && irq <= cs->hppi.irq) {
+if (irq <= cs->hppi.irq) {
 return true;
 }
 return false;
@@ -129,6 +138,40 @@ static uint32_t gicr_int_pending(GICv3CPUState *cs)
 return pend;
 }
 
+static bool gicv3_get_priority(GICv3CPUState *cs, bool is_redist, int irq,
+   uint8_t *prio)
+{
+uint32_t nmi = 0x0;
+
+if (is_redist) {
+nmi = extract32(cs->gicr_inmir0, irq, 1);
+} else {
+nmi = *gic_bmp_ptr32(cs->gic->nmi, irq);
+nmi = nmi & (1 << (irq & 0x1f));
+}
+
+if (nmi) {
+/* DS = 0 & Non-secure NMI */
+if (!(cs->gic->gicd_ctlr & GICD_CTLR_DS) &&
+((is_redist && extract32(cs->gicr_igroupr0, irq, 1)) ||
+ (!is_redist && gicv3_gicd_group_test(cs->gic, irq {
+*prio = 0x80;
+} else {
+*prio = 0x0;
+}
+
+return true;
+}
+
+if (is_redist) {
+*prio = cs->gicr_ipriorityr[irq];
+} else {
+*prio = cs->gic->gicd_ipriority[irq];
+}
+
+return false;
+}
+
 /* Update the interrupt status after state in a redistributor
  * or CPU interface has changed, but don't tell the CPU i/f.
  */
@@ -141,6 +184,7 @@ static void gicv3_redist_update_noirqset(GICv3CPUState *cs)
 uint8_t prio;
 int i;
 uint32_t pend;
+bool nmi = false;
 
 /* Find out which redistributor interrupts are eligible to be
  * signaled to the CPU interface.
@@ -152,10 +196,11 @@ static void gicv3_redist_update_noirqset(GICv3CPUState 
*cs)
 if (!(pend & (1 << i))) {
 continue;
 }
-prio = cs->gicr_ipriorityr[i];
-if (irqbetter(cs, i, prio)) {
+nmi = gicv3_get_priority(cs, true, i, );
+if (irqbetter(cs, i, prio, nmi)) {
 cs->hppi.irq = i;
 cs->hppi.prio = prio;
+cs->hppi.nmi = nmi;

[PATCH v13 18/24] hw/intc/arm_gicv3: Add NMI handling CPU interface registers

2024-04-07 Thread Jinjie Ruan via

Add the NMIAR CPU interface registers which deal with acknowledging NMI.

When introduce NMI interrupt, there are some updates to the semantics for the
register ICC_IAR1_EL1 and ICC_HPPIR1_EL1. For ICC_IAR1_EL1 register, it
should return 1022 if the intid has non-maskable property. And for
ICC_NMIAR1_EL1 register, it should return 1023 if the intid do not have
non-maskable property. Howerever, these are not necessary for ICC_HPPIR1_EL1
register.

And the APR and RPR has NMI bits which should be handled correctly.

Signed-off-by: Jinjie Ruan 
Reviewed-by: Richard Henderson 
Reviewed-by: Peter Maydell 
---
v13:
- Add Reviewed-by.
v12:
- pPriority<63> = ICC_AP1R_EL1NS<63> if HaveNMIExt() and HaveEL(EL3) and
  (IsNonSecure(), fix the wrong writing.
- Do not check nmi_support repetitively in icc_hppi_can_preempt()
  and icc_activate_irq().
- Check hppi.nmi after check icc_hppi_can_preempt() for icc_iar1_read() and
  icc_nmiar1_read().
v11:
- Handle NMI priority in icc_highest_active_prio() and handle NMI RPR in
  icc_rpr_read() separately.
- Only set NMI bit for a NMI and and ordinary priority bit for a non-NMI in
  icc_activate_irq().
- Only clear APR bit for AP1R0 in icc_drop_prio().
- Check special INTID_* in callers instead of passing two extra boolean args
  for ack functions.
- Handle NMI in icc_hppi_can_preempt() and icc_highest_active_group().
- Also check icc_hppi_can_preempt() for icc_nmiar1_read().
v10:
- is_nmi -> nmi.
- is_hppi -> hppi.
- Exchange the order of nmi and hppi parameters.
- superprio -> nmi.
- Handle APR and RPR NMI bits.
- Update the commit message, super priority -> non-maskable property.
v7:
- Add Reviewed-by.
v4:
- Define ICC_NMIAR1_EL1 only if FEAT_GICv3_NMI is implemented.
- Check sctrl_elx.SCTLR_NMI to return 1022 for icc_iar1_read().
- Add gicv3_icc_nmiar1_read() trace event.
- Do not check icc_hppi_can_preempt() for icc_nmiar1_read().
- Add icv_nmiar1_read() and call it when EL2Enabled() and HCR_EL2.IMO == '1'
---
 hw/intc/arm_gicv3_cpuif.c | 137 --
 hw/intc/gicv3_internal.h  |   5 ++
 hw/intc/trace-events  |   1 +
 3 files changed, 138 insertions(+), 5 deletions(-)

diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
index e1a60d8c15..a5a1ef93ca 100644
--- a/hw/intc/arm_gicv3_cpuif.c
+++ b/hw/intc/arm_gicv3_cpuif.c
@@ -795,6 +795,13 @@ static uint64_t icv_iar_read(CPUARMState *env, const 
ARMCPRegInfo *ri)
 return intid;
 }
 
+static uint64_t icv_nmiar1_read(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+/* todo */
+uint64_t intid = INTID_SPURIOUS;
+return intid;
+}
+
 static uint32_t icc_fullprio_mask(GICv3CPUState *cs)
 {
 /*
@@ -832,6 +839,23 @@ static int icc_highest_active_prio(GICv3CPUState *cs)
  */
 int i;
 
+if (cs->gic->nmi_support) {
+/*
+ * If an NMI is active this takes precedence over anything else
+ * for priority purposes; the NMI bit is only in the AP1R0 bit.
+ * We return here the effective priority of the NMI, which is
+ * either 0x0 or 0x80. Callers will need to check NMI again for
+ * purposes of either setting the RPR register bits or for
+ * prioritization of NMI vs non-NMI.
+ */
+if (cs->icc_apr[GICV3_G1][0] & ICC_AP1R_EL1_NMI) {
+return 0;
+}
+if (cs->icc_apr[GICV3_G1NS][0] & ICC_AP1R_EL1_NMI) {
+return (cs->gic->gicd_ctlr & GICD_CTLR_DS) ? 0 : 0x80;
+}
+}
+
 for (i = 0; i < icc_num_aprs(cs); i++) {
 uint32_t apr = cs->icc_apr[GICV3_G0][i] |
 cs->icc_apr[GICV3_G1][i] | cs->icc_apr[GICV3_G1NS][i];
@@ -898,12 +922,24 @@ static bool icc_hppi_can_preempt(GICv3CPUState *cs)
  */
 int rprio;
 uint32_t mask;
+ARMCPU *cpu = ARM_CPU(cs->cpu);
+CPUARMState *env = >env;
 
 if (icc_no_enabled_hppi(cs)) {
 return false;
 }
 
-if (cs->hppi.prio >= cs->icc_pmr_el1) {
+if (cs->hppi.nmi) {
+if (!(cs->gic->gicd_ctlr & GICD_CTLR_DS) &&
+cs->hppi.grp == GICV3_G1NS) {
+if (cs->icc_pmr_el1 < 0x80) {
+return false;
+}
+if (arm_is_secure(env) && cs->icc_pmr_el1 == 0x80) {
+return false;
+}
+}
+} else if (cs->hppi.prio >= cs->icc_pmr_el1) {
 /* Priority mask masks this interrupt */
 return false;
 }
@@ -923,6 +959,12 @@ static bool icc_hppi_can_preempt(GICv3CPUState *cs)
 return true;
 }
 
+if (cs->hppi.nmi && (cs->hppi.prio & mask) == (rprio & mask)) {
+if (!(cs->icc_apr[cs->hppi.grp][0] & ICC_AP1R_EL1_NMI)) {
+return true;
+}
+}
+
 return false;
 }
 
@@ -1044,8 +1086,13 @@ static void icc_activate_irq(GICv3CPUState *cs, int irq)
 int aprbit = prio >> (8 - cs->prebits);
 int regno = aprbit / 32;
 int regbit = aprbit % 32;
+bool nmi = cs->hppi.nmi;
 
-cs->icc_apr[cs->hppi.grp][regno] |= (1 << regbit);

[PATCH v13 05/24] target/arm: Support MSR access to ALLINT

2024-04-07 Thread Jinjie Ruan via

Support ALLINT msr access as follow:
mrs , ALLINT// read allint
msr ALLINT, // write allint with imm

Signed-off-by: Jinjie Ruan 
Reviewed-by: Richard Henderson 
Reviewed-by: Peter Maydell 
---
v13:
- Add Reviewed-by.
v9:
- Move nmi_reginfo and related functions inside an existing ifdef
  TARGET_AARCH64 to solve the --target-list=aarch64-softmmu,arm-softmmu
  compilation problem.
- Check 'isread' when writing to ALLINT.
v5:
- Add Reviewed-by.
v4:
- Remove arm_is_el2_enabled() check in allint_check().
- Change to env->pstate instead of env->allint.
v3:
- Remove EL0 check in aa64_allint_access() which alreay checks in .access
  PL1_RW.
- Use arm_hcrx_el2_eff() in aa64_allint_access() instead of env->cp15.hcrx_el2.
- Make ALLINT msr access function controlled by aa64_nmi.
---
 target/arm/helper.c | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 408922c94d..5ed3eacbea 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -7496,6 +7496,37 @@ static const ARMCPRegInfo rme_mte_reginfo[] = {
   .opc0 = 1, .opc1 = 6, .crn = 7, .crm = 14, .opc2 = 5,
   .access = PL3_W, .type = ARM_CP_NOP },
 };
+
+static void aa64_allint_write(CPUARMState *env, const ARMCPRegInfo *ri,
+  uint64_t value)
+{
+env->pstate = (env->pstate & ~PSTATE_ALLINT) | (value & PSTATE_ALLINT);
+}
+
+static uint64_t aa64_allint_read(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+return env->pstate & PSTATE_ALLINT;
+}
+
+static CPAccessResult aa64_allint_access(CPUARMState *env,
+ const ARMCPRegInfo *ri, bool isread)
+{
+if (!isread && arm_current_el(env) == 1 &&
+(arm_hcrx_el2_eff(env) & HCRX_TALLINT)) {
+return CP_ACCESS_TRAP_EL2;
+}
+return CP_ACCESS_OK;
+}
+
+static const ARMCPRegInfo nmi_reginfo[] = {
+{ .name = "ALLINT", .state = ARM_CP_STATE_AA64,
+  .opc0 = 3, .opc1 = 0, .opc2 = 0, .crn = 4, .crm = 3,
+  .type = ARM_CP_NO_RAW,
+  .access = PL1_RW, .accessfn = aa64_allint_access,
+  .fieldoffset = offsetof(CPUARMState, pstate),
+  .writefn = aa64_allint_write, .readfn = aa64_allint_read,
+  .resetfn = arm_cp_reset_ignore },
+};
 #endif /* TARGET_AARCH64 */
 
 static void define_pmu_regs(ARMCPU *cpu)
@@ -9890,6 +9921,10 @@ void register_cp_regs_for_features(ARMCPU *cpu)
 if (cpu_isar_feature(aa64_nv2, cpu)) {
 define_arm_cp_regs(cpu, nv2_reginfo);
 }
+
+if (cpu_isar_feature(aa64_nmi, cpu)) {
+define_arm_cp_regs(cpu, nmi_reginfo);
+}
 #endif
 
 if (cpu_isar_feature(any_predinv, cpu)) {
-- 
2.34.1

[PATCH v13 03/24] target/arm: Add support for FEAT_NMI, Non-maskable Interrupt

2024-04-07 Thread Jinjie Ruan via

Add support for FEAT_NMI. NMI (FEAT_NMI) is an mandatory feature in
ARMv8.8-A and ARM v9.3-A.

Signed-off-by: Jinjie Ruan 
Reviewed-by: Richard Henderson 
Reviewed-by: Peter Maydell 
---
v13:
- Add Reviewed-by.
v3:
- Add Reviewed-by.
- Adjust to before the MSR patches.
---
 target/arm/internals.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/target/arm/internals.h b/target/arm/internals.h
index dd3da211a3..516e0584bf 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -1229,6 +1229,9 @@ static inline uint32_t aarch64_pstate_valid_mask(const 
ARMISARegisters *id)
 if (isar_feature_aa64_mte(id)) {
 valid |= PSTATE_TCO;
 }
+if (isar_feature_aa64_nmi(id)) {
+valid |= PSTATE_ALLINT;
+}
 
 return valid;
 }
-- 
2.34.1

[PATCH v13 08/24] target/arm: Handle IS/FS in ISR_EL1 for NMI, VINMI and VFNMI

2024-04-07 Thread Jinjie Ruan via

Add IS and FS bit in ISR_EL1 and handle the read. With CPU_INTERRUPT_NMI or
CPU_INTERRUPT_VINMI, both CPSR_I and ISR_IS must be set. With
CPU_INTERRUPT_VFNMI, both CPSR_F and ISR_FS must be set.

Signed-off-by: Jinjie Ruan 
Reviewed-by: Richard Henderson 
Reviewed-by: Peter Maydell 
---
v13:
- Add Reviewed-by.
v9:
- CPU_INTERRUPT_VNMI -> CPU_INTERRUPT_VINMI.
- Handle CPSR_F and ISR_FS according to CPU_INTERRUPT_VFNMI instead of
  CPU_INTERRUPT_VFIQ and HCRX_EL2.VFNMI.
- Update the commit message.
v7:
- env->cp15.hcrx_el2 -> arm_hcrx_el2_eff().
- Add Reviewed-by.
v6:
- Verify that HCR_EL2.VF is set before checking VFNMI.
v4；
- Also handle VNMI.
v3:
- CPU_INTERRUPT_NMI do not set FIQ, so remove it.
- With CPU_INTERRUPT_NMI, both CPSR_I and ISR_IS must be set.
---
 target/arm/cpu.h|  2 ++
 target/arm/helper.c | 13 +
 2 files changed, 15 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 08a6bc50de..97997dbd08 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -1398,6 +1398,8 @@ void pmu_init(ARMCPU *cpu);
 #define CPSR_N (1U << 31)
 #define CPSR_NZCV (CPSR_N | CPSR_Z | CPSR_C | CPSR_V)
 #define CPSR_AIF (CPSR_A | CPSR_I | CPSR_F)
+#define ISR_FS (1U << 9)
+#define ISR_IS (1U << 10)
 
 #define CPSR_IT (CPSR_IT_0_1 | CPSR_IT_2_7)
 #define CACHED_CPSR_BITS (CPSR_T | CPSR_AIF | CPSR_GE | CPSR_IT | CPSR_Q \
diff --git a/target/arm/helper.c b/target/arm/helper.c
index d9814433e1..0e7eefd7e5 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -2021,16 +2021,29 @@ static uint64_t isr_read(CPUARMState *env, const 
ARMCPRegInfo *ri)
 if (cs->interrupt_request & CPU_INTERRUPT_VIRQ) {
 ret |= CPSR_I;
 }
+if (cs->interrupt_request & CPU_INTERRUPT_VINMI) {
+ret |= ISR_IS;
+ret |= CPSR_I;
+}
 } else {
 if (cs->interrupt_request & CPU_INTERRUPT_HARD) {
 ret |= CPSR_I;
 }
+
+if (cs->interrupt_request & CPU_INTERRUPT_NMI) {
+ret |= ISR_IS;
+ret |= CPSR_I;
+}
 }
 
 if (hcr_el2 & HCR_FMO) {
 if (cs->interrupt_request & CPU_INTERRUPT_VFIQ) {
 ret |= CPSR_F;
 }
+if (cs->interrupt_request & CPU_INTERRUPT_VFNMI) {
+ret |= ISR_FS;
+ret |= CPSR_F;
+}
 } else {
 if (cs->interrupt_request & CPU_INTERRUPT_FIQ) {
 ret |= CPSR_F;
-- 
2.34.1

[PATCH v13 12/24] target/arm: Handle NMI in arm_cpu_do_interrupt_aarch64()

2024-04-07 Thread Jinjie Ruan via

According to Arm GIC section 4.6.3 Interrupt superpriority, the interrupt
with superpriority is always IRQ, never FIQ, so the NMI exception trap entry
behave like IRQ. And VINMI(vIRQ with Superpriority) can be raised from the
GIC or come from the hcrx_el2.HCRX_VINMI bit, VFNMI(vFIQ with Superpriority)
come from the hcrx_el2.HCRX_VFNMI bit.

Signed-off-by: Jinjie Ruan 
Reviewed-by: Richard Henderson 
Reviewed-by: Peter Maydell 
---
v13:
- Add Reviewed-by.
v9:
- Update the commit message.
- Handle VINMI and VFNMI.
v7:
- Add Reviewed-by.
v6:
- Not combine VFNMI with CPU_INTERRUPT_VNMI.
v4:
- Also handle VNMI in arm_cpu_do_interrupt_aarch64().
v3:
- Remove the FIQ NMI handle.
---
 target/arm/helper.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 65f2ddfa56..0455f20ccc 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11649,10 +11649,13 @@ static void arm_cpu_do_interrupt_aarch64(CPUState *cs)
 break;
 case EXCP_IRQ:
 case EXCP_VIRQ:
+case EXCP_NMI:
+case EXCP_VINMI:
 addr += 0x80;
 break;
 case EXCP_FIQ:
 case EXCP_VFIQ:
+case EXCP_VFNMI:
 addr += 0x100;
 break;
 case EXCP_VSERR:
-- 
2.34.1

[PATCH v13 17/24] hw/intc/arm_gicv3: Implement GICD_INMIR

2024-04-07 Thread Jinjie Ruan via

Add GICD_INMIR, GICD_INMIRnE register and support access GICD_INMIR0.

Signed-off-by: Jinjie Ruan 
Reviewed-by: Richard Henderson 
Reviewed-by: Peter Maydell 
---
v11:
- Add new Reviewed-by.
v10:
- superprio -> nmi.
v4:
- Make the GICD_INMIR implementation more clearer.
- Udpate the commit message.
v3:
- Add Reviewed-by.
---
 hw/intc/arm_gicv3_dist.c | 34 ++
 hw/intc/gicv3_internal.h |  2 ++
 2 files changed, 36 insertions(+)

diff --git a/hw/intc/arm_gicv3_dist.c b/hw/intc/arm_gicv3_dist.c
index 22ddc0d666..d8207acb22 100644
--- a/hw/intc/arm_gicv3_dist.c
+++ b/hw/intc/arm_gicv3_dist.c
@@ -89,6 +89,29 @@ static int gicd_ns_access(GICv3State *s, int irq)
 return extract32(s->gicd_nsacr[irq / 16], (irq % 16) * 2, 2);
 }
 
+static void gicd_write_bitmap_reg(GICv3State *s, MemTxAttrs attrs,
+  uint32_t *bmp, maskfn *maskfn,
+  int offset, uint32_t val)
+{
+/*
+ * Helper routine to implement writing to a "set" register
+ * (GICD_INMIR, etc).
+ * Semantics implemented here:
+ * RAZ/WI for SGIs, PPIs, unimplemented IRQs
+ * Bits corresponding to Group 0 or Secure Group 1 interrupts RAZ/WI.
+ * offset should be the offset in bytes of the register from the start
+ * of its group.
+ */
+int irq = offset * 8;
+
+if (irq < GIC_INTERNAL || irq >= s->num_irq) {
+return;
+}
+val &= mask_group_and_nsacr(s, attrs, maskfn, irq);
+*gic_bmp_ptr32(bmp, irq) = val;
+gicv3_update(s, irq, 32);
+}
+
 static void gicd_write_set_bitmap_reg(GICv3State *s, MemTxAttrs attrs,
   uint32_t *bmp,
   maskfn *maskfn,
@@ -545,6 +568,11 @@ static bool gicd_readl(GICv3State *s, hwaddr offset,
 /* RAZ/WI since affinity routing is always enabled */
 *data = 0;
 return true;
+case GICD_INMIR ... GICD_INMIR + 0x7f:
+*data = (!s->nmi_support) ? 0 :
+gicd_read_bitmap_reg(s, attrs, s->nmi, NULL,
+ offset - GICD_INMIR);
+return true;
 case GICD_IROUTER ... GICD_IROUTER + 0x1fdf:
 {
 uint64_t r;
@@ -754,6 +782,12 @@ static bool gicd_writel(GICv3State *s, hwaddr offset,
 case GICD_SPENDSGIR ... GICD_SPENDSGIR + 0xf:
 /* RAZ/WI since affinity routing is always enabled */
 return true;
+case GICD_INMIR ... GICD_INMIR + 0x7f:
+if (s->nmi_support) {
+gicd_write_bitmap_reg(s, attrs, s->nmi, NULL,
+  offset - GICD_INMIR, value);
+}
+return true;
 case GICD_IROUTER ... GICD_IROUTER + 0x1fdf:
 {
 uint64_t r;
diff --git a/hw/intc/gicv3_internal.h b/hw/intc/gicv3_internal.h
index 21697ecf39..8d793243f4 100644
--- a/hw/intc/gicv3_internal.h
+++ b/hw/intc/gicv3_internal.h
@@ -52,6 +52,8 @@
 #define GICD_SGIR0x0F00
 #define GICD_CPENDSGIR   0x0F10
 #define GICD_SPENDSGIR   0x0F20
+#define GICD_INMIR   0x0F80
+#define GICD_INMIRnE 0x3B00
 #define GICD_IROUTER 0x6000
 #define GICD_IDREGS  0xFFD0
 
-- 
2.34.1

[PATCH v13 01/24] target/arm: Handle HCR_EL2 accesses for bits introduced with FEAT_NMI

2024-04-07 Thread Jinjie Ruan via

FEAT_NMI defines another three new bits in HCRX_EL2: TALLINT, HCRX_VINMI and
HCRX_VFNMI. When the feature is enabled, allow these bits to be written in
HCRX_EL2.

Signed-off-by: Jinjie Ruan 
Reviewed-by: Richard Henderson 
Reviewed-by: Peter Maydell 
---
v13:
- Add Reviewed-by.
v12:
- Remove the redundant blank line.
v9:
- Declare cpu variable to reuse latter.
v4:
- Update the comment for FEAT_NMI in hcrx_write().
- Update the commit message, s/thress/three/g.
v3:
- Add Reviewed-by.
- Add HCRX_VINMI and HCRX_VFNMI support in HCRX_EL2.
- Upate the commit messsage.
---
 target/arm/cpu-features.h | 5 +
 target/arm/helper.c   | 8 +++-
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/target/arm/cpu-features.h b/target/arm/cpu-features.h
index e5758d9fbc..b300d0446d 100644
--- a/target/arm/cpu-features.h
+++ b/target/arm/cpu-features.h
@@ -681,6 +681,11 @@ static inline bool isar_feature_aa64_sme(const 
ARMISARegisters *id)
 return FIELD_EX64(id->id_aa64pfr1, ID_AA64PFR1, SME) != 0;
 }
 
+static inline bool isar_feature_aa64_nmi(const ARMISARegisters *id)
+{
+return FIELD_EX64(id->id_aa64pfr1, ID_AA64PFR1, NMI) != 0;
+}
+
 static inline bool isar_feature_aa64_tgran4_lpa2(const ARMISARegisters *id)
 {
 return FIELD_SEX64(id->id_aa64mmfr0, ID_AA64MMFR0, TGRAN4) >= 1;
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 3f3a5b55d4..408922c94d 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -6183,13 +6183,19 @@ bool el_is_in_host(CPUARMState *env, int el)
 static void hcrx_write(CPUARMState *env, const ARMCPRegInfo *ri,
uint64_t value)
 {
+ARMCPU *cpu = env_archcpu(env);
 uint64_t valid_mask = 0;
 
 /* FEAT_MOPS adds MSCEn and MCE2 */
-if (cpu_isar_feature(aa64_mops, env_archcpu(env))) {
+if (cpu_isar_feature(aa64_mops, cpu)) {
 valid_mask |= HCRX_MSCEN | HCRX_MCE2;
 }
 
+/* FEAT_NMI adds TALLINT, VINMI and VFNMI */
+if (cpu_isar_feature(aa64_nmi, cpu)) {
+valid_mask |= HCRX_TALLINT | HCRX_VINMI | HCRX_VFNMI;
+}
+
 /* Clear RES0 bits.  */
 env->cp15.hcrx_el2 = value & valid_mask;
 }
-- 
2.34.1

[PATCH v13 06/24] target/arm: Add support for Non-maskable Interrupt

2024-04-07 Thread Jinjie Ruan via

This only implements the external delivery method via the GICv3.

Signed-off-by: Jinjie Ruan 
Reviewed-by: Richard Henderson 
Reviewed-by: Peter Maydell 
---
v13:
- Add Reviewed-by.
v12:
- Correct the comment style in arm_cpu_initfn().
v10:
- In arm_cpu_exec_interrupt(), if SCTLR_ELx.NMI is 0, NMI -> IRQ,
  VINMI -> VIRQ, VFNMI -> VFIQ.
- Make arm_cpu_update_virq() and arm_cpu_update_vfiq() check that it is not a
  VINMI/VFNMI, so only set 1 bit in interrupt_request, not 2.
v9:
- Update the GPIOs passed in the arm_cpu_kvm_set_irq, and update the comment.
- Definitely not merge VINMI and VFNMI into EXCP_VNMI.
- Update VINMI and VFNMI when writing HCR_EL2 or HCRX_EL2.
v8:
- Fix the rcu stall after sending a VNMI in qemu VM.
v7:
- Add Reviewed-by.
v6:
- env->cp15.hcr_el2 -> arm_hcr_el2_eff().
- env->cp15.hcrx_el2 -> arm_hcrx_el2_eff().
- Not include VF && VFNMI in CPU_INTERRUPT_VNMI.
v4:
- Accept NMI unconditionally for arm_cpu_has_work() but add comment.
- Change from & to && for EXCP_IRQ or EXCP_FIQ.
- Refator nmi mask in arm_excp_unmasked().
- Also handle VNMI in arm_cpu_exec_interrupt() and arm_cpu_set_irq().
- Rename virtual to Virtual.
v3:
- Not include CPU_INTERRUPT_NMI when FEAT_NMI not enabled
- Add ARM_CPU_VNMI.
- Refator nmi mask in arm_excp_unmasked().
- Test SCTLR_ELx.NMI for ALLINT mask for NMI.
---
 target/arm/cpu-qom.h   |   5 +-
 target/arm/cpu.c   | 147 ++---
 target/arm/cpu.h   |   6 ++
 target/arm/helper.c|  33 +++--
 target/arm/internals.h |  18 +
 5 files changed, 193 insertions(+), 16 deletions(-)

diff --git a/target/arm/cpu-qom.h b/target/arm/cpu-qom.h
index 8e032691db..b497667d61 100644
--- a/target/arm/cpu-qom.h
+++ b/target/arm/cpu-qom.h
@@ -36,11 +36,14 @@ DECLARE_CLASS_CHECKERS(AArch64CPUClass, AARCH64_CPU,
 #define ARM_CPU_TYPE_SUFFIX "-" TYPE_ARM_CPU
 #define ARM_CPU_TYPE_NAME(name) (name ARM_CPU_TYPE_SUFFIX)
 
-/* Meanings of the ARMCPU object's four inbound GPIO lines */
+/* Meanings of the ARMCPU object's seven inbound GPIO lines */
 #define ARM_CPU_IRQ 0
 #define ARM_CPU_FIQ 1
 #define ARM_CPU_VIRQ 2
 #define ARM_CPU_VFIQ 3
+#define ARM_CPU_NMI 4
+#define ARM_CPU_VINMI 5
+#define ARM_CPU_VFNMI 6
 
 /* For M profile, some registers are banked secure vs non-secure;
  * these are represented as a 2-element array where the first element
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index ab8d007a86..d2dfd36fd4 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -122,6 +122,13 @@ void arm_restore_state_to_opc(CPUState *cs,
 }
 #endif /* CONFIG_TCG */
 
+/*
+ * With SCTLR_ELx.NMI == 0, IRQ with Superpriority is masked identically with
+ * IRQ without Superpriority. Moreover, if the GIC is configured so that
+ * FEAT_GICv3_NMI is only set if FEAT_NMI is set, then we won't ever see
+ * CPU_INTERRUPT_*NMI anyway. So we might as well accept NMI here
+ * unconditionally.
+ */
 static bool arm_cpu_has_work(CPUState *cs)
 {
 ARMCPU *cpu = ARM_CPU(cs);
@@ -129,6 +136,7 @@ static bool arm_cpu_has_work(CPUState *cs)
 return (cpu->power_state != PSCI_OFF)
 && cs->interrupt_request &
 (CPU_INTERRUPT_FIQ | CPU_INTERRUPT_HARD
+ | CPU_INTERRUPT_NMI | CPU_INTERRUPT_VINMI | CPU_INTERRUPT_VFNMI
  | CPU_INTERRUPT_VFIQ | CPU_INTERRUPT_VIRQ | CPU_INTERRUPT_VSERR
  | CPU_INTERRUPT_EXITTB);
 }
@@ -668,6 +676,7 @@ static inline bool arm_excp_unmasked(CPUState *cs, unsigned 
int excp_idx,
 CPUARMState *env = cpu_env(cs);
 bool pstate_unmasked;
 bool unmasked = false;
+bool allIntMask = false;
 
 /*
  * Don't take exceptions if they target a lower EL.
@@ -678,13 +687,36 @@ static inline bool arm_excp_unmasked(CPUState *cs, 
unsigned int excp_idx,
 return false;
 }
 
+if (cpu_isar_feature(aa64_nmi, env_archcpu(env)) &&
+env->cp15.sctlr_el[target_el] & SCTLR_NMI && cur_el == target_el) {
+allIntMask = env->pstate & PSTATE_ALLINT ||
+ ((env->cp15.sctlr_el[target_el] & SCTLR_SPINTMASK) &&
+  (env->pstate & PSTATE_SP));
+}
+
 switch (excp_idx) {
+case EXCP_NMI:
+pstate_unmasked = !allIntMask;
+break;
+
+case EXCP_VINMI:
+if (!(hcr_el2 & HCR_IMO) || (hcr_el2 & HCR_TGE)) {
+/* VINMIs are only taken when hypervized.  */
+return false;
+}
+return !allIntMask;
+case EXCP_VFNMI:
+if (!(hcr_el2 & HCR_FMO) || (hcr_el2 & HCR_TGE)) {
+/* VFNMIs are only taken when hypervized.  */
+return false;
+}
+return !allIntMask;
 case EXCP_FIQ:
-pstate_unmasked = !(env->daif & PSTATE_F);
+pstate_unmasked = (!(env->daif & PSTATE_F)) && (!allIntMask);
 break;
 
 case EXCP_IRQ:
-pstate_unmasked = !(env->daif & PSTATE_I);
+pstate_unmasked = (!(env->daif & PSTATE_I)) && (!allIntMask);
 break;
 
 case EXCP_VFIQ:
@@ -692,13 +724,13 @@ static

[PATCH v13 02/24] target/arm: Add PSTATE.ALLINT

2024-04-07 Thread Jinjie Ruan via

When PSTATE.ALLINT is set, an IRQ or FIQ interrupt that is targeted to
ELx, with or without superpriority is masked. As Richard suggested, place
ALLINT bit in PSTATE in env->pstate.

In the pseudocode, AArch64.ExceptionReturn() calls SetPSTATEFromPSR(), which
treats PSTATE.ALLINT as one of the bits which are reinstated from SPSR to
PSTATE regardless of whether this is an illegal exception return or not. So
handle PSTATE.ALLINT the same way as PSTATE.DAIF in the illegal_return exit
path of the exception_return helper. With the change, exception entry and
return are automatically handled.

Signed-off-by: Jinjie Ruan 
Reviewed-by: Richard Henderson 
Reviewed-by: Peter Maydell 
---
v13:
- Handle PSTATE.ALLINT the same way as PSTATE.DAIF in the illegal_return
  exit path.
- Update the commit message.
- Add Reviewed-by.
v5:
- Remove the ALLINT comment, as it is covered by "all other bits".
- Add Reviewed-by.
v4:
- Keep PSTATE.ALLINT in env->pstate but not env->allint.
- Update the commit message.
v3:
- Remove ALLINT dump in aarch64_cpu_dump_state().
- Update the commit message.
---
 target/arm/cpu.h| 1 +
 target/arm/tcg/helper-a64.c | 4 ++--
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index bc0c84873f..de740d223f 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -1430,6 +1430,7 @@ void pmu_init(ARMCPU *cpu);
 #define PSTATE_D (1U << 9)
 #define PSTATE_BTYPE (3U << 10)
 #define PSTATE_SSBS (1U << 12)
+#define PSTATE_ALLINT (1U << 13)
 #define PSTATE_IL (1U << 20)
 #define PSTATE_SS (1U << 21)
 #define PSTATE_PAN (1U << 22)
diff --git a/target/arm/tcg/helper-a64.c b/target/arm/tcg/helper-a64.c
index ebaa7f00df..29f3ef274a 100644
--- a/target/arm/tcg/helper-a64.c
+++ b/target/arm/tcg/helper-a64.c
@@ -892,8 +892,8 @@ illegal_return:
  */
 env->pstate |= PSTATE_IL;
 env->pc = new_pc;
-spsr &= PSTATE_NZCV | PSTATE_DAIF;
-spsr |= pstate_read(env) & ~(PSTATE_NZCV | PSTATE_DAIF);
+spsr &= PSTATE_NZCV | PSTATE_DAIF | PSTATE_ALLINT;
+spsr |= pstate_read(env) & ~(PSTATE_NZCV | PSTATE_DAIF | PSTATE_ALLINT);
 pstate_write(env, spsr);
 if (!arm_singlestep_active(env)) {
 env->pstate &= ~PSTATE_SS;
-- 
2.34.1

[PATCH v13 07/24] target/arm: Add support for NMI in arm_phys_excp_target_el()

2024-04-07 Thread Jinjie Ruan via

According to Arm GIC section 4.6.3 Interrupt superpriority, the interrupt
with superpriority is always IRQ, never FIQ, so handle NMI same as IRQ in
arm_phys_excp_target_el().

Signed-off-by: Jinjie Ruan 
Reviewed-by: Richard Henderson 
Reviewed-by: Peter Maydell 
---
v13:
- Add Reviewed-by.
v4:
- Add Reviewed-by.
v3:
- Remove nmi_is_irq flag in CPUARMState.
- Handle NMI same as IRQ in arm_phys_excp_target_el().
---
 target/arm/helper.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index b82792f251..d9814433e1 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -10759,6 +10759,7 @@ uint32_t arm_phys_excp_target_el(CPUState *cs, uint32_t 
excp_idx,
 hcr_el2 = arm_hcr_el2_eff(env);
 switch (excp_idx) {
 case EXCP_IRQ:
+case EXCP_NMI:
 scr = ((env->cp15.scr_el3 & SCR_IRQ) == SCR_IRQ);
 hcr = hcr_el2 & HCR_IMO;
 break;
-- 
2.34.1

[PATCH v13 24/24] hw/arm/virt: Add FEAT_GICv3_NMI feature support in virt GIC

2024-04-07 Thread Jinjie Ruan via

FEAT_GICv3_NMI introduces GIC support for non-maskable interrupts (NMIs).
A PE that implements FEAT_NMI and FEAT_GICv3 also implements FEAT_GICv3_NMI.
A PE that does not implement FEAT_NMI, does not implement FEAT_GICv3_NMI.

So included support FEAT_GICv3_NMI feature as part of virt platform
GIC initialization if FEAT_NMI and FEAT_GICv3 supported.

And as Peter suggested, neither KVM nor hvf support FEAT_NMI yet, so add
tcg_enabled() to the condition. Defaulting QEMU to not trying to enable NMI
in the GIC device is the safe option. As and when those accelerators get NMI
support, we can add the handling to QEMU and update this code in the virt board.

Signed-off-by: Jinjie Ruan 
Reviewed-by: Richard Henderson 
Suggested-by: Peter Maydell 
---
v13:
- Check tcg_enabled() for gicv3_nmi_present().
- Update the comment for gicv3_nmi_present().
- Update the commit message.
- Add Suggested-by.
v4:
- Add Reviewed-by.
v3:
- Adjust to be the last after add FEAT_NMI to max.
- Check whether support FEAT_NMI and FEAT_GICv3 for FEAT_GICv3_NMI.
---
 hw/arm/virt.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index dca509d082..a8c9156b24 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -729,6 +729,18 @@ static void create_v2m(VirtMachineState *vms)
 vms->msi_controller = VIRT_MSI_CTRL_GICV2M;
 }
 
+/*
+ * A PE that implements FEAT_NMI and FEAT_GICv3 also implements FEAT_GICv3_NMI.
+ * A PE that does not implement FEAT_NMI, does not implement FEAT_GICv3_NMI.
+ */
+static bool gicv3_nmi_present(VirtMachineState *vms)
+{
+ARMCPU *cpu = ARM_CPU(qemu_get_cpu(0));
+
+return tcg_enabled() && cpu_isar_feature(aa64_nmi, cpu) &&
+   (vms->gic_version != VIRT_GIC_VERSION_2);
+}
+
 static void create_gic(VirtMachineState *vms, MemoryRegion *mem)
 {
 MachineState *ms = MACHINE(vms);
@@ -802,6 +814,11 @@ static void create_gic(VirtMachineState *vms, MemoryRegion 
*mem)
   vms->virt);
 }
 }
+
+if (gicv3_nmi_present(vms)) {
+qdev_prop_set_bit(vms->gic, "has-nmi", true);
+}
+
 gicbusdev = SYS_BUS_DEVICE(vms->gic);
 sysbus_realize_and_unref(gicbusdev, _fatal);
 sysbus_mmio_map(gicbusdev, 0, vms->memmap[VIRT_GIC_DIST].base);
-- 
2.34.1

[PATCH v13 16/24] hw/intc/arm_gicv3_redist: Implement GICR_INMIR0

2024-04-07 Thread Jinjie Ruan via

Add GICR_INMIR0 register and support access GICR_INMIR0.

Signed-off-by: Jinjie Ruan 
Reviewed-by: Richard Henderson 
Reviewed-by: Peter Maydell 
---
v11:
- Add new Reviewed-by.
v10:
- gicr_isuperprio -> gicr_inmir0.
v6:
- Add Reviewed-by.
v4:
- Make the GICR_INMIR0 implementation more clearer.
---
 hw/intc/arm_gicv3_redist.c | 19 +++
 hw/intc/gicv3_internal.h   |  1 +
 2 files changed, 20 insertions(+)

diff --git a/hw/intc/arm_gicv3_redist.c b/hw/intc/arm_gicv3_redist.c
index 8153525849..ed1f9d1e44 100644
--- a/hw/intc/arm_gicv3_redist.c
+++ b/hw/intc/arm_gicv3_redist.c
@@ -35,6 +35,15 @@ static int gicr_ns_access(GICv3CPUState *cs, int irq)
 return extract32(cs->gicr_nsacr, irq * 2, 2);
 }
 
+static void gicr_write_bitmap_reg(GICv3CPUState *cs, MemTxAttrs attrs,
+  uint32_t *reg, uint32_t val)
+{
+/* Helper routine to implement writing to a "set" register */
+val &= mask_group(cs, attrs);
+*reg = val;
+gicv3_redist_update(cs);
+}
+
 static void gicr_write_set_bitmap_reg(GICv3CPUState *cs, MemTxAttrs attrs,
   uint32_t *reg, uint32_t val)
 {
@@ -406,6 +415,10 @@ static MemTxResult gicr_readl(GICv3CPUState *cs, hwaddr 
offset,
 *data = value;
 return MEMTX_OK;
 }
+case GICR_INMIR0:
+*data = cs->gic->nmi_support ?
+gicr_read_bitmap_reg(cs, attrs, cs->gicr_inmir0) : 0;
+return MEMTX_OK;
 case GICR_ICFGR0:
 case GICR_ICFGR1:
 {
@@ -555,6 +568,12 @@ static MemTxResult gicr_writel(GICv3CPUState *cs, hwaddr 
offset,
 gicv3_redist_update(cs);
 return MEMTX_OK;
 }
+case GICR_INMIR0:
+if (cs->gic->nmi_support) {
+gicr_write_bitmap_reg(cs, attrs, >gicr_inmir0, value);
+}
+return MEMTX_OK;
+
 case GICR_ICFGR0:
 /* Register is all RAZ/WI or RAO/WI bits */
 return MEMTX_OK;
diff --git a/hw/intc/gicv3_internal.h b/hw/intc/gicv3_internal.h
index 8f4ebed2f4..21697ecf39 100644
--- a/hw/intc/gicv3_internal.h
+++ b/hw/intc/gicv3_internal.h
@@ -110,6 +110,7 @@
 #define GICR_ICFGR1   (GICR_SGI_OFFSET + 0x0C04)
 #define GICR_IGRPMODR0(GICR_SGI_OFFSET + 0x0D00)
 #define GICR_NSACR(GICR_SGI_OFFSET + 0x0E00)
+#define GICR_INMIR0   (GICR_SGI_OFFSET + 0x0F80)
 
 /* VLPI redistributor registers, offsets from VLPI_base */
 #define GICR_VPROPBASER   (GICR_VLPI_OFFSET + 0x70)
-- 
2.34.1

[PATCH v13 14/24] hw/intc/arm_gicv3_kvm: Not set has-nmi=true for the KVM GICv3

2024-04-07 Thread Jinjie Ruan via

So far, there is no FEAT_GICv3_NMI support in the in-kernel GIC, so make it
an error to try to set has-nmi=true for the KVM GICv3.

Signed-off-by: Jinjie Ruan 
Suggested-by: Peter Maydell 
---
 hw/intc/arm_gicv3_kvm.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/hw/intc/arm_gicv3_kvm.c b/hw/intc/arm_gicv3_kvm.c
index 77eb37e131..00a383079b 100644
--- a/hw/intc/arm_gicv3_kvm.c
+++ b/hw/intc/arm_gicv3_kvm.c
@@ -805,6 +805,11 @@ static void kvm_arm_gicv3_realize(DeviceState *dev, Error 
**errp)
 return;
 }
 
+if (s->nmi_support) {
+error_setg(errp, "NMI is not supported with the in-kernel GIC");
+return;
+}
+
 gicv3_init_irqs_and_mmio(s, kvm_arm_gicv3_set_irq, NULL);
 
 for (i = 0; i < s->num_cpu; i++) {
-- 
2.34.1

[PATCH v13 21/24] hw/intc/arm_gicv3: Report the NMI interrupt in gicv3_cpuif_update()

2024-04-07 Thread Jinjie Ruan via

In CPU Interface, if the IRQ has the non-maskable property, report NMI to
the corresponding PE.

Signed-off-by: Jinjie Ruan 
Reviewed-by: Richard Henderson 
Reviewed-by: Peter Maydell 
---
v12:
- Add Reviewed-by.
v10:
- superprio -> nmi.
- Update the commit message, superpriority -> non-maskable.
v6:
- Add Reviewed-by.
v4:
- Swap the ordering of the IFs.
v3:
- Remove handling nmi_is_irq flag.
---
 hw/intc/arm_gicv3_cpuif.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
index dff88c4283..60e2d9ec9c 100644
--- a/hw/intc/arm_gicv3_cpuif.c
+++ b/hw/intc/arm_gicv3_cpuif.c
@@ -1037,6 +1037,7 @@ void gicv3_cpuif_update(GICv3CPUState *cs)
 /* Tell the CPU about its highest priority pending interrupt */
 int irqlevel = 0;
 int fiqlevel = 0;
+int nmilevel = 0;
 ARMCPU *cpu = ARM_CPU(cs->cpu);
 CPUARMState *env = >env;
 
@@ -1075,6 +1076,8 @@ void gicv3_cpuif_update(GICv3CPUState *cs)
 
 if (isfiq) {
 fiqlevel = 1;
+} else if (cs->hppi.nmi) {
+nmilevel = 1;
 } else {
 irqlevel = 1;
 }
@@ -1084,6 +1087,7 @@ void gicv3_cpuif_update(GICv3CPUState *cs)
 
 qemu_set_irq(cs->parent_fiq, fiqlevel);
 qemu_set_irq(cs->parent_irq, irqlevel);
+qemu_set_irq(cs->parent_nmi, nmilevel);
 }
 
 static uint64_t icc_pmr_read(CPUARMState *env, const ARMCPRegInfo *ri)
-- 
2.34.1

[PATCH v13 00/24] target/arm: Implement FEAT_NMI and FEAT_GICv3_NMI

2024-04-07 Thread Jinjie Ruan via

This patch set implements FEAT_NMI and FEAT_GICv3_NMI for ARMv8. These
introduce support for a new category of interrupts in the architecture
which we can use to provide NMI like functionality.

There are two modes for using this FEAT_NMI. When PSTATE.ALLINT or
PSTATE.SP & SCTLR_ELx.SCTLR_SPINTMASK is set, any entry to ELx causes all
interrupts including those with superpriority to be masked on entry to ELn
until the mask is explicitly removed by software or hardware. PSTATE.ALLINT
can be managed by software using the new register control ALLINT.ALLINT.
Independent controls are provided for this feature at each EL, usage at EL1
should not disrupt EL2 or EL3.

I have tested it with the following Linux patches which try to support
FEAT_NMI in Linux kernel:


https://lore.kernel.org/linux-arm-kernel/Y4sH5qX5bK9xfEBp@lpieralisi/T/#mb4ba4a2c045bf72c10c2202c1dd1b82d3240dc88

In the test, SGI, PPI and SPI interrupts can all be set to have super priority
to be converted to a hardware NMI interrupt. The SGI is tested with kernel
IPI as NMI framework, softlockup, hardlockup and kgdb test cases, and the PPI
interrupt is tested with "perf top" command with hardware NMI enabled, and
the SPI interrupt is tested with a custom test module, in which NMI interrupts
can be received and sent normally.

And the Virtual NMI(VNMI) SGI, PPI and SPI interrupts has also been tested in
nested QEMU Virtual Machine with host "virtualization=true". The SGI VNMI is
tested by accessing GICR_INMIR0 and GICR_ISPENDR0 with devmem command, as well
as hardlockup and kgdb testcases. The PPI VNMI is tested by accessing
GICR_INMIR0 and GICR_ISPENDR0 with devmem command, as well as "perf top"
command with hardware NMI enabled, which works well. The SPI VNMI is tested
with a custom test module, in which SPI VNMI can be sent from the GIC and
received normally.

 +-+
 |   Distributor   |
 +-+
 SPI |  NMI|  NMI
\/\/
++ +---+
| Redist | | Redist|
++ +---+
SGI  | NMI PPI | NMI
\/\/
  +-+ +---+
  |CPU Interface|   ...   | CPU Interface |
  +-+ +---+
   | NMI | NMI
  \/\/
+-+   +-+
|  PE |   |  PE |
+-+   +-+

Changes in v13:
- Handle PSTATE.ALLINT the same way as PSTATE.DAIF in the illegal_return
  exit path.
- Adjust "hw/arm/virt: Wire NMI and VINMI irq lines from GIC to CPU" to after
  "hw/intc/arm_gicv3: Add external IRQ lines for NMI" to fix the unexpected
  error with patchseries at this point.
- Only connect NMI irq lines from GIC to CPU if vms->gic_version is not
  VIRT_GIC_VERSION_2 to fix the gic-version=2 unexpected error.
- Enforce RES0 bit in NMI field when FEAT_GICv3_NMI is not implemented for
  ICH_LR_EL2.
- Swap the order of the "irq" and "prio" args in gicv3_get_priority() to make
  input before output.
- Check tcg_enabled() for gicv3_nmi_present().
- Update the comment for gicv3_nmi_present().
- Update the commit message.
- Add Reviewed-by.
- Add Suggested-by.

Changes in v12:
- pPriority<63> = ICC_AP1R_EL1NS<63> if HaveNMIExt() and HaveEL(EL3) and
  (IsNonSecure(), fix the wrong writing.
- Do not check nmi_support repetitively in icc_hppi_can_preempt(),
  and icc_activate_irq, ich_highest_active_virt_prio(), hppvi_index(),
  icv_hppi_can_preempt(), icv_rpr_read() and icv_activate_irq(),
  gicv3_cpuif_virt_irq_fiq_update().
- Check hppi.nmi after check icc_hppi_can_preempt() for icc_iar1_read() and
  icc_nmiar1_read().
- When NMI is 1, the virtual interrupt's priority is 0x0.
- Make the thisnmi logic more concisely in hppvi_index().
- Use is_nmi to simplify code and check is_nmi before comparing vpmr.
- Also check sctlrx.NMI in icv_iar_read().
- Check icv_hppi_can_preempt() for icv_nmiar1_read().
- Check ICH_LR_EL2.NMI after check icv_hppi_can_preempt() as icv_iar_read()
  do it in icv_nmiar1_read().
- Correct thisnmi to bool in icv_eoir_write().
- Check thisnmi and nmi both true instead of identical in icv_eoir_write().
- nmi_needed -> gicv3_cpu_nmi_needed, needed_nmi -> gicv3_nmi_needed.
- Correct the comment style in arm_cpu_initfn() and icv_nmiar1_read().
- Fix the typo, "prioirty" -> "priority".
- Remove the redundant blank line.
- Update the subject and commit message, hppi.superprio -> hppi.nmi,
  super priority -> non-maskable property.
- Add Reviewed-by.

Changes in v11:
- Put vmstate_gicv3_cpu_nmi and

[PATCH for-9.0] MAINTAINERS: Adjust migration documentation files

2024-04-07 Thread Avihai Horon

Commit 8cb2f8b172e7 ("docs/migration: Create migration/ directory")
changed migration documentation file structure but forgot to update the
entries in the MAINTAINERS file.

Commit 4c6f8a79ae53 ("docs/migration: Split 'dirty limit'") extracted
dirty limit documentation to a new file without updating dirty limit
section in MAINTAINERS file.

Fix the above.

Fixes: 8cb2f8b172e7 ("docs/migration: Create migration/ directory")
Fixes: 4c6f8a79ae53 ("docs/migration: Split 'dirty limit'")
Signed-off-by: Avihai Horon 
---
 MAINTAINERS | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index e71183eef9..d3fc2a06e3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2170,7 +2170,7 @@ S: Supported
 F: hw/vfio/*
 F: include/hw/vfio/
 F: docs/igd-assign.txt
-F: docs/devel/vfio-migration.rst
+F: docs/devel/migration/vfio.rst
 
 vfio-ccw
 M: Eric Farman 
@@ -2231,6 +2231,7 @@ F: qapi/virtio.json
 F: net/vhost-user.c
 F: include/hw/virtio/
 F: docs/devel/virtio*
+F: docs/devel/migration/virtio.rst
 
 virtio-balloon
 M: Michael S. Tsirkin 
@@ -3422,7 +3423,7 @@ F: migration/
 F: scripts/vmstate-static-checker.py
 F: tests/vmstate-static-checker-data/
 F: tests/qtest/migration-test.c
-F: docs/devel/migration.rst
+F: docs/devel/migration/
 F: qapi/migration.json
 F: tests/migration/
 F: util/userfaultfd.c
@@ -3442,6 +3443,7 @@ F: include/sysemu/dirtylimit.h
 F: migration/dirtyrate.c
 F: migration/dirtyrate.h
 F: include/sysemu/dirtyrate.h
+F: docs/devel/migration/dirty-limit.rst
 
 Detached LUKS header
 M: Hyman Huang 
-- 
2.26.3

Re: [PATCH net v4] virtio_net: Do not send RSS key if it is not supported

2024-04-07 Thread Xuan Zhuo

On Wed,  3 Apr 2024 08:43:12 -0700, Breno Leitao  wrote:
> There is a bug when setting the RSS options in virtio_net that can break
> the whole machine, getting the kernel into an infinite loop.
>
> Running the following command in any QEMU virtual machine with virtionet
> will reproduce this problem:
>
> # ethtool -X eth0  hfunc toeplitz
>
> This is how the problem happens:
>
> 1) ethtool_set_rxfh() calls virtnet_set_rxfh()
>
> 2) virtnet_set_rxfh() calls virtnet_commit_rss_command()
>
> 3) virtnet_commit_rss_command() populates 4 entries for the rss
> scatter-gather
>
> 4) Since the command above does not have a key, then the last
> scatter-gatter entry will be zeroed, since rss_key_size == 0.
> sg_buf_size = vi->rss_key_size;
>
> 5) This buffer is passed to qemu, but qemu is not happy with a buffer
> with zero length, and do the following in virtqueue_map_desc() (QEMU
> function):
>
>   if (!sz) {
>   virtio_error(vdev, "virtio: zero sized buffers are not allowed");
>
> 6) virtio_error() (also QEMU function) set the device as broken
>
> vdev->broken = true;
>
> 7) Qemu bails out, and do not repond this crazy kernel.
>
> 8) The kernel is waiting for the response to come back (function
> virtnet_send_command())
>
> 9) The kernel is waiting doing the following :
>
>   while (!virtqueue_get_buf(vi->cvq, ) &&
>!virtqueue_is_broken(vi->cvq))
> cpu_relax();
>
> 10) None of the following functions above is true, thus, the kernel
> loops here forever. Keeping in mind that virtqueue_is_broken() does
> not look at the qemu `vdev->broken`, so, it never realizes that the
> vitio is broken at QEMU side.
>
> Fix it by not sending RSS commands if the feature is not available in
> the device.
>
> Fixes: c7114b1249fa ("drivers/net/virtio_net: Added basic RSS support.")
> Cc: sta...@vger.kernel.org
> Cc: qemu-devel@nongnu.org
> Signed-off-by: Breno Leitao 
> Reviewed-by: Heng Qi 

Reviewed-by: Xuan Zhuo 

> ---
> Changelog:
>
> V2:
>   * Moved from creating a valid packet, by rejecting the request
> completely.
> V3:
>   * Got some good feedback from and Xuan Zhuo and Heng Qi, and reworked
> the rejection path.
> V4:
>   * Added a comment in an "if" clause, as suggested by Michael S. Tsirkin.
>
> ---
>  drivers/net/virtio_net.c | 26 ++
>  1 file changed, 22 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index c22d1118a133..115c3c5414f2 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -3807,6 +3807,7 @@ static int virtnet_set_rxfh(struct net_device *dev,
>   struct netlink_ext_ack *extack)
>  {
>   struct virtnet_info *vi = netdev_priv(dev);
> + bool update = false;
>   int i;
>
>   if (rxfh->hfunc != ETH_RSS_HASH_NO_CHANGE &&
> @@ -3814,13 +3815,28 @@ static int virtnet_set_rxfh(struct net_device *dev,
>   return -EOPNOTSUPP;
>
>   if (rxfh->indir) {
> + if (!vi->has_rss)
> + return -EOPNOTSUPP;
> +
>   for (i = 0; i < vi->rss_indir_table_size; ++i)
>   vi->ctrl->rss.indirection_table[i] = rxfh->indir[i];
> + update = true;
>   }
> - if (rxfh->key)
> +
> + if (rxfh->key) {
> + /* If either _F_HASH_REPORT or _F_RSS are negotiated, the
> +  * device provides hash calculation capabilities, that is,
> +  * hash_key is configured.
> +  */
> + if (!vi->has_rss && !vi->has_rss_hash_report)
> + return -EOPNOTSUPP;
> +
>   memcpy(vi->ctrl->rss.key, rxfh->key, vi->rss_key_size);
> + update = true;
> + }
>
> - virtnet_commit_rss_command(vi);
> + if (update)
> + virtnet_commit_rss_command(vi);
>
>   return 0;
>  }
> @@ -4729,13 +4745,15 @@ static int virtnet_probe(struct virtio_device *vdev)
>   if (virtio_has_feature(vdev, VIRTIO_NET_F_HASH_REPORT))
>   vi->has_rss_hash_report = true;
>
> - if (virtio_has_feature(vdev, VIRTIO_NET_F_RSS))
> + if (virtio_has_feature(vdev, VIRTIO_NET_F_RSS)) {
>   vi->has_rss = true;
>
> - if (vi->has_rss || vi->has_rss_hash_report) {
>   vi->rss_indir_table_size =
>   virtio_cread16(vdev, offsetof(struct virtio_net_config,
>   rss_max_indirection_table_length));
> + }
> +
> + if (vi->has_rss || vi->has_rss_hash_report) {
>   vi->rss_key_size =
>   virtio_cread8(vdev, offsetof(struct virtio_net_config, 
> rss_max_key_size));
>
> --
> 2.43.0
>

Re: [PATCH 1/1] virtio-pci: Fix the crash when the vector changes back from VIRTIO_NO_VECTOR

2024-04-07 Thread Cindy Lu

On Sun, Apr 7, 2024 at 12:20 PM Jason Wang  wrote:
>
> On Tue, Apr 2, 2024 at 11:02 PM Cindy Lu  wrote:
> >
> > When the guest calls virtio_stop and then virtio_reset,
>
> Guests could not call those functions directly, it is triggered by for
> example writing to some of the registers like reset or others.
>
sure , Will fix this
> > the vector will change
> > to VIRTIO_NO_VECTOR and the IRQFD for this vector will be released. After 
> > that
> > If you want to change the vector back,
>
> What do you mean by "change the vector back"? Something like
>
> assign VIRTIO_MSI_NO_VECTOR to vector 0
> assign X to vector 0
>
yes, the process is something  like

set config_vector = VIRTIO_MSI_NO_VECTOR
...
set config_vector = 0
> And I guess what you meant is to configure the vector after DRIVER_OK.

>
>
> > it will cause a crash.
> >
> > To fix this, we need to call the function "kvm_virtio_pci_vector_use_one()"
> > when the vector changes back from VIRTIO_NO_VECTOR
> >
> > Signed-off-by: Cindy Lu 
> > ---
> >  hw/virtio/virtio-pci.c | 31 ---
> >  1 file changed, 28 insertions(+), 3 deletions(-)
> >
> > diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> > index e433879542..45f3ab38c3 100644
> > --- a/hw/virtio/virtio-pci.c
> > +++ b/hw/virtio/virtio-pci.c
> > @@ -874,12 +874,14 @@ static int virtio_pci_get_notifier(VirtIOPCIProxy 
> > *proxy, int queue_no,
> >  return 0;
> >  }
> >
> > -static int kvm_virtio_pci_vector_use_one(VirtIOPCIProxy *proxy, int 
> > queue_no)
> > +static int kvm_virtio_pci_vector_use_one(VirtIOPCIProxy *proxy, int 
> > queue_no,
> > + bool recovery)
> >  {
> >  unsigned int vector;
> >  int ret;
> >  EventNotifier *n;
> >  PCIDevice *dev = >pci_dev;
> > +VirtIOIRQFD *irqfd;
> >  VirtIODevice *vdev = virtio_bus_get_device(>bus);
> >  VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
> >
> > @@ -890,10 +892,21 @@ static int 
> > kvm_virtio_pci_vector_use_one(VirtIOPCIProxy *proxy, int queue_no)
> >  if (vector >= msix_nr_vectors_allocated(dev)) {
> >  return 0;
> >  }
> > +/*
> > + * if this is recovery and irqfd still in use, means the irqfd was not
> > + * release before and don't need to set up again
> > + */
> > +if (recovery) {
> > +irqfd = >vector_irqfd[vector];
> > +if (irqfd->users != 0) {
> > +return 0;
> > +}
> > +}
> >  ret = kvm_virtio_pci_vq_vector_use(proxy, vector);
> >  if (ret < 0) {
> >  goto undo;
> >  }
> > +
> >  /*
> >   * If guest supports masking, set up irqfd now.
> >   * Otherwise, delay until unmasked in the frontend.
> > @@ -932,14 +945,14 @@ static int 
> > kvm_virtio_pci_vector_vq_use(VirtIOPCIProxy *proxy, int nvqs)
> >  if (!virtio_queue_get_num(vdev, queue_no)) {
> >  return -1;
> >  }
> > -ret = kvm_virtio_pci_vector_use_one(proxy, queue_no);
> > +ret = kvm_virtio_pci_vector_use_one(proxy, queue_no, false);
> >  }
> >  return ret;
> >  }
> >
> >  static int kvm_virtio_pci_vector_config_use(VirtIOPCIProxy *proxy)
> >  {
> > -return kvm_virtio_pci_vector_use_one(proxy, VIRTIO_CONFIG_IRQ_IDX);
> > +return kvm_virtio_pci_vector_use_one(proxy, VIRTIO_CONFIG_IRQ_IDX, 
> > false);
> >  }
> >
> >  static void kvm_virtio_pci_vector_release_one(VirtIOPCIProxy *proxy,
> > @@ -1570,7 +1583,13 @@ static void virtio_pci_common_write(void *opaque, 
> > hwaddr addr,
> >  } else {
> >  val = VIRTIO_NO_VECTOR;
> >  }
> > +vector = vdev->config_vector;
> >  vdev->config_vector = val;
> > +/*check if the vector need to recovery*/
> > +if ((val != VIRTIO_NO_VECTOR) && (vector == VIRTIO_NO_VECTOR) &&
> > +(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > +kvm_virtio_pci_vector_use_one(proxy, VIRTIO_CONFIG_IRQ_IDX, 
> > true);
> > +}
>
> This looks too tricky.
>
> Think hard of this. I think it's better to split this into two parts:
>
> 1) a series that disables config irqfd for vhost-net, this series
> needs to be backported to -stable which needs to be conservative. It
> looks more like your V1, but let's add a boolean for pci proxy.
sure, I wll try this

> 2) a series that deal with the msix vector configuration after
> driver_ok, we probably need some refactoring to do per vq use instead
> of the current loop in DRIVER_OK
>
Sorry I didn't get what you mean , Do you mean we need to move the check
to inside kvm_virtio_pci_vector_vq_use()？
Thanks
cindy
> Does this make sense?
>
> Thanks
>
> >  break;
> >  case VIRTIO_PCI_COMMON_STATUS:
> >  if (!(val & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > @@ -1611,6 +1630,12 @@ static void virtio_pci_common_write(void *opaque, 
> > hwaddr addr,
> >  val = VIRTIO_NO_VECTOR;
> >  }
> >  virtio_queue_set_vector(vdev,

Re: [PATCH v12 23/23] hw/arm/virt: Add FEAT_GICv3_NMI feature support in virt GIC

2024-04-07 Thread Jinjie Ruan via




On 2024/4/5 21:48, Peter Maydell wrote:
> On Wed, 3 Apr 2024 at 11:18, Jinjie Ruan  wrote:
>>
>> A PE that implements FEAT_NMI and FEAT_GICv3 also implements
>> FEAT_GICv3_NMI. A PE that does not implement FEAT_NMI, does not implement
>> FEAT_GICv3_NMI
> 
> This is true but not really relevant here -- FEAT_GICv3_NMI
> is not "NMI support in the GIC", it's "does the CPU interface
> support NMIs". (And so I'm wondering if the code in arm_gicv3_cpuif.c
> should be checking cpu_isar_feature(aa64_nmi, cpu) rather than
> cs->gic->nmi_support; but I need to think through the consequences
> of that first.)

In "1.1.2 GIC architecture extensions", it said:

FEAT_GICv3_NMI introduces GIC support for non-maskable interrupts (NMIs).

So in my opinion, it is relevant here.

> 
> The justification for "enable NMIs in the GIC device if the
> CPU has FEAT_NMI" is that (a) it's only OK to have a GIC with
> NMI support if the CPU also has NMI support and (b) if we
> can turn on NMI support in the GIC we should, so that we can
> provide the feature to the guest.
> 
>> So included support FEAT_GICv3_NMI feature as part of virt platform
>> GIC initialization if FEAT_NMI and FEAT_GICv3 supported.
>>
>> Signed-off-by: Jinjie Ruan 
>> Reviewed-by: Richard Henderson 
>> ---
>> v4:
>> - Add Reviewed-by.
>> v3:
>> - Adjust to be the last after add FEAT_NMI to max.
>> - Check whether support FEAT_NMI and FEAT_GICv3 for FEAT_GICv3_NMI.
>> ---
>>  hw/arm/virt.c | 18 ++
>>  1 file changed, 18 insertions(+)
>>
>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>> index ef2e6c2c4d..63d9f5b553 100644
>> --- a/hw/arm/virt.c
>> +++ b/hw/arm/virt.c
>> @@ -729,6 +729,19 @@ static void create_v2m(VirtMachineState *vms)
>>  vms->msi_controller = VIRT_MSI_CTRL_GICV2M;
>>  }
>>
>> +/*
>> + * A PE that implements FEAT_NMI and FEAT_GICv3 also implements
>> + * FEAT_GICv3_NMI. A PE that does not implement FEAT_NMI, does not implement
>> + * FEAT_GICv3_NMI.
>> + */
>> +static bool gicv3_nmi_present(VirtMachineState *vms)
>> +{
>> +ARMCPU *cpu = ARM_CPU(qemu_get_cpu(0));
>> +
>> +return cpu_isar_feature(aa64_nmi, cpu) &&
>> +   (vms->gic_version != VIRT_GIC_VERSION_2);
> 
> I think we should add tcg_enabled() to this condition:
> neither KVM nor hvf support FEAT_NMI yet. Defaulting QEMU to
> not trying to enable NMI in the GIC device is the safe
> option. As and when those accelerators get NMI support, we
> can add the handling to QEMU and update this code in the virt board.
> 
> thanks
> -- PMM

55 matches

Mail list logo