On Thu, Mar 28, 2024 at 12:14 PM Jason Wang <jasow...@redhat.com> wrote: > > On Wed, Mar 27, 2024 at 5:44 PM Cindy Lu <l...@redhat.com> wrote: > > > > On Wed, Mar 27, 2024 at 5:13 PM Jason Wang <jasow...@redhat.com> wrote: > > > > > > On Wed, Mar 27, 2024 at 5:12 PM Jason Wang <jasow...@redhat.com> wrote: > > > > > > > > On Wed, Mar 27, 2024 at 4:28 PM Cindy Lu <l...@redhat.com> wrote: > > > > > > > > > > On Wed, Mar 27, 2024 at 3:54 PM Jason Wang <jasow...@redhat.com> > > > > > wrote: > > > > > > > > > > > > On Wed, Mar 27, 2024 at 2:03 PM Cindy Lu <l...@redhat.com> wrote: > > > > > > > > > > > > > > On Wed, Mar 27, 2024 at 11:05 AM Jason Wang <jasow...@redhat.com> > > > > > > > wrote: > > > > > > > > > > > > > > > > Hi Cindy: > > > > > > > > > > > > > > > > On Wed, Mar 27, 2024 at 9:29 AM Cindy Lu <l...@redhat.com> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > we need a crash in Non-standard image, here is the jira for > > > > > > > > > this https://issues.redhat.com/browse/RHEL-28522 > > > > > > > > > The root cause of the issue is that an IRQFD was used without > > > > > > > > > initialization.. > > > > > > > > > > > > > > > > > > During the booting process of the Vyatta image, the behavior > > > > > > > > > of the called function in qemu is as follows: > > > > > > > > > > > > > > > > > > 1. vhost_net_stop() was called, this will call the function > > > > > > > > > virtio_pci_set_guest_notifiers() with assgin= false, and > > > > > > > > > virtio_pci_set_guest_notifiers() will release the irqfd for > > > > > > > > > vector 0 > > > > > > > > > > > > > > > > Before vhost_net_stop(), do we know which vector is used by > > > > > > > > which queue? > > > > > > > > > > > > > > > before this stop, vdev->config_verctor is get from > > > > > > > virtio_pci_common_read/virtio_pci_common_write > > > > > > > it was set to vector 0 > > > > > > > > > > > > I basically meant if vector 0 is shared with some virtqueues here. > > > > > > > > > > > Really sorry for this, vq's vector is 1,2, and will not share with the > > > > > configure vector > > > > > > > > > > > > > > > > > > 2. virtio_reset() was called -->set configure vector to > > > > > > > > > VIRTIO_NO_VECTORt > > > > > > > > > > > > > > > > > > 3.vhost_net_start() was called (at this time the configure > > > > > > > > > vector is > > > > > > > > > still VIRTIO_NO_VECTOR) and call > > > > > > > > > virtio_pci_set_guest_notifiers() with > > > > > > > > > assgin= true, so the irqfd for vector 0 was not "init" during > > > > > > > > > this process > > > > > > > > > > > > > > > > How does the configure vector differ from the virtqueue vector > > > > > > > > here? > > > > > > > > > > > > > > > All the vectors are VIRTIO_NO_VECTOR (including vq). any > > > > > > > msix_fire_vector_notifier() > > > > > > > been called will cause the crash at this time. > > > > > > > > > > > > Won't virtio_pci_set_guest_notifiers() will try to allocate irqfd > > > > > > when > > > > > > the assignment is true? > > > > > > > > > > > It will allocate, but the vector is VIRTIO_NO_VECTOR (0xffff) > > > > > > > > > > then it will called kvm_virtio_pci_vector_use_one() > > > > > > > > > > in this function, there is a check for > > > > > > > > > > if (vector >= msix_nr_vectors_allocated(dev)) > > > > > > > > > > { return 0; } > > > > > > > > > > So it will return. > > > > > > > > How about let's just fix this? > > > > > > Btw, another question, how does vDPA work here? > > > > > > Thanks > > > > > the rhel/fedroa guest image will not call vrtio_stop and virtio_reset > > during the boot > > So vector will not change to VIRTIO_NO_VECTOR. So the vdpa's > > configure interrupt > > Should work ok and there is no crash > > I mean: > > 1) if vDPA can work with the image you used to reproduce the issue > 2) if current Qemu can work on old kernel without configure interrupt > support for vDPA > Really Sorry for , I tried to answer this 1. the vDPA device also can not working in this image, Because the irqfd for vector 0 is released. and then guest image called unmask this vector. These code are all not dependent on vdpa related processes
2, the current qemu can work with old kernel because as I debug the old kernel won't call msix_fire_vector_notifier for vector 0, and it will not unmask vector 0 Thansk cindy > Thanks > > > Thanks > > cindy > > > > > > > > > > Btw, it's better to explain in detail like the above in the next > > > > version. > > > > > > > > Thanks > > > > > > > > > > > > > > > > So I think this should > > > > > > > be a bug in this guest image > > > > > > > > > > > > The point is Qemu should not crash even if the guest driver is > > > > > > buggy. > > > > > > > > > > > > It would be nice if we can have a qtest for this on top. > > > > > > > > > > > > Thanks > > > > > > > > > > > sure, got it, I have done the Qtest, and it passed > > > > > here is the result > > > > > > > > > > Ok: 794 > > > > > Expected Fail: 0 > > > > > Fail: 0 > > > > > Unexpected Pass: 0 > > > > > Skipped: 32 > > > > > Timeout: 0 > > > > > > > > > > > > > > > > > > > > > > > 4. The system continues to boot and > > > > > > > > > msix_fire_vector_notifier() was > > > > > > > > > called unmask the vector 0 and then met the crash > > > > > > > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 1 > > > > > > > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 0 > > > > > > > > > > > > > > > > > > The reason for not reproducing in RHEL/fedora guest image is > > > > > > > > > because > > > > > > > > > REHL/Fedora doesn't have the behavior of calling > > > > > > > > > vhost_net_stop and then virtio_reset, and also won't call > > > > > > > > > msix_fire_vector_notifier for vector 0 during system boot. > > > > > > > > > > > > > > > > > > The reason for not reproducing before configure interrupt > > > > > > > > > support is because > > > > > > > > > vector 0 is for configure interrupt, before the support for > > > > > > > > > configure interrupts, the notifier process will not handle > > > > > > > > > vector 0. > > > > > > > > > > > > > > > > > > For the device Vyatta using, it doesn't support configure > > > > > > > > > interrupts at all, So we plan to disable the configure > > > > > > > > > interrupts in unsupported device > > > > > > > > > > > > > > > > Btw, let's tweak the changelog, it's a little bit hard to > > > > > > > > understand. > > > > > > > > > > > > > > > sure will do > > > > > > > thanks > > > > > > > Cindy > > > > > > > > Thanks > > > > > > > > > > > > > > > > > > > > > > > > > > Signed-off-by: Cindy Lu <l...@redhat.com> > > > > > > > > > > > > > > > > > > Cindy Lu (2): > > > > > > > > > virtio-net: disable the configure interrupt for not support > > > > > > > > > device > > > > > > > > > virtio-pci: check if the configure interrupt enable > > > > > > > > > > > > > > > > > > hw/net/virtio-net.c | 5 ++++- > > > > > > > > > hw/virtio/virtio-pci.c | 41 > > > > > > > > > +++++++++++++++++++++----------------- > > > > > > > > > hw/virtio/virtio.c | 1 + > > > > > > > > > include/hw/virtio/virtio.h | 1 + > > > > > > > > > 4 files changed, 29 insertions(+), 19 deletions(-) > > > > > > > > > > > > > > > > > > -- > > > > > > > > > 2.43.0 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >