Re: [RFC 0/2] disable the configuration interrupt for the unsupported device

2024-03-28 Thread Jason Wang
On Fri, Mar 29, 2024 at 11:02 AM Cindy Lu  wrote:
>
> On Thu, Mar 28, 2024 at 12:12 PM Jason Wang  wrote:
> >
> > On Wed, Mar 27, 2024 at 5:33 PM Cindy Lu  wrote:
> > >
> > > On Wed, Mar 27, 2024 at 5:12 PM Jason Wang  wrote:
> > > >
> > > > On Wed, Mar 27, 2024 at 4:28 PM Cindy Lu  wrote:
> > > > >
> > > > > On Wed, Mar 27, 2024 at 3:54 PM Jason Wang  
> > > > > wrote:
> > > > > >
> > > > > > On Wed, Mar 27, 2024 at 2:03 PM Cindy Lu  wrote:
> > > > > > >
> > > > > > > On Wed, Mar 27, 2024 at 11:05 AM Jason Wang  
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > Hi Cindy:
> > > > > > > >
> > > > > > > > On Wed, Mar 27, 2024 at 9:29 AM Cindy Lu  
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > we need a crash in Non-standard image, here is the jira for 
> > > > > > > > > this https://issues.redhat.com/browse/RHEL-28522
> > > > > > > > > The root cause of the issue is that an IRQFD was used without 
> > > > > > > > > initialization..
> > > > > > > > >
> > > > > > > > > During the booting process of the Vyatta image, the behavior 
> > > > > > > > > of the called function in qemu is as follows:
> > > > > > > > >
> > > > > > > > > 1. vhost_net_stop() was called, this will call the function
> > > > > > > > > virtio_pci_set_guest_notifiers() with assgin= false, and
> > > > > > > > > virtio_pci_set_guest_notifiers() will release the irqfd for 
> > > > > > > > > vector 0
> > > > > > > >
> > > > > > > > Before vhost_net_stop(), do we know which vector is used by 
> > > > > > > > which queue?
> > > > > > > >
> > > > > > > before this stop, vdev->config_verctor is get from
> > > > > > > virtio_pci_common_read/virtio_pci_common_write
> > > > > > > it was set to vector 0
> > > > > >
> > > > > > I basically meant if vector 0 is shared with some virtqueues here.
> > > > > >
> > > > > Really sorry for this, vq's vector is 1,2, and will not share with the
> > > > > configure vector
> > > > > > > > >
> > > > > > > > > 2. virtio_reset() was called -->set configure vector to 
> > > > > > > > > VIRTIO_NO_VECTORt
> > > > > > > > >
> > > > > > > > > 3.vhost_net_start() was called (at this time the configure 
> > > > > > > > > vector is
> > > > > > > > > still VIRTIO_NO_VECTOR) and call 
> > > > > > > > > virtio_pci_set_guest_notifiers() with
> > > > > > > > > assgin= true, so the irqfd for vector 0 was not "init" during 
> > > > > > > > > this process
> > > > > > > >
> > > > > > > > How does the configure vector differ from the virtqueue vector 
> > > > > > > > here?
> > > > > > > >
> > > > > > > All the vectors are VIRTIO_NO_VECTOR (including vq). any
> > > > > > > msix_fire_vector_notifier()
> > > > > > > been called will cause the crash at this time.
> > > > > >
> > > > > > Won't virtio_pci_set_guest_notifiers() will try to allocate irqfd 
> > > > > > when
> > > > > > the assignment is true?
> > > > > >
> > > > > It will allocate, but  the vector is VIRTIO_NO_VECTOR (0x)
> > > > >
> > > > > then it will called kvm_virtio_pci_vector_use_one()
> > > > >
> > > > > in this function, there is a check for
> > > > >
> > > > > if (vector >= msix_nr_vectors_allocated(dev))
> > > > >
> > > > > { return 0; }
> > > > >
> > > > > So it will return.
> > > >
> > > > How about let's just fix this?
> > > >
> > > > Btw, it's better to explain in detail like the above in the next 
> > > > version.
> > > >
> > > > Thanks
> > > >
> > > The problem is I think the behavior here is correct, The vector here is
> > >  VIRTIO_NO_VECTOR and we should return,
> >
> > So if I understand correctly, the configure vector is configured after
> > DRIVER_OK?
> >
> sorry I didn't get your point, Do you mean set_guest_notifiers()?,
> this was called during the system boot
>  but for the value of vdev->config_vector/vq vector, this is changed
> by virtio_pci_common_read/virtio_pci_common_write
> and these function will not check the process  DRIVER_OK.

I basically mean Qemu behave based on the guest's behaviour.

So what you've described looks like a guest trying to configure the
config vector after it sets DRIVER_OK. So Qemu tries to use the irqfd
without initializaiton.

> > Spec doesn't forbid this, this is something we need to support.
> >
> > It looks to me the correct fix is to kvm_virtio_pci_vector_use_one()
> > when guest is writing to msix_vector after DRIVER_OK?
> >
> if I understand correctly. do you mean
> when  function   virtio_pci_common_read/virtio_pci_common_write was called
> we need to check the number of  vdev->config_vector/vq vector, if this
> was changed and also DRIVER_OK was set
> then we need to call virtio_pci_set_guest_notifiers() to re-init the irqfd?

It is not re-init, as it has been freed.

A quick fix would be, call kvm_virtio_pci_vector_use/unuse_one() when
a guest assign/deassign a vector after DRIVER_OK.

Thanks

> Thanks
> cindy
> > Thanks
> >
> > > the fix could work maybe is we try get to know if this was changed
> > > from another value
> > > and use that one? this seems strange.
> > > 

Re: [RFC 0/2] disable the configuration interrupt for the unsupported device

2024-03-28 Thread Cindy Lu
On Thu, Mar 28, 2024 at 12:12 PM Jason Wang  wrote:
>
> On Wed, Mar 27, 2024 at 5:33 PM Cindy Lu  wrote:
> >
> > On Wed, Mar 27, 2024 at 5:12 PM Jason Wang  wrote:
> > >
> > > On Wed, Mar 27, 2024 at 4:28 PM Cindy Lu  wrote:
> > > >
> > > > On Wed, Mar 27, 2024 at 3:54 PM Jason Wang  wrote:
> > > > >
> > > > > On Wed, Mar 27, 2024 at 2:03 PM Cindy Lu  wrote:
> > > > > >
> > > > > > On Wed, Mar 27, 2024 at 11:05 AM Jason Wang  
> > > > > > wrote:
> > > > > > >
> > > > > > > Hi Cindy:
> > > > > > >
> > > > > > > On Wed, Mar 27, 2024 at 9:29 AM Cindy Lu  wrote:
> > > > > > > >
> > > > > > > > we need a crash in Non-standard image, here is the jira for 
> > > > > > > > this https://issues.redhat.com/browse/RHEL-28522
> > > > > > > > The root cause of the issue is that an IRQFD was used without 
> > > > > > > > initialization..
> > > > > > > >
> > > > > > > > During the booting process of the Vyatta image, the behavior of 
> > > > > > > > the called function in qemu is as follows:
> > > > > > > >
> > > > > > > > 1. vhost_net_stop() was called, this will call the function
> > > > > > > > virtio_pci_set_guest_notifiers() with assgin= false, and
> > > > > > > > virtio_pci_set_guest_notifiers() will release the irqfd for 
> > > > > > > > vector 0
> > > > > > >
> > > > > > > Before vhost_net_stop(), do we know which vector is used by which 
> > > > > > > queue?
> > > > > > >
> > > > > > before this stop, vdev->config_verctor is get from
> > > > > > virtio_pci_common_read/virtio_pci_common_write
> > > > > > it was set to vector 0
> > > > >
> > > > > I basically meant if vector 0 is shared with some virtqueues here.
> > > > >
> > > > Really sorry for this, vq's vector is 1,2, and will not share with the
> > > > configure vector
> > > > > > > >
> > > > > > > > 2. virtio_reset() was called -->set configure vector to 
> > > > > > > > VIRTIO_NO_VECTORt
> > > > > > > >
> > > > > > > > 3.vhost_net_start() was called (at this time the configure 
> > > > > > > > vector is
> > > > > > > > still VIRTIO_NO_VECTOR) and call 
> > > > > > > > virtio_pci_set_guest_notifiers() with
> > > > > > > > assgin= true, so the irqfd for vector 0 was not "init" during 
> > > > > > > > this process
> > > > > > >
> > > > > > > How does the configure vector differ from the virtqueue vector 
> > > > > > > here?
> > > > > > >
> > > > > > All the vectors are VIRTIO_NO_VECTOR (including vq). any
> > > > > > msix_fire_vector_notifier()
> > > > > > been called will cause the crash at this time.
> > > > >
> > > > > Won't virtio_pci_set_guest_notifiers() will try to allocate irqfd when
> > > > > the assignment is true?
> > > > >
> > > > It will allocate, but  the vector is VIRTIO_NO_VECTOR (0x)
> > > >
> > > > then it will called kvm_virtio_pci_vector_use_one()
> > > >
> > > > in this function, there is a check for
> > > >
> > > > if (vector >= msix_nr_vectors_allocated(dev))
> > > >
> > > > { return 0; }
> > > >
> > > > So it will return.
> > >
> > > How about let's just fix this?
> > >
> > > Btw, it's better to explain in detail like the above in the next version.
> > >
> > > Thanks
> > >
> > The problem is I think the behavior here is correct, The vector here is
> >  VIRTIO_NO_VECTOR and we should return,
>
> So if I understand correctly, the configure vector is configured after
> DRIVER_OK?
>
sorry I didn't get your point, Do you mean set_guest_notifiers()?,
this was called during the system boot
 but for the value of vdev->config_vector/vq vector, this is changed
by virtio_pci_common_read/virtio_pci_common_write
and these function will not check the process  DRIVER_OK.
> Spec doesn't forbid this, this is something we need to support.
>
> It looks to me the correct fix is to kvm_virtio_pci_vector_use_one()
> when guest is writing to msix_vector after DRIVER_OK?
>
if I understand correctly. do you mean
when  function   virtio_pci_common_read/virtio_pci_common_write was called
we need to check the number of  vdev->config_vector/vq vector, if this
was changed and also DRIVER_OK was set
then we need to call virtio_pci_set_guest_notifiers() to re-init the irqfd?
Thanks
cindy
> Thanks
>
> > the fix could work maybe is we try get to know if this was changed
> > from another value
> > and use that one? this seems strange.
> > Thanks
> > cindy
> > > >
> > > > > > So I think this should
> > > > > > be a bug in this guest image
> > > > >
> > > > > The point is Qemu should not crash even if the guest driver is buggy.
> > > > >
> > > > > It would be nice if we can have a qtest for this on top.
> > > > >
> > > > > Thanks
> > > > >
> > > > sure, got it, I have done the Qtest, and it passed
> > > > here is the result
> > > >
> > > > Ok: 794
> > > > Expected Fail:  0
> > > > Fail:   0
> > > > Unexpected Pass:0
> > > > Skipped:32
> > > > Timeout:0
> > > >
> > > > > > > >
> > > > > > > > 4. The system continues to boot and msix_fire_vector_notifier() 
> > > > > > > > was
> > > > > > > > 

Re: [RFC 0/2] disable the configuration interrupt for the unsupported device

2024-03-28 Thread Cindy Lu
On Thu, Mar 28, 2024 at 12:14 PM Jason Wang  wrote:
>
> On Wed, Mar 27, 2024 at 5:44 PM Cindy Lu  wrote:
> >
> > On Wed, Mar 27, 2024 at 5:13 PM Jason Wang  wrote:
> > >
> > > On Wed, Mar 27, 2024 at 5:12 PM Jason Wang  wrote:
> > > >
> > > > On Wed, Mar 27, 2024 at 4:28 PM Cindy Lu  wrote:
> > > > >
> > > > > On Wed, Mar 27, 2024 at 3:54 PM Jason Wang  
> > > > > wrote:
> > > > > >
> > > > > > On Wed, Mar 27, 2024 at 2:03 PM Cindy Lu  wrote:
> > > > > > >
> > > > > > > On Wed, Mar 27, 2024 at 11:05 AM Jason Wang  
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > Hi Cindy:
> > > > > > > >
> > > > > > > > On Wed, Mar 27, 2024 at 9:29 AM Cindy Lu  
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > we need a crash in Non-standard image, here is the jira for 
> > > > > > > > > this https://issues.redhat.com/browse/RHEL-28522
> > > > > > > > > The root cause of the issue is that an IRQFD was used without 
> > > > > > > > > initialization..
> > > > > > > > >
> > > > > > > > > During the booting process of the Vyatta image, the behavior 
> > > > > > > > > of the called function in qemu is as follows:
> > > > > > > > >
> > > > > > > > > 1. vhost_net_stop() was called, this will call the function
> > > > > > > > > virtio_pci_set_guest_notifiers() with assgin= false, and
> > > > > > > > > virtio_pci_set_guest_notifiers() will release the irqfd for 
> > > > > > > > > vector 0
> > > > > > > >
> > > > > > > > Before vhost_net_stop(), do we know which vector is used by 
> > > > > > > > which queue?
> > > > > > > >
> > > > > > > before this stop, vdev->config_verctor is get from
> > > > > > > virtio_pci_common_read/virtio_pci_common_write
> > > > > > > it was set to vector 0
> > > > > >
> > > > > > I basically meant if vector 0 is shared with some virtqueues here.
> > > > > >
> > > > > Really sorry for this, vq's vector is 1,2, and will not share with the
> > > > > configure vector
> > > > > > > > >
> > > > > > > > > 2. virtio_reset() was called -->set configure vector to 
> > > > > > > > > VIRTIO_NO_VECTORt
> > > > > > > > >
> > > > > > > > > 3.vhost_net_start() was called (at this time the configure 
> > > > > > > > > vector is
> > > > > > > > > still VIRTIO_NO_VECTOR) and call 
> > > > > > > > > virtio_pci_set_guest_notifiers() with
> > > > > > > > > assgin= true, so the irqfd for vector 0 was not "init" during 
> > > > > > > > > this process
> > > > > > > >
> > > > > > > > How does the configure vector differ from the virtqueue vector 
> > > > > > > > here?
> > > > > > > >
> > > > > > > All the vectors are VIRTIO_NO_VECTOR (including vq). any
> > > > > > > msix_fire_vector_notifier()
> > > > > > > been called will cause the crash at this time.
> > > > > >
> > > > > > Won't virtio_pci_set_guest_notifiers() will try to allocate irqfd 
> > > > > > when
> > > > > > the assignment is true?
> > > > > >
> > > > > It will allocate, but  the vector is VIRTIO_NO_VECTOR (0x)
> > > > >
> > > > > then it will called kvm_virtio_pci_vector_use_one()
> > > > >
> > > > > in this function, there is a check for
> > > > >
> > > > > if (vector >= msix_nr_vectors_allocated(dev))
> > > > >
> > > > > { return 0; }
> > > > >
> > > > > So it will return.
> > > >
> > > > How about let's just fix this?
> > >
> > > Btw, another question, how does vDPA work here?
> > >
> > > Thanks
> > >
> > the rhel/fedroa guest image will not call  vrtio_stop and virtio_reset
> > during the boot
> > So vector will not change to  VIRTIO_NO_VECTOR. So the vdpa's
> > configure interrupt
> > Should work ok and there is no crash
>
> I mean:
>
> 1) if vDPA can work with the image you used to reproduce the issue
> 2) if current Qemu can work on old kernel without configure interrupt
> support for vDPA
>
Really Sorry for ,  I tried to answer this
1. the vDPA device also can not working in this image,
 Because the irqfd for vector 0 is released. and then guest image
called unmask this vector. These code are all not dependent on vdpa
related processes

2, the current qemu can work with old kernel
because as I debug the old kernel  won't call
msix_fire_vector_notifier for vector 0, and it will not unmask vector
0
Thansk
cindy

> Thanks
>
> > Thanks
> > cindy
> >
> > > >
> > > > Btw, it's better to explain in detail like the above in the next 
> > > > version.
> > > >
> > > > Thanks
> > > >
> > > > >
> > > > > > > So I think this should
> > > > > > > be a bug in this guest image
> > > > > >
> > > > > > The point is Qemu should not crash even if the guest driver is 
> > > > > > buggy.
> > > > > >
> > > > > > It would be nice if we can have a qtest for this on top.
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > sure, got it, I have done the Qtest, and it passed
> > > > > here is the result
> > > > >
> > > > > Ok: 794
> > > > > Expected Fail:  0
> > > > > Fail:   0
> > > > > Unexpected Pass:0
> > > > > Skipped:32
> > > > > Timeout:0
> > > > >
> > > > > > > > >
> > > > > > > > > 

Re: [RFC 0/2] disable the configuration interrupt for the unsupported device

2024-03-27 Thread Jason Wang
On Wed, Mar 27, 2024 at 5:44 PM Cindy Lu  wrote:
>
> On Wed, Mar 27, 2024 at 5:13 PM Jason Wang  wrote:
> >
> > On Wed, Mar 27, 2024 at 5:12 PM Jason Wang  wrote:
> > >
> > > On Wed, Mar 27, 2024 at 4:28 PM Cindy Lu  wrote:
> > > >
> > > > On Wed, Mar 27, 2024 at 3:54 PM Jason Wang  wrote:
> > > > >
> > > > > On Wed, Mar 27, 2024 at 2:03 PM Cindy Lu  wrote:
> > > > > >
> > > > > > On Wed, Mar 27, 2024 at 11:05 AM Jason Wang  
> > > > > > wrote:
> > > > > > >
> > > > > > > Hi Cindy:
> > > > > > >
> > > > > > > On Wed, Mar 27, 2024 at 9:29 AM Cindy Lu  wrote:
> > > > > > > >
> > > > > > > > we need a crash in Non-standard image, here is the jira for 
> > > > > > > > this https://issues.redhat.com/browse/RHEL-28522
> > > > > > > > The root cause of the issue is that an IRQFD was used without 
> > > > > > > > initialization..
> > > > > > > >
> > > > > > > > During the booting process of the Vyatta image, the behavior of 
> > > > > > > > the called function in qemu is as follows:
> > > > > > > >
> > > > > > > > 1. vhost_net_stop() was called, this will call the function
> > > > > > > > virtio_pci_set_guest_notifiers() with assgin= false, and
> > > > > > > > virtio_pci_set_guest_notifiers() will release the irqfd for 
> > > > > > > > vector 0
> > > > > > >
> > > > > > > Before vhost_net_stop(), do we know which vector is used by which 
> > > > > > > queue?
> > > > > > >
> > > > > > before this stop, vdev->config_verctor is get from
> > > > > > virtio_pci_common_read/virtio_pci_common_write
> > > > > > it was set to vector 0
> > > > >
> > > > > I basically meant if vector 0 is shared with some virtqueues here.
> > > > >
> > > > Really sorry for this, vq's vector is 1,2, and will not share with the
> > > > configure vector
> > > > > > > >
> > > > > > > > 2. virtio_reset() was called -->set configure vector to 
> > > > > > > > VIRTIO_NO_VECTORt
> > > > > > > >
> > > > > > > > 3.vhost_net_start() was called (at this time the configure 
> > > > > > > > vector is
> > > > > > > > still VIRTIO_NO_VECTOR) and call 
> > > > > > > > virtio_pci_set_guest_notifiers() with
> > > > > > > > assgin= true, so the irqfd for vector 0 was not "init" during 
> > > > > > > > this process
> > > > > > >
> > > > > > > How does the configure vector differ from the virtqueue vector 
> > > > > > > here?
> > > > > > >
> > > > > > All the vectors are VIRTIO_NO_VECTOR (including vq). any
> > > > > > msix_fire_vector_notifier()
> > > > > > been called will cause the crash at this time.
> > > > >
> > > > > Won't virtio_pci_set_guest_notifiers() will try to allocate irqfd when
> > > > > the assignment is true?
> > > > >
> > > > It will allocate, but  the vector is VIRTIO_NO_VECTOR (0x)
> > > >
> > > > then it will called kvm_virtio_pci_vector_use_one()
> > > >
> > > > in this function, there is a check for
> > > >
> > > > if (vector >= msix_nr_vectors_allocated(dev))
> > > >
> > > > { return 0; }
> > > >
> > > > So it will return.
> > >
> > > How about let's just fix this?
> >
> > Btw, another question, how does vDPA work here?
> >
> > Thanks
> >
> the rhel/fedroa guest image will not call  vrtio_stop and virtio_reset
> during the boot
> So vector will not change to  VIRTIO_NO_VECTOR. So the vdpa's
> configure interrupt
> Should work ok and there is no crash

I mean:

1) if vDPA can work with the image you used to reproduce the issue
2) if current Qemu can work on old kernel without configure interrupt
support for vDPA

Thanks

> Thanks
> cindy
>
> > >
> > > Btw, it's better to explain in detail like the above in the next version.
> > >
> > > Thanks
> > >
> > > >
> > > > > > So I think this should
> > > > > > be a bug in this guest image
> > > > >
> > > > > The point is Qemu should not crash even if the guest driver is buggy.
> > > > >
> > > > > It would be nice if we can have a qtest for this on top.
> > > > >
> > > > > Thanks
> > > > >
> > > > sure, got it, I have done the Qtest, and it passed
> > > > here is the result
> > > >
> > > > Ok: 794
> > > > Expected Fail:  0
> > > > Fail:   0
> > > > Unexpected Pass:0
> > > > Skipped:32
> > > > Timeout:0
> > > >
> > > > > > > >
> > > > > > > > 4. The system continues to boot and msix_fire_vector_notifier() 
> > > > > > > > was
> > > > > > > > called unmask the vector 0 and then met the crash
> > > > > > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 1
> > > > > > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 0
> > > > > > > >
> > > > > > > > The reason for not reproducing in RHEL/fedora guest image is 
> > > > > > > > because
> > > > > > > > REHL/Fedora doesn't have the behavior of calling vhost_net_stop 
> > > > > > > > and then virtio_reset, and also won't call 
> > > > > > > > msix_fire_vector_notifier for vector 0 during system boot.
> > > > > > > >
> > > > > > > > The reason for not reproducing before configure interrupt 
> > > > > > > > support is because
> > > > > > > > vector 0 is 

Re: [RFC 0/2] disable the configuration interrupt for the unsupported device

2024-03-27 Thread Jason Wang
On Wed, Mar 27, 2024 at 5:33 PM Cindy Lu  wrote:
>
> On Wed, Mar 27, 2024 at 5:12 PM Jason Wang  wrote:
> >
> > On Wed, Mar 27, 2024 at 4:28 PM Cindy Lu  wrote:
> > >
> > > On Wed, Mar 27, 2024 at 3:54 PM Jason Wang  wrote:
> > > >
> > > > On Wed, Mar 27, 2024 at 2:03 PM Cindy Lu  wrote:
> > > > >
> > > > > On Wed, Mar 27, 2024 at 11:05 AM Jason Wang  
> > > > > wrote:
> > > > > >
> > > > > > Hi Cindy:
> > > > > >
> > > > > > On Wed, Mar 27, 2024 at 9:29 AM Cindy Lu  wrote:
> > > > > > >
> > > > > > > we need a crash in Non-standard image, here is the jira for this 
> > > > > > > https://issues.redhat.com/browse/RHEL-28522
> > > > > > > The root cause of the issue is that an IRQFD was used without 
> > > > > > > initialization..
> > > > > > >
> > > > > > > During the booting process of the Vyatta image, the behavior of 
> > > > > > > the called function in qemu is as follows:
> > > > > > >
> > > > > > > 1. vhost_net_stop() was called, this will call the function
> > > > > > > virtio_pci_set_guest_notifiers() with assgin= false, and
> > > > > > > virtio_pci_set_guest_notifiers() will release the irqfd for 
> > > > > > > vector 0
> > > > > >
> > > > > > Before vhost_net_stop(), do we know which vector is used by which 
> > > > > > queue?
> > > > > >
> > > > > before this stop, vdev->config_verctor is get from
> > > > > virtio_pci_common_read/virtio_pci_common_write
> > > > > it was set to vector 0
> > > >
> > > > I basically meant if vector 0 is shared with some virtqueues here.
> > > >
> > > Really sorry for this, vq's vector is 1,2, and will not share with the
> > > configure vector
> > > > > > >
> > > > > > > 2. virtio_reset() was called -->set configure vector to 
> > > > > > > VIRTIO_NO_VECTORt
> > > > > > >
> > > > > > > 3.vhost_net_start() was called (at this time the configure vector 
> > > > > > > is
> > > > > > > still VIRTIO_NO_VECTOR) and call virtio_pci_set_guest_notifiers() 
> > > > > > > with
> > > > > > > assgin= true, so the irqfd for vector 0 was not "init" during 
> > > > > > > this process
> > > > > >
> > > > > > How does the configure vector differ from the virtqueue vector here?
> > > > > >
> > > > > All the vectors are VIRTIO_NO_VECTOR (including vq). any
> > > > > msix_fire_vector_notifier()
> > > > > been called will cause the crash at this time.
> > > >
> > > > Won't virtio_pci_set_guest_notifiers() will try to allocate irqfd when
> > > > the assignment is true?
> > > >
> > > It will allocate, but  the vector is VIRTIO_NO_VECTOR (0x)
> > >
> > > then it will called kvm_virtio_pci_vector_use_one()
> > >
> > > in this function, there is a check for
> > >
> > > if (vector >= msix_nr_vectors_allocated(dev))
> > >
> > > { return 0; }
> > >
> > > So it will return.
> >
> > How about let's just fix this?
> >
> > Btw, it's better to explain in detail like the above in the next version.
> >
> > Thanks
> >
> The problem is I think the behavior here is correct, The vector here is
>  VIRTIO_NO_VECTOR and we should return,

So if I understand correctly, the configure vector is configured after
DRIVER_OK?

Spec doesn't forbid this, this is something we need to support.

It looks to me the correct fix is to kvm_virtio_pci_vector_use_one()
when guest is writing to msix_vector after DRIVER_OK?

Thanks

> the fix could work maybe is we try get to know if this was changed
> from another value
> and use that one? this seems strange.
> Thanks
> cindy
> > >
> > > > > So I think this should
> > > > > be a bug in this guest image
> > > >
> > > > The point is Qemu should not crash even if the guest driver is buggy.
> > > >
> > > > It would be nice if we can have a qtest for this on top.
> > > >
> > > > Thanks
> > > >
> > > sure, got it, I have done the Qtest, and it passed
> > > here is the result
> > >
> > > Ok: 794
> > > Expected Fail:  0
> > > Fail:   0
> > > Unexpected Pass:0
> > > Skipped:32
> > > Timeout:0
> > >
> > > > > > >
> > > > > > > 4. The system continues to boot and msix_fire_vector_notifier() 
> > > > > > > was
> > > > > > > called unmask the vector 0 and then met the crash
> > > > > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 1
> > > > > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 0
> > > > > > >
> > > > > > > The reason for not reproducing in RHEL/fedora guest image is 
> > > > > > > because
> > > > > > > REHL/Fedora doesn't have the behavior of calling vhost_net_stop 
> > > > > > > and then virtio_reset, and also won't call 
> > > > > > > msix_fire_vector_notifier for vector 0 during system boot.
> > > > > > >
> > > > > > > The reason for not reproducing before configure interrupt support 
> > > > > > > is because
> > > > > > > vector 0 is for configure interrupt,  before the support for 
> > > > > > > configure interrupts, the notifier process will not handle vector 
> > > > > > > 0.
> > > > > > >
> > > > > > > For the device Vyatta using, it doesn't support configure 
> > 

Re: [RFC 0/2] disable the configuration interrupt for the unsupported device

2024-03-27 Thread Cindy Lu
On Wed, Mar 27, 2024 at 5:13 PM Jason Wang  wrote:
>
> On Wed, Mar 27, 2024 at 5:12 PM Jason Wang  wrote:
> >
> > On Wed, Mar 27, 2024 at 4:28 PM Cindy Lu  wrote:
> > >
> > > On Wed, Mar 27, 2024 at 3:54 PM Jason Wang  wrote:
> > > >
> > > > On Wed, Mar 27, 2024 at 2:03 PM Cindy Lu  wrote:
> > > > >
> > > > > On Wed, Mar 27, 2024 at 11:05 AM Jason Wang  
> > > > > wrote:
> > > > > >
> > > > > > Hi Cindy:
> > > > > >
> > > > > > On Wed, Mar 27, 2024 at 9:29 AM Cindy Lu  wrote:
> > > > > > >
> > > > > > > we need a crash in Non-standard image, here is the jira for this 
> > > > > > > https://issues.redhat.com/browse/RHEL-28522
> > > > > > > The root cause of the issue is that an IRQFD was used without 
> > > > > > > initialization..
> > > > > > >
> > > > > > > During the booting process of the Vyatta image, the behavior of 
> > > > > > > the called function in qemu is as follows:
> > > > > > >
> > > > > > > 1. vhost_net_stop() was called, this will call the function
> > > > > > > virtio_pci_set_guest_notifiers() with assgin= false, and
> > > > > > > virtio_pci_set_guest_notifiers() will release the irqfd for 
> > > > > > > vector 0
> > > > > >
> > > > > > Before vhost_net_stop(), do we know which vector is used by which 
> > > > > > queue?
> > > > > >
> > > > > before this stop, vdev->config_verctor is get from
> > > > > virtio_pci_common_read/virtio_pci_common_write
> > > > > it was set to vector 0
> > > >
> > > > I basically meant if vector 0 is shared with some virtqueues here.
> > > >
> > > Really sorry for this, vq's vector is 1,2, and will not share with the
> > > configure vector
> > > > > > >
> > > > > > > 2. virtio_reset() was called -->set configure vector to 
> > > > > > > VIRTIO_NO_VECTORt
> > > > > > >
> > > > > > > 3.vhost_net_start() was called (at this time the configure vector 
> > > > > > > is
> > > > > > > still VIRTIO_NO_VECTOR) and call virtio_pci_set_guest_notifiers() 
> > > > > > > with
> > > > > > > assgin= true, so the irqfd for vector 0 was not "init" during 
> > > > > > > this process
> > > > > >
> > > > > > How does the configure vector differ from the virtqueue vector here?
> > > > > >
> > > > > All the vectors are VIRTIO_NO_VECTOR (including vq). any
> > > > > msix_fire_vector_notifier()
> > > > > been called will cause the crash at this time.
> > > >
> > > > Won't virtio_pci_set_guest_notifiers() will try to allocate irqfd when
> > > > the assignment is true?
> > > >
> > > It will allocate, but  the vector is VIRTIO_NO_VECTOR (0x)
> > >
> > > then it will called kvm_virtio_pci_vector_use_one()
> > >
> > > in this function, there is a check for
> > >
> > > if (vector >= msix_nr_vectors_allocated(dev))
> > >
> > > { return 0; }
> > >
> > > So it will return.
> >
> > How about let's just fix this?
>
> Btw, another question, how does vDPA work here?
>
> Thanks
>
the rhel/fedroa guest image will not call  vrtio_stop and virtio_reset
during the boot
So vector will not change to  VIRTIO_NO_VECTOR. So the vdpa's
configure interrupt
Should work ok and there is no crash
Thanks
cindy

> >
> > Btw, it's better to explain in detail like the above in the next version.
> >
> > Thanks
> >
> > >
> > > > > So I think this should
> > > > > be a bug in this guest image
> > > >
> > > > The point is Qemu should not crash even if the guest driver is buggy.
> > > >
> > > > It would be nice if we can have a qtest for this on top.
> > > >
> > > > Thanks
> > > >
> > > sure, got it, I have done the Qtest, and it passed
> > > here is the result
> > >
> > > Ok: 794
> > > Expected Fail:  0
> > > Fail:   0
> > > Unexpected Pass:0
> > > Skipped:32
> > > Timeout:0
> > >
> > > > > > >
> > > > > > > 4. The system continues to boot and msix_fire_vector_notifier() 
> > > > > > > was
> > > > > > > called unmask the vector 0 and then met the crash
> > > > > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 1
> > > > > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 0
> > > > > > >
> > > > > > > The reason for not reproducing in RHEL/fedora guest image is 
> > > > > > > because
> > > > > > > REHL/Fedora doesn't have the behavior of calling vhost_net_stop 
> > > > > > > and then virtio_reset, and also won't call 
> > > > > > > msix_fire_vector_notifier for vector 0 during system boot.
> > > > > > >
> > > > > > > The reason for not reproducing before configure interrupt support 
> > > > > > > is because
> > > > > > > vector 0 is for configure interrupt,  before the support for 
> > > > > > > configure interrupts, the notifier process will not handle vector 
> > > > > > > 0.
> > > > > > >
> > > > > > > For the device Vyatta using, it doesn't support configure 
> > > > > > > interrupts at all, So we plan to disable the configure interrupts 
> > > > > > > in unsupported device
> > > > > >
> > > > > > Btw, let's tweak the changelog, it's a little bit hard to 
> > > > > > understand.
> > > > > >
> > > > > sure will do
> 

Re: [RFC 0/2] disable the configuration interrupt for the unsupported device

2024-03-27 Thread Cindy Lu
On Wed, Mar 27, 2024 at 5:12 PM Jason Wang  wrote:
>
> On Wed, Mar 27, 2024 at 4:28 PM Cindy Lu  wrote:
> >
> > On Wed, Mar 27, 2024 at 3:54 PM Jason Wang  wrote:
> > >
> > > On Wed, Mar 27, 2024 at 2:03 PM Cindy Lu  wrote:
> > > >
> > > > On Wed, Mar 27, 2024 at 11:05 AM Jason Wang  wrote:
> > > > >
> > > > > Hi Cindy:
> > > > >
> > > > > On Wed, Mar 27, 2024 at 9:29 AM Cindy Lu  wrote:
> > > > > >
> > > > > > we need a crash in Non-standard image, here is the jira for this 
> > > > > > https://issues.redhat.com/browse/RHEL-28522
> > > > > > The root cause of the issue is that an IRQFD was used without 
> > > > > > initialization..
> > > > > >
> > > > > > During the booting process of the Vyatta image, the behavior of the 
> > > > > > called function in qemu is as follows:
> > > > > >
> > > > > > 1. vhost_net_stop() was called, this will call the function
> > > > > > virtio_pci_set_guest_notifiers() with assgin= false, and
> > > > > > virtio_pci_set_guest_notifiers() will release the irqfd for vector 0
> > > > >
> > > > > Before vhost_net_stop(), do we know which vector is used by which 
> > > > > queue?
> > > > >
> > > > before this stop, vdev->config_verctor is get from
> > > > virtio_pci_common_read/virtio_pci_common_write
> > > > it was set to vector 0
> > >
> > > I basically meant if vector 0 is shared with some virtqueues here.
> > >
> > Really sorry for this, vq's vector is 1,2, and will not share with the
> > configure vector
> > > > > >
> > > > > > 2. virtio_reset() was called -->set configure vector to 
> > > > > > VIRTIO_NO_VECTORt
> > > > > >
> > > > > > 3.vhost_net_start() was called (at this time the configure vector is
> > > > > > still VIRTIO_NO_VECTOR) and call virtio_pci_set_guest_notifiers() 
> > > > > > with
> > > > > > assgin= true, so the irqfd for vector 0 was not "init" during this 
> > > > > > process
> > > > >
> > > > > How does the configure vector differ from the virtqueue vector here?
> > > > >
> > > > All the vectors are VIRTIO_NO_VECTOR (including vq). any
> > > > msix_fire_vector_notifier()
> > > > been called will cause the crash at this time.
> > >
> > > Won't virtio_pci_set_guest_notifiers() will try to allocate irqfd when
> > > the assignment is true?
> > >
> > It will allocate, but  the vector is VIRTIO_NO_VECTOR (0x)
> >
> > then it will called kvm_virtio_pci_vector_use_one()
> >
> > in this function, there is a check for
> >
> > if (vector >= msix_nr_vectors_allocated(dev))
> >
> > { return 0; }
> >
> > So it will return.
>
> How about let's just fix this?
>
> Btw, it's better to explain in detail like the above in the next version.
>
> Thanks
>
The problem is I think the behavior here is correct, The vector here is
 VIRTIO_NO_VECTOR and we should return,
the fix could work maybe is we try get to know if this was changed
from another value
and use that one? this seems strange.
Thanks
cindy
> >
> > > > So I think this should
> > > > be a bug in this guest image
> > >
> > > The point is Qemu should not crash even if the guest driver is buggy.
> > >
> > > It would be nice if we can have a qtest for this on top.
> > >
> > > Thanks
> > >
> > sure, got it, I have done the Qtest, and it passed
> > here is the result
> >
> > Ok: 794
> > Expected Fail:  0
> > Fail:   0
> > Unexpected Pass:0
> > Skipped:32
> > Timeout:0
> >
> > > > > >
> > > > > > 4. The system continues to boot and msix_fire_vector_notifier() was
> > > > > > called unmask the vector 0 and then met the crash
> > > > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 1
> > > > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 0
> > > > > >
> > > > > > The reason for not reproducing in RHEL/fedora guest image is because
> > > > > > REHL/Fedora doesn't have the behavior of calling vhost_net_stop and 
> > > > > > then virtio_reset, and also won't call msix_fire_vector_notifier 
> > > > > > for vector 0 during system boot.
> > > > > >
> > > > > > The reason for not reproducing before configure interrupt support 
> > > > > > is because
> > > > > > vector 0 is for configure interrupt,  before the support for 
> > > > > > configure interrupts, the notifier process will not handle vector 0.
> > > > > >
> > > > > > For the device Vyatta using, it doesn't support configure 
> > > > > > interrupts at all, So we plan to disable the configure interrupts 
> > > > > > in unsupported device
> > > > >
> > > > > Btw, let's tweak the changelog, it's a little bit hard to understand.
> > > > >
> > > > sure will do
> > > > thanks
> > > > Cindy
> > > > > Thanks
> > > > >
> > > > > >
> > > > > > Signed-off-by: Cindy Lu 
> > > > > >
> > > > > > Cindy Lu (2):
> > > > > >   virtio-net: disable the configure interrupt for not support device
> > > > > >   virtio-pci: check if the configure interrupt enable
> > > > > >
> > > > > >  hw/net/virtio-net.c|  5 -
> > > > > >  hw/virtio/virtio-pci.c | 41 
> > > > > > 

Re: [RFC 0/2] disable the configuration interrupt for the unsupported device

2024-03-27 Thread Jason Wang
On Wed, Mar 27, 2024 at 5:12 PM Jason Wang  wrote:
>
> On Wed, Mar 27, 2024 at 4:28 PM Cindy Lu  wrote:
> >
> > On Wed, Mar 27, 2024 at 3:54 PM Jason Wang  wrote:
> > >
> > > On Wed, Mar 27, 2024 at 2:03 PM Cindy Lu  wrote:
> > > >
> > > > On Wed, Mar 27, 2024 at 11:05 AM Jason Wang  wrote:
> > > > >
> > > > > Hi Cindy:
> > > > >
> > > > > On Wed, Mar 27, 2024 at 9:29 AM Cindy Lu  wrote:
> > > > > >
> > > > > > we need a crash in Non-standard image, here is the jira for this 
> > > > > > https://issues.redhat.com/browse/RHEL-28522
> > > > > > The root cause of the issue is that an IRQFD was used without 
> > > > > > initialization..
> > > > > >
> > > > > > During the booting process of the Vyatta image, the behavior of the 
> > > > > > called function in qemu is as follows:
> > > > > >
> > > > > > 1. vhost_net_stop() was called, this will call the function
> > > > > > virtio_pci_set_guest_notifiers() with assgin= false, and
> > > > > > virtio_pci_set_guest_notifiers() will release the irqfd for vector 0
> > > > >
> > > > > Before vhost_net_stop(), do we know which vector is used by which 
> > > > > queue?
> > > > >
> > > > before this stop, vdev->config_verctor is get from
> > > > virtio_pci_common_read/virtio_pci_common_write
> > > > it was set to vector 0
> > >
> > > I basically meant if vector 0 is shared with some virtqueues here.
> > >
> > Really sorry for this, vq's vector is 1,2, and will not share with the
> > configure vector
> > > > > >
> > > > > > 2. virtio_reset() was called -->set configure vector to 
> > > > > > VIRTIO_NO_VECTORt
> > > > > >
> > > > > > 3.vhost_net_start() was called (at this time the configure vector is
> > > > > > still VIRTIO_NO_VECTOR) and call virtio_pci_set_guest_notifiers() 
> > > > > > with
> > > > > > assgin= true, so the irqfd for vector 0 was not "init" during this 
> > > > > > process
> > > > >
> > > > > How does the configure vector differ from the virtqueue vector here?
> > > > >
> > > > All the vectors are VIRTIO_NO_VECTOR (including vq). any
> > > > msix_fire_vector_notifier()
> > > > been called will cause the crash at this time.
> > >
> > > Won't virtio_pci_set_guest_notifiers() will try to allocate irqfd when
> > > the assignment is true?
> > >
> > It will allocate, but  the vector is VIRTIO_NO_VECTOR (0x)
> >
> > then it will called kvm_virtio_pci_vector_use_one()
> >
> > in this function, there is a check for
> >
> > if (vector >= msix_nr_vectors_allocated(dev))
> >
> > { return 0; }
> >
> > So it will return.
>
> How about let's just fix this?

Btw, another question, how does vDPA work here?

Thanks

>
> Btw, it's better to explain in detail like the above in the next version.
>
> Thanks
>
> >
> > > > So I think this should
> > > > be a bug in this guest image
> > >
> > > The point is Qemu should not crash even if the guest driver is buggy.
> > >
> > > It would be nice if we can have a qtest for this on top.
> > >
> > > Thanks
> > >
> > sure, got it, I have done the Qtest, and it passed
> > here is the result
> >
> > Ok: 794
> > Expected Fail:  0
> > Fail:   0
> > Unexpected Pass:0
> > Skipped:32
> > Timeout:0
> >
> > > > > >
> > > > > > 4. The system continues to boot and msix_fire_vector_notifier() was
> > > > > > called unmask the vector 0 and then met the crash
> > > > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 1
> > > > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 0
> > > > > >
> > > > > > The reason for not reproducing in RHEL/fedora guest image is because
> > > > > > REHL/Fedora doesn't have the behavior of calling vhost_net_stop and 
> > > > > > then virtio_reset, and also won't call msix_fire_vector_notifier 
> > > > > > for vector 0 during system boot.
> > > > > >
> > > > > > The reason for not reproducing before configure interrupt support 
> > > > > > is because
> > > > > > vector 0 is for configure interrupt,  before the support for 
> > > > > > configure interrupts, the notifier process will not handle vector 0.
> > > > > >
> > > > > > For the device Vyatta using, it doesn't support configure 
> > > > > > interrupts at all, So we plan to disable the configure interrupts 
> > > > > > in unsupported device
> > > > >
> > > > > Btw, let's tweak the changelog, it's a little bit hard to understand.
> > > > >
> > > > sure will do
> > > > thanks
> > > > Cindy
> > > > > Thanks
> > > > >
> > > > > >
> > > > > > Signed-off-by: Cindy Lu 
> > > > > >
> > > > > > Cindy Lu (2):
> > > > > >   virtio-net: disable the configure interrupt for not support device
> > > > > >   virtio-pci: check if the configure interrupt enable
> > > > > >
> > > > > >  hw/net/virtio-net.c|  5 -
> > > > > >  hw/virtio/virtio-pci.c | 41 
> > > > > > +-
> > > > > >  hw/virtio/virtio.c |  1 +
> > > > > >  include/hw/virtio/virtio.h |  1 +
> > > > > >  4 files changed, 29 insertions(+), 19 deletions(-)
> > > 

Re: [RFC 0/2] disable the configuration interrupt for the unsupported device

2024-03-27 Thread Jason Wang
On Wed, Mar 27, 2024 at 4:28 PM Cindy Lu  wrote:
>
> On Wed, Mar 27, 2024 at 3:54 PM Jason Wang  wrote:
> >
> > On Wed, Mar 27, 2024 at 2:03 PM Cindy Lu  wrote:
> > >
> > > On Wed, Mar 27, 2024 at 11:05 AM Jason Wang  wrote:
> > > >
> > > > Hi Cindy:
> > > >
> > > > On Wed, Mar 27, 2024 at 9:29 AM Cindy Lu  wrote:
> > > > >
> > > > > we need a crash in Non-standard image, here is the jira for this 
> > > > > https://issues.redhat.com/browse/RHEL-28522
> > > > > The root cause of the issue is that an IRQFD was used without 
> > > > > initialization..
> > > > >
> > > > > During the booting process of the Vyatta image, the behavior of the 
> > > > > called function in qemu is as follows:
> > > > >
> > > > > 1. vhost_net_stop() was called, this will call the function
> > > > > virtio_pci_set_guest_notifiers() with assgin= false, and
> > > > > virtio_pci_set_guest_notifiers() will release the irqfd for vector 0
> > > >
> > > > Before vhost_net_stop(), do we know which vector is used by which queue?
> > > >
> > > before this stop, vdev->config_verctor is get from
> > > virtio_pci_common_read/virtio_pci_common_write
> > > it was set to vector 0
> >
> > I basically meant if vector 0 is shared with some virtqueues here.
> >
> Really sorry for this, vq's vector is 1,2, and will not share with the
> configure vector
> > > > >
> > > > > 2. virtio_reset() was called -->set configure vector to 
> > > > > VIRTIO_NO_VECTORt
> > > > >
> > > > > 3.vhost_net_start() was called (at this time the configure vector is
> > > > > still VIRTIO_NO_VECTOR) and call virtio_pci_set_guest_notifiers() with
> > > > > assgin= true, so the irqfd for vector 0 was not "init" during this 
> > > > > process
> > > >
> > > > How does the configure vector differ from the virtqueue vector here?
> > > >
> > > All the vectors are VIRTIO_NO_VECTOR (including vq). any
> > > msix_fire_vector_notifier()
> > > been called will cause the crash at this time.
> >
> > Won't virtio_pci_set_guest_notifiers() will try to allocate irqfd when
> > the assignment is true?
> >
> It will allocate, but  the vector is VIRTIO_NO_VECTOR (0x)
>
> then it will called kvm_virtio_pci_vector_use_one()
>
> in this function, there is a check for
>
> if (vector >= msix_nr_vectors_allocated(dev))
>
> { return 0; }
>
> So it will return.

How about let's just fix this?

Btw, it's better to explain in detail like the above in the next version.

Thanks

>
> > > So I think this should
> > > be a bug in this guest image
> >
> > The point is Qemu should not crash even if the guest driver is buggy.
> >
> > It would be nice if we can have a qtest for this on top.
> >
> > Thanks
> >
> sure, got it, I have done the Qtest, and it passed
> here is the result
>
> Ok: 794
> Expected Fail:  0
> Fail:   0
> Unexpected Pass:0
> Skipped:32
> Timeout:0
>
> > > > >
> > > > > 4. The system continues to boot and msix_fire_vector_notifier() was
> > > > > called unmask the vector 0 and then met the crash
> > > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 1
> > > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 0
> > > > >
> > > > > The reason for not reproducing in RHEL/fedora guest image is because
> > > > > REHL/Fedora doesn't have the behavior of calling vhost_net_stop and 
> > > > > then virtio_reset, and also won't call msix_fire_vector_notifier for 
> > > > > vector 0 during system boot.
> > > > >
> > > > > The reason for not reproducing before configure interrupt support is 
> > > > > because
> > > > > vector 0 is for configure interrupt,  before the support for 
> > > > > configure interrupts, the notifier process will not handle vector 0.
> > > > >
> > > > > For the device Vyatta using, it doesn't support configure interrupts 
> > > > > at all, So we plan to disable the configure interrupts in unsupported 
> > > > > device
> > > >
> > > > Btw, let's tweak the changelog, it's a little bit hard to understand.
> > > >
> > > sure will do
> > > thanks
> > > Cindy
> > > > Thanks
> > > >
> > > > >
> > > > > Signed-off-by: Cindy Lu 
> > > > >
> > > > > Cindy Lu (2):
> > > > >   virtio-net: disable the configure interrupt for not support device
> > > > >   virtio-pci: check if the configure interrupt enable
> > > > >
> > > > >  hw/net/virtio-net.c|  5 -
> > > > >  hw/virtio/virtio-pci.c | 41 
> > > > > +-
> > > > >  hw/virtio/virtio.c |  1 +
> > > > >  include/hw/virtio/virtio.h |  1 +
> > > > >  4 files changed, 29 insertions(+), 19 deletions(-)
> > > > >
> > > > > --
> > > > > 2.43.0
> > > > >
> > > >
> > >
> >
>




Re: [RFC 0/2] disable the configuration interrupt for the unsupported device

2024-03-27 Thread Cindy Lu
On Wed, Mar 27, 2024 at 3:54 PM Jason Wang  wrote:
>
> On Wed, Mar 27, 2024 at 2:03 PM Cindy Lu  wrote:
> >
> > On Wed, Mar 27, 2024 at 11:05 AM Jason Wang  wrote:
> > >
> > > Hi Cindy:
> > >
> > > On Wed, Mar 27, 2024 at 9:29 AM Cindy Lu  wrote:
> > > >
> > > > we need a crash in Non-standard image, here is the jira for this 
> > > > https://issues.redhat.com/browse/RHEL-28522
> > > > The root cause of the issue is that an IRQFD was used without 
> > > > initialization..
> > > >
> > > > During the booting process of the Vyatta image, the behavior of the 
> > > > called function in qemu is as follows:
> > > >
> > > > 1. vhost_net_stop() was called, this will call the function
> > > > virtio_pci_set_guest_notifiers() with assgin= false, and
> > > > virtio_pci_set_guest_notifiers() will release the irqfd for vector 0
> > >
> > > Before vhost_net_stop(), do we know which vector is used by which queue?
> > >
> > before this stop, vdev->config_verctor is get from
> > virtio_pci_common_read/virtio_pci_common_write
> > it was set to vector 0
>
> I basically meant if vector 0 is shared with some virtqueues here.
>
Really sorry for this, vq's vector is 1,2, and will not share with the
configure vector
> > > >
> > > > 2. virtio_reset() was called -->set configure vector to 
> > > > VIRTIO_NO_VECTORt
> > > >
> > > > 3.vhost_net_start() was called (at this time the configure vector is
> > > > still VIRTIO_NO_VECTOR) and call virtio_pci_set_guest_notifiers() with
> > > > assgin= true, so the irqfd for vector 0 was not "init" during this 
> > > > process
> > >
> > > How does the configure vector differ from the virtqueue vector here?
> > >
> > All the vectors are VIRTIO_NO_VECTOR (including vq). any
> > msix_fire_vector_notifier()
> > been called will cause the crash at this time.
>
> Won't virtio_pci_set_guest_notifiers() will try to allocate irqfd when
> the assignment is true?
>
It will allocate, but  the vector is VIRTIO_NO_VECTOR (0x)

then it will called kvm_virtio_pci_vector_use_one()

in this function, there is a check for

if (vector >= msix_nr_vectors_allocated(dev))

{ return 0; }

So it will return.

> > So I think this should
> > be a bug in this guest image
>
> The point is Qemu should not crash even if the guest driver is buggy.
>
> It would be nice if we can have a qtest for this on top.
>
> Thanks
>
sure, got it, I have done the Qtest, and it passed
here is the result

Ok: 794
Expected Fail:  0
Fail:   0
Unexpected Pass:0
Skipped:32
Timeout:0

> > > >
> > > > 4. The system continues to boot and msix_fire_vector_notifier() was
> > > > called unmask the vector 0 and then met the crash
> > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 1
> > > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 0
> > > >
> > > > The reason for not reproducing in RHEL/fedora guest image is because
> > > > REHL/Fedora doesn't have the behavior of calling vhost_net_stop and 
> > > > then virtio_reset, and also won't call msix_fire_vector_notifier for 
> > > > vector 0 during system boot.
> > > >
> > > > The reason for not reproducing before configure interrupt support is 
> > > > because
> > > > vector 0 is for configure interrupt,  before the support for configure 
> > > > interrupts, the notifier process will not handle vector 0.
> > > >
> > > > For the device Vyatta using, it doesn't support configure interrupts at 
> > > > all, So we plan to disable the configure interrupts in unsupported 
> > > > device
> > >
> > > Btw, let's tweak the changelog, it's a little bit hard to understand.
> > >
> > sure will do
> > thanks
> > Cindy
> > > Thanks
> > >
> > > >
> > > > Signed-off-by: Cindy Lu 
> > > >
> > > > Cindy Lu (2):
> > > >   virtio-net: disable the configure interrupt for not support device
> > > >   virtio-pci: check if the configure interrupt enable
> > > >
> > > >  hw/net/virtio-net.c|  5 -
> > > >  hw/virtio/virtio-pci.c | 41 +-
> > > >  hw/virtio/virtio.c |  1 +
> > > >  include/hw/virtio/virtio.h |  1 +
> > > >  4 files changed, 29 insertions(+), 19 deletions(-)
> > > >
> > > > --
> > > > 2.43.0
> > > >
> > >
> >
>




Re: [RFC 0/2] disable the configuration interrupt for the unsupported device

2024-03-27 Thread Jason Wang
On Wed, Mar 27, 2024 at 2:03 PM Cindy Lu  wrote:
>
> On Wed, Mar 27, 2024 at 11:05 AM Jason Wang  wrote:
> >
> > Hi Cindy:
> >
> > On Wed, Mar 27, 2024 at 9:29 AM Cindy Lu  wrote:
> > >
> > > we need a crash in Non-standard image, here is the jira for this 
> > > https://issues.redhat.com/browse/RHEL-28522
> > > The root cause of the issue is that an IRQFD was used without 
> > > initialization..
> > >
> > > During the booting process of the Vyatta image, the behavior of the 
> > > called function in qemu is as follows:
> > >
> > > 1. vhost_net_stop() was called, this will call the function
> > > virtio_pci_set_guest_notifiers() with assgin= false, and
> > > virtio_pci_set_guest_notifiers() will release the irqfd for vector 0
> >
> > Before vhost_net_stop(), do we know which vector is used by which queue?
> >
> before this stop, vdev->config_verctor is get from
> virtio_pci_common_read/virtio_pci_common_write
> it was set to vector 0

I basically meant if vector 0 is shared with some virtqueues here.

> > >
> > > 2. virtio_reset() was called -->set configure vector to VIRTIO_NO_VECTORt
> > >
> > > 3.vhost_net_start() was called (at this time the configure vector is
> > > still VIRTIO_NO_VECTOR) and call virtio_pci_set_guest_notifiers() with
> > > assgin= true, so the irqfd for vector 0 was not "init" during this process
> >
> > How does the configure vector differ from the virtqueue vector here?
> >
> All the vectors are VIRTIO_NO_VECTOR (including vq). any
> msix_fire_vector_notifier()
> been called will cause the crash at this time.

Won't virtio_pci_set_guest_notifiers() will try to allocate irqfd when
the assignment is true?

> So I think this should
> be a bug in this guest image

The point is Qemu should not crash even if the guest driver is buggy.

It would be nice if we can have a qtest for this on top.

Thanks

> > >
> > > 4. The system continues to boot and msix_fire_vector_notifier() was
> > > called unmask the vector 0 and then met the crash
> > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 1
> > > [msix_fire_vector_notifier] 112 called vector 0 is_masked 0
> > >
> > > The reason for not reproducing in RHEL/fedora guest image is because
> > > REHL/Fedora doesn't have the behavior of calling vhost_net_stop and then 
> > > virtio_reset, and also won't call msix_fire_vector_notifier for vector 0 
> > > during system boot.
> > >
> > > The reason for not reproducing before configure interrupt support is 
> > > because
> > > vector 0 is for configure interrupt,  before the support for configure 
> > > interrupts, the notifier process will not handle vector 0.
> > >
> > > For the device Vyatta using, it doesn't support configure interrupts at 
> > > all, So we plan to disable the configure interrupts in unsupported device
> >
> > Btw, let's tweak the changelog, it's a little bit hard to understand.
> >
> sure will do
> thanks
> Cindy
> > Thanks
> >
> > >
> > > Signed-off-by: Cindy Lu 
> > >
> > > Cindy Lu (2):
> > >   virtio-net: disable the configure interrupt for not support device
> > >   virtio-pci: check if the configure interrupt enable
> > >
> > >  hw/net/virtio-net.c|  5 -
> > >  hw/virtio/virtio-pci.c | 41 +-
> > >  hw/virtio/virtio.c |  1 +
> > >  include/hw/virtio/virtio.h |  1 +
> > >  4 files changed, 29 insertions(+), 19 deletions(-)
> > >
> > > --
> > > 2.43.0
> > >
> >
>




Re: [RFC 0/2] disable the configuration interrupt for the unsupported device

2024-03-27 Thread Cindy Lu
On Wed, Mar 27, 2024 at 11:05 AM Jason Wang  wrote:
>
> Hi Cindy:
>
> On Wed, Mar 27, 2024 at 9:29 AM Cindy Lu  wrote:
> >
> > we need a crash in Non-standard image, here is the jira for this 
> > https://issues.redhat.com/browse/RHEL-28522
> > The root cause of the issue is that an IRQFD was used without 
> > initialization..
> >
> > During the booting process of the Vyatta image, the behavior of the called 
> > function in qemu is as follows:
> >
> > 1. vhost_net_stop() was called, this will call the function
> > virtio_pci_set_guest_notifiers() with assgin= false, and
> > virtio_pci_set_guest_notifiers() will release the irqfd for vector 0
>
> Before vhost_net_stop(), do we know which vector is used by which queue?
>
before this stop, vdev->config_verctor is get from
virtio_pci_common_read/virtio_pci_common_write
it was set to vector 0
> >
> > 2. virtio_reset() was called -->set configure vector to VIRTIO_NO_VECTORt
> >
> > 3.vhost_net_start() was called (at this time the configure vector is
> > still VIRTIO_NO_VECTOR) and call virtio_pci_set_guest_notifiers() with
> > assgin= true, so the irqfd for vector 0 was not "init" during this process
>
> How does the configure vector differ from the virtqueue vector here?
>
All the vectors are VIRTIO_NO_VECTOR (including vq). any
msix_fire_vector_notifier()
been called will cause the crash at this time.  So I think this should
be a bug in this guest image
> >
> > 4. The system continues to boot and msix_fire_vector_notifier() was
> > called unmask the vector 0 and then met the crash
> > [msix_fire_vector_notifier] 112 called vector 0 is_masked 1
> > [msix_fire_vector_notifier] 112 called vector 0 is_masked 0
> >
> > The reason for not reproducing in RHEL/fedora guest image is because
> > REHL/Fedora doesn't have the behavior of calling vhost_net_stop and then 
> > virtio_reset, and also won't call msix_fire_vector_notifier for vector 0 
> > during system boot.
> >
> > The reason for not reproducing before configure interrupt support is because
> > vector 0 is for configure interrupt,  before the support for configure 
> > interrupts, the notifier process will not handle vector 0.
> >
> > For the device Vyatta using, it doesn't support configure interrupts at 
> > all, So we plan to disable the configure interrupts in unsupported device
>
> Btw, let's tweak the changelog, it's a little bit hard to understand.
>
sure will do
thanks
Cindy
> Thanks
>
> >
> > Signed-off-by: Cindy Lu 
> >
> > Cindy Lu (2):
> >   virtio-net: disable the configure interrupt for not support device
> >   virtio-pci: check if the configure interrupt enable
> >
> >  hw/net/virtio-net.c|  5 -
> >  hw/virtio/virtio-pci.c | 41 +-
> >  hw/virtio/virtio.c |  1 +
> >  include/hw/virtio/virtio.h |  1 +
> >  4 files changed, 29 insertions(+), 19 deletions(-)
> >
> > --
> > 2.43.0
> >
>




Re: [RFC 0/2] disable the configuration interrupt for the unsupported device

2024-03-26 Thread Jason Wang
Hi Cindy:

On Wed, Mar 27, 2024 at 9:29 AM Cindy Lu  wrote:
>
> we need a crash in Non-standard image, here is the jira for this 
> https://issues.redhat.com/browse/RHEL-28522
> The root cause of the issue is that an IRQFD was used without initialization..
>
> During the booting process of the Vyatta image, the behavior of the called 
> function in qemu is as follows:
>
> 1. vhost_net_stop() was called, this will call the function
> virtio_pci_set_guest_notifiers() with assgin= false, and
> virtio_pci_set_guest_notifiers() will release the irqfd for vector 0

Before vhost_net_stop(), do we know which vector is used by which queue?

>
> 2. virtio_reset() was called -->set configure vector to VIRTIO_NO_VECTORt
>
> 3.vhost_net_start() was called (at this time the configure vector is
> still VIRTIO_NO_VECTOR) and call virtio_pci_set_guest_notifiers() with
> assgin= true, so the irqfd for vector 0 was not "init" during this process

How does the configure vector differ from the virtqueue vector here?

>
> 4. The system continues to boot and msix_fire_vector_notifier() was
> called unmask the vector 0 and then met the crash
> [msix_fire_vector_notifier] 112 called vector 0 is_masked 1
> [msix_fire_vector_notifier] 112 called vector 0 is_masked 0
>
> The reason for not reproducing in RHEL/fedora guest image is because
> REHL/Fedora doesn't have the behavior of calling vhost_net_stop and then 
> virtio_reset, and also won't call msix_fire_vector_notifier for vector 0 
> during system boot.
>
> The reason for not reproducing before configure interrupt support is because
> vector 0 is for configure interrupt,  before the support for configure 
> interrupts, the notifier process will not handle vector 0.
>
> For the device Vyatta using, it doesn't support configure interrupts at all, 
> So we plan to disable the configure interrupts in unsupported device

Btw, let's tweak the changelog, it's a little bit hard to understand.

Thanks

>
> Signed-off-by: Cindy Lu 
>
> Cindy Lu (2):
>   virtio-net: disable the configure interrupt for not support device
>   virtio-pci: check if the configure interrupt enable
>
>  hw/net/virtio-net.c|  5 -
>  hw/virtio/virtio-pci.c | 41 +-
>  hw/virtio/virtio.c |  1 +
>  include/hw/virtio/virtio.h |  1 +
>  4 files changed, 29 insertions(+), 19 deletions(-)
>
> --
> 2.43.0
>




[RFC 0/2] disable the configuration interrupt for the unsupported device

2024-03-26 Thread Cindy Lu
we need a crash in Non-standard image, here is the jira for this 
https://issues.redhat.com/browse/RHEL-28522
The root cause of the issue is that an IRQFD was used without initialization..

During the booting process of the Vyatta image, the behavior of the called 
function in qemu is as follows:

1. vhost_net_stop() was called, this will call the function
virtio_pci_set_guest_notifiers() with assgin= false, and
virtio_pci_set_guest_notifiers(??? will release the irqfd for vector 0

2. virtio_reset() was called -->set configure vector to VIRTIO_NO_VECTORt

3.vhost_net_start() was called (at this time the configure vector is
still VIRTIO_NO_VECTOR) and call virtio_pci_set_guest_notifiers() with
assgin= true, so the irqfd for vector 0 was not "init" during this process

4. The system continues to boot and msix_fire_vector_notifier() was
called unmask the vector 0 and then met the crash
[msix_fire_vector_notifier] 112 called vector 0 is_masked 1
[msix_fire_vector_notifier] 112 called vector 0 is_masked 0

The reason for not reproducing in RHEL/fedora guest image is because
REHL/Fedora doesn't have the behavior of calling vhost_net_stop and then 
virtio_reset, and also won't call msix_fire_vector_notifier for vector 0 during 
system boot.

The reason for not reproducing before configure interrupt support is because
vector 0 is for configure interrupt,  before the support for configure 
interrupts, the notifier process will not handle vector 0.

For the device Vyatta using, it doesn't support configure interrupts at all, So 
we plan to disable the configure interrupts in unsupported device

Signed-off-by: Cindy Lu 

Cindy Lu (2):
  virtio-net: disable the configure interrupt for not support device
  virtio-pci: check if the configure interrupt enable

 hw/net/virtio-net.c|  5 -
 hw/virtio/virtio-pci.c | 41 +-
 hw/virtio/virtio.c |  1 +
 include/hw/virtio/virtio.h |  1 +
 4 files changed, 29 insertions(+), 19 deletions(-)

-- 
2.43.0