Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang

2024-01-25 Thread Zhu Yanjun



在 2024/1/26 11:11, Zhu Yanjun 写道:



在 2024/1/22 15:02, Xuan Zhuo 写道:

On Mon, 22 Jan 2024 14:58:09 +0800, Jason Wang  wrote:

On Mon, Jan 22, 2024 at 2:55 PM Jason Wang  wrote:

On Mon, Jan 22, 2024 at 2:20 PM Xuan Zhuo  wrote:

On Mon, 22 Jan 2024 12:16:27 +0800, Jason Wang  wrote:

On Mon, Jan 22, 2024 at 12:00 PM Xuan Zhuo  wrote:

On Mon, 22 Jan 2024 11:14:30 +0800, Jason Wang  wrote:

On Mon, Jan 22, 2024 at 10:12 AM Zhu Yanjun  wrote:

在 2024/1/20 1:29, Andrew Lunn 写道:

while (!virtqueue_get_buf(vi->cvq, &tmp) &&
-   !virtqueue_is_broken(vi->cvq))
+   !virtqueue_is_broken(vi->cvq)) {
+if (timeout)
+timeout--;

This is not really a timeout, just a loop counter. 200 iterations could
be a very short time on reasonable H/W. I guess this avoid the soft
lockup, but possibly (likely?) breaks the functionality when we need to
loop for some non negligible time.

I fear we need a more complex solution, as mentioned by Micheal in the
thread you quoted.

Got it. I also look forward to the more complex solution to this problem.

Can we add a device capability (new feature bit) such as ctrq_wait_timeout
to get a reasonable timeout?

The usual solution to this is include/linux/iopoll.h. If you can sleep
read_poll_timeout() otherwise read_poll_timeout_atomic().

I read carefully the functions read_poll_timeout() and
read_poll_timeout_atomic(). The timeout is set by the caller of the 2
functions.

FYI, in order to avoid a swtich of atomic or not, we need convert rx
mode setting to workqueue first:

https://www.mail-archive.com/[email protected]/msg60298.html


As such, can we add a module parameter to customize this timeout value
by the user?

Who is the "user" here, or how can the "user" know the value?


Or this timeout value is stored in device register, virtio_net driver
will read this timeout value at initialization?

See another thread. The design needs to be general, or you can post a RFC.

In another thought, we've already had a tx watchdog, maybe we can have
something similar to cvq and use timeout + reset in that case.

But we may block by the reset ^_^ if the device is broken?

I mean vq reset here.

I see.

I mean when the deivce is broken, the vq reset also many be blocked.

 void vp_modern_set_queue_reset(struct virtio_pci_modern_device *mdev, 
u16 index)
 {
 struct virtio_pci_modern_common_cfg __iomem *cfg;

 cfg = (struct virtio_pci_modern_common_cfg __iomem 
*)mdev->common;

 vp_iowrite16(index, &cfg->cfg.queue_select);
 vp_iowrite16(1, &cfg->queue_reset);

 while (vp_ioread16(&cfg->queue_reset))
 msleep(1);

 while (vp_ioread16(&cfg->cfg.queue_enable))
 msleep(1);
 }
 EXPORT_SYMBOL_GPL(vp_modern_set_queue_reset);

In this function, for the broken device, we can not expect something.

Yes, it's best effort, there's no guarantee then. But it doesn't harm to try.

Thanks


It looks like we have multiple goals here

1) avoid lockups, using workqueue + cond_resched() seems to be
sufficient, it has issue but nothing new
2) recover from the unresponsive device, the issue for timeout is that
it needs to deal with false positives

I agree.

But I want to add a new goal, cvq async. In the netdim, we will
send many requests via the cvq, so the cvq async will be nice.

Then you need an interrupt for cvq.

FYI, I've posted a series that use interrupt for cvq in the past:

https://lore.kernel.org/lkml/[email protected]/t/

I know this. But the interrupt maybe not a good solution without new space.


Haven't found time in working on this anymore, maybe we can start from
this or not.

I said async, but my aim is to put many requests to the cvq before getting the
response.

Heng Qi posted 
thishttps://lore.kernel.org/all/[email protected]/


Sorry. This mail is rejected by netdev maillist. So I have to resend it.


Thanks a lot. I read Heng Qi's commits carefully. This patch series are 
similiar with the NIC feature xmit_more.


But if cvq command is urgent, can we let this urgent cvq command be 
passed ASAP?


I mean, can we set a flag similar to xmit_more? if cvq command is not 
urgent, it can be queued. If it is urgent,


this cvq command is passed ASAP.

Zhu Yanjun


Zhu Yanjun


Thanks.



Thanks


Thanks.



Thanks


Thanks.



Thans


Zhu Yanjun


   Andrew




Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang

2024-01-22 Thread Xuan Zhuo
On Mon, 22 Jan 2024 16:32:46 +0800, Jason Wang  wrote:
> On Mon, Jan 22, 2024 at 4:04 PM Xuan Zhuo  wrote:
> >
> > On Mon, 22 Jan 2024 15:57:08 +0800, Jason Wang  wrote:
> > > On Mon, Jan 22, 2024 at 3:36 PM Xuan Zhuo  
> > > wrote:
> > > >
> > > > On Mon, 22 Jan 2024 15:19:12 +0800, Jason Wang  
> > > > wrote:
> > > > > On Mon, Jan 22, 2024 at 3:07 PM Xuan Zhuo 
> > > > >  wrote:
> > > > > >
> > > > > > On Mon, 22 Jan 2024 14:58:09 +0800, Jason Wang 
> > > > > >  wrote:
> > > > > > > On Mon, Jan 22, 2024 at 2:55 PM Jason Wang  
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > On Mon, Jan 22, 2024 at 2:20 PM Xuan Zhuo 
> > > > > > > >  wrote:
> > > > > > > > >
> > > > > > > > > On Mon, 22 Jan 2024 12:16:27 +0800, Jason Wang 
> > > > > > > > >  wrote:
> > > > > > > > > > On Mon, Jan 22, 2024 at 12:00 PM Xuan Zhuo 
> > > > > > > > > >  wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Mon, 22 Jan 2024 11:14:30 +0800, Jason Wang 
> > > > > > > > > > >  wrote:
> > > > > > > > > > > > On Mon, Jan 22, 2024 at 10:12 AM Zhu Yanjun 
> > > > > > > > > > > >  wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > 在 2024/1/20 1:29, Andrew Lunn 写道:
> > > > > > > > > > > > > >while (!virtqueue_get_buf(vi->cvq, &tmp) 
> > > > > > > > > > > > > > &&
> > > > > > > > > > > > > > -   !virtqueue_is_broken(vi->cvq))
> > > > > > > > > > > > > > +   !virtqueue_is_broken(vi->cvq)) {
> > > > > > > > > > > > > > +if (timeout)
> > > > > > > > > > > > > > +timeout--;
> > > > > > > > > > > > >  This is not really a timeout, just a loop 
> > > > > > > > > > > > >  counter. 200 iterations could
> > > > > > > > > > > > >  be a very short time on reasonable H/W. I guess 
> > > > > > > > > > > > >  this avoid the soft
> > > > > > > > > > > > >  lockup, but possibly (likely?) breaks the 
> > > > > > > > > > > > >  functionality when we need to
> > > > > > > > > > > > >  loop for some non negligible time.
> > > > > > > > > > > > > 
> > > > > > > > > > > > >  I fear we need a more complex solution, as 
> > > > > > > > > > > > >  mentioned by Micheal in the
> > > > > > > > > > > > >  thread you quoted.
> > > > > > > > > > > > > >>> Got it. I also look forward to the more complex 
> > > > > > > > > > > > > >>> solution to this problem.
> > > > > > > > > > > > > >> Can we add a device capability (new feature bit) 
> > > > > > > > > > > > > >> such as ctrq_wait_timeout
> > > > > > > > > > > > > >> to get a reasonable timeout?
> > > > > > > > > > > > > > The usual solution to this is 
> > > > > > > > > > > > > > include/linux/iopoll.h. If you can sleep
> > > > > > > > > > > > > > read_poll_timeout() otherwise 
> > > > > > > > > > > > > > read_poll_timeout_atomic().
> > > > > > > > > > > > >
> > > > > > > > > > > > > I read carefully the functions read_poll_timeout() and
> > > > > > > > > > > > > read_poll_timeout_atomic(). The timeout is set by the 
> > > > > > > > > > > > > caller of the 2
> > > > > > > > > > > > > functions.
> > > > > > > > > > > >
> > > > > > > > > > > > FYI, in order to avoid a swtich of atomic or not, we 
> > > > > > > > > > > > need convert rx
> > > > > > > > > > > > mode setting to workqueue first:
> > > > > > > > > > > >
> > > > > > > > > > > > https://www.mail-archive.com/[email protected]/msg60298.html
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > As such, can we add a module parameter to customize 
> > > > > > > > > > > > > this timeout value
> > > > > > > > > > > > > by the user?
> > > > > > > > > > > >
> > > > > > > > > > > > Who is the "user" here, or how can the "user" know the 
> > > > > > > > > > > > value?
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Or this timeout value is stored in device register, 
> > > > > > > > > > > > > virtio_net driver
> > > > > > > > > > > > > will read this timeout value at initialization?
> > > > > > > > > > > >
> > > > > > > > > > > > See another thread. The design needs to be general, or 
> > > > > > > > > > > > you can post a RFC.
> > > > > > > > > > > >
> > > > > > > > > > > > In another thought, we've already had a tx watchdog, 
> > > > > > > > > > > > maybe we can have
> > > > > > > > > > > > something similar to cvq and use timeout + reset in 
> > > > > > > > > > > > that case.
> > > > > > > > > > >
> > > > > > > > > > > But we may block by the reset ^_^ if the device is broken?
> > > > > > > > > >
> > > > > > > > > > I mean vq reset here.
> > > > > > > > >
> > > > > > > > > I see.
> > > > > > > > >
> > > > > > > > > I mean when the deivce is broken, the vq reset also many be 
> > > > > > > > > blocked.
> > > > > > > > >
> > > > > > > > > void vp_modern_set_queue_reset(struct 
> > > > > > > > > virtio_pci_modern_device *mdev, u16 index)
> > > > > > > > > {
> > > > > > > > > struct virtio_pci_modern_common_cfg __iom

Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang

2024-01-22 Thread Jason Wang
On Mon, Jan 22, 2024 at 4:04 PM Xuan Zhuo  wrote:
>
> On Mon, 22 Jan 2024 15:57:08 +0800, Jason Wang  wrote:
> > On Mon, Jan 22, 2024 at 3:36 PM Xuan Zhuo  
> > wrote:
> > >
> > > On Mon, 22 Jan 2024 15:19:12 +0800, Jason Wang  
> > > wrote:
> > > > On Mon, Jan 22, 2024 at 3:07 PM Xuan Zhuo  
> > > > wrote:
> > > > >
> > > > > On Mon, 22 Jan 2024 14:58:09 +0800, Jason Wang  
> > > > > wrote:
> > > > > > On Mon, Jan 22, 2024 at 2:55 PM Jason Wang  
> > > > > > wrote:
> > > > > > >
> > > > > > > On Mon, Jan 22, 2024 at 2:20 PM Xuan Zhuo 
> > > > > > >  wrote:
> > > > > > > >
> > > > > > > > On Mon, 22 Jan 2024 12:16:27 +0800, Jason Wang 
> > > > > > > >  wrote:
> > > > > > > > > On Mon, Jan 22, 2024 at 12:00 PM Xuan Zhuo 
> > > > > > > > >  wrote:
> > > > > > > > > >
> > > > > > > > > > On Mon, 22 Jan 2024 11:14:30 +0800, Jason Wang 
> > > > > > > > > >  wrote:
> > > > > > > > > > > On Mon, Jan 22, 2024 at 10:12 AM Zhu Yanjun 
> > > > > > > > > > >  wrote:
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > 在 2024/1/20 1:29, Andrew Lunn 写道:
> > > > > > > > > > > > >while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> > > > > > > > > > > > > -   !virtqueue_is_broken(vi->cvq))
> > > > > > > > > > > > > +   !virtqueue_is_broken(vi->cvq)) {
> > > > > > > > > > > > > +if (timeout)
> > > > > > > > > > > > > +timeout--;
> > > > > > > > > > > >  This is not really a timeout, just a loop counter. 
> > > > > > > > > > > >  200 iterations could
> > > > > > > > > > > >  be a very short time on reasonable H/W. I guess 
> > > > > > > > > > > >  this avoid the soft
> > > > > > > > > > > >  lockup, but possibly (likely?) breaks the 
> > > > > > > > > > > >  functionality when we need to
> > > > > > > > > > > >  loop for some non negligible time.
> > > > > > > > > > > > 
> > > > > > > > > > > >  I fear we need a more complex solution, as 
> > > > > > > > > > > >  mentioned by Micheal in the
> > > > > > > > > > > >  thread you quoted.
> > > > > > > > > > > > >>> Got it. I also look forward to the more complex 
> > > > > > > > > > > > >>> solution to this problem.
> > > > > > > > > > > > >> Can we add a device capability (new feature bit) 
> > > > > > > > > > > > >> such as ctrq_wait_timeout
> > > > > > > > > > > > >> to get a reasonable timeout?
> > > > > > > > > > > > > The usual solution to this is include/linux/iopoll.h. 
> > > > > > > > > > > > > If you can sleep
> > > > > > > > > > > > > read_poll_timeout() otherwise 
> > > > > > > > > > > > > read_poll_timeout_atomic().
> > > > > > > > > > > >
> > > > > > > > > > > > I read carefully the functions read_poll_timeout() and
> > > > > > > > > > > > read_poll_timeout_atomic(). The timeout is set by the 
> > > > > > > > > > > > caller of the 2
> > > > > > > > > > > > functions.
> > > > > > > > > > >
> > > > > > > > > > > FYI, in order to avoid a swtich of atomic or not, we need 
> > > > > > > > > > > convert rx
> > > > > > > > > > > mode setting to workqueue first:
> > > > > > > > > > >
> > > > > > > > > > > https://www.mail-archive.com/[email protected]/msg60298.html
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > As such, can we add a module parameter to customize 
> > > > > > > > > > > > this timeout value
> > > > > > > > > > > > by the user?
> > > > > > > > > > >
> > > > > > > > > > > Who is the "user" here, or how can the "user" know the 
> > > > > > > > > > > value?
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Or this timeout value is stored in device register, 
> > > > > > > > > > > > virtio_net driver
> > > > > > > > > > > > will read this timeout value at initialization?
> > > > > > > > > > >
> > > > > > > > > > > See another thread. The design needs to be general, or 
> > > > > > > > > > > you can post a RFC.
> > > > > > > > > > >
> > > > > > > > > > > In another thought, we've already had a tx watchdog, 
> > > > > > > > > > > maybe we can have
> > > > > > > > > > > something similar to cvq and use timeout + reset in that 
> > > > > > > > > > > case.
> > > > > > > > > >
> > > > > > > > > > But we may block by the reset ^_^ if the device is broken?
> > > > > > > > >
> > > > > > > > > I mean vq reset here.
> > > > > > > >
> > > > > > > > I see.
> > > > > > > >
> > > > > > > > I mean when the deivce is broken, the vq reset also many be 
> > > > > > > > blocked.
> > > > > > > >
> > > > > > > > void vp_modern_set_queue_reset(struct 
> > > > > > > > virtio_pci_modern_device *mdev, u16 index)
> > > > > > > > {
> > > > > > > > struct virtio_pci_modern_common_cfg __iomem 
> > > > > > > > *cfg;
> > > > > > > >
> > > > > > > > cfg = (struct virtio_pci_modern_common_cfg 
> > > > > > > > __iomem *)mdev->common;
> > > > > > > >
> > > > > > > > vp_iowrite16(index, &cfg->cfg.queue_select);
> > > > > > > > vp_iowrit

Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang

2024-01-22 Thread Xuan Zhuo
On Mon, 22 Jan 2024 15:57:08 +0800, Jason Wang  wrote:
> On Mon, Jan 22, 2024 at 3:36 PM Xuan Zhuo  wrote:
> >
> > On Mon, 22 Jan 2024 15:19:12 +0800, Jason Wang  wrote:
> > > On Mon, Jan 22, 2024 at 3:07 PM Xuan Zhuo  
> > > wrote:
> > > >
> > > > On Mon, 22 Jan 2024 14:58:09 +0800, Jason Wang  
> > > > wrote:
> > > > > On Mon, Jan 22, 2024 at 2:55 PM Jason Wang  
> > > > > wrote:
> > > > > >
> > > > > > On Mon, Jan 22, 2024 at 2:20 PM Xuan Zhuo 
> > > > > >  wrote:
> > > > > > >
> > > > > > > On Mon, 22 Jan 2024 12:16:27 +0800, Jason Wang 
> > > > > > >  wrote:
> > > > > > > > On Mon, Jan 22, 2024 at 12:00 PM Xuan Zhuo 
> > > > > > > >  wrote:
> > > > > > > > >
> > > > > > > > > On Mon, 22 Jan 2024 11:14:30 +0800, Jason Wang 
> > > > > > > > >  wrote:
> > > > > > > > > > On Mon, Jan 22, 2024 at 10:12 AM Zhu Yanjun 
> > > > > > > > > >  wrote:
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > 在 2024/1/20 1:29, Andrew Lunn 写道:
> > > > > > > > > > > >while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> > > > > > > > > > > > -   !virtqueue_is_broken(vi->cvq))
> > > > > > > > > > > > +   !virtqueue_is_broken(vi->cvq)) {
> > > > > > > > > > > > +if (timeout)
> > > > > > > > > > > > +timeout--;
> > > > > > > > > > >  This is not really a timeout, just a loop counter. 
> > > > > > > > > > >  200 iterations could
> > > > > > > > > > >  be a very short time on reasonable H/W. I guess this 
> > > > > > > > > > >  avoid the soft
> > > > > > > > > > >  lockup, but possibly (likely?) breaks the 
> > > > > > > > > > >  functionality when we need to
> > > > > > > > > > >  loop for some non negligible time.
> > > > > > > > > > > 
> > > > > > > > > > >  I fear we need a more complex solution, as mentioned 
> > > > > > > > > > >  by Micheal in the
> > > > > > > > > > >  thread you quoted.
> > > > > > > > > > > >>> Got it. I also look forward to the more complex 
> > > > > > > > > > > >>> solution to this problem.
> > > > > > > > > > > >> Can we add a device capability (new feature bit) such 
> > > > > > > > > > > >> as ctrq_wait_timeout
> > > > > > > > > > > >> to get a reasonable timeout?
> > > > > > > > > > > > The usual solution to this is include/linux/iopoll.h. 
> > > > > > > > > > > > If you can sleep
> > > > > > > > > > > > read_poll_timeout() otherwise 
> > > > > > > > > > > > read_poll_timeout_atomic().
> > > > > > > > > > >
> > > > > > > > > > > I read carefully the functions read_poll_timeout() and
> > > > > > > > > > > read_poll_timeout_atomic(). The timeout is set by the 
> > > > > > > > > > > caller of the 2
> > > > > > > > > > > functions.
> > > > > > > > > >
> > > > > > > > > > FYI, in order to avoid a swtich of atomic or not, we need 
> > > > > > > > > > convert rx
> > > > > > > > > > mode setting to workqueue first:
> > > > > > > > > >
> > > > > > > > > > https://www.mail-archive.com/[email protected]/msg60298.html
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > As such, can we add a module parameter to customize this 
> > > > > > > > > > > timeout value
> > > > > > > > > > > by the user?
> > > > > > > > > >
> > > > > > > > > > Who is the "user" here, or how can the "user" know the 
> > > > > > > > > > value?
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Or this timeout value is stored in device register, 
> > > > > > > > > > > virtio_net driver
> > > > > > > > > > > will read this timeout value at initialization?
> > > > > > > > > >
> > > > > > > > > > See another thread. The design needs to be general, or you 
> > > > > > > > > > can post a RFC.
> > > > > > > > > >
> > > > > > > > > > In another thought, we've already had a tx watchdog, maybe 
> > > > > > > > > > we can have
> > > > > > > > > > something similar to cvq and use timeout + reset in that 
> > > > > > > > > > case.
> > > > > > > > >
> > > > > > > > > But we may block by the reset ^_^ if the device is broken?
> > > > > > > >
> > > > > > > > I mean vq reset here.
> > > > > > >
> > > > > > > I see.
> > > > > > >
> > > > > > > I mean when the deivce is broken, the vq reset also many be 
> > > > > > > blocked.
> > > > > > >
> > > > > > > void vp_modern_set_queue_reset(struct 
> > > > > > > virtio_pci_modern_device *mdev, u16 index)
> > > > > > > {
> > > > > > > struct virtio_pci_modern_common_cfg __iomem *cfg;
> > > > > > >
> > > > > > > cfg = (struct virtio_pci_modern_common_cfg 
> > > > > > > __iomem *)mdev->common;
> > > > > > >
> > > > > > > vp_iowrite16(index, &cfg->cfg.queue_select);
> > > > > > > vp_iowrite16(1, &cfg->queue_reset);
> > > > > > >
> > > > > > > while (vp_ioread16(&cfg->queue_reset))
> > > > > > > msleep(1);
> > > > > > >
> > > > > > > while (vp_ioread16(&cfg->cfg.queue_enable))
> > > > > > > msle

Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang

2024-01-21 Thread Jason Wang
On Mon, Jan 22, 2024 at 3:36 PM Xuan Zhuo  wrote:
>
> On Mon, 22 Jan 2024 15:19:12 +0800, Jason Wang  wrote:
> > On Mon, Jan 22, 2024 at 3:07 PM Xuan Zhuo  
> > wrote:
> > >
> > > On Mon, 22 Jan 2024 14:58:09 +0800, Jason Wang  
> > > wrote:
> > > > On Mon, Jan 22, 2024 at 2:55 PM Jason Wang  wrote:
> > > > >
> > > > > On Mon, Jan 22, 2024 at 2:20 PM Xuan Zhuo 
> > > > >  wrote:
> > > > > >
> > > > > > On Mon, 22 Jan 2024 12:16:27 +0800, Jason Wang 
> > > > > >  wrote:
> > > > > > > On Mon, Jan 22, 2024 at 12:00 PM Xuan Zhuo 
> > > > > > >  wrote:
> > > > > > > >
> > > > > > > > On Mon, 22 Jan 2024 11:14:30 +0800, Jason Wang 
> > > > > > > >  wrote:
> > > > > > > > > On Mon, Jan 22, 2024 at 10:12 AM Zhu Yanjun 
> > > > > > > > >  wrote:
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > 在 2024/1/20 1:29, Andrew Lunn 写道:
> > > > > > > > > > >while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> > > > > > > > > > > -   !virtqueue_is_broken(vi->cvq))
> > > > > > > > > > > +   !virtqueue_is_broken(vi->cvq)) {
> > > > > > > > > > > +if (timeout)
> > > > > > > > > > > +timeout--;
> > > > > > > > > >  This is not really a timeout, just a loop counter. 200 
> > > > > > > > > >  iterations could
> > > > > > > > > >  be a very short time on reasonable H/W. I guess this 
> > > > > > > > > >  avoid the soft
> > > > > > > > > >  lockup, but possibly (likely?) breaks the 
> > > > > > > > > >  functionality when we need to
> > > > > > > > > >  loop for some non negligible time.
> > > > > > > > > > 
> > > > > > > > > >  I fear we need a more complex solution, as mentioned 
> > > > > > > > > >  by Micheal in the
> > > > > > > > > >  thread you quoted.
> > > > > > > > > > >>> Got it. I also look forward to the more complex 
> > > > > > > > > > >>> solution to this problem.
> > > > > > > > > > >> Can we add a device capability (new feature bit) such as 
> > > > > > > > > > >> ctrq_wait_timeout
> > > > > > > > > > >> to get a reasonable timeout?
> > > > > > > > > > > The usual solution to this is include/linux/iopoll.h. If 
> > > > > > > > > > > you can sleep
> > > > > > > > > > > read_poll_timeout() otherwise read_poll_timeout_atomic().
> > > > > > > > > >
> > > > > > > > > > I read carefully the functions read_poll_timeout() and
> > > > > > > > > > read_poll_timeout_atomic(). The timeout is set by the 
> > > > > > > > > > caller of the 2
> > > > > > > > > > functions.
> > > > > > > > >
> > > > > > > > > FYI, in order to avoid a swtich of atomic or not, we need 
> > > > > > > > > convert rx
> > > > > > > > > mode setting to workqueue first:
> > > > > > > > >
> > > > > > > > > https://www.mail-archive.com/[email protected]/msg60298.html
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > As such, can we add a module parameter to customize this 
> > > > > > > > > > timeout value
> > > > > > > > > > by the user?
> > > > > > > > >
> > > > > > > > > Who is the "user" here, or how can the "user" know the value?
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Or this timeout value is stored in device register, 
> > > > > > > > > > virtio_net driver
> > > > > > > > > > will read this timeout value at initialization?
> > > > > > > > >
> > > > > > > > > See another thread. The design needs to be general, or you 
> > > > > > > > > can post a RFC.
> > > > > > > > >
> > > > > > > > > In another thought, we've already had a tx watchdog, maybe we 
> > > > > > > > > can have
> > > > > > > > > something similar to cvq and use timeout + reset in that case.
> > > > > > > >
> > > > > > > > But we may block by the reset ^_^ if the device is broken?
> > > > > > >
> > > > > > > I mean vq reset here.
> > > > > >
> > > > > > I see.
> > > > > >
> > > > > > I mean when the deivce is broken, the vq reset also many be blocked.
> > > > > >
> > > > > > void vp_modern_set_queue_reset(struct 
> > > > > > virtio_pci_modern_device *mdev, u16 index)
> > > > > > {
> > > > > > struct virtio_pci_modern_common_cfg __iomem *cfg;
> > > > > >
> > > > > > cfg = (struct virtio_pci_modern_common_cfg __iomem 
> > > > > > *)mdev->common;
> > > > > >
> > > > > > vp_iowrite16(index, &cfg->cfg.queue_select);
> > > > > > vp_iowrite16(1, &cfg->queue_reset);
> > > > > >
> > > > > > while (vp_ioread16(&cfg->queue_reset))
> > > > > > msleep(1);
> > > > > >
> > > > > > while (vp_ioread16(&cfg->cfg.queue_enable))
> > > > > > msleep(1);
> > > > > > }
> > > > > > EXPORT_SYMBOL_GPL(vp_modern_set_queue_reset);
> > > > > >
> > > > > > In this function, for the broken device, we can not expect 
> > > > > > something.
> > > > >
> > > > > Yes, it's best effort, there's no guarantee then. But it doesn't harm 
> > > > > to try.
> > > > >
> > > > > Thanks
> > > > >
> > > > 

Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang

2024-01-21 Thread Xuan Zhuo
On Mon, 22 Jan 2024 15:19:12 +0800, Jason Wang  wrote:
> On Mon, Jan 22, 2024 at 3:07 PM Xuan Zhuo  wrote:
> >
> > On Mon, 22 Jan 2024 14:58:09 +0800, Jason Wang  wrote:
> > > On Mon, Jan 22, 2024 at 2:55 PM Jason Wang  wrote:
> > > >
> > > > On Mon, Jan 22, 2024 at 2:20 PM Xuan Zhuo  
> > > > wrote:
> > > > >
> > > > > On Mon, 22 Jan 2024 12:16:27 +0800, Jason Wang  
> > > > > wrote:
> > > > > > On Mon, Jan 22, 2024 at 12:00 PM Xuan Zhuo 
> > > > > >  wrote:
> > > > > > >
> > > > > > > On Mon, 22 Jan 2024 11:14:30 +0800, Jason Wang 
> > > > > > >  wrote:
> > > > > > > > On Mon, Jan 22, 2024 at 10:12 AM Zhu Yanjun 
> > > > > > > >  wrote:
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > 在 2024/1/20 1:29, Andrew Lunn 写道:
> > > > > > > > > >while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> > > > > > > > > > -   !virtqueue_is_broken(vi->cvq))
> > > > > > > > > > +   !virtqueue_is_broken(vi->cvq)) {
> > > > > > > > > > +if (timeout)
> > > > > > > > > > +timeout--;
> > > > > > > > >  This is not really a timeout, just a loop counter. 200 
> > > > > > > > >  iterations could
> > > > > > > > >  be a very short time on reasonable H/W. I guess this 
> > > > > > > > >  avoid the soft
> > > > > > > > >  lockup, but possibly (likely?) breaks the functionality 
> > > > > > > > >  when we need to
> > > > > > > > >  loop for some non negligible time.
> > > > > > > > > 
> > > > > > > > >  I fear we need a more complex solution, as mentioned by 
> > > > > > > > >  Micheal in the
> > > > > > > > >  thread you quoted.
> > > > > > > > > >>> Got it. I also look forward to the more complex solution 
> > > > > > > > > >>> to this problem.
> > > > > > > > > >> Can we add a device capability (new feature bit) such as 
> > > > > > > > > >> ctrq_wait_timeout
> > > > > > > > > >> to get a reasonable timeout?
> > > > > > > > > > The usual solution to this is include/linux/iopoll.h. If 
> > > > > > > > > > you can sleep
> > > > > > > > > > read_poll_timeout() otherwise read_poll_timeout_atomic().
> > > > > > > > >
> > > > > > > > > I read carefully the functions read_poll_timeout() and
> > > > > > > > > read_poll_timeout_atomic(). The timeout is set by the caller 
> > > > > > > > > of the 2
> > > > > > > > > functions.
> > > > > > > >
> > > > > > > > FYI, in order to avoid a swtich of atomic or not, we need 
> > > > > > > > convert rx
> > > > > > > > mode setting to workqueue first:
> > > > > > > >
> > > > > > > > https://www.mail-archive.com/[email protected]/msg60298.html
> > > > > > > >
> > > > > > > > >
> > > > > > > > > As such, can we add a module parameter to customize this 
> > > > > > > > > timeout value
> > > > > > > > > by the user?
> > > > > > > >
> > > > > > > > Who is the "user" here, or how can the "user" know the value?
> > > > > > > >
> > > > > > > > >
> > > > > > > > > Or this timeout value is stored in device register, 
> > > > > > > > > virtio_net driver
> > > > > > > > > will read this timeout value at initialization?
> > > > > > > >
> > > > > > > > See another thread. The design needs to be general, or you can 
> > > > > > > > post a RFC.
> > > > > > > >
> > > > > > > > In another thought, we've already had a tx watchdog, maybe we 
> > > > > > > > can have
> > > > > > > > something similar to cvq and use timeout + reset in that case.
> > > > > > >
> > > > > > > But we may block by the reset ^_^ if the device is broken?
> > > > > >
> > > > > > I mean vq reset here.
> > > > >
> > > > > I see.
> > > > >
> > > > > I mean when the deivce is broken, the vq reset also many be blocked.
> > > > >
> > > > > void vp_modern_set_queue_reset(struct 
> > > > > virtio_pci_modern_device *mdev, u16 index)
> > > > > {
> > > > > struct virtio_pci_modern_common_cfg __iomem *cfg;
> > > > >
> > > > > cfg = (struct virtio_pci_modern_common_cfg __iomem 
> > > > > *)mdev->common;
> > > > >
> > > > > vp_iowrite16(index, &cfg->cfg.queue_select);
> > > > > vp_iowrite16(1, &cfg->queue_reset);
> > > > >
> > > > > while (vp_ioread16(&cfg->queue_reset))
> > > > > msleep(1);
> > > > >
> > > > > while (vp_ioread16(&cfg->cfg.queue_enable))
> > > > > msleep(1);
> > > > > }
> > > > > EXPORT_SYMBOL_GPL(vp_modern_set_queue_reset);
> > > > >
> > > > > In this function, for the broken device, we can not expect something.
> > > >
> > > > Yes, it's best effort, there's no guarantee then. But it doesn't harm 
> > > > to try.
> > > >
> > > > Thanks
> > > >
> > > > >
> > > > >
> > > > > >
> > > > > > It looks like we have multiple goals here
> > > > > >
> > > > > > 1) avoid lockups, using workqueue + cond_resched() seems to be
> > > > > > sufficient, it has issue but nothing new
> > > > > > 2) recover from the unresponsive device, the issue for timeout is 

Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang

2024-01-21 Thread Jason Wang
On Mon, Jan 22, 2024 at 3:07 PM Xuan Zhuo  wrote:
>
> On Mon, 22 Jan 2024 14:58:09 +0800, Jason Wang  wrote:
> > On Mon, Jan 22, 2024 at 2:55 PM Jason Wang  wrote:
> > >
> > > On Mon, Jan 22, 2024 at 2:20 PM Xuan Zhuo  
> > > wrote:
> > > >
> > > > On Mon, 22 Jan 2024 12:16:27 +0800, Jason Wang  
> > > > wrote:
> > > > > On Mon, Jan 22, 2024 at 12:00 PM Xuan Zhuo 
> > > > >  wrote:
> > > > > >
> > > > > > On Mon, 22 Jan 2024 11:14:30 +0800, Jason Wang 
> > > > > >  wrote:
> > > > > > > On Mon, Jan 22, 2024 at 10:12 AM Zhu Yanjun 
> > > > > > >  wrote:
> > > > > > > >
> > > > > > > >
> > > > > > > > 在 2024/1/20 1:29, Andrew Lunn 写道:
> > > > > > > > >while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> > > > > > > > > -   !virtqueue_is_broken(vi->cvq))
> > > > > > > > > +   !virtqueue_is_broken(vi->cvq)) {
> > > > > > > > > +if (timeout)
> > > > > > > > > +timeout--;
> > > > > > > >  This is not really a timeout, just a loop counter. 200 
> > > > > > > >  iterations could
> > > > > > > >  be a very short time on reasonable H/W. I guess this avoid 
> > > > > > > >  the soft
> > > > > > > >  lockup, but possibly (likely?) breaks the functionality 
> > > > > > > >  when we need to
> > > > > > > >  loop for some non negligible time.
> > > > > > > > 
> > > > > > > >  I fear we need a more complex solution, as mentioned by 
> > > > > > > >  Micheal in the
> > > > > > > >  thread you quoted.
> > > > > > > > >>> Got it. I also look forward to the more complex solution to 
> > > > > > > > >>> this problem.
> > > > > > > > >> Can we add a device capability (new feature bit) such as 
> > > > > > > > >> ctrq_wait_timeout
> > > > > > > > >> to get a reasonable timeout?
> > > > > > > > > The usual solution to this is include/linux/iopoll.h. If you 
> > > > > > > > > can sleep
> > > > > > > > > read_poll_timeout() otherwise read_poll_timeout_atomic().
> > > > > > > >
> > > > > > > > I read carefully the functions read_poll_timeout() and
> > > > > > > > read_poll_timeout_atomic(). The timeout is set by the caller of 
> > > > > > > > the 2
> > > > > > > > functions.
> > > > > > >
> > > > > > > FYI, in order to avoid a swtich of atomic or not, we need convert 
> > > > > > > rx
> > > > > > > mode setting to workqueue first:
> > > > > > >
> > > > > > > https://www.mail-archive.com/[email protected]/msg60298.html
> > > > > > >
> > > > > > > >
> > > > > > > > As such, can we add a module parameter to customize this 
> > > > > > > > timeout value
> > > > > > > > by the user?
> > > > > > >
> > > > > > > Who is the "user" here, or how can the "user" know the value?
> > > > > > >
> > > > > > > >
> > > > > > > > Or this timeout value is stored in device register, virtio_net 
> > > > > > > > driver
> > > > > > > > will read this timeout value at initialization?
> > > > > > >
> > > > > > > See another thread. The design needs to be general, or you can 
> > > > > > > post a RFC.
> > > > > > >
> > > > > > > In another thought, we've already had a tx watchdog, maybe we can 
> > > > > > > have
> > > > > > > something similar to cvq and use timeout + reset in that case.
> > > > > >
> > > > > > But we may block by the reset ^_^ if the device is broken?
> > > > >
> > > > > I mean vq reset here.
> > > >
> > > > I see.
> > > >
> > > > I mean when the deivce is broken, the vq reset also many be blocked.
> > > >
> > > > void vp_modern_set_queue_reset(struct virtio_pci_modern_device 
> > > > *mdev, u16 index)
> > > > {
> > > > struct virtio_pci_modern_common_cfg __iomem *cfg;
> > > >
> > > > cfg = (struct virtio_pci_modern_common_cfg __iomem 
> > > > *)mdev->common;
> > > >
> > > > vp_iowrite16(index, &cfg->cfg.queue_select);
> > > > vp_iowrite16(1, &cfg->queue_reset);
> > > >
> > > > while (vp_ioread16(&cfg->queue_reset))
> > > > msleep(1);
> > > >
> > > > while (vp_ioread16(&cfg->cfg.queue_enable))
> > > > msleep(1);
> > > > }
> > > > EXPORT_SYMBOL_GPL(vp_modern_set_queue_reset);
> > > >
> > > > In this function, for the broken device, we can not expect something.
> > >
> > > Yes, it's best effort, there's no guarantee then. But it doesn't harm to 
> > > try.
> > >
> > > Thanks
> > >
> > > >
> > > >
> > > > >
> > > > > It looks like we have multiple goals here
> > > > >
> > > > > 1) avoid lockups, using workqueue + cond_resched() seems to be
> > > > > sufficient, it has issue but nothing new
> > > > > 2) recover from the unresponsive device, the issue for timeout is that
> > > > > it needs to deal with false positives
> > > >
> > > >
> > > > I agree.
> > > >
> > > > But I want to add a new goal, cvq async. In the netdim, we will
> > > > send many requests via the cvq, so the cvq async will be nice.
> >
> > Then you need an interrupt for cvq.
> >
> > F

Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang

2024-01-21 Thread Xuan Zhuo
On Mon, 22 Jan 2024 14:58:09 +0800, Jason Wang  wrote:
> On Mon, Jan 22, 2024 at 2:55 PM Jason Wang  wrote:
> >
> > On Mon, Jan 22, 2024 at 2:20 PM Xuan Zhuo  
> > wrote:
> > >
> > > On Mon, 22 Jan 2024 12:16:27 +0800, Jason Wang  
> > > wrote:
> > > > On Mon, Jan 22, 2024 at 12:00 PM Xuan Zhuo  
> > > > wrote:
> > > > >
> > > > > On Mon, 22 Jan 2024 11:14:30 +0800, Jason Wang  
> > > > > wrote:
> > > > > > On Mon, Jan 22, 2024 at 10:12 AM Zhu Yanjun  
> > > > > > wrote:
> > > > > > >
> > > > > > >
> > > > > > > 在 2024/1/20 1:29, Andrew Lunn 写道:
> > > > > > > >while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> > > > > > > > -   !virtqueue_is_broken(vi->cvq))
> > > > > > > > +   !virtqueue_is_broken(vi->cvq)) {
> > > > > > > > +if (timeout)
> > > > > > > > +timeout--;
> > > > > > >  This is not really a timeout, just a loop counter. 200 
> > > > > > >  iterations could
> > > > > > >  be a very short time on reasonable H/W. I guess this avoid 
> > > > > > >  the soft
> > > > > > >  lockup, but possibly (likely?) breaks the functionality when 
> > > > > > >  we need to
> > > > > > >  loop for some non negligible time.
> > > > > > > 
> > > > > > >  I fear we need a more complex solution, as mentioned by 
> > > > > > >  Micheal in the
> > > > > > >  thread you quoted.
> > > > > > > >>> Got it. I also look forward to the more complex solution to 
> > > > > > > >>> this problem.
> > > > > > > >> Can we add a device capability (new feature bit) such as 
> > > > > > > >> ctrq_wait_timeout
> > > > > > > >> to get a reasonable timeout?
> > > > > > > > The usual solution to this is include/linux/iopoll.h. If you 
> > > > > > > > can sleep
> > > > > > > > read_poll_timeout() otherwise read_poll_timeout_atomic().
> > > > > > >
> > > > > > > I read carefully the functions read_poll_timeout() and
> > > > > > > read_poll_timeout_atomic(). The timeout is set by the caller of 
> > > > > > > the 2
> > > > > > > functions.
> > > > > >
> > > > > > FYI, in order to avoid a swtich of atomic or not, we need convert rx
> > > > > > mode setting to workqueue first:
> > > > > >
> > > > > > https://www.mail-archive.com/[email protected]/msg60298.html
> > > > > >
> > > > > > >
> > > > > > > As such, can we add a module parameter to customize this timeout 
> > > > > > > value
> > > > > > > by the user?
> > > > > >
> > > > > > Who is the "user" here, or how can the "user" know the value?
> > > > > >
> > > > > > >
> > > > > > > Or this timeout value is stored in device register, virtio_net 
> > > > > > > driver
> > > > > > > will read this timeout value at initialization?
> > > > > >
> > > > > > See another thread. The design needs to be general, or you can post 
> > > > > > a RFC.
> > > > > >
> > > > > > In another thought, we've already had a tx watchdog, maybe we can 
> > > > > > have
> > > > > > something similar to cvq and use timeout + reset in that case.
> > > > >
> > > > > But we may block by the reset ^_^ if the device is broken?
> > > >
> > > > I mean vq reset here.
> > >
> > > I see.
> > >
> > > I mean when the deivce is broken, the vq reset also many be blocked.
> > >
> > > void vp_modern_set_queue_reset(struct virtio_pci_modern_device 
> > > *mdev, u16 index)
> > > {
> > > struct virtio_pci_modern_common_cfg __iomem *cfg;
> > >
> > > cfg = (struct virtio_pci_modern_common_cfg __iomem 
> > > *)mdev->common;
> > >
> > > vp_iowrite16(index, &cfg->cfg.queue_select);
> > > vp_iowrite16(1, &cfg->queue_reset);
> > >
> > > while (vp_ioread16(&cfg->queue_reset))
> > > msleep(1);
> > >
> > > while (vp_ioread16(&cfg->cfg.queue_enable))
> > > msleep(1);
> > > }
> > > EXPORT_SYMBOL_GPL(vp_modern_set_queue_reset);
> > >
> > > In this function, for the broken device, we can not expect something.
> >
> > Yes, it's best effort, there's no guarantee then. But it doesn't harm to 
> > try.
> >
> > Thanks
> >
> > >
> > >
> > > >
> > > > It looks like we have multiple goals here
> > > >
> > > > 1) avoid lockups, using workqueue + cond_resched() seems to be
> > > > sufficient, it has issue but nothing new
> > > > 2) recover from the unresponsive device, the issue for timeout is that
> > > > it needs to deal with false positives
> > >
> > >
> > > I agree.
> > >
> > > But I want to add a new goal, cvq async. In the netdim, we will
> > > send many requests via the cvq, so the cvq async will be nice.
>
> Then you need an interrupt for cvq.
>
> FYI, I've posted a series that use interrupt for cvq in the past:
>
> https://lore.kernel.org/lkml/[email protected]/t/

I know this. But the interrupt maybe not a good solution without new space.

>
> Haven't found time in working on this anymore, maybe we can start from
> this or not.

Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang

2024-01-21 Thread Xuan Zhuo
On Mon, 22 Jan 2024 14:55:46 +0800, Jason Wang  wrote:
> On Mon, Jan 22, 2024 at 2:20 PM Xuan Zhuo  wrote:
> >
> > On Mon, 22 Jan 2024 12:16:27 +0800, Jason Wang  wrote:
> > > On Mon, Jan 22, 2024 at 12:00 PM Xuan Zhuo  
> > > wrote:
> > > >
> > > > On Mon, 22 Jan 2024 11:14:30 +0800, Jason Wang  
> > > > wrote:
> > > > > On Mon, Jan 22, 2024 at 10:12 AM Zhu Yanjun  
> > > > > wrote:
> > > > > >
> > > > > >
> > > > > > 在 2024/1/20 1:29, Andrew Lunn 写道:
> > > > > > >while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> > > > > > > -   !virtqueue_is_broken(vi->cvq))
> > > > > > > +   !virtqueue_is_broken(vi->cvq)) {
> > > > > > > +if (timeout)
> > > > > > > +timeout--;
> > > > > >  This is not really a timeout, just a loop counter. 200 
> > > > > >  iterations could
> > > > > >  be a very short time on reasonable H/W. I guess this avoid the 
> > > > > >  soft
> > > > > >  lockup, but possibly (likely?) breaks the functionality when 
> > > > > >  we need to
> > > > > >  loop for some non negligible time.
> > > > > > 
> > > > > >  I fear we need a more complex solution, as mentioned by 
> > > > > >  Micheal in the
> > > > > >  thread you quoted.
> > > > > > >>> Got it. I also look forward to the more complex solution to 
> > > > > > >>> this problem.
> > > > > > >> Can we add a device capability (new feature bit) such as 
> > > > > > >> ctrq_wait_timeout
> > > > > > >> to get a reasonable timeout?
> > > > > > > The usual solution to this is include/linux/iopoll.h. If you can 
> > > > > > > sleep
> > > > > > > read_poll_timeout() otherwise read_poll_timeout_atomic().
> > > > > >
> > > > > > I read carefully the functions read_poll_timeout() and
> > > > > > read_poll_timeout_atomic(). The timeout is set by the caller of the 
> > > > > > 2
> > > > > > functions.
> > > > >
> > > > > FYI, in order to avoid a swtich of atomic or not, we need convert rx
> > > > > mode setting to workqueue first:
> > > > >
> > > > > https://www.mail-archive.com/[email protected]/msg60298.html
> > > > >
> > > > > >
> > > > > > As such, can we add a module parameter to customize this timeout 
> > > > > > value
> > > > > > by the user?
> > > > >
> > > > > Who is the "user" here, or how can the "user" know the value?
> > > > >
> > > > > >
> > > > > > Or this timeout value is stored in device register, virtio_net 
> > > > > > driver
> > > > > > will read this timeout value at initialization?
> > > > >
> > > > > See another thread. The design needs to be general, or you can post a 
> > > > > RFC.
> > > > >
> > > > > In another thought, we've already had a tx watchdog, maybe we can have
> > > > > something similar to cvq and use timeout + reset in that case.
> > > >
> > > > But we may block by the reset ^_^ if the device is broken?
> > >
> > > I mean vq reset here.
> >
> > I see.
> >
> > I mean when the deivce is broken, the vq reset also many be blocked.
> >
> > void vp_modern_set_queue_reset(struct virtio_pci_modern_device 
> > *mdev, u16 index)
> > {
> > struct virtio_pci_modern_common_cfg __iomem *cfg;
> >
> > cfg = (struct virtio_pci_modern_common_cfg __iomem 
> > *)mdev->common;
> >
> > vp_iowrite16(index, &cfg->cfg.queue_select);
> > vp_iowrite16(1, &cfg->queue_reset);
> >
> > while (vp_ioread16(&cfg->queue_reset))
> > msleep(1);
> >
> > while (vp_ioread16(&cfg->cfg.queue_enable))
> > msleep(1);
> > }
> > EXPORT_SYMBOL_GPL(vp_modern_set_queue_reset);
> >
> > In this function, for the broken device, we can not expect something.
>
> Yes, it's best effort, there's no guarantee then. But it doesn't harm to try.
>

YES. I agree.

Thanks.


> Thanks
>
> >
> >
> > >
> > > It looks like we have multiple goals here
> > >
> > > 1) avoid lockups, using workqueue + cond_resched() seems to be
> > > sufficient, it has issue but nothing new
> > > 2) recover from the unresponsive device, the issue for timeout is that
> > > it needs to deal with false positives
> >
> >
> > I agree.
> >
> > But I want to add a new goal, cvq async. In the netdim, we will
> > send many requests via the cvq, so the cvq async will be nice.
> >
> > Thanks.
> >
> >
> > >
> > > Thanks
> > >
> > > >
> > > > Thanks.
> > > >
> > > >
> > > > >
> > > > > Thans
> > > > >
> > > > > >
> > > > > > Zhu Yanjun
> > > > > >
> > > > > > >
> > > > > > >   Andrew
> > > > > >
> > > > >
> > > >
> > >
> >
>



Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang

2024-01-21 Thread Jason Wang
On Mon, Jan 22, 2024 at 2:55 PM Jason Wang  wrote:
>
> On Mon, Jan 22, 2024 at 2:20 PM Xuan Zhuo  wrote:
> >
> > On Mon, 22 Jan 2024 12:16:27 +0800, Jason Wang  wrote:
> > > On Mon, Jan 22, 2024 at 12:00 PM Xuan Zhuo  
> > > wrote:
> > > >
> > > > On Mon, 22 Jan 2024 11:14:30 +0800, Jason Wang  
> > > > wrote:
> > > > > On Mon, Jan 22, 2024 at 10:12 AM Zhu Yanjun  
> > > > > wrote:
> > > > > >
> > > > > >
> > > > > > 在 2024/1/20 1:29, Andrew Lunn 写道:
> > > > > > >while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> > > > > > > -   !virtqueue_is_broken(vi->cvq))
> > > > > > > +   !virtqueue_is_broken(vi->cvq)) {
> > > > > > > +if (timeout)
> > > > > > > +timeout--;
> > > > > >  This is not really a timeout, just a loop counter. 200 
> > > > > >  iterations could
> > > > > >  be a very short time on reasonable H/W. I guess this avoid the 
> > > > > >  soft
> > > > > >  lockup, but possibly (likely?) breaks the functionality when 
> > > > > >  we need to
> > > > > >  loop for some non negligible time.
> > > > > > 
> > > > > >  I fear we need a more complex solution, as mentioned by 
> > > > > >  Micheal in the
> > > > > >  thread you quoted.
> > > > > > >>> Got it. I also look forward to the more complex solution to 
> > > > > > >>> this problem.
> > > > > > >> Can we add a device capability (new feature bit) such as 
> > > > > > >> ctrq_wait_timeout
> > > > > > >> to get a reasonable timeout?
> > > > > > > The usual solution to this is include/linux/iopoll.h. If you can 
> > > > > > > sleep
> > > > > > > read_poll_timeout() otherwise read_poll_timeout_atomic().
> > > > > >
> > > > > > I read carefully the functions read_poll_timeout() and
> > > > > > read_poll_timeout_atomic(). The timeout is set by the caller of the 
> > > > > > 2
> > > > > > functions.
> > > > >
> > > > > FYI, in order to avoid a swtich of atomic or not, we need convert rx
> > > > > mode setting to workqueue first:
> > > > >
> > > > > https://www.mail-archive.com/[email protected]/msg60298.html
> > > > >
> > > > > >
> > > > > > As such, can we add a module parameter to customize this timeout 
> > > > > > value
> > > > > > by the user?
> > > > >
> > > > > Who is the "user" here, or how can the "user" know the value?
> > > > >
> > > > > >
> > > > > > Or this timeout value is stored in device register, virtio_net 
> > > > > > driver
> > > > > > will read this timeout value at initialization?
> > > > >
> > > > > See another thread. The design needs to be general, or you can post a 
> > > > > RFC.
> > > > >
> > > > > In another thought, we've already had a tx watchdog, maybe we can have
> > > > > something similar to cvq and use timeout + reset in that case.
> > > >
> > > > But we may block by the reset ^_^ if the device is broken?
> > >
> > > I mean vq reset here.
> >
> > I see.
> >
> > I mean when the deivce is broken, the vq reset also many be blocked.
> >
> > void vp_modern_set_queue_reset(struct virtio_pci_modern_device 
> > *mdev, u16 index)
> > {
> > struct virtio_pci_modern_common_cfg __iomem *cfg;
> >
> > cfg = (struct virtio_pci_modern_common_cfg __iomem 
> > *)mdev->common;
> >
> > vp_iowrite16(index, &cfg->cfg.queue_select);
> > vp_iowrite16(1, &cfg->queue_reset);
> >
> > while (vp_ioread16(&cfg->queue_reset))
> > msleep(1);
> >
> > while (vp_ioread16(&cfg->cfg.queue_enable))
> > msleep(1);
> > }
> > EXPORT_SYMBOL_GPL(vp_modern_set_queue_reset);
> >
> > In this function, for the broken device, we can not expect something.
>
> Yes, it's best effort, there's no guarantee then. But it doesn't harm to try.
>
> Thanks
>
> >
> >
> > >
> > > It looks like we have multiple goals here
> > >
> > > 1) avoid lockups, using workqueue + cond_resched() seems to be
> > > sufficient, it has issue but nothing new
> > > 2) recover from the unresponsive device, the issue for timeout is that
> > > it needs to deal with false positives
> >
> >
> > I agree.
> >
> > But I want to add a new goal, cvq async. In the netdim, we will
> > send many requests via the cvq, so the cvq async will be nice.

Then you need an interrupt for cvq.

FYI, I've posted a series that use interrupt for cvq in the past:

https://lore.kernel.org/lkml/[email protected]/t/

Haven't found time in working on this anymore, maybe we can start from
this or not.

Thanks

> >
> > Thanks.
> >
> >
> > >
> > > Thanks
> > >
> > > >
> > > > Thanks.
> > > >
> > > >
> > > > >
> > > > > Thans
> > > > >
> > > > > >
> > > > > > Zhu Yanjun
> > > > > >
> > > > > > >
> > > > > > >   Andrew
> > > > > >
> > > > >
> > > >
> > >
> >




Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang

2024-01-21 Thread Jason Wang
On Mon, Jan 22, 2024 at 2:20 PM Xuan Zhuo  wrote:
>
> On Mon, 22 Jan 2024 12:16:27 +0800, Jason Wang  wrote:
> > On Mon, Jan 22, 2024 at 12:00 PM Xuan Zhuo  
> > wrote:
> > >
> > > On Mon, 22 Jan 2024 11:14:30 +0800, Jason Wang  
> > > wrote:
> > > > On Mon, Jan 22, 2024 at 10:12 AM Zhu Yanjun  
> > > > wrote:
> > > > >
> > > > >
> > > > > 在 2024/1/20 1:29, Andrew Lunn 写道:
> > > > > >while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> > > > > > -   !virtqueue_is_broken(vi->cvq))
> > > > > > +   !virtqueue_is_broken(vi->cvq)) {
> > > > > > +if (timeout)
> > > > > > +timeout--;
> > > > >  This is not really a timeout, just a loop counter. 200 
> > > > >  iterations could
> > > > >  be a very short time on reasonable H/W. I guess this avoid the 
> > > > >  soft
> > > > >  lockup, but possibly (likely?) breaks the functionality when we 
> > > > >  need to
> > > > >  loop for some non negligible time.
> > > > > 
> > > > >  I fear we need a more complex solution, as mentioned by Micheal 
> > > > >  in the
> > > > >  thread you quoted.
> > > > > >>> Got it. I also look forward to the more complex solution to this 
> > > > > >>> problem.
> > > > > >> Can we add a device capability (new feature bit) such as 
> > > > > >> ctrq_wait_timeout
> > > > > >> to get a reasonable timeout?
> > > > > > The usual solution to this is include/linux/iopoll.h. If you can 
> > > > > > sleep
> > > > > > read_poll_timeout() otherwise read_poll_timeout_atomic().
> > > > >
> > > > > I read carefully the functions read_poll_timeout() and
> > > > > read_poll_timeout_atomic(). The timeout is set by the caller of the 2
> > > > > functions.
> > > >
> > > > FYI, in order to avoid a swtich of atomic or not, we need convert rx
> > > > mode setting to workqueue first:
> > > >
> > > > https://www.mail-archive.com/[email protected]/msg60298.html
> > > >
> > > > >
> > > > > As such, can we add a module parameter to customize this timeout value
> > > > > by the user?
> > > >
> > > > Who is the "user" here, or how can the "user" know the value?
> > > >
> > > > >
> > > > > Or this timeout value is stored in device register, virtio_net driver
> > > > > will read this timeout value at initialization?
> > > >
> > > > See another thread. The design needs to be general, or you can post a 
> > > > RFC.
> > > >
> > > > In another thought, we've already had a tx watchdog, maybe we can have
> > > > something similar to cvq and use timeout + reset in that case.
> > >
> > > But we may block by the reset ^_^ if the device is broken?
> >
> > I mean vq reset here.
>
> I see.
>
> I mean when the deivce is broken, the vq reset also many be blocked.
>
> void vp_modern_set_queue_reset(struct virtio_pci_modern_device *mdev, 
> u16 index)
> {
> struct virtio_pci_modern_common_cfg __iomem *cfg;
>
> cfg = (struct virtio_pci_modern_common_cfg __iomem 
> *)mdev->common;
>
> vp_iowrite16(index, &cfg->cfg.queue_select);
> vp_iowrite16(1, &cfg->queue_reset);
>
> while (vp_ioread16(&cfg->queue_reset))
> msleep(1);
>
> while (vp_ioread16(&cfg->cfg.queue_enable))
> msleep(1);
> }
> EXPORT_SYMBOL_GPL(vp_modern_set_queue_reset);
>
> In this function, for the broken device, we can not expect something.

Yes, it's best effort, there's no guarantee then. But it doesn't harm to try.

Thanks

>
>
> >
> > It looks like we have multiple goals here
> >
> > 1) avoid lockups, using workqueue + cond_resched() seems to be
> > sufficient, it has issue but nothing new
> > 2) recover from the unresponsive device, the issue for timeout is that
> > it needs to deal with false positives
>
>
> I agree.
>
> But I want to add a new goal, cvq async. In the netdim, we will
> send many requests via the cvq, so the cvq async will be nice.
>
> Thanks.
>
>
> >
> > Thanks
> >
> > >
> > > Thanks.
> > >
> > >
> > > >
> > > > Thans
> > > >
> > > > >
> > > > > Zhu Yanjun
> > > > >
> > > > > >
> > > > > >   Andrew
> > > > >
> > > >
> > >
> >
>




Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang

2024-01-21 Thread Xuan Zhuo
On Mon, 22 Jan 2024 12:16:27 +0800, Jason Wang  wrote:
> On Mon, Jan 22, 2024 at 12:00 PM Xuan Zhuo  wrote:
> >
> > On Mon, 22 Jan 2024 11:14:30 +0800, Jason Wang  wrote:
> > > On Mon, Jan 22, 2024 at 10:12 AM Zhu Yanjun  wrote:
> > > >
> > > >
> > > > 在 2024/1/20 1:29, Andrew Lunn 写道:
> > > > >while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> > > > > -   !virtqueue_is_broken(vi->cvq))
> > > > > +   !virtqueue_is_broken(vi->cvq)) {
> > > > > +if (timeout)
> > > > > +timeout--;
> > > >  This is not really a timeout, just a loop counter. 200 iterations 
> > > >  could
> > > >  be a very short time on reasonable H/W. I guess this avoid the soft
> > > >  lockup, but possibly (likely?) breaks the functionality when we 
> > > >  need to
> > > >  loop for some non negligible time.
> > > > 
> > > >  I fear we need a more complex solution, as mentioned by Micheal in 
> > > >  the
> > > >  thread you quoted.
> > > > >>> Got it. I also look forward to the more complex solution to this 
> > > > >>> problem.
> > > > >> Can we add a device capability (new feature bit) such as 
> > > > >> ctrq_wait_timeout
> > > > >> to get a reasonable timeout?
> > > > > The usual solution to this is include/linux/iopoll.h. If you can sleep
> > > > > read_poll_timeout() otherwise read_poll_timeout_atomic().
> > > >
> > > > I read carefully the functions read_poll_timeout() and
> > > > read_poll_timeout_atomic(). The timeout is set by the caller of the 2
> > > > functions.
> > >
> > > FYI, in order to avoid a swtich of atomic or not, we need convert rx
> > > mode setting to workqueue first:
> > >
> > > https://www.mail-archive.com/[email protected]/msg60298.html
> > >
> > > >
> > > > As such, can we add a module parameter to customize this timeout value
> > > > by the user?
> > >
> > > Who is the "user" here, or how can the "user" know the value?
> > >
> > > >
> > > > Or this timeout value is stored in device register, virtio_net driver
> > > > will read this timeout value at initialization?
> > >
> > > See another thread. The design needs to be general, or you can post a RFC.
> > >
> > > In another thought, we've already had a tx watchdog, maybe we can have
> > > something similar to cvq and use timeout + reset in that case.
> >
> > But we may block by the reset ^_^ if the device is broken?
>
> I mean vq reset here.

I see.

I mean when the deivce is broken, the vq reset also many be blocked.

void vp_modern_set_queue_reset(struct virtio_pci_modern_device *mdev, 
u16 index)
{
struct virtio_pci_modern_common_cfg __iomem *cfg;

cfg = (struct virtio_pci_modern_common_cfg __iomem 
*)mdev->common;

vp_iowrite16(index, &cfg->cfg.queue_select);
vp_iowrite16(1, &cfg->queue_reset);

while (vp_ioread16(&cfg->queue_reset))
msleep(1);

while (vp_ioread16(&cfg->cfg.queue_enable))
msleep(1);
}
EXPORT_SYMBOL_GPL(vp_modern_set_queue_reset);

In this function, for the broken device, we can not expect something.


>
> It looks like we have multiple goals here
>
> 1) avoid lockups, using workqueue + cond_resched() seems to be
> sufficient, it has issue but nothing new
> 2) recover from the unresponsive device, the issue for timeout is that
> it needs to deal with false positives


I agree.

But I want to add a new goal, cvq async. In the netdim, we will
send many requests via the cvq, so the cvq async will be nice.

Thanks.


>
> Thanks
>
> >
> > Thanks.
> >
> >
> > >
> > > Thans
> > >
> > > >
> > > > Zhu Yanjun
> > > >
> > > > >
> > > > >   Andrew
> > > >
> > >
> >
>



Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang

2024-01-21 Thread Heng Qi




在 2024/1/22 上午11:08, Jason Wang 写道:

On Fri, Jan 19, 2024 at 10:27 PM Heng Qi  wrote:



在 2024/1/18 下午8:01, Zhu Yanjun 写道:

在 2024/1/16 20:04, Paolo Abeni 写道:

On Mon, 2024-01-15 at 09:29 +0800, Zhu Yanjun wrote:

From: Zhu Yanjun 

Some devices emulate the virtio_net hardwares. When virtio_net
driver sends commands to the emulated hardware, normally the
hardware needs time to response. Sometimes the time is very
long. Thus, the following will appear. Then the whole system
will hang.
The similar problems also occur in Intel NICs and Mellanox NICs.
As such, the similar solution is borrowed from them. A timeout
value is added and the timeout value as large as possible is set
to ensure that the driver gets the maximum possible response from
the hardware.

"
[  213.795860] watchdog: BUG: soft lockup - CPU#108 stuck for 26s!
[(udev-worker):3157]
[  213.796114] Modules linked in: virtio_net(+) net_failover
failover qrtr rfkill sunrpc intel_rapl_msr intel_rapl_common
intel_uncore_frequency intel_uncore_frequency_common intel_ifs
i10nm_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp
coretemp iTCO_wdt rapl intel_pmc_bxt dax_hmem iTCO_vendor_support
vfat cxl_acpi intel_cstate pmt_telemetry pmt_class intel_sdsi joydev
intel_uncore cxl_core fat pcspkr mei_me isst_if_mbox_pci
isst_if_mmio idxd i2c_i801 isst_if_common mei intel_vsec idxd_bus
i2c_smbus i2c_ismt ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf
ipmi_msghandler acpi_pad acpi_power_meter pfr_telemetry pfr_update
fuse loop zram xfs crct10dif_pclmul crc32_pclmul crc32c_intel
polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3
bnxt_en sha256_ssse3 sha1_ssse3 nvme ast nvme_core i2c_algo_bit wmi
pinctrl_emmitsburg scsi_dh_rdac scsi_dh_emc scsi_dh_alua dm_multipath
[  213.796194] irq event stamp: 67740
[  213.796195] hardirqs last  enabled at (67739):
[] asm_sysvec_apic_timer_interrupt+0x1a/0x20
[  213.796203] hardirqs last disabled at (67740):
[] sysvec_apic_timer_interrupt+0xe/0x90
[  213.796208] softirqs last  enabled at (67686):
[] __irq_exit_rcu+0xbe/0xe0
[  213.796214] softirqs last disabled at (67681):
[] __irq_exit_rcu+0xbe/0xe0
[  213.796217] CPU: 108 PID: 3157 Comm: (udev-worker) Kdump: loaded
Not tainted 6.7.0+ #9
[  213.796220] Hardware name: Intel Corporation
M50FCP2SBSTD/M50FCP2SBSTD, BIOS SE5C741.86B.01.01.0001.2211140926
11/14/2022
[  213.796221] RIP: 0010:virtqueue_get_buf_ctx_split+0x8d/0x110
[  213.796228] Code: 89 df e8 26 fe ff ff 0f b7 43 50 83 c0 01 66 89
43 50 f6 43 78 01 75 12 80 7b 42 00 48 8b 4b 68 8b 53 58 74 0f 66 87
44 51 04 <48> 89 e8 5b 5d c3 cc cc cc cc 66 89 44 51 04 0f ae f0 48
89 e8 5b
[  213.796230] RSP: 0018:ff4bbb362306f9b0 EFLAGS: 0246
[  213.796233] RAX:  RBX: ff2f15095896f000 RCX:
0001
[  213.796235] RDX:  RSI: ff4bbb362306f9cc RDI:
ff2f15095896f000
[  213.796236] RBP:  R08:  R09:

[  213.796237] R10: 0003 R11: ff2f15095893cc40 R12:
0002
[  213.796239] R13: 0004 R14:  R15:
ff2f1509534f3000
[  213.796240] FS:  7f775847d0c0() GS:ff2f1528bac0()
knlGS:
[  213.796242] CS:  0010 DS:  ES:  CR0: 80050033
[  213.796243] CR2: 557f987b6e70 CR3: 002098602006 CR4:
00f71ef0
[  213.796245] DR0:  DR1:  DR2:

[  213.796246] DR3:  DR6: fffe07f0 DR7:
0400
[  213.796247] PKRU: 5554
[  213.796249] Call Trace:
[  213.796250]  
[  213.796252]  ? watchdog_timer_fn+0x1c0/0x220
[  213.796258]  ? __pfx_watchdog_timer_fn+0x10/0x10
[  213.796261]  ? __hrtimer_run_queues+0x1af/0x380
[  213.796269]  ? hrtimer_interrupt+0xf8/0x230
[  213.796274]  ? __sysvec_apic_timer_interrupt+0x64/0x1a0
[  213.796279]  ? sysvec_apic_timer_interrupt+0x6d/0x90
[  213.796282]  
[  213.796284]  
[  213.796285]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
[  213.796293]  ? virtqueue_get_buf_ctx_split+0x8d/0x110
[  213.796297]  virtnet_send_command+0x18a/0x1f0 [virtio_net]
[  213.796310]  _virtnet_set_queues+0xc6/0x120 [virtio_net]
[  213.796319]  virtnet_probe+0xa06/0xd50 [virtio_net]
[  213.796328]  virtio_dev_probe+0x195/0x230
[  213.796333]  really_probe+0x19f/0x400
[  213.796338]  ? __pfx___driver_attach+0x10/0x10
[  213.796340]  __driver_probe_device+0x78/0x160
[  213.796343]  driver_probe_device+0x1f/0x90
[  213.796346]  __driver_attach+0xd6/0x1d0
[  213.796349]  bus_for_each_dev+0x8c/0xe0
[  213.796355]  bus_add_driver+0x119/0x220
[  213.796359]  driver_register+0x59/0x100
[  213.796362]  ? __pfx_virtio_net_driver_init+0x10/0x10 [virtio_net]
[  213.796369]  virtio_net_driver_init+0x8e/0xff0 [virtio_net]
[  213.796375]  do_one_initcall+0x6f/0x380
[  213.796384]  do_init_module+0x60/0x240
[  213.796388]  init_module_from_file+0x86/0xc0
[  213.796396]  idempotent_init_module+0x129/0x2c0
[  213.796406]  __x64_sys_finit_module+0x5e/0xb0
[  213.796409]  do_sy

Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang

2024-01-21 Thread Jason Wang
On Mon, Jan 22, 2024 at 12:00 PM Xuan Zhuo  wrote:
>
> On Mon, 22 Jan 2024 11:14:30 +0800, Jason Wang  wrote:
> > On Mon, Jan 22, 2024 at 10:12 AM Zhu Yanjun  wrote:
> > >
> > >
> > > 在 2024/1/20 1:29, Andrew Lunn 写道:
> > > >while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> > > > -   !virtqueue_is_broken(vi->cvq))
> > > > +   !virtqueue_is_broken(vi->cvq)) {
> > > > +if (timeout)
> > > > +timeout--;
> > >  This is not really a timeout, just a loop counter. 200 iterations 
> > >  could
> > >  be a very short time on reasonable H/W. I guess this avoid the soft
> > >  lockup, but possibly (likely?) breaks the functionality when we need 
> > >  to
> > >  loop for some non negligible time.
> > > 
> > >  I fear we need a more complex solution, as mentioned by Micheal in 
> > >  the
> > >  thread you quoted.
> > > >>> Got it. I also look forward to the more complex solution to this 
> > > >>> problem.
> > > >> Can we add a device capability (new feature bit) such as 
> > > >> ctrq_wait_timeout
> > > >> to get a reasonable timeout?
> > > > The usual solution to this is include/linux/iopoll.h. If you can sleep
> > > > read_poll_timeout() otherwise read_poll_timeout_atomic().
> > >
> > > I read carefully the functions read_poll_timeout() and
> > > read_poll_timeout_atomic(). The timeout is set by the caller of the 2
> > > functions.
> >
> > FYI, in order to avoid a swtich of atomic or not, we need convert rx
> > mode setting to workqueue first:
> >
> > https://www.mail-archive.com/[email protected]/msg60298.html
> >
> > >
> > > As such, can we add a module parameter to customize this timeout value
> > > by the user?
> >
> > Who is the "user" here, or how can the "user" know the value?
> >
> > >
> > > Or this timeout value is stored in device register, virtio_net driver
> > > will read this timeout value at initialization?
> >
> > See another thread. The design needs to be general, or you can post a RFC.
> >
> > In another thought, we've already had a tx watchdog, maybe we can have
> > something similar to cvq and use timeout + reset in that case.
>
> But we may block by the reset ^_^ if the device is broken?

I mean vq reset here.

It looks like we have multiple goals here

1) avoid lockups, using workqueue + cond_resched() seems to be
sufficient, it has issue but nothing new
2) recover from the unresponsive device, the issue for timeout is that
it needs to deal with false positives

Thanks

>
> Thanks.
>
>
> >
> > Thans
> >
> > >
> > > Zhu Yanjun
> > >
> > > >
> > > >   Andrew
> > >
> >
>




Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang

2024-01-21 Thread Xuan Zhuo
On Mon, 22 Jan 2024 11:14:30 +0800, Jason Wang  wrote:
> On Mon, Jan 22, 2024 at 10:12 AM Zhu Yanjun  wrote:
> >
> >
> > 在 2024/1/20 1:29, Andrew Lunn 写道:
> > >while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> > > -   !virtqueue_is_broken(vi->cvq))
> > > +   !virtqueue_is_broken(vi->cvq)) {
> > > +if (timeout)
> > > +timeout--;
> >  This is not really a timeout, just a loop counter. 200 iterations could
> >  be a very short time on reasonable H/W. I guess this avoid the soft
> >  lockup, but possibly (likely?) breaks the functionality when we need to
> >  loop for some non negligible time.
> > 
> >  I fear we need a more complex solution, as mentioned by Micheal in the
> >  thread you quoted.
> > >>> Got it. I also look forward to the more complex solution to this 
> > >>> problem.
> > >> Can we add a device capability (new feature bit) such as 
> > >> ctrq_wait_timeout
> > >> to get a reasonable timeout?
> > > The usual solution to this is include/linux/iopoll.h. If you can sleep
> > > read_poll_timeout() otherwise read_poll_timeout_atomic().
> >
> > I read carefully the functions read_poll_timeout() and
> > read_poll_timeout_atomic(). The timeout is set by the caller of the 2
> > functions.
>
> FYI, in order to avoid a swtich of atomic or not, we need convert rx
> mode setting to workqueue first:
>
> https://www.mail-archive.com/[email protected]/msg60298.html
>
> >
> > As such, can we add a module parameter to customize this timeout value
> > by the user?
>
> Who is the "user" here, or how can the "user" know the value?
>
> >
> > Or this timeout value is stored in device register, virtio_net driver
> > will read this timeout value at initialization?
>
> See another thread. The design needs to be general, or you can post a RFC.
>
> In another thought, we've already had a tx watchdog, maybe we can have
> something similar to cvq and use timeout + reset in that case.

But we may block by the reset ^_^ if the device is broken?

Thanks.


>
> Thans
>
> >
> > Zhu Yanjun
> >
> > >
> > >   Andrew
> >
>



Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang

2024-01-21 Thread Jason Wang
On Mon, Jan 22, 2024 at 10:12 AM Zhu Yanjun  wrote:
>
>
> 在 2024/1/20 1:29, Andrew Lunn 写道:
> >while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> > -   !virtqueue_is_broken(vi->cvq))
> > +   !virtqueue_is_broken(vi->cvq)) {
> > +if (timeout)
> > +timeout--;
>  This is not really a timeout, just a loop counter. 200 iterations could
>  be a very short time on reasonable H/W. I guess this avoid the soft
>  lockup, but possibly (likely?) breaks the functionality when we need to
>  loop for some non negligible time.
> 
>  I fear we need a more complex solution, as mentioned by Micheal in the
>  thread you quoted.
> >>> Got it. I also look forward to the more complex solution to this problem.
> >> Can we add a device capability (new feature bit) such as ctrq_wait_timeout
> >> to get a reasonable timeout?
> > The usual solution to this is include/linux/iopoll.h. If you can sleep
> > read_poll_timeout() otherwise read_poll_timeout_atomic().
>
> I read carefully the functions read_poll_timeout() and
> read_poll_timeout_atomic(). The timeout is set by the caller of the 2
> functions.

FYI, in order to avoid a swtich of atomic or not, we need convert rx
mode setting to workqueue first:

https://www.mail-archive.com/[email protected]/msg60298.html

>
> As such, can we add a module parameter to customize this timeout value
> by the user?

Who is the "user" here, or how can the "user" know the value?

>
> Or this timeout value is stored in device register, virtio_net driver
> will read this timeout value at initialization?

See another thread. The design needs to be general, or you can post a RFC.

In another thought, we've already had a tx watchdog, maybe we can have
something similar to cvq and use timeout + reset in that case.

Thans

>
> Zhu Yanjun
>
> >
> >   Andrew
>




Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang

2024-01-21 Thread Jason Wang
On Fri, Jan 19, 2024 at 10:27 PM Heng Qi  wrote:
>
>
>
> 在 2024/1/18 下午8:01, Zhu Yanjun 写道:
> >
> > 在 2024/1/16 20:04, Paolo Abeni 写道:
> >> On Mon, 2024-01-15 at 09:29 +0800, Zhu Yanjun wrote:
> >>> From: Zhu Yanjun 
> >>>
> >>> Some devices emulate the virtio_net hardwares. When virtio_net
> >>> driver sends commands to the emulated hardware, normally the
> >>> hardware needs time to response. Sometimes the time is very
> >>> long. Thus, the following will appear. Then the whole system
> >>> will hang.
> >>> The similar problems also occur in Intel NICs and Mellanox NICs.
> >>> As such, the similar solution is borrowed from them. A timeout
> >>> value is added and the timeout value as large as possible is set
> >>> to ensure that the driver gets the maximum possible response from
> >>> the hardware.
> >>>
> >>> "
> >>> [  213.795860] watchdog: BUG: soft lockup - CPU#108 stuck for 26s!
> >>> [(udev-worker):3157]
> >>> [  213.796114] Modules linked in: virtio_net(+) net_failover
> >>> failover qrtr rfkill sunrpc intel_rapl_msr intel_rapl_common
> >>> intel_uncore_frequency intel_uncore_frequency_common intel_ifs
> >>> i10nm_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp
> >>> coretemp iTCO_wdt rapl intel_pmc_bxt dax_hmem iTCO_vendor_support
> >>> vfat cxl_acpi intel_cstate pmt_telemetry pmt_class intel_sdsi joydev
> >>> intel_uncore cxl_core fat pcspkr mei_me isst_if_mbox_pci
> >>> isst_if_mmio idxd i2c_i801 isst_if_common mei intel_vsec idxd_bus
> >>> i2c_smbus i2c_ismt ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf
> >>> ipmi_msghandler acpi_pad acpi_power_meter pfr_telemetry pfr_update
> >>> fuse loop zram xfs crct10dif_pclmul crc32_pclmul crc32c_intel
> >>> polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3
> >>> bnxt_en sha256_ssse3 sha1_ssse3 nvme ast nvme_core i2c_algo_bit wmi
> >>> pinctrl_emmitsburg scsi_dh_rdac scsi_dh_emc scsi_dh_alua dm_multipath
> >>> [  213.796194] irq event stamp: 67740
> >>> [  213.796195] hardirqs last  enabled at (67739):
> >>> [] asm_sysvec_apic_timer_interrupt+0x1a/0x20
> >>> [  213.796203] hardirqs last disabled at (67740):
> >>> [] sysvec_apic_timer_interrupt+0xe/0x90
> >>> [  213.796208] softirqs last  enabled at (67686):
> >>> [] __irq_exit_rcu+0xbe/0xe0
> >>> [  213.796214] softirqs last disabled at (67681):
> >>> [] __irq_exit_rcu+0xbe/0xe0
> >>> [  213.796217] CPU: 108 PID: 3157 Comm: (udev-worker) Kdump: loaded
> >>> Not tainted 6.7.0+ #9
> >>> [  213.796220] Hardware name: Intel Corporation
> >>> M50FCP2SBSTD/M50FCP2SBSTD, BIOS SE5C741.86B.01.01.0001.2211140926
> >>> 11/14/2022
> >>> [  213.796221] RIP: 0010:virtqueue_get_buf_ctx_split+0x8d/0x110
> >>> [  213.796228] Code: 89 df e8 26 fe ff ff 0f b7 43 50 83 c0 01 66 89
> >>> 43 50 f6 43 78 01 75 12 80 7b 42 00 48 8b 4b 68 8b 53 58 74 0f 66 87
> >>> 44 51 04 <48> 89 e8 5b 5d c3 cc cc cc cc 66 89 44 51 04 0f ae f0 48
> >>> 89 e8 5b
> >>> [  213.796230] RSP: 0018:ff4bbb362306f9b0 EFLAGS: 0246
> >>> [  213.796233] RAX:  RBX: ff2f15095896f000 RCX:
> >>> 0001
> >>> [  213.796235] RDX:  RSI: ff4bbb362306f9cc RDI:
> >>> ff2f15095896f000
> >>> [  213.796236] RBP:  R08:  R09:
> >>> 
> >>> [  213.796237] R10: 0003 R11: ff2f15095893cc40 R12:
> >>> 0002
> >>> [  213.796239] R13: 0004 R14:  R15:
> >>> ff2f1509534f3000
> >>> [  213.796240] FS:  7f775847d0c0() GS:ff2f1528bac0()
> >>> knlGS:
> >>> [  213.796242] CS:  0010 DS:  ES:  CR0: 80050033
> >>> [  213.796243] CR2: 557f987b6e70 CR3: 002098602006 CR4:
> >>> 00f71ef0
> >>> [  213.796245] DR0:  DR1:  DR2:
> >>> 
> >>> [  213.796246] DR3:  DR6: fffe07f0 DR7:
> >>> 0400
> >>> [  213.796247] PKRU: 5554
> >>> [  213.796249] Call Trace:
> >>> [  213.796250]  
> >>> [  213.796252]  ? watchdog_timer_fn+0x1c0/0x220
> >>> [  213.796258]  ? __pfx_watchdog_timer_fn+0x10/0x10
> >>> [  213.796261]  ? __hrtimer_run_queues+0x1af/0x380
> >>> [  213.796269]  ? hrtimer_interrupt+0xf8/0x230
> >>> [  213.796274]  ? __sysvec_apic_timer_interrupt+0x64/0x1a0
> >>> [  213.796279]  ? sysvec_apic_timer_interrupt+0x6d/0x90
> >>> [  213.796282]  
> >>> [  213.796284]  
> >>> [  213.796285]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
> >>> [  213.796293]  ? virtqueue_get_buf_ctx_split+0x8d/0x110
> >>> [  213.796297]  virtnet_send_command+0x18a/0x1f0 [virtio_net]
> >>> [  213.796310]  _virtnet_set_queues+0xc6/0x120 [virtio_net]
> >>> [  213.796319]  virtnet_probe+0xa06/0xd50 [virtio_net]
> >>> [  213.796328]  virtio_dev_probe+0x195/0x230
> >>> [  213.796333]  really_probe+0x19f/0x400
> >>> [  213.796338]  ? __pfx___driver_attach+0x10/0x10
> >>> [  213.796340]  __driver_probe_device+0x78/0x160
> >>> [  213.796343]  driver_probe_device+0x1f/0x90
> >>> [  213.796346]  __driver_attach+0xd6/0x1d

Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang

2024-01-21 Thread Jason Wang
On Mon, Jan 15, 2024 at 6:22 PM Zhu Yanjun  wrote:
>
>
> 在 2024/1/15 10:20, Jason Wang 写道:
>
> On Mon, Jan 15, 2024 at 9:35 AM Zhu Yanjun  wrote:
>
> From: Zhu Yanjun 
>
> Some devices emulate the virtio_net hardwares. When virtio_net
> driver sends commands to the emulated hardware, normally the
> hardware needs time to response. Sometimes the time is very
> long. Thus, the following will appear. Then the whole system
> will hang.
> The similar problems also occur in Intel NICs and Mellanox NICs.
> As such, the similar solution is borrowed from them. A timeout
> value is added and the timeout value as large as possible is set
> to ensure that the driver gets the maximum possible response from
> the hardware.
>
> "
> [  213.795860] watchdog: BUG: soft lockup - CPU#108 stuck for 26s! 
> [(udev-worker):3157]
> [  213.796114] Modules linked in: virtio_net(+) net_failover failover qrtr 
> rfkill sunrpc intel_rapl_msr intel_rapl_common intel_uncore_frequency 
> intel_uncore_frequency_common intel_ifs i10nm_edac nfit libnvdimm 
> x86_pkg_temp_thermal intel_powerclamp coretemp iTCO_wdt rapl intel_pmc_bxt 
> dax_hmem iTCO_vendor_support vfat cxl_acpi intel_cstate pmt_telemetry 
> pmt_class intel_sdsi joydev intel_uncore cxl_core fat pcspkr mei_me 
> isst_if_mbox_pci isst_if_mmio idxd i2c_i801 isst_if_common mei intel_vsec 
> idxd_bus i2c_smbus i2c_ismt ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf 
> ipmi_msghandler acpi_pad acpi_power_meter pfr_telemetry pfr_update fuse loop 
> zram xfs crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni 
> polyval_generic ghash_clmulni_intel sha512_ssse3 bnxt_en sha256_ssse3 
> sha1_ssse3 nvme ast nvme_core i2c_algo_bit wmi pinctrl_emmitsburg 
> scsi_dh_rdac scsi_dh_emc scsi_dh_alua dm_multipath
> [  213.796194] irq event stamp: 67740
> [  213.796195] hardirqs last  enabled at (67739): [] 
> asm_sysvec_apic_timer_interrupt+0x1a/0x20
> [  213.796203] hardirqs last disabled at (67740): [] 
> sysvec_apic_timer_interrupt+0xe/0x90
> [  213.796208] softirqs last  enabled at (67686): [] 
> __irq_exit_rcu+0xbe/0xe0
> [  213.796214] softirqs last disabled at (67681): [] 
> __irq_exit_rcu+0xbe/0xe0
> [  213.796217] CPU: 108 PID: 3157 Comm: (udev-worker) Kdump: loaded Not 
> tainted 6.7.0+ #9
> [  213.796220] Hardware name: Intel Corporation M50FCP2SBSTD/M50FCP2SBSTD, 
> BIOS SE5C741.86B.01.01.0001.2211140926 11/14/2022
> [  213.796221] RIP: 0010:virtqueue_get_buf_ctx_split+0x8d/0x110
> [  213.796228] Code: 89 df e8 26 fe ff ff 0f b7 43 50 83 c0 01 66 89 43 50 f6 
> 43 78 01 75 12 80 7b 42 00 48 8b 4b 68 8b 53 58 74 0f 66 87 44 51 04 <48> 89 
> e8 5b 5d c3 cc cc cc cc 66 89 44 51 04 0f ae f0 48 89 e8 5b
> [  213.796230] RSP: 0018:ff4bbb362306f9b0 EFLAGS: 0246
> [  213.796233] RAX:  RBX: ff2f15095896f000 RCX: 
> 0001
> [  213.796235] RDX:  RSI: ff4bbb362306f9cc RDI: 
> ff2f15095896f000
> [  213.796236] RBP:  R08:  R09: 
> 
> [  213.796237] R10: 0003 R11: ff2f15095893cc40 R12: 
> 0002
> [  213.796239] R13: 0004 R14:  R15: 
> ff2f1509534f3000
> [  213.796240] FS:  7f775847d0c0() GS:ff2f1528bac0() 
> knlGS:
> [  213.796242] CS:  0010 DS:  ES:  CR0: 80050033
> [  213.796243] CR2: 557f987b6e70 CR3: 002098602006 CR4: 
> 00f71ef0
> [  213.796245] DR0:  DR1:  DR2: 
> 
> [  213.796246] DR3:  DR6: fffe07f0 DR7: 
> 0400
> [  213.796247] PKRU: 5554
> [  213.796249] Call Trace:
> [  213.796250]  
> [  213.796252]  ? watchdog_timer_fn+0x1c0/0x220
> [  213.796258]  ? __pfx_watchdog_timer_fn+0x10/0x10
> [  213.796261]  ? __hrtimer_run_queues+0x1af/0x380
> [  213.796269]  ? hrtimer_interrupt+0xf8/0x230
> [  213.796274]  ? __sysvec_apic_timer_interrupt+0x64/0x1a0
> [  213.796279]  ? sysvec_apic_timer_interrupt+0x6d/0x90
> [  213.796282]  
> [  213.796284]  
> [  213.796285]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
> [  213.796293]  ? virtqueue_get_buf_ctx_split+0x8d/0x110
> [  213.796297]  virtnet_send_command+0x18a/0x1f0 [virtio_net]
> [  213.796310]  _virtnet_set_queues+0xc6/0x120 [virtio_net]
> [  213.796319]  virtnet_probe+0xa06/0xd50 [virtio_net]
> [  213.796328]  virtio_dev_probe+0x195/0x230
> [  213.796333]  really_probe+0x19f/0x400
> [  213.796338]  ? __pfx___driver_attach+0x10/0x10
> [  213.796340]  __driver_probe_device+0x78/0x160
> [  213.796343]  driver_probe_device+0x1f/0x90
> [  213.796346]  __driver_attach+0xd6/0x1d0
> [  213.796349]  bus_for_each_dev+0x8c/0xe0
> [  213.796355]  bus_add_driver+0x119/0x220
> [  213.796359]  driver_register+0x59/0x100
> [  213.796362]  ? __pfx_virtio_net_driver_init+0x10/0x10 [virtio_net]
> [  213.796369]  virtio_net_driver_init+0x8e/0xff0 [virtio_net]
> [  213.796375]  do_one_initcall+0x6f/0x380
> [  213.796384]  do_init_module+0x60/0x240
> [  213.796388]  init

Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang

2024-01-21 Thread Zhu Yanjun



在 2024/1/20 1:29, Andrew Lunn 写道:

   while (!virtqueue_get_buf(vi->cvq, &tmp) &&
-   !virtqueue_is_broken(vi->cvq))
+   !virtqueue_is_broken(vi->cvq)) {
+    if (timeout)
+    timeout--;

This is not really a timeout, just a loop counter. 200 iterations could
be a very short time on reasonable H/W. I guess this avoid the soft
lockup, but possibly (likely?) breaks the functionality when we need to
loop for some non negligible time.

I fear we need a more complex solution, as mentioned by Micheal in the
thread you quoted.

Got it. I also look forward to the more complex solution to this problem.

Can we add a device capability (new feature bit) such as ctrq_wait_timeout
to get a reasonable timeout?

The usual solution to this is include/linux/iopoll.h. If you can sleep
read_poll_timeout() otherwise read_poll_timeout_atomic().


I read carefully the functions read_poll_timeout() and 
read_poll_timeout_atomic(). The timeout is set by the caller of the 2 
functions.


As such, can we add a module parameter to customize this timeout value 
by the user?


Or this timeout value is stored in device register, virtio_net driver 
will read this timeout value at initialization?


Zhu Yanjun



Andrew




Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang

2024-01-19 Thread Zhu Yanjun



在 2024/1/20 1:29, Andrew Lunn 写道:

   while (!virtqueue_get_buf(vi->cvq, &tmp) &&
-   !virtqueue_is_broken(vi->cvq))
+   !virtqueue_is_broken(vi->cvq)) {
+    if (timeout)
+    timeout--;

This is not really a timeout, just a loop counter. 200 iterations could
be a very short time on reasonable H/W. I guess this avoid the soft
lockup, but possibly (likely?) breaks the functionality when we need to
loop for some non negligible time.

I fear we need a more complex solution, as mentioned by Micheal in the
thread you quoted.

Got it. I also look forward to the more complex solution to this problem.

Can we add a device capability (new feature bit) such as ctrq_wait_timeout
to get a reasonable timeout?

The usual solution to this is include/linux/iopoll.h. If you can sleep
read_poll_timeout() otherwise read_poll_timeout_atomic().


Thanks. The 2 functions read_poll_timeout() and 
read_poll_timeout_atomic() are interesting.


Zhu Yanjun



Andrew




Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang

2024-01-19 Thread Andrew Lunn
> > > >   while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> > > > -   !virtqueue_is_broken(vi->cvq))
> > > > +   !virtqueue_is_broken(vi->cvq)) {
> > > > +    if (timeout)
> > > > +    timeout--;
> > > This is not really a timeout, just a loop counter. 200 iterations could
> > > be a very short time on reasonable H/W. I guess this avoid the soft
> > > lockup, but possibly (likely?) breaks the functionality when we need to
> > > loop for some non negligible time.
> > > 
> > > I fear we need a more complex solution, as mentioned by Micheal in the
> > > thread you quoted.
> > 
> > Got it. I also look forward to the more complex solution to this problem.
> 
> Can we add a device capability (new feature bit) such as ctrq_wait_timeout
> to get a reasonable timeout?

The usual solution to this is include/linux/iopoll.h. If you can sleep
read_poll_timeout() otherwise read_poll_timeout_atomic().

Andrew



Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang

2024-01-19 Thread Heng Qi




在 2024/1/18 下午8:01, Zhu Yanjun 写道:


在 2024/1/16 20:04, Paolo Abeni 写道:

On Mon, 2024-01-15 at 09:29 +0800, Zhu Yanjun wrote:

From: Zhu Yanjun 

Some devices emulate the virtio_net hardwares. When virtio_net
driver sends commands to the emulated hardware, normally the
hardware needs time to response. Sometimes the time is very
long. Thus, the following will appear. Then the whole system
will hang.
The similar problems also occur in Intel NICs and Mellanox NICs.
As such, the similar solution is borrowed from them. A timeout
value is added and the timeout value as large as possible is set
to ensure that the driver gets the maximum possible response from
the hardware.

"
[  213.795860] watchdog: BUG: soft lockup - CPU#108 stuck for 26s! 
[(udev-worker):3157]
[  213.796114] Modules linked in: virtio_net(+) net_failover 
failover qrtr rfkill sunrpc intel_rapl_msr intel_rapl_common 
intel_uncore_frequency intel_uncore_frequency_common intel_ifs 
i10nm_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp 
coretemp iTCO_wdt rapl intel_pmc_bxt dax_hmem iTCO_vendor_support 
vfat cxl_acpi intel_cstate pmt_telemetry pmt_class intel_sdsi joydev 
intel_uncore cxl_core fat pcspkr mei_me isst_if_mbox_pci 
isst_if_mmio idxd i2c_i801 isst_if_common mei intel_vsec idxd_bus 
i2c_smbus i2c_ismt ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf 
ipmi_msghandler acpi_pad acpi_power_meter pfr_telemetry pfr_update 
fuse loop zram xfs crct10dif_pclmul crc32_pclmul crc32c_intel 
polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3 
bnxt_en sha256_ssse3 sha1_ssse3 nvme ast nvme_core i2c_algo_bit wmi 
pinctrl_emmitsburg scsi_dh_rdac scsi_dh_emc scsi_dh_alua dm_multipath

[  213.796194] irq event stamp: 67740
[  213.796195] hardirqs last  enabled at (67739): 
[] asm_sysvec_apic_timer_interrupt+0x1a/0x20
[  213.796203] hardirqs last disabled at (67740): 
[] sysvec_apic_timer_interrupt+0xe/0x90
[  213.796208] softirqs last  enabled at (67686): 
[] __irq_exit_rcu+0xbe/0xe0
[  213.796214] softirqs last disabled at (67681): 
[] __irq_exit_rcu+0xbe/0xe0
[  213.796217] CPU: 108 PID: 3157 Comm: (udev-worker) Kdump: loaded 
Not tainted 6.7.0+ #9
[  213.796220] Hardware name: Intel Corporation 
M50FCP2SBSTD/M50FCP2SBSTD, BIOS SE5C741.86B.01.01.0001.2211140926 
11/14/2022

[  213.796221] RIP: 0010:virtqueue_get_buf_ctx_split+0x8d/0x110
[  213.796228] Code: 89 df e8 26 fe ff ff 0f b7 43 50 83 c0 01 66 89 
43 50 f6 43 78 01 75 12 80 7b 42 00 48 8b 4b 68 8b 53 58 74 0f 66 87 
44 51 04 <48> 89 e8 5b 5d c3 cc cc cc cc 66 89 44 51 04 0f ae f0 48 
89 e8 5b

[  213.796230] RSP: 0018:ff4bbb362306f9b0 EFLAGS: 0246
[  213.796233] RAX:  RBX: ff2f15095896f000 RCX: 
0001
[  213.796235] RDX:  RSI: ff4bbb362306f9cc RDI: 
ff2f15095896f000
[  213.796236] RBP:  R08:  R09: 

[  213.796237] R10: 0003 R11: ff2f15095893cc40 R12: 
0002
[  213.796239] R13: 0004 R14:  R15: 
ff2f1509534f3000
[  213.796240] FS:  7f775847d0c0() GS:ff2f1528bac0() 
knlGS:

[  213.796242] CS:  0010 DS:  ES:  CR0: 80050033
[  213.796243] CR2: 557f987b6e70 CR3: 002098602006 CR4: 
00f71ef0
[  213.796245] DR0:  DR1:  DR2: 

[  213.796246] DR3:  DR6: fffe07f0 DR7: 
0400

[  213.796247] PKRU: 5554
[  213.796249] Call Trace:
[  213.796250]  
[  213.796252]  ? watchdog_timer_fn+0x1c0/0x220
[  213.796258]  ? __pfx_watchdog_timer_fn+0x10/0x10
[  213.796261]  ? __hrtimer_run_queues+0x1af/0x380
[  213.796269]  ? hrtimer_interrupt+0xf8/0x230
[  213.796274]  ? __sysvec_apic_timer_interrupt+0x64/0x1a0
[  213.796279]  ? sysvec_apic_timer_interrupt+0x6d/0x90
[  213.796282]  
[  213.796284]  
[  213.796285]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
[  213.796293]  ? virtqueue_get_buf_ctx_split+0x8d/0x110
[  213.796297]  virtnet_send_command+0x18a/0x1f0 [virtio_net]
[  213.796310]  _virtnet_set_queues+0xc6/0x120 [virtio_net]
[  213.796319]  virtnet_probe+0xa06/0xd50 [virtio_net]
[  213.796328]  virtio_dev_probe+0x195/0x230
[  213.796333]  really_probe+0x19f/0x400
[  213.796338]  ? __pfx___driver_attach+0x10/0x10
[  213.796340]  __driver_probe_device+0x78/0x160
[  213.796343]  driver_probe_device+0x1f/0x90
[  213.796346]  __driver_attach+0xd6/0x1d0
[  213.796349]  bus_for_each_dev+0x8c/0xe0
[  213.796355]  bus_add_driver+0x119/0x220
[  213.796359]  driver_register+0x59/0x100
[  213.796362]  ? __pfx_virtio_net_driver_init+0x10/0x10 [virtio_net]
[  213.796369]  virtio_net_driver_init+0x8e/0xff0 [virtio_net]
[  213.796375]  do_one_initcall+0x6f/0x380
[  213.796384]  do_init_module+0x60/0x240
[  213.796388]  init_module_from_file+0x86/0xc0
[  213.796396]  idempotent_init_module+0x129/0x2c0
[  213.796406]  __x64_sys_finit_module+0x5e/0xb0
[  213.796409]  do_syscall_64+0x60/0xe0
[  213.796415]  ? do_syscall_64

Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang

2024-01-18 Thread Xuan Zhuo
On Thu, 18 Jan 2024 08:14:21 -0500, "Michael S. Tsirkin"  
wrote:
> On Tue, Jan 16, 2024 at 01:04:49PM +0100, Paolo Abeni wrote:
> > On Mon, 2024-01-15 at 09:29 +0800, Zhu Yanjun wrote:
> > > From: Zhu Yanjun 
> > >
> > > Some devices emulate the virtio_net hardwares. When virtio_net
> > > driver sends commands to the emulated hardware, normally the
> > > hardware needs time to response. Sometimes the time is very
> > > long. Thus, the following will appear. Then the whole system
> > > will hang.
> > > The similar problems also occur in Intel NICs and Mellanox NICs.
> > > As such, the similar solution is borrowed from them. A timeout
> > > value is added and the timeout value as large as possible is set
> > > to ensure that the driver gets the maximum possible response from
> > > the hardware.
> > >
> > > "
> > > [  213.795860] watchdog: BUG: soft lockup - CPU#108 stuck for 26s! 
> > > [(udev-worker):3157]
> > > [  213.796114] Modules linked in: virtio_net(+) net_failover failover 
> > > qrtr rfkill sunrpc intel_rapl_msr intel_rapl_common 
> > > intel_uncore_frequency intel_uncore_frequency_common intel_ifs i10nm_edac 
> > > nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp iTCO_wdt 
> > > rapl intel_pmc_bxt dax_hmem iTCO_vendor_support vfat cxl_acpi 
> > > intel_cstate pmt_telemetry pmt_class intel_sdsi joydev intel_uncore 
> > > cxl_core fat pcspkr mei_me isst_if_mbox_pci isst_if_mmio idxd i2c_i801 
> > > isst_if_common mei intel_vsec idxd_bus i2c_smbus i2c_ismt ipmi_ssif 
> > > acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler acpi_pad acpi_power_meter 
> > > pfr_telemetry pfr_update fuse loop zram xfs crct10dif_pclmul crc32_pclmul 
> > > crc32c_intel polyval_clmulni polyval_generic ghash_clmulni_intel 
> > > sha512_ssse3 bnxt_en sha256_ssse3 sha1_ssse3 nvme ast nvme_core 
> > > i2c_algo_bit wmi pinctrl_emmitsburg scsi_dh_rdac scsi_dh_emc scsi_dh_alua 
> > > dm_multipath
> > > [  213.796194] irq event stamp: 67740
> > > [  213.796195] hardirqs last  enabled at (67739): [] 
> > > asm_sysvec_apic_timer_interrupt+0x1a/0x20
> > > [  213.796203] hardirqs last disabled at (67740): [] 
> > > sysvec_apic_timer_interrupt+0xe/0x90
> > > [  213.796208] softirqs last  enabled at (67686): [] 
> > > __irq_exit_rcu+0xbe/0xe0
> > > [  213.796214] softirqs last disabled at (67681): [] 
> > > __irq_exit_rcu+0xbe/0xe0
> > > [  213.796217] CPU: 108 PID: 3157 Comm: (udev-worker) Kdump: loaded Not 
> > > tainted 6.7.0+ #9
> > > [  213.796220] Hardware name: Intel Corporation 
> > > M50FCP2SBSTD/M50FCP2SBSTD, BIOS SE5C741.86B.01.01.0001.2211140926 
> > > 11/14/2022
> > > [  213.796221] RIP: 0010:virtqueue_get_buf_ctx_split+0x8d/0x110
> > > [  213.796228] Code: 89 df e8 26 fe ff ff 0f b7 43 50 83 c0 01 66 89 43 
> > > 50 f6 43 78 01 75 12 80 7b 42 00 48 8b 4b 68 8b 53 58 74 0f 66 87 44 51 
> > > 04 <48> 89 e8 5b 5d c3 cc cc cc cc 66 89 44 51 04 0f ae f0 48 89 e8 5b
> > > [  213.796230] RSP: 0018:ff4bbb362306f9b0 EFLAGS: 0246
> > > [  213.796233] RAX:  RBX: ff2f15095896f000 RCX: 
> > > 0001
> > > [  213.796235] RDX:  RSI: ff4bbb362306f9cc RDI: 
> > > ff2f15095896f000
> > > [  213.796236] RBP:  R08:  R09: 
> > > 
> > > [  213.796237] R10: 0003 R11: ff2f15095893cc40 R12: 
> > > 0002
> > > [  213.796239] R13: 0004 R14:  R15: 
> > > ff2f1509534f3000
> > > [  213.796240] FS:  7f775847d0c0() GS:ff2f1528bac0() 
> > > knlGS:
> > > [  213.796242] CS:  0010 DS:  ES:  CR0: 80050033
> > > [  213.796243] CR2: 557f987b6e70 CR3: 002098602006 CR4: 
> > > 00f71ef0
> > > [  213.796245] DR0:  DR1:  DR2: 
> > > 
> > > [  213.796246] DR3:  DR6: fffe07f0 DR7: 
> > > 0400
> > > [  213.796247] PKRU: 5554
> > > [  213.796249] Call Trace:
> > > [  213.796250]  
> > > [  213.796252]  ? watchdog_timer_fn+0x1c0/0x220
> > > [  213.796258]  ? __pfx_watchdog_timer_fn+0x10/0x10
> > > [  213.796261]  ? __hrtimer_run_queues+0x1af/0x380
> > > [  213.796269]  ? hrtimer_interrupt+0xf8/0x230
> > > [  213.796274]  ? __sysvec_apic_timer_interrupt+0x64/0x1a0
> > > [  213.796279]  ? sysvec_apic_timer_interrupt+0x6d/0x90
> > > [  213.796282]  
> > > [  213.796284]  
> > > [  213.796285]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
> > > [  213.796293]  ? virtqueue_get_buf_ctx_split+0x8d/0x110
> > > [  213.796297]  virtnet_send_command+0x18a/0x1f0 [virtio_net]
> > > [  213.796310]  _virtnet_set_queues+0xc6/0x120 [virtio_net]
> > > [  213.796319]  virtnet_probe+0xa06/0xd50 [virtio_net]
> > > [  213.796328]  virtio_dev_probe+0x195/0x230
> > > [  213.796333]  really_probe+0x19f/0x400
> > > [  213.796338]  ? __pfx___driver_attach+0x10/0x10
> > > [  213.796340]  __driver_probe_device+0x78/0x160
> > > [  213.796343]  driver_probe_device+0x1f/0x90
> > > [  213.796346]  __driv

Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang

2024-01-18 Thread Michael S. Tsirkin
On Tue, Jan 16, 2024 at 01:04:49PM +0100, Paolo Abeni wrote:
> On Mon, 2024-01-15 at 09:29 +0800, Zhu Yanjun wrote:
> > From: Zhu Yanjun 
> > 
> > Some devices emulate the virtio_net hardwares. When virtio_net
> > driver sends commands to the emulated hardware, normally the
> > hardware needs time to response. Sometimes the time is very
> > long. Thus, the following will appear. Then the whole system
> > will hang.
> > The similar problems also occur in Intel NICs and Mellanox NICs.
> > As such, the similar solution is borrowed from them. A timeout
> > value is added and the timeout value as large as possible is set
> > to ensure that the driver gets the maximum possible response from
> > the hardware.
> > 
> > "
> > [  213.795860] watchdog: BUG: soft lockup - CPU#108 stuck for 26s! 
> > [(udev-worker):3157]
> > [  213.796114] Modules linked in: virtio_net(+) net_failover failover qrtr 
> > rfkill sunrpc intel_rapl_msr intel_rapl_common intel_uncore_frequency 
> > intel_uncore_frequency_common intel_ifs i10nm_edac nfit libnvdimm 
> > x86_pkg_temp_thermal intel_powerclamp coretemp iTCO_wdt rapl intel_pmc_bxt 
> > dax_hmem iTCO_vendor_support vfat cxl_acpi intel_cstate pmt_telemetry 
> > pmt_class intel_sdsi joydev intel_uncore cxl_core fat pcspkr mei_me 
> > isst_if_mbox_pci isst_if_mmio idxd i2c_i801 isst_if_common mei intel_vsec 
> > idxd_bus i2c_smbus i2c_ismt ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf 
> > ipmi_msghandler acpi_pad acpi_power_meter pfr_telemetry pfr_update fuse 
> > loop zram xfs crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni 
> > polyval_generic ghash_clmulni_intel sha512_ssse3 bnxt_en sha256_ssse3 
> > sha1_ssse3 nvme ast nvme_core i2c_algo_bit wmi pinctrl_emmitsburg 
> > scsi_dh_rdac scsi_dh_emc scsi_dh_alua dm_multipath
> > [  213.796194] irq event stamp: 67740
> > [  213.796195] hardirqs last  enabled at (67739): [] 
> > asm_sysvec_apic_timer_interrupt+0x1a/0x20
> > [  213.796203] hardirqs last disabled at (67740): [] 
> > sysvec_apic_timer_interrupt+0xe/0x90
> > [  213.796208] softirqs last  enabled at (67686): [] 
> > __irq_exit_rcu+0xbe/0xe0
> > [  213.796214] softirqs last disabled at (67681): [] 
> > __irq_exit_rcu+0xbe/0xe0
> > [  213.796217] CPU: 108 PID: 3157 Comm: (udev-worker) Kdump: loaded Not 
> > tainted 6.7.0+ #9
> > [  213.796220] Hardware name: Intel Corporation M50FCP2SBSTD/M50FCP2SBSTD, 
> > BIOS SE5C741.86B.01.01.0001.2211140926 11/14/2022
> > [  213.796221] RIP: 0010:virtqueue_get_buf_ctx_split+0x8d/0x110
> > [  213.796228] Code: 89 df e8 26 fe ff ff 0f b7 43 50 83 c0 01 66 89 43 50 
> > f6 43 78 01 75 12 80 7b 42 00 48 8b 4b 68 8b 53 58 74 0f 66 87 44 51 04 
> > <48> 89 e8 5b 5d c3 cc cc cc cc 66 89 44 51 04 0f ae f0 48 89 e8 5b
> > [  213.796230] RSP: 0018:ff4bbb362306f9b0 EFLAGS: 0246
> > [  213.796233] RAX:  RBX: ff2f15095896f000 RCX: 
> > 0001
> > [  213.796235] RDX:  RSI: ff4bbb362306f9cc RDI: 
> > ff2f15095896f000
> > [  213.796236] RBP:  R08:  R09: 
> > 
> > [  213.796237] R10: 0003 R11: ff2f15095893cc40 R12: 
> > 0002
> > [  213.796239] R13: 0004 R14:  R15: 
> > ff2f1509534f3000
> > [  213.796240] FS:  7f775847d0c0() GS:ff2f1528bac0() 
> > knlGS:
> > [  213.796242] CS:  0010 DS:  ES:  CR0: 80050033
> > [  213.796243] CR2: 557f987b6e70 CR3: 002098602006 CR4: 
> > 00f71ef0
> > [  213.796245] DR0:  DR1:  DR2: 
> > 
> > [  213.796246] DR3:  DR6: fffe07f0 DR7: 
> > 0400
> > [  213.796247] PKRU: 5554
> > [  213.796249] Call Trace:
> > [  213.796250]  
> > [  213.796252]  ? watchdog_timer_fn+0x1c0/0x220
> > [  213.796258]  ? __pfx_watchdog_timer_fn+0x10/0x10
> > [  213.796261]  ? __hrtimer_run_queues+0x1af/0x380
> > [  213.796269]  ? hrtimer_interrupt+0xf8/0x230
> > [  213.796274]  ? __sysvec_apic_timer_interrupt+0x64/0x1a0
> > [  213.796279]  ? sysvec_apic_timer_interrupt+0x6d/0x90
> > [  213.796282]  
> > [  213.796284]  
> > [  213.796285]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
> > [  213.796293]  ? virtqueue_get_buf_ctx_split+0x8d/0x110
> > [  213.796297]  virtnet_send_command+0x18a/0x1f0 [virtio_net]
> > [  213.796310]  _virtnet_set_queues+0xc6/0x120 [virtio_net]
> > [  213.796319]  virtnet_probe+0xa06/0xd50 [virtio_net]
> > [  213.796328]  virtio_dev_probe+0x195/0x230
> > [  213.796333]  really_probe+0x19f/0x400
> > [  213.796338]  ? __pfx___driver_attach+0x10/0x10
> > [  213.796340]  __driver_probe_device+0x78/0x160
> > [  213.796343]  driver_probe_device+0x1f/0x90
> > [  213.796346]  __driver_attach+0xd6/0x1d0
> > [  213.796349]  bus_for_each_dev+0x8c/0xe0
> > [  213.796355]  bus_add_driver+0x119/0x220
> > [  213.796359]  driver_register+0x59/0x100
> > [  213.796362]  ? __pfx_virtio_net_driver_init+0x10/0x10 [virtio_net]
> > [  213.796369]  vi

Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang

2024-01-18 Thread Zhu Yanjun



在 2024/1/16 20:04, Paolo Abeni 写道:

On Mon, 2024-01-15 at 09:29 +0800, Zhu Yanjun wrote:

From: Zhu Yanjun 

Some devices emulate the virtio_net hardwares. When virtio_net
driver sends commands to the emulated hardware, normally the
hardware needs time to response. Sometimes the time is very
long. Thus, the following will appear. Then the whole system
will hang.
The similar problems also occur in Intel NICs and Mellanox NICs.
As such, the similar solution is borrowed from them. A timeout
value is added and the timeout value as large as possible is set
to ensure that the driver gets the maximum possible response from
the hardware.

"
[  213.795860] watchdog: BUG: soft lockup - CPU#108 stuck for 26s! 
[(udev-worker):3157]
[  213.796114] Modules linked in: virtio_net(+) net_failover failover qrtr 
rfkill sunrpc intel_rapl_msr intel_rapl_common intel_uncore_frequency 
intel_uncore_frequency_common intel_ifs i10nm_edac nfit libnvdimm 
x86_pkg_temp_thermal intel_powerclamp coretemp iTCO_wdt rapl intel_pmc_bxt 
dax_hmem iTCO_vendor_support vfat cxl_acpi intel_cstate pmt_telemetry pmt_class 
intel_sdsi joydev intel_uncore cxl_core fat pcspkr mei_me isst_if_mbox_pci 
isst_if_mmio idxd i2c_i801 isst_if_common mei intel_vsec idxd_bus i2c_smbus 
i2c_ismt ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler acpi_pad 
acpi_power_meter pfr_telemetry pfr_update fuse loop zram xfs crct10dif_pclmul 
crc32_pclmul crc32c_intel polyval_clmulni polyval_generic ghash_clmulni_intel 
sha512_ssse3 bnxt_en sha256_ssse3 sha1_ssse3 nvme ast nvme_core i2c_algo_bit 
wmi pinctrl_emmitsburg scsi_dh_rdac scsi_dh_emc scsi_dh_alua dm_multipath
[  213.796194] irq event stamp: 67740
[  213.796195] hardirqs last  enabled at (67739): [] 
asm_sysvec_apic_timer_interrupt+0x1a/0x20
[  213.796203] hardirqs last disabled at (67740): [] 
sysvec_apic_timer_interrupt+0xe/0x90
[  213.796208] softirqs last  enabled at (67686): [] 
__irq_exit_rcu+0xbe/0xe0
[  213.796214] softirqs last disabled at (67681): [] 
__irq_exit_rcu+0xbe/0xe0
[  213.796217] CPU: 108 PID: 3157 Comm: (udev-worker) Kdump: loaded Not tainted 
6.7.0+ #9
[  213.796220] Hardware name: Intel Corporation M50FCP2SBSTD/M50FCP2SBSTD, BIOS 
SE5C741.86B.01.01.0001.2211140926 11/14/2022
[  213.796221] RIP: 0010:virtqueue_get_buf_ctx_split+0x8d/0x110
[  213.796228] Code: 89 df e8 26 fe ff ff 0f b7 43 50 83 c0 01 66 89 43 50 f6 43 78 
01 75 12 80 7b 42 00 48 8b 4b 68 8b 53 58 74 0f 66 87 44 51 04 <48> 89 e8 5b 5d 
c3 cc cc cc cc 66 89 44 51 04 0f ae f0 48 89 e8 5b
[  213.796230] RSP: 0018:ff4bbb362306f9b0 EFLAGS: 0246
[  213.796233] RAX:  RBX: ff2f15095896f000 RCX: 0001
[  213.796235] RDX:  RSI: ff4bbb362306f9cc RDI: ff2f15095896f000
[  213.796236] RBP:  R08:  R09: 
[  213.796237] R10: 0003 R11: ff2f15095893cc40 R12: 0002
[  213.796239] R13: 0004 R14:  R15: ff2f1509534f3000
[  213.796240] FS:  7f775847d0c0() GS:ff2f1528bac0() 
knlGS:
[  213.796242] CS:  0010 DS:  ES:  CR0: 80050033
[  213.796243] CR2: 557f987b6e70 CR3: 002098602006 CR4: 00f71ef0
[  213.796245] DR0:  DR1:  DR2: 
[  213.796246] DR3:  DR6: fffe07f0 DR7: 0400
[  213.796247] PKRU: 5554
[  213.796249] Call Trace:
[  213.796250]  
[  213.796252]  ? watchdog_timer_fn+0x1c0/0x220
[  213.796258]  ? __pfx_watchdog_timer_fn+0x10/0x10
[  213.796261]  ? __hrtimer_run_queues+0x1af/0x380
[  213.796269]  ? hrtimer_interrupt+0xf8/0x230
[  213.796274]  ? __sysvec_apic_timer_interrupt+0x64/0x1a0
[  213.796279]  ? sysvec_apic_timer_interrupt+0x6d/0x90
[  213.796282]  
[  213.796284]  
[  213.796285]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
[  213.796293]  ? virtqueue_get_buf_ctx_split+0x8d/0x110
[  213.796297]  virtnet_send_command+0x18a/0x1f0 [virtio_net]
[  213.796310]  _virtnet_set_queues+0xc6/0x120 [virtio_net]
[  213.796319]  virtnet_probe+0xa06/0xd50 [virtio_net]
[  213.796328]  virtio_dev_probe+0x195/0x230
[  213.796333]  really_probe+0x19f/0x400
[  213.796338]  ? __pfx___driver_attach+0x10/0x10
[  213.796340]  __driver_probe_device+0x78/0x160
[  213.796343]  driver_probe_device+0x1f/0x90
[  213.796346]  __driver_attach+0xd6/0x1d0
[  213.796349]  bus_for_each_dev+0x8c/0xe0
[  213.796355]  bus_add_driver+0x119/0x220
[  213.796359]  driver_register+0x59/0x100
[  213.796362]  ? __pfx_virtio_net_driver_init+0x10/0x10 [virtio_net]
[  213.796369]  virtio_net_driver_init+0x8e/0xff0 [virtio_net]
[  213.796375]  do_one_initcall+0x6f/0x380
[  213.796384]  do_init_module+0x60/0x240
[  213.796388]  init_module_from_file+0x86/0xc0
[  213.796396]  idempotent_init_module+0x129/0x2c0
[  213.796406]  __x64_sys_finit_module+0x5e/0xb0
[  213.796409]  do_syscall_64+0x60/0xe0
[  213.796415]  ? do_syscall_64+0x6f/0xe0
[  213.796418]  ? lockdep_hardirqs_on_prepar

Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang

2024-01-16 Thread Paolo Abeni
On Mon, 2024-01-15 at 09:29 +0800, Zhu Yanjun wrote:
> From: Zhu Yanjun 
> 
> Some devices emulate the virtio_net hardwares. When virtio_net
> driver sends commands to the emulated hardware, normally the
> hardware needs time to response. Sometimes the time is very
> long. Thus, the following will appear. Then the whole system
> will hang.
> The similar problems also occur in Intel NICs and Mellanox NICs.
> As such, the similar solution is borrowed from them. A timeout
> value is added and the timeout value as large as possible is set
> to ensure that the driver gets the maximum possible response from
> the hardware.
> 
> "
> [  213.795860] watchdog: BUG: soft lockup - CPU#108 stuck for 26s! 
> [(udev-worker):3157]
> [  213.796114] Modules linked in: virtio_net(+) net_failover failover qrtr 
> rfkill sunrpc intel_rapl_msr intel_rapl_common intel_uncore_frequency 
> intel_uncore_frequency_common intel_ifs i10nm_edac nfit libnvdimm 
> x86_pkg_temp_thermal intel_powerclamp coretemp iTCO_wdt rapl intel_pmc_bxt 
> dax_hmem iTCO_vendor_support vfat cxl_acpi intel_cstate pmt_telemetry 
> pmt_class intel_sdsi joydev intel_uncore cxl_core fat pcspkr mei_me 
> isst_if_mbox_pci isst_if_mmio idxd i2c_i801 isst_if_common mei intel_vsec 
> idxd_bus i2c_smbus i2c_ismt ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf 
> ipmi_msghandler acpi_pad acpi_power_meter pfr_telemetry pfr_update fuse loop 
> zram xfs crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni 
> polyval_generic ghash_clmulni_intel sha512_ssse3 bnxt_en sha256_ssse3 
> sha1_ssse3 nvme ast nvme_core i2c_algo_bit wmi pinctrl_emmitsburg 
> scsi_dh_rdac scsi_dh_emc scsi_dh_alua dm_multipath
> [  213.796194] irq event stamp: 67740
> [  213.796195] hardirqs last  enabled at (67739): [] 
> asm_sysvec_apic_timer_interrupt+0x1a/0x20
> [  213.796203] hardirqs last disabled at (67740): [] 
> sysvec_apic_timer_interrupt+0xe/0x90
> [  213.796208] softirqs last  enabled at (67686): [] 
> __irq_exit_rcu+0xbe/0xe0
> [  213.796214] softirqs last disabled at (67681): [] 
> __irq_exit_rcu+0xbe/0xe0
> [  213.796217] CPU: 108 PID: 3157 Comm: (udev-worker) Kdump: loaded Not 
> tainted 6.7.0+ #9
> [  213.796220] Hardware name: Intel Corporation M50FCP2SBSTD/M50FCP2SBSTD, 
> BIOS SE5C741.86B.01.01.0001.2211140926 11/14/2022
> [  213.796221] RIP: 0010:virtqueue_get_buf_ctx_split+0x8d/0x110
> [  213.796228] Code: 89 df e8 26 fe ff ff 0f b7 43 50 83 c0 01 66 89 43 50 f6 
> 43 78 01 75 12 80 7b 42 00 48 8b 4b 68 8b 53 58 74 0f 66 87 44 51 04 <48> 89 
> e8 5b 5d c3 cc cc cc cc 66 89 44 51 04 0f ae f0 48 89 e8 5b
> [  213.796230] RSP: 0018:ff4bbb362306f9b0 EFLAGS: 0246
> [  213.796233] RAX:  RBX: ff2f15095896f000 RCX: 
> 0001
> [  213.796235] RDX:  RSI: ff4bbb362306f9cc RDI: 
> ff2f15095896f000
> [  213.796236] RBP:  R08:  R09: 
> 
> [  213.796237] R10: 0003 R11: ff2f15095893cc40 R12: 
> 0002
> [  213.796239] R13: 0004 R14:  R15: 
> ff2f1509534f3000
> [  213.796240] FS:  7f775847d0c0() GS:ff2f1528bac0() 
> knlGS:
> [  213.796242] CS:  0010 DS:  ES:  CR0: 80050033
> [  213.796243] CR2: 557f987b6e70 CR3: 002098602006 CR4: 
> 00f71ef0
> [  213.796245] DR0:  DR1:  DR2: 
> 
> [  213.796246] DR3:  DR6: fffe07f0 DR7: 
> 0400
> [  213.796247] PKRU: 5554
> [  213.796249] Call Trace:
> [  213.796250]  
> [  213.796252]  ? watchdog_timer_fn+0x1c0/0x220
> [  213.796258]  ? __pfx_watchdog_timer_fn+0x10/0x10
> [  213.796261]  ? __hrtimer_run_queues+0x1af/0x380
> [  213.796269]  ? hrtimer_interrupt+0xf8/0x230
> [  213.796274]  ? __sysvec_apic_timer_interrupt+0x64/0x1a0
> [  213.796279]  ? sysvec_apic_timer_interrupt+0x6d/0x90
> [  213.796282]  
> [  213.796284]  
> [  213.796285]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
> [  213.796293]  ? virtqueue_get_buf_ctx_split+0x8d/0x110
> [  213.796297]  virtnet_send_command+0x18a/0x1f0 [virtio_net]
> [  213.796310]  _virtnet_set_queues+0xc6/0x120 [virtio_net]
> [  213.796319]  virtnet_probe+0xa06/0xd50 [virtio_net]
> [  213.796328]  virtio_dev_probe+0x195/0x230
> [  213.796333]  really_probe+0x19f/0x400
> [  213.796338]  ? __pfx___driver_attach+0x10/0x10
> [  213.796340]  __driver_probe_device+0x78/0x160
> [  213.796343]  driver_probe_device+0x1f/0x90
> [  213.796346]  __driver_attach+0xd6/0x1d0
> [  213.796349]  bus_for_each_dev+0x8c/0xe0
> [  213.796355]  bus_add_driver+0x119/0x220
> [  213.796359]  driver_register+0x59/0x100
> [  213.796362]  ? __pfx_virtio_net_driver_init+0x10/0x10 [virtio_net]
> [  213.796369]  virtio_net_driver_init+0x8e/0xff0 [virtio_net]
> [  213.796375]  do_one_initcall+0x6f/0x380
> [  213.796384]  do_init_module+0x60/0x240
> [  213.796388]  init_module_from_file+0x86/0xc0
> [  213.796396]  idempotent_init_module+0x129/0x2c0
> [  213.796

Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang

2024-01-15 Thread Zhu Yanjun



在 2024/1/15 10:20, Jason Wang 写道:

On Mon, Jan 15, 2024 at 9:35 AM Zhu Yanjun  wrote:

From: Zhu Yanjun 

Some devices emulate the virtio_net hardwares. When virtio_net
driver sends commands to the emulated hardware, normally the
hardware needs time to response. Sometimes the time is very
long. Thus, the following will appear. Then the whole system
will hang.
The similar problems also occur in Intel NICs and Mellanox NICs.
As such, the similar solution is borrowed from them. A timeout
value is added and the timeout value as large as possible is set
to ensure that the driver gets the maximum possible response from
the hardware.

"
[  213.795860] watchdog: BUG: soft lockup - CPU#108 stuck for 26s! 
[(udev-worker):3157]
[  213.796114] Modules linked in: virtio_net(+) net_failover failover qrtr 
rfkill sunrpc intel_rapl_msr intel_rapl_common intel_uncore_frequency 
intel_uncore_frequency_common intel_ifs i10nm_edac nfit libnvdimm 
x86_pkg_temp_thermal intel_powerclamp coretemp iTCO_wdt rapl intel_pmc_bxt 
dax_hmem iTCO_vendor_support vfat cxl_acpi intel_cstate pmt_telemetry pmt_class 
intel_sdsi joydev intel_uncore cxl_core fat pcspkr mei_me isst_if_mbox_pci 
isst_if_mmio idxd i2c_i801 isst_if_common mei intel_vsec idxd_bus i2c_smbus 
i2c_ismt ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler acpi_pad 
acpi_power_meter pfr_telemetry pfr_update fuse loop zram xfs crct10dif_pclmul 
crc32_pclmul crc32c_intel polyval_clmulni polyval_generic ghash_clmulni_intel 
sha512_ssse3 bnxt_en sha256_ssse3 sha1_ssse3 nvme ast nvme_core i2c_algo_bit 
wmi pinctrl_emmitsburg scsi_dh_rdac scsi_dh_emc scsi_dh_alua dm_multipath
[  213.796194] irq event stamp: 67740
[  213.796195] hardirqs last  enabled at (67739): [] 
asm_sysvec_apic_timer_interrupt+0x1a/0x20
[  213.796203] hardirqs last disabled at (67740): [] 
sysvec_apic_timer_interrupt+0xe/0x90
[  213.796208] softirqs last  enabled at (67686): [] 
__irq_exit_rcu+0xbe/0xe0
[  213.796214] softirqs last disabled at (67681): [] 
__irq_exit_rcu+0xbe/0xe0
[  213.796217] CPU: 108 PID: 3157 Comm: (udev-worker) Kdump: loaded Not tainted 
6.7.0+ #9
[  213.796220] Hardware name: Intel Corporation M50FCP2SBSTD/M50FCP2SBSTD, BIOS 
SE5C741.86B.01.01.0001.2211140926 11/14/2022
[  213.796221] RIP: 0010:virtqueue_get_buf_ctx_split+0x8d/0x110
[  213.796228] Code: 89 df e8 26 fe ff ff 0f b7 43 50 83 c0 01 66 89 43 50 f6 43 78 
01 75 12 80 7b 42 00 48 8b 4b 68 8b 53 58 74 0f 66 87 44 51 04 <48> 89 e8 5b 5d 
c3 cc cc cc cc 66 89 44 51 04 0f ae f0 48 89 e8 5b
[  213.796230] RSP: 0018:ff4bbb362306f9b0 EFLAGS: 0246
[  213.796233] RAX:  RBX: ff2f15095896f000 RCX: 0001
[  213.796235] RDX:  RSI: ff4bbb362306f9cc RDI: ff2f15095896f000
[  213.796236] RBP:  R08:  R09: 
[  213.796237] R10: 0003 R11: ff2f15095893cc40 R12: 0002
[  213.796239] R13: 0004 R14:  R15: ff2f1509534f3000
[  213.796240] FS:  7f775847d0c0() GS:ff2f1528bac0() 
knlGS:
[  213.796242] CS:  0010 DS:  ES:  CR0: 80050033
[  213.796243] CR2: 557f987b6e70 CR3: 002098602006 CR4: 00f71ef0
[  213.796245] DR0:  DR1:  DR2: 
[  213.796246] DR3:  DR6: fffe07f0 DR7: 0400
[  213.796247] PKRU: 5554
[  213.796249] Call Trace:
[  213.796250]  
[  213.796252]  ? watchdog_timer_fn+0x1c0/0x220
[  213.796258]  ? __pfx_watchdog_timer_fn+0x10/0x10
[  213.796261]  ? __hrtimer_run_queues+0x1af/0x380
[  213.796269]  ? hrtimer_interrupt+0xf8/0x230
[  213.796274]  ? __sysvec_apic_timer_interrupt+0x64/0x1a0
[  213.796279]  ? sysvec_apic_timer_interrupt+0x6d/0x90
[  213.796282]  
[  213.796284]  
[  213.796285]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
[  213.796293]  ? virtqueue_get_buf_ctx_split+0x8d/0x110
[  213.796297]  virtnet_send_command+0x18a/0x1f0 [virtio_net]
[  213.796310]  _virtnet_set_queues+0xc6/0x120 [virtio_net]
[  213.796319]  virtnet_probe+0xa06/0xd50 [virtio_net]
[  213.796328]  virtio_dev_probe+0x195/0x230
[  213.796333]  really_probe+0x19f/0x400
[  213.796338]  ? __pfx___driver_attach+0x10/0x10
[  213.796340]  __driver_probe_device+0x78/0x160
[  213.796343]  driver_probe_device+0x1f/0x90
[  213.796346]  __driver_attach+0xd6/0x1d0
[  213.796349]  bus_for_each_dev+0x8c/0xe0
[  213.796355]  bus_add_driver+0x119/0x220
[  213.796359]  driver_register+0x59/0x100
[  213.796362]  ? __pfx_virtio_net_driver_init+0x10/0x10 [virtio_net]
[  213.796369]  virtio_net_driver_init+0x8e/0xff0 [virtio_net]
[  213.796375]  do_one_initcall+0x6f/0x380
[  213.796384]  do_init_module+0x60/0x240
[  213.796388]  init_module_from_file+0x86/0xc0
[  213.796396]  idempotent_init_module+0x129/0x2c0
[  213.796406]  __x64_sys_finit_module+0x5e/0xb0
[  213.796409]  do_syscall_64+0x60/0xe0
[  213.796415]  ? do_syscall_64+0x6f/0xe0
[  213.796418]  ? lockdep_hardirqs_on_prepare+0

Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang

2024-01-14 Thread Jason Wang
On Mon, Jan 15, 2024 at 9:35 AM Zhu Yanjun  wrote:
>
> From: Zhu Yanjun 
>
> Some devices emulate the virtio_net hardwares. When virtio_net
> driver sends commands to the emulated hardware, normally the
> hardware needs time to response. Sometimes the time is very
> long. Thus, the following will appear. Then the whole system
> will hang.
> The similar problems also occur in Intel NICs and Mellanox NICs.
> As such, the similar solution is borrowed from them. A timeout
> value is added and the timeout value as large as possible is set
> to ensure that the driver gets the maximum possible response from
> the hardware.
>
> "
> [  213.795860] watchdog: BUG: soft lockup - CPU#108 stuck for 26s! 
> [(udev-worker):3157]
> [  213.796114] Modules linked in: virtio_net(+) net_failover failover qrtr 
> rfkill sunrpc intel_rapl_msr intel_rapl_common intel_uncore_frequency 
> intel_uncore_frequency_common intel_ifs i10nm_edac nfit libnvdimm 
> x86_pkg_temp_thermal intel_powerclamp coretemp iTCO_wdt rapl intel_pmc_bxt 
> dax_hmem iTCO_vendor_support vfat cxl_acpi intel_cstate pmt_telemetry 
> pmt_class intel_sdsi joydev intel_uncore cxl_core fat pcspkr mei_me 
> isst_if_mbox_pci isst_if_mmio idxd i2c_i801 isst_if_common mei intel_vsec 
> idxd_bus i2c_smbus i2c_ismt ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf 
> ipmi_msghandler acpi_pad acpi_power_meter pfr_telemetry pfr_update fuse loop 
> zram xfs crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni 
> polyval_generic ghash_clmulni_intel sha512_ssse3 bnxt_en sha256_ssse3 
> sha1_ssse3 nvme ast nvme_core i2c_algo_bit wmi pinctrl_emmitsburg 
> scsi_dh_rdac scsi_dh_emc scsi_dh_alua dm_multipath
> [  213.796194] irq event stamp: 67740
> [  213.796195] hardirqs last  enabled at (67739): [] 
> asm_sysvec_apic_timer_interrupt+0x1a/0x20
> [  213.796203] hardirqs last disabled at (67740): [] 
> sysvec_apic_timer_interrupt+0xe/0x90
> [  213.796208] softirqs last  enabled at (67686): [] 
> __irq_exit_rcu+0xbe/0xe0
> [  213.796214] softirqs last disabled at (67681): [] 
> __irq_exit_rcu+0xbe/0xe0
> [  213.796217] CPU: 108 PID: 3157 Comm: (udev-worker) Kdump: loaded Not 
> tainted 6.7.0+ #9
> [  213.796220] Hardware name: Intel Corporation M50FCP2SBSTD/M50FCP2SBSTD, 
> BIOS SE5C741.86B.01.01.0001.2211140926 11/14/2022
> [  213.796221] RIP: 0010:virtqueue_get_buf_ctx_split+0x8d/0x110
> [  213.796228] Code: 89 df e8 26 fe ff ff 0f b7 43 50 83 c0 01 66 89 43 50 f6 
> 43 78 01 75 12 80 7b 42 00 48 8b 4b 68 8b 53 58 74 0f 66 87 44 51 04 <48> 89 
> e8 5b 5d c3 cc cc cc cc 66 89 44 51 04 0f ae f0 48 89 e8 5b
> [  213.796230] RSP: 0018:ff4bbb362306f9b0 EFLAGS: 0246
> [  213.796233] RAX:  RBX: ff2f15095896f000 RCX: 
> 0001
> [  213.796235] RDX:  RSI: ff4bbb362306f9cc RDI: 
> ff2f15095896f000
> [  213.796236] RBP:  R08:  R09: 
> 
> [  213.796237] R10: 0003 R11: ff2f15095893cc40 R12: 
> 0002
> [  213.796239] R13: 0004 R14:  R15: 
> ff2f1509534f3000
> [  213.796240] FS:  7f775847d0c0() GS:ff2f1528bac0() 
> knlGS:
> [  213.796242] CS:  0010 DS:  ES:  CR0: 80050033
> [  213.796243] CR2: 557f987b6e70 CR3: 002098602006 CR4: 
> 00f71ef0
> [  213.796245] DR0:  DR1:  DR2: 
> 
> [  213.796246] DR3:  DR6: fffe07f0 DR7: 
> 0400
> [  213.796247] PKRU: 5554
> [  213.796249] Call Trace:
> [  213.796250]  
> [  213.796252]  ? watchdog_timer_fn+0x1c0/0x220
> [  213.796258]  ? __pfx_watchdog_timer_fn+0x10/0x10
> [  213.796261]  ? __hrtimer_run_queues+0x1af/0x380
> [  213.796269]  ? hrtimer_interrupt+0xf8/0x230
> [  213.796274]  ? __sysvec_apic_timer_interrupt+0x64/0x1a0
> [  213.796279]  ? sysvec_apic_timer_interrupt+0x6d/0x90
> [  213.796282]  
> [  213.796284]  
> [  213.796285]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
> [  213.796293]  ? virtqueue_get_buf_ctx_split+0x8d/0x110
> [  213.796297]  virtnet_send_command+0x18a/0x1f0 [virtio_net]
> [  213.796310]  _virtnet_set_queues+0xc6/0x120 [virtio_net]
> [  213.796319]  virtnet_probe+0xa06/0xd50 [virtio_net]
> [  213.796328]  virtio_dev_probe+0x195/0x230
> [  213.796333]  really_probe+0x19f/0x400
> [  213.796338]  ? __pfx___driver_attach+0x10/0x10
> [  213.796340]  __driver_probe_device+0x78/0x160
> [  213.796343]  driver_probe_device+0x1f/0x90
> [  213.796346]  __driver_attach+0xd6/0x1d0
> [  213.796349]  bus_for_each_dev+0x8c/0xe0
> [  213.796355]  bus_add_driver+0x119/0x220
> [  213.796359]  driver_register+0x59/0x100
> [  213.796362]  ? __pfx_virtio_net_driver_init+0x10/0x10 [virtio_net]
> [  213.796369]  virtio_net_driver_init+0x8e/0xff0 [virtio_net]
> [  213.796375]  do_one_initcall+0x6f/0x380
> [  213.796384]  do_init_module+0x60/0x240
> [  213.796388]  init_module_from_file+0x86/0xc0
> [  213.796396]  idempotent_init_module+0x129/0x2c0
> [  213.79640