Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang
在 2024/1/26 11:11, Zhu Yanjun 写道:
在 2024/1/22 15:02, Xuan Zhuo 写道:
On Mon, 22 Jan 2024 14:58:09 +0800, Jason Wang wrote:
On Mon, Jan 22, 2024 at 2:55 PM Jason Wang wrote:
On Mon, Jan 22, 2024 at 2:20 PM Xuan Zhuo wrote:
On Mon, 22 Jan 2024 12:16:27 +0800, Jason Wang wrote:
On Mon, Jan 22, 2024 at 12:00 PM Xuan Zhuo wrote:
On Mon, 22 Jan 2024 11:14:30 +0800, Jason Wang wrote:
On Mon, Jan 22, 2024 at 10:12 AM Zhu Yanjun wrote:
在 2024/1/20 1:29, Andrew Lunn 写道:
while (!virtqueue_get_buf(vi->cvq, &tmp) &&
- !virtqueue_is_broken(vi->cvq))
+ !virtqueue_is_broken(vi->cvq)) {
+if (timeout)
+timeout--;
This is not really a timeout, just a loop counter. 200 iterations could
be a very short time on reasonable H/W. I guess this avoid the soft
lockup, but possibly (likely?) breaks the functionality when we need to
loop for some non negligible time.
I fear we need a more complex solution, as mentioned by Micheal in the
thread you quoted.
Got it. I also look forward to the more complex solution to this problem.
Can we add a device capability (new feature bit) such as ctrq_wait_timeout
to get a reasonable timeout?
The usual solution to this is include/linux/iopoll.h. If you can sleep
read_poll_timeout() otherwise read_poll_timeout_atomic().
I read carefully the functions read_poll_timeout() and
read_poll_timeout_atomic(). The timeout is set by the caller of the 2
functions.
FYI, in order to avoid a swtich of atomic or not, we need convert rx
mode setting to workqueue first:
https://www.mail-archive.com/[email protected]/msg60298.html
As such, can we add a module parameter to customize this timeout value
by the user?
Who is the "user" here, or how can the "user" know the value?
Or this timeout value is stored in device register, virtio_net driver
will read this timeout value at initialization?
See another thread. The design needs to be general, or you can post a RFC.
In another thought, we've already had a tx watchdog, maybe we can have
something similar to cvq and use timeout + reset in that case.
But we may block by the reset ^_^ if the device is broken?
I mean vq reset here.
I see.
I mean when the deivce is broken, the vq reset also many be blocked.
void vp_modern_set_queue_reset(struct virtio_pci_modern_device *mdev,
u16 index)
{
struct virtio_pci_modern_common_cfg __iomem *cfg;
cfg = (struct virtio_pci_modern_common_cfg __iomem
*)mdev->common;
vp_iowrite16(index, &cfg->cfg.queue_select);
vp_iowrite16(1, &cfg->queue_reset);
while (vp_ioread16(&cfg->queue_reset))
msleep(1);
while (vp_ioread16(&cfg->cfg.queue_enable))
msleep(1);
}
EXPORT_SYMBOL_GPL(vp_modern_set_queue_reset);
In this function, for the broken device, we can not expect something.
Yes, it's best effort, there's no guarantee then. But it doesn't harm to try.
Thanks
It looks like we have multiple goals here
1) avoid lockups, using workqueue + cond_resched() seems to be
sufficient, it has issue but nothing new
2) recover from the unresponsive device, the issue for timeout is that
it needs to deal with false positives
I agree.
But I want to add a new goal, cvq async. In the netdim, we will
send many requests via the cvq, so the cvq async will be nice.
Then you need an interrupt for cvq.
FYI, I've posted a series that use interrupt for cvq in the past:
https://lore.kernel.org/lkml/[email protected]/t/
I know this. But the interrupt maybe not a good solution without new space.
Haven't found time in working on this anymore, maybe we can start from
this or not.
I said async, but my aim is to put many requests to the cvq before getting the
response.
Heng Qi posted
thishttps://lore.kernel.org/all/[email protected]/
Sorry. This mail is rejected by netdev maillist. So I have to resend it.
Thanks a lot. I read Heng Qi's commits carefully. This patch series are
similiar with the NIC feature xmit_more.
But if cvq command is urgent, can we let this urgent cvq command be
passed ASAP?
I mean, can we set a flag similar to xmit_more? if cvq command is not
urgent, it can be queued. If it is urgent,
this cvq command is passed ASAP.
Zhu Yanjun
Zhu Yanjun
Thanks.
Thanks
Thanks.
Thanks
Thanks.
Thans
Zhu Yanjun
Andrew
Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang
On Mon, 22 Jan 2024 16:32:46 +0800, Jason Wang wrote:
> On Mon, Jan 22, 2024 at 4:04 PM Xuan Zhuo wrote:
> >
> > On Mon, 22 Jan 2024 15:57:08 +0800, Jason Wang wrote:
> > > On Mon, Jan 22, 2024 at 3:36 PM Xuan Zhuo
> > > wrote:
> > > >
> > > > On Mon, 22 Jan 2024 15:19:12 +0800, Jason Wang
> > > > wrote:
> > > > > On Mon, Jan 22, 2024 at 3:07 PM Xuan Zhuo
> > > > > wrote:
> > > > > >
> > > > > > On Mon, 22 Jan 2024 14:58:09 +0800, Jason Wang
> > > > > > wrote:
> > > > > > > On Mon, Jan 22, 2024 at 2:55 PM Jason Wang
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > On Mon, Jan 22, 2024 at 2:20 PM Xuan Zhuo
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > On Mon, 22 Jan 2024 12:16:27 +0800, Jason Wang
> > > > > > > > > wrote:
> > > > > > > > > > On Mon, Jan 22, 2024 at 12:00 PM Xuan Zhuo
> > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Mon, 22 Jan 2024 11:14:30 +0800, Jason Wang
> > > > > > > > > > > wrote:
> > > > > > > > > > > > On Mon, Jan 22, 2024 at 10:12 AM Zhu Yanjun
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > 在 2024/1/20 1:29, Andrew Lunn 写道:
> > > > > > > > > > > > > >while (!virtqueue_get_buf(vi->cvq, &tmp)
> > > > > > > > > > > > > > &&
> > > > > > > > > > > > > > - !virtqueue_is_broken(vi->cvq))
> > > > > > > > > > > > > > + !virtqueue_is_broken(vi->cvq)) {
> > > > > > > > > > > > > > +if (timeout)
> > > > > > > > > > > > > > +timeout--;
> > > > > > > > > > > > > This is not really a timeout, just a loop
> > > > > > > > > > > > > counter. 200 iterations could
> > > > > > > > > > > > > be a very short time on reasonable H/W. I guess
> > > > > > > > > > > > > this avoid the soft
> > > > > > > > > > > > > lockup, but possibly (likely?) breaks the
> > > > > > > > > > > > > functionality when we need to
> > > > > > > > > > > > > loop for some non negligible time.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I fear we need a more complex solution, as
> > > > > > > > > > > > > mentioned by Micheal in the
> > > > > > > > > > > > > thread you quoted.
> > > > > > > > > > > > > >>> Got it. I also look forward to the more complex
> > > > > > > > > > > > > >>> solution to this problem.
> > > > > > > > > > > > > >> Can we add a device capability (new feature bit)
> > > > > > > > > > > > > >> such as ctrq_wait_timeout
> > > > > > > > > > > > > >> to get a reasonable timeout?
> > > > > > > > > > > > > > The usual solution to this is
> > > > > > > > > > > > > > include/linux/iopoll.h. If you can sleep
> > > > > > > > > > > > > > read_poll_timeout() otherwise
> > > > > > > > > > > > > > read_poll_timeout_atomic().
> > > > > > > > > > > > >
> > > > > > > > > > > > > I read carefully the functions read_poll_timeout() and
> > > > > > > > > > > > > read_poll_timeout_atomic(). The timeout is set by the
> > > > > > > > > > > > > caller of the 2
> > > > > > > > > > > > > functions.
> > > > > > > > > > > >
> > > > > > > > > > > > FYI, in order to avoid a swtich of atomic or not, we
> > > > > > > > > > > > need convert rx
> > > > > > > > > > > > mode setting to workqueue first:
> > > > > > > > > > > >
> > > > > > > > > > > > https://www.mail-archive.com/[email protected]/msg60298.html
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > As such, can we add a module parameter to customize
> > > > > > > > > > > > > this timeout value
> > > > > > > > > > > > > by the user?
> > > > > > > > > > > >
> > > > > > > > > > > > Who is the "user" here, or how can the "user" know the
> > > > > > > > > > > > value?
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Or this timeout value is stored in device register,
> > > > > > > > > > > > > virtio_net driver
> > > > > > > > > > > > > will read this timeout value at initialization?
> > > > > > > > > > > >
> > > > > > > > > > > > See another thread. The design needs to be general, or
> > > > > > > > > > > > you can post a RFC.
> > > > > > > > > > > >
> > > > > > > > > > > > In another thought, we've already had a tx watchdog,
> > > > > > > > > > > > maybe we can have
> > > > > > > > > > > > something similar to cvq and use timeout + reset in
> > > > > > > > > > > > that case.
> > > > > > > > > > >
> > > > > > > > > > > But we may block by the reset ^_^ if the device is broken?
> > > > > > > > > >
> > > > > > > > > > I mean vq reset here.
> > > > > > > > >
> > > > > > > > > I see.
> > > > > > > > >
> > > > > > > > > I mean when the deivce is broken, the vq reset also many be
> > > > > > > > > blocked.
> > > > > > > > >
> > > > > > > > > void vp_modern_set_queue_reset(struct
> > > > > > > > > virtio_pci_modern_device *mdev, u16 index)
> > > > > > > > > {
> > > > > > > > > struct virtio_pci_modern_common_cfg __iom
Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang
On Mon, Jan 22, 2024 at 4:04 PM Xuan Zhuo wrote:
>
> On Mon, 22 Jan 2024 15:57:08 +0800, Jason Wang wrote:
> > On Mon, Jan 22, 2024 at 3:36 PM Xuan Zhuo
> > wrote:
> > >
> > > On Mon, 22 Jan 2024 15:19:12 +0800, Jason Wang
> > > wrote:
> > > > On Mon, Jan 22, 2024 at 3:07 PM Xuan Zhuo
> > > > wrote:
> > > > >
> > > > > On Mon, 22 Jan 2024 14:58:09 +0800, Jason Wang
> > > > > wrote:
> > > > > > On Mon, Jan 22, 2024 at 2:55 PM Jason Wang
> > > > > > wrote:
> > > > > > >
> > > > > > > On Mon, Jan 22, 2024 at 2:20 PM Xuan Zhuo
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > On Mon, 22 Jan 2024 12:16:27 +0800, Jason Wang
> > > > > > > > wrote:
> > > > > > > > > On Mon, Jan 22, 2024 at 12:00 PM Xuan Zhuo
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > On Mon, 22 Jan 2024 11:14:30 +0800, Jason Wang
> > > > > > > > > > wrote:
> > > > > > > > > > > On Mon, Jan 22, 2024 at 10:12 AM Zhu Yanjun
> > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > 在 2024/1/20 1:29, Andrew Lunn 写道:
> > > > > > > > > > > > >while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> > > > > > > > > > > > > - !virtqueue_is_broken(vi->cvq))
> > > > > > > > > > > > > + !virtqueue_is_broken(vi->cvq)) {
> > > > > > > > > > > > > +if (timeout)
> > > > > > > > > > > > > +timeout--;
> > > > > > > > > > > > This is not really a timeout, just a loop counter.
> > > > > > > > > > > > 200 iterations could
> > > > > > > > > > > > be a very short time on reasonable H/W. I guess
> > > > > > > > > > > > this avoid the soft
> > > > > > > > > > > > lockup, but possibly (likely?) breaks the
> > > > > > > > > > > > functionality when we need to
> > > > > > > > > > > > loop for some non negligible time.
> > > > > > > > > > > >
> > > > > > > > > > > > I fear we need a more complex solution, as
> > > > > > > > > > > > mentioned by Micheal in the
> > > > > > > > > > > > thread you quoted.
> > > > > > > > > > > > >>> Got it. I also look forward to the more complex
> > > > > > > > > > > > >>> solution to this problem.
> > > > > > > > > > > > >> Can we add a device capability (new feature bit)
> > > > > > > > > > > > >> such as ctrq_wait_timeout
> > > > > > > > > > > > >> to get a reasonable timeout?
> > > > > > > > > > > > > The usual solution to this is include/linux/iopoll.h.
> > > > > > > > > > > > > If you can sleep
> > > > > > > > > > > > > read_poll_timeout() otherwise
> > > > > > > > > > > > > read_poll_timeout_atomic().
> > > > > > > > > > > >
> > > > > > > > > > > > I read carefully the functions read_poll_timeout() and
> > > > > > > > > > > > read_poll_timeout_atomic(). The timeout is set by the
> > > > > > > > > > > > caller of the 2
> > > > > > > > > > > > functions.
> > > > > > > > > > >
> > > > > > > > > > > FYI, in order to avoid a swtich of atomic or not, we need
> > > > > > > > > > > convert rx
> > > > > > > > > > > mode setting to workqueue first:
> > > > > > > > > > >
> > > > > > > > > > > https://www.mail-archive.com/[email protected]/msg60298.html
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > As such, can we add a module parameter to customize
> > > > > > > > > > > > this timeout value
> > > > > > > > > > > > by the user?
> > > > > > > > > > >
> > > > > > > > > > > Who is the "user" here, or how can the "user" know the
> > > > > > > > > > > value?
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Or this timeout value is stored in device register,
> > > > > > > > > > > > virtio_net driver
> > > > > > > > > > > > will read this timeout value at initialization?
> > > > > > > > > > >
> > > > > > > > > > > See another thread. The design needs to be general, or
> > > > > > > > > > > you can post a RFC.
> > > > > > > > > > >
> > > > > > > > > > > In another thought, we've already had a tx watchdog,
> > > > > > > > > > > maybe we can have
> > > > > > > > > > > something similar to cvq and use timeout + reset in that
> > > > > > > > > > > case.
> > > > > > > > > >
> > > > > > > > > > But we may block by the reset ^_^ if the device is broken?
> > > > > > > > >
> > > > > > > > > I mean vq reset here.
> > > > > > > >
> > > > > > > > I see.
> > > > > > > >
> > > > > > > > I mean when the deivce is broken, the vq reset also many be
> > > > > > > > blocked.
> > > > > > > >
> > > > > > > > void vp_modern_set_queue_reset(struct
> > > > > > > > virtio_pci_modern_device *mdev, u16 index)
> > > > > > > > {
> > > > > > > > struct virtio_pci_modern_common_cfg __iomem
> > > > > > > > *cfg;
> > > > > > > >
> > > > > > > > cfg = (struct virtio_pci_modern_common_cfg
> > > > > > > > __iomem *)mdev->common;
> > > > > > > >
> > > > > > > > vp_iowrite16(index, &cfg->cfg.queue_select);
> > > > > > > > vp_iowrit
Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang
On Mon, 22 Jan 2024 15:57:08 +0800, Jason Wang wrote:
> On Mon, Jan 22, 2024 at 3:36 PM Xuan Zhuo wrote:
> >
> > On Mon, 22 Jan 2024 15:19:12 +0800, Jason Wang wrote:
> > > On Mon, Jan 22, 2024 at 3:07 PM Xuan Zhuo
> > > wrote:
> > > >
> > > > On Mon, 22 Jan 2024 14:58:09 +0800, Jason Wang
> > > > wrote:
> > > > > On Mon, Jan 22, 2024 at 2:55 PM Jason Wang
> > > > > wrote:
> > > > > >
> > > > > > On Mon, Jan 22, 2024 at 2:20 PM Xuan Zhuo
> > > > > > wrote:
> > > > > > >
> > > > > > > On Mon, 22 Jan 2024 12:16:27 +0800, Jason Wang
> > > > > > > wrote:
> > > > > > > > On Mon, Jan 22, 2024 at 12:00 PM Xuan Zhuo
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > On Mon, 22 Jan 2024 11:14:30 +0800, Jason Wang
> > > > > > > > > wrote:
> > > > > > > > > > On Mon, Jan 22, 2024 at 10:12 AM Zhu Yanjun
> > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > 在 2024/1/20 1:29, Andrew Lunn 写道:
> > > > > > > > > > > >while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> > > > > > > > > > > > - !virtqueue_is_broken(vi->cvq))
> > > > > > > > > > > > + !virtqueue_is_broken(vi->cvq)) {
> > > > > > > > > > > > +if (timeout)
> > > > > > > > > > > > +timeout--;
> > > > > > > > > > > This is not really a timeout, just a loop counter.
> > > > > > > > > > > 200 iterations could
> > > > > > > > > > > be a very short time on reasonable H/W. I guess this
> > > > > > > > > > > avoid the soft
> > > > > > > > > > > lockup, but possibly (likely?) breaks the
> > > > > > > > > > > functionality when we need to
> > > > > > > > > > > loop for some non negligible time.
> > > > > > > > > > >
> > > > > > > > > > > I fear we need a more complex solution, as mentioned
> > > > > > > > > > > by Micheal in the
> > > > > > > > > > > thread you quoted.
> > > > > > > > > > > >>> Got it. I also look forward to the more complex
> > > > > > > > > > > >>> solution to this problem.
> > > > > > > > > > > >> Can we add a device capability (new feature bit) such
> > > > > > > > > > > >> as ctrq_wait_timeout
> > > > > > > > > > > >> to get a reasonable timeout?
> > > > > > > > > > > > The usual solution to this is include/linux/iopoll.h.
> > > > > > > > > > > > If you can sleep
> > > > > > > > > > > > read_poll_timeout() otherwise
> > > > > > > > > > > > read_poll_timeout_atomic().
> > > > > > > > > > >
> > > > > > > > > > > I read carefully the functions read_poll_timeout() and
> > > > > > > > > > > read_poll_timeout_atomic(). The timeout is set by the
> > > > > > > > > > > caller of the 2
> > > > > > > > > > > functions.
> > > > > > > > > >
> > > > > > > > > > FYI, in order to avoid a swtich of atomic or not, we need
> > > > > > > > > > convert rx
> > > > > > > > > > mode setting to workqueue first:
> > > > > > > > > >
> > > > > > > > > > https://www.mail-archive.com/[email protected]/msg60298.html
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > As such, can we add a module parameter to customize this
> > > > > > > > > > > timeout value
> > > > > > > > > > > by the user?
> > > > > > > > > >
> > > > > > > > > > Who is the "user" here, or how can the "user" know the
> > > > > > > > > > value?
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Or this timeout value is stored in device register,
> > > > > > > > > > > virtio_net driver
> > > > > > > > > > > will read this timeout value at initialization?
> > > > > > > > > >
> > > > > > > > > > See another thread. The design needs to be general, or you
> > > > > > > > > > can post a RFC.
> > > > > > > > > >
> > > > > > > > > > In another thought, we've already had a tx watchdog, maybe
> > > > > > > > > > we can have
> > > > > > > > > > something similar to cvq and use timeout + reset in that
> > > > > > > > > > case.
> > > > > > > > >
> > > > > > > > > But we may block by the reset ^_^ if the device is broken?
> > > > > > > >
> > > > > > > > I mean vq reset here.
> > > > > > >
> > > > > > > I see.
> > > > > > >
> > > > > > > I mean when the deivce is broken, the vq reset also many be
> > > > > > > blocked.
> > > > > > >
> > > > > > > void vp_modern_set_queue_reset(struct
> > > > > > > virtio_pci_modern_device *mdev, u16 index)
> > > > > > > {
> > > > > > > struct virtio_pci_modern_common_cfg __iomem *cfg;
> > > > > > >
> > > > > > > cfg = (struct virtio_pci_modern_common_cfg
> > > > > > > __iomem *)mdev->common;
> > > > > > >
> > > > > > > vp_iowrite16(index, &cfg->cfg.queue_select);
> > > > > > > vp_iowrite16(1, &cfg->queue_reset);
> > > > > > >
> > > > > > > while (vp_ioread16(&cfg->queue_reset))
> > > > > > > msleep(1);
> > > > > > >
> > > > > > > while (vp_ioread16(&cfg->cfg.queue_enable))
> > > > > > > msle
Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang
On Mon, Jan 22, 2024 at 3:36 PM Xuan Zhuo wrote:
>
> On Mon, 22 Jan 2024 15:19:12 +0800, Jason Wang wrote:
> > On Mon, Jan 22, 2024 at 3:07 PM Xuan Zhuo
> > wrote:
> > >
> > > On Mon, 22 Jan 2024 14:58:09 +0800, Jason Wang
> > > wrote:
> > > > On Mon, Jan 22, 2024 at 2:55 PM Jason Wang wrote:
> > > > >
> > > > > On Mon, Jan 22, 2024 at 2:20 PM Xuan Zhuo
> > > > > wrote:
> > > > > >
> > > > > > On Mon, 22 Jan 2024 12:16:27 +0800, Jason Wang
> > > > > > wrote:
> > > > > > > On Mon, Jan 22, 2024 at 12:00 PM Xuan Zhuo
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > On Mon, 22 Jan 2024 11:14:30 +0800, Jason Wang
> > > > > > > > wrote:
> > > > > > > > > On Mon, Jan 22, 2024 at 10:12 AM Zhu Yanjun
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > 在 2024/1/20 1:29, Andrew Lunn 写道:
> > > > > > > > > > >while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> > > > > > > > > > > - !virtqueue_is_broken(vi->cvq))
> > > > > > > > > > > + !virtqueue_is_broken(vi->cvq)) {
> > > > > > > > > > > +if (timeout)
> > > > > > > > > > > +timeout--;
> > > > > > > > > > This is not really a timeout, just a loop counter. 200
> > > > > > > > > > iterations could
> > > > > > > > > > be a very short time on reasonable H/W. I guess this
> > > > > > > > > > avoid the soft
> > > > > > > > > > lockup, but possibly (likely?) breaks the
> > > > > > > > > > functionality when we need to
> > > > > > > > > > loop for some non negligible time.
> > > > > > > > > >
> > > > > > > > > > I fear we need a more complex solution, as mentioned
> > > > > > > > > > by Micheal in the
> > > > > > > > > > thread you quoted.
> > > > > > > > > > >>> Got it. I also look forward to the more complex
> > > > > > > > > > >>> solution to this problem.
> > > > > > > > > > >> Can we add a device capability (new feature bit) such as
> > > > > > > > > > >> ctrq_wait_timeout
> > > > > > > > > > >> to get a reasonable timeout?
> > > > > > > > > > > The usual solution to this is include/linux/iopoll.h. If
> > > > > > > > > > > you can sleep
> > > > > > > > > > > read_poll_timeout() otherwise read_poll_timeout_atomic().
> > > > > > > > > >
> > > > > > > > > > I read carefully the functions read_poll_timeout() and
> > > > > > > > > > read_poll_timeout_atomic(). The timeout is set by the
> > > > > > > > > > caller of the 2
> > > > > > > > > > functions.
> > > > > > > > >
> > > > > > > > > FYI, in order to avoid a swtich of atomic or not, we need
> > > > > > > > > convert rx
> > > > > > > > > mode setting to workqueue first:
> > > > > > > > >
> > > > > > > > > https://www.mail-archive.com/[email protected]/msg60298.html
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > As such, can we add a module parameter to customize this
> > > > > > > > > > timeout value
> > > > > > > > > > by the user?
> > > > > > > > >
> > > > > > > > > Who is the "user" here, or how can the "user" know the value?
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Or this timeout value is stored in device register,
> > > > > > > > > > virtio_net driver
> > > > > > > > > > will read this timeout value at initialization?
> > > > > > > > >
> > > > > > > > > See another thread. The design needs to be general, or you
> > > > > > > > > can post a RFC.
> > > > > > > > >
> > > > > > > > > In another thought, we've already had a tx watchdog, maybe we
> > > > > > > > > can have
> > > > > > > > > something similar to cvq and use timeout + reset in that case.
> > > > > > > >
> > > > > > > > But we may block by the reset ^_^ if the device is broken?
> > > > > > >
> > > > > > > I mean vq reset here.
> > > > > >
> > > > > > I see.
> > > > > >
> > > > > > I mean when the deivce is broken, the vq reset also many be blocked.
> > > > > >
> > > > > > void vp_modern_set_queue_reset(struct
> > > > > > virtio_pci_modern_device *mdev, u16 index)
> > > > > > {
> > > > > > struct virtio_pci_modern_common_cfg __iomem *cfg;
> > > > > >
> > > > > > cfg = (struct virtio_pci_modern_common_cfg __iomem
> > > > > > *)mdev->common;
> > > > > >
> > > > > > vp_iowrite16(index, &cfg->cfg.queue_select);
> > > > > > vp_iowrite16(1, &cfg->queue_reset);
> > > > > >
> > > > > > while (vp_ioread16(&cfg->queue_reset))
> > > > > > msleep(1);
> > > > > >
> > > > > > while (vp_ioread16(&cfg->cfg.queue_enable))
> > > > > > msleep(1);
> > > > > > }
> > > > > > EXPORT_SYMBOL_GPL(vp_modern_set_queue_reset);
> > > > > >
> > > > > > In this function, for the broken device, we can not expect
> > > > > > something.
> > > > >
> > > > > Yes, it's best effort, there's no guarantee then. But it doesn't harm
> > > > > to try.
> > > > >
> > > > > Thanks
> > > > >
> > > >
Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang
On Mon, 22 Jan 2024 15:19:12 +0800, Jason Wang wrote:
> On Mon, Jan 22, 2024 at 3:07 PM Xuan Zhuo wrote:
> >
> > On Mon, 22 Jan 2024 14:58:09 +0800, Jason Wang wrote:
> > > On Mon, Jan 22, 2024 at 2:55 PM Jason Wang wrote:
> > > >
> > > > On Mon, Jan 22, 2024 at 2:20 PM Xuan Zhuo
> > > > wrote:
> > > > >
> > > > > On Mon, 22 Jan 2024 12:16:27 +0800, Jason Wang
> > > > > wrote:
> > > > > > On Mon, Jan 22, 2024 at 12:00 PM Xuan Zhuo
> > > > > > wrote:
> > > > > > >
> > > > > > > On Mon, 22 Jan 2024 11:14:30 +0800, Jason Wang
> > > > > > > wrote:
> > > > > > > > On Mon, Jan 22, 2024 at 10:12 AM Zhu Yanjun
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > 在 2024/1/20 1:29, Andrew Lunn 写道:
> > > > > > > > > >while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> > > > > > > > > > - !virtqueue_is_broken(vi->cvq))
> > > > > > > > > > + !virtqueue_is_broken(vi->cvq)) {
> > > > > > > > > > +if (timeout)
> > > > > > > > > > +timeout--;
> > > > > > > > > This is not really a timeout, just a loop counter. 200
> > > > > > > > > iterations could
> > > > > > > > > be a very short time on reasonable H/W. I guess this
> > > > > > > > > avoid the soft
> > > > > > > > > lockup, but possibly (likely?) breaks the functionality
> > > > > > > > > when we need to
> > > > > > > > > loop for some non negligible time.
> > > > > > > > >
> > > > > > > > > I fear we need a more complex solution, as mentioned by
> > > > > > > > > Micheal in the
> > > > > > > > > thread you quoted.
> > > > > > > > > >>> Got it. I also look forward to the more complex solution
> > > > > > > > > >>> to this problem.
> > > > > > > > > >> Can we add a device capability (new feature bit) such as
> > > > > > > > > >> ctrq_wait_timeout
> > > > > > > > > >> to get a reasonable timeout?
> > > > > > > > > > The usual solution to this is include/linux/iopoll.h. If
> > > > > > > > > > you can sleep
> > > > > > > > > > read_poll_timeout() otherwise read_poll_timeout_atomic().
> > > > > > > > >
> > > > > > > > > I read carefully the functions read_poll_timeout() and
> > > > > > > > > read_poll_timeout_atomic(). The timeout is set by the caller
> > > > > > > > > of the 2
> > > > > > > > > functions.
> > > > > > > >
> > > > > > > > FYI, in order to avoid a swtich of atomic or not, we need
> > > > > > > > convert rx
> > > > > > > > mode setting to workqueue first:
> > > > > > > >
> > > > > > > > https://www.mail-archive.com/[email protected]/msg60298.html
> > > > > > > >
> > > > > > > > >
> > > > > > > > > As such, can we add a module parameter to customize this
> > > > > > > > > timeout value
> > > > > > > > > by the user?
> > > > > > > >
> > > > > > > > Who is the "user" here, or how can the "user" know the value?
> > > > > > > >
> > > > > > > > >
> > > > > > > > > Or this timeout value is stored in device register,
> > > > > > > > > virtio_net driver
> > > > > > > > > will read this timeout value at initialization?
> > > > > > > >
> > > > > > > > See another thread. The design needs to be general, or you can
> > > > > > > > post a RFC.
> > > > > > > >
> > > > > > > > In another thought, we've already had a tx watchdog, maybe we
> > > > > > > > can have
> > > > > > > > something similar to cvq and use timeout + reset in that case.
> > > > > > >
> > > > > > > But we may block by the reset ^_^ if the device is broken?
> > > > > >
> > > > > > I mean vq reset here.
> > > > >
> > > > > I see.
> > > > >
> > > > > I mean when the deivce is broken, the vq reset also many be blocked.
> > > > >
> > > > > void vp_modern_set_queue_reset(struct
> > > > > virtio_pci_modern_device *mdev, u16 index)
> > > > > {
> > > > > struct virtio_pci_modern_common_cfg __iomem *cfg;
> > > > >
> > > > > cfg = (struct virtio_pci_modern_common_cfg __iomem
> > > > > *)mdev->common;
> > > > >
> > > > > vp_iowrite16(index, &cfg->cfg.queue_select);
> > > > > vp_iowrite16(1, &cfg->queue_reset);
> > > > >
> > > > > while (vp_ioread16(&cfg->queue_reset))
> > > > > msleep(1);
> > > > >
> > > > > while (vp_ioread16(&cfg->cfg.queue_enable))
> > > > > msleep(1);
> > > > > }
> > > > > EXPORT_SYMBOL_GPL(vp_modern_set_queue_reset);
> > > > >
> > > > > In this function, for the broken device, we can not expect something.
> > > >
> > > > Yes, it's best effort, there's no guarantee then. But it doesn't harm
> > > > to try.
> > > >
> > > > Thanks
> > > >
> > > > >
> > > > >
> > > > > >
> > > > > > It looks like we have multiple goals here
> > > > > >
> > > > > > 1) avoid lockups, using workqueue + cond_resched() seems to be
> > > > > > sufficient, it has issue but nothing new
> > > > > > 2) recover from the unresponsive device, the issue for timeout is
Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang
On Mon, Jan 22, 2024 at 3:07 PM Xuan Zhuo wrote:
>
> On Mon, 22 Jan 2024 14:58:09 +0800, Jason Wang wrote:
> > On Mon, Jan 22, 2024 at 2:55 PM Jason Wang wrote:
> > >
> > > On Mon, Jan 22, 2024 at 2:20 PM Xuan Zhuo
> > > wrote:
> > > >
> > > > On Mon, 22 Jan 2024 12:16:27 +0800, Jason Wang
> > > > wrote:
> > > > > On Mon, Jan 22, 2024 at 12:00 PM Xuan Zhuo
> > > > > wrote:
> > > > > >
> > > > > > On Mon, 22 Jan 2024 11:14:30 +0800, Jason Wang
> > > > > > wrote:
> > > > > > > On Mon, Jan 22, 2024 at 10:12 AM Zhu Yanjun
> > > > > > > wrote:
> > > > > > > >
> > > > > > > >
> > > > > > > > 在 2024/1/20 1:29, Andrew Lunn 写道:
> > > > > > > > >while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> > > > > > > > > - !virtqueue_is_broken(vi->cvq))
> > > > > > > > > + !virtqueue_is_broken(vi->cvq)) {
> > > > > > > > > +if (timeout)
> > > > > > > > > +timeout--;
> > > > > > > > This is not really a timeout, just a loop counter. 200
> > > > > > > > iterations could
> > > > > > > > be a very short time on reasonable H/W. I guess this avoid
> > > > > > > > the soft
> > > > > > > > lockup, but possibly (likely?) breaks the functionality
> > > > > > > > when we need to
> > > > > > > > loop for some non negligible time.
> > > > > > > >
> > > > > > > > I fear we need a more complex solution, as mentioned by
> > > > > > > > Micheal in the
> > > > > > > > thread you quoted.
> > > > > > > > >>> Got it. I also look forward to the more complex solution to
> > > > > > > > >>> this problem.
> > > > > > > > >> Can we add a device capability (new feature bit) such as
> > > > > > > > >> ctrq_wait_timeout
> > > > > > > > >> to get a reasonable timeout?
> > > > > > > > > The usual solution to this is include/linux/iopoll.h. If you
> > > > > > > > > can sleep
> > > > > > > > > read_poll_timeout() otherwise read_poll_timeout_atomic().
> > > > > > > >
> > > > > > > > I read carefully the functions read_poll_timeout() and
> > > > > > > > read_poll_timeout_atomic(). The timeout is set by the caller of
> > > > > > > > the 2
> > > > > > > > functions.
> > > > > > >
> > > > > > > FYI, in order to avoid a swtich of atomic or not, we need convert
> > > > > > > rx
> > > > > > > mode setting to workqueue first:
> > > > > > >
> > > > > > > https://www.mail-archive.com/[email protected]/msg60298.html
> > > > > > >
> > > > > > > >
> > > > > > > > As such, can we add a module parameter to customize this
> > > > > > > > timeout value
> > > > > > > > by the user?
> > > > > > >
> > > > > > > Who is the "user" here, or how can the "user" know the value?
> > > > > > >
> > > > > > > >
> > > > > > > > Or this timeout value is stored in device register, virtio_net
> > > > > > > > driver
> > > > > > > > will read this timeout value at initialization?
> > > > > > >
> > > > > > > See another thread. The design needs to be general, or you can
> > > > > > > post a RFC.
> > > > > > >
> > > > > > > In another thought, we've already had a tx watchdog, maybe we can
> > > > > > > have
> > > > > > > something similar to cvq and use timeout + reset in that case.
> > > > > >
> > > > > > But we may block by the reset ^_^ if the device is broken?
> > > > >
> > > > > I mean vq reset here.
> > > >
> > > > I see.
> > > >
> > > > I mean when the deivce is broken, the vq reset also many be blocked.
> > > >
> > > > void vp_modern_set_queue_reset(struct virtio_pci_modern_device
> > > > *mdev, u16 index)
> > > > {
> > > > struct virtio_pci_modern_common_cfg __iomem *cfg;
> > > >
> > > > cfg = (struct virtio_pci_modern_common_cfg __iomem
> > > > *)mdev->common;
> > > >
> > > > vp_iowrite16(index, &cfg->cfg.queue_select);
> > > > vp_iowrite16(1, &cfg->queue_reset);
> > > >
> > > > while (vp_ioread16(&cfg->queue_reset))
> > > > msleep(1);
> > > >
> > > > while (vp_ioread16(&cfg->cfg.queue_enable))
> > > > msleep(1);
> > > > }
> > > > EXPORT_SYMBOL_GPL(vp_modern_set_queue_reset);
> > > >
> > > > In this function, for the broken device, we can not expect something.
> > >
> > > Yes, it's best effort, there's no guarantee then. But it doesn't harm to
> > > try.
> > >
> > > Thanks
> > >
> > > >
> > > >
> > > > >
> > > > > It looks like we have multiple goals here
> > > > >
> > > > > 1) avoid lockups, using workqueue + cond_resched() seems to be
> > > > > sufficient, it has issue but nothing new
> > > > > 2) recover from the unresponsive device, the issue for timeout is that
> > > > > it needs to deal with false positives
> > > >
> > > >
> > > > I agree.
> > > >
> > > > But I want to add a new goal, cvq async. In the netdim, we will
> > > > send many requests via the cvq, so the cvq async will be nice.
> >
> > Then you need an interrupt for cvq.
> >
> > F
Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang
On Mon, 22 Jan 2024 14:58:09 +0800, Jason Wang wrote:
> On Mon, Jan 22, 2024 at 2:55 PM Jason Wang wrote:
> >
> > On Mon, Jan 22, 2024 at 2:20 PM Xuan Zhuo
> > wrote:
> > >
> > > On Mon, 22 Jan 2024 12:16:27 +0800, Jason Wang
> > > wrote:
> > > > On Mon, Jan 22, 2024 at 12:00 PM Xuan Zhuo
> > > > wrote:
> > > > >
> > > > > On Mon, 22 Jan 2024 11:14:30 +0800, Jason Wang
> > > > > wrote:
> > > > > > On Mon, Jan 22, 2024 at 10:12 AM Zhu Yanjun
> > > > > > wrote:
> > > > > > >
> > > > > > >
> > > > > > > 在 2024/1/20 1:29, Andrew Lunn 写道:
> > > > > > > >while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> > > > > > > > - !virtqueue_is_broken(vi->cvq))
> > > > > > > > + !virtqueue_is_broken(vi->cvq)) {
> > > > > > > > +if (timeout)
> > > > > > > > +timeout--;
> > > > > > > This is not really a timeout, just a loop counter. 200
> > > > > > > iterations could
> > > > > > > be a very short time on reasonable H/W. I guess this avoid
> > > > > > > the soft
> > > > > > > lockup, but possibly (likely?) breaks the functionality when
> > > > > > > we need to
> > > > > > > loop for some non negligible time.
> > > > > > >
> > > > > > > I fear we need a more complex solution, as mentioned by
> > > > > > > Micheal in the
> > > > > > > thread you quoted.
> > > > > > > >>> Got it. I also look forward to the more complex solution to
> > > > > > > >>> this problem.
> > > > > > > >> Can we add a device capability (new feature bit) such as
> > > > > > > >> ctrq_wait_timeout
> > > > > > > >> to get a reasonable timeout?
> > > > > > > > The usual solution to this is include/linux/iopoll.h. If you
> > > > > > > > can sleep
> > > > > > > > read_poll_timeout() otherwise read_poll_timeout_atomic().
> > > > > > >
> > > > > > > I read carefully the functions read_poll_timeout() and
> > > > > > > read_poll_timeout_atomic(). The timeout is set by the caller of
> > > > > > > the 2
> > > > > > > functions.
> > > > > >
> > > > > > FYI, in order to avoid a swtich of atomic or not, we need convert rx
> > > > > > mode setting to workqueue first:
> > > > > >
> > > > > > https://www.mail-archive.com/[email protected]/msg60298.html
> > > > > >
> > > > > > >
> > > > > > > As such, can we add a module parameter to customize this timeout
> > > > > > > value
> > > > > > > by the user?
> > > > > >
> > > > > > Who is the "user" here, or how can the "user" know the value?
> > > > > >
> > > > > > >
> > > > > > > Or this timeout value is stored in device register, virtio_net
> > > > > > > driver
> > > > > > > will read this timeout value at initialization?
> > > > > >
> > > > > > See another thread. The design needs to be general, or you can post
> > > > > > a RFC.
> > > > > >
> > > > > > In another thought, we've already had a tx watchdog, maybe we can
> > > > > > have
> > > > > > something similar to cvq and use timeout + reset in that case.
> > > > >
> > > > > But we may block by the reset ^_^ if the device is broken?
> > > >
> > > > I mean vq reset here.
> > >
> > > I see.
> > >
> > > I mean when the deivce is broken, the vq reset also many be blocked.
> > >
> > > void vp_modern_set_queue_reset(struct virtio_pci_modern_device
> > > *mdev, u16 index)
> > > {
> > > struct virtio_pci_modern_common_cfg __iomem *cfg;
> > >
> > > cfg = (struct virtio_pci_modern_common_cfg __iomem
> > > *)mdev->common;
> > >
> > > vp_iowrite16(index, &cfg->cfg.queue_select);
> > > vp_iowrite16(1, &cfg->queue_reset);
> > >
> > > while (vp_ioread16(&cfg->queue_reset))
> > > msleep(1);
> > >
> > > while (vp_ioread16(&cfg->cfg.queue_enable))
> > > msleep(1);
> > > }
> > > EXPORT_SYMBOL_GPL(vp_modern_set_queue_reset);
> > >
> > > In this function, for the broken device, we can not expect something.
> >
> > Yes, it's best effort, there's no guarantee then. But it doesn't harm to
> > try.
> >
> > Thanks
> >
> > >
> > >
> > > >
> > > > It looks like we have multiple goals here
> > > >
> > > > 1) avoid lockups, using workqueue + cond_resched() seems to be
> > > > sufficient, it has issue but nothing new
> > > > 2) recover from the unresponsive device, the issue for timeout is that
> > > > it needs to deal with false positives
> > >
> > >
> > > I agree.
> > >
> > > But I want to add a new goal, cvq async. In the netdim, we will
> > > send many requests via the cvq, so the cvq async will be nice.
>
> Then you need an interrupt for cvq.
>
> FYI, I've posted a series that use interrupt for cvq in the past:
>
> https://lore.kernel.org/lkml/[email protected]/t/
I know this. But the interrupt maybe not a good solution without new space.
>
> Haven't found time in working on this anymore, maybe we can start from
> this or not.
Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang
On Mon, 22 Jan 2024 14:55:46 +0800, Jason Wang wrote:
> On Mon, Jan 22, 2024 at 2:20 PM Xuan Zhuo wrote:
> >
> > On Mon, 22 Jan 2024 12:16:27 +0800, Jason Wang wrote:
> > > On Mon, Jan 22, 2024 at 12:00 PM Xuan Zhuo
> > > wrote:
> > > >
> > > > On Mon, 22 Jan 2024 11:14:30 +0800, Jason Wang
> > > > wrote:
> > > > > On Mon, Jan 22, 2024 at 10:12 AM Zhu Yanjun
> > > > > wrote:
> > > > > >
> > > > > >
> > > > > > 在 2024/1/20 1:29, Andrew Lunn 写道:
> > > > > > >while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> > > > > > > - !virtqueue_is_broken(vi->cvq))
> > > > > > > + !virtqueue_is_broken(vi->cvq)) {
> > > > > > > +if (timeout)
> > > > > > > +timeout--;
> > > > > > This is not really a timeout, just a loop counter. 200
> > > > > > iterations could
> > > > > > be a very short time on reasonable H/W. I guess this avoid the
> > > > > > soft
> > > > > > lockup, but possibly (likely?) breaks the functionality when
> > > > > > we need to
> > > > > > loop for some non negligible time.
> > > > > >
> > > > > > I fear we need a more complex solution, as mentioned by
> > > > > > Micheal in the
> > > > > > thread you quoted.
> > > > > > >>> Got it. I also look forward to the more complex solution to
> > > > > > >>> this problem.
> > > > > > >> Can we add a device capability (new feature bit) such as
> > > > > > >> ctrq_wait_timeout
> > > > > > >> to get a reasonable timeout?
> > > > > > > The usual solution to this is include/linux/iopoll.h. If you can
> > > > > > > sleep
> > > > > > > read_poll_timeout() otherwise read_poll_timeout_atomic().
> > > > > >
> > > > > > I read carefully the functions read_poll_timeout() and
> > > > > > read_poll_timeout_atomic(). The timeout is set by the caller of the
> > > > > > 2
> > > > > > functions.
> > > > >
> > > > > FYI, in order to avoid a swtich of atomic or not, we need convert rx
> > > > > mode setting to workqueue first:
> > > > >
> > > > > https://www.mail-archive.com/[email protected]/msg60298.html
> > > > >
> > > > > >
> > > > > > As such, can we add a module parameter to customize this timeout
> > > > > > value
> > > > > > by the user?
> > > > >
> > > > > Who is the "user" here, or how can the "user" know the value?
> > > > >
> > > > > >
> > > > > > Or this timeout value is stored in device register, virtio_net
> > > > > > driver
> > > > > > will read this timeout value at initialization?
> > > > >
> > > > > See another thread. The design needs to be general, or you can post a
> > > > > RFC.
> > > > >
> > > > > In another thought, we've already had a tx watchdog, maybe we can have
> > > > > something similar to cvq and use timeout + reset in that case.
> > > >
> > > > But we may block by the reset ^_^ if the device is broken?
> > >
> > > I mean vq reset here.
> >
> > I see.
> >
> > I mean when the deivce is broken, the vq reset also many be blocked.
> >
> > void vp_modern_set_queue_reset(struct virtio_pci_modern_device
> > *mdev, u16 index)
> > {
> > struct virtio_pci_modern_common_cfg __iomem *cfg;
> >
> > cfg = (struct virtio_pci_modern_common_cfg __iomem
> > *)mdev->common;
> >
> > vp_iowrite16(index, &cfg->cfg.queue_select);
> > vp_iowrite16(1, &cfg->queue_reset);
> >
> > while (vp_ioread16(&cfg->queue_reset))
> > msleep(1);
> >
> > while (vp_ioread16(&cfg->cfg.queue_enable))
> > msleep(1);
> > }
> > EXPORT_SYMBOL_GPL(vp_modern_set_queue_reset);
> >
> > In this function, for the broken device, we can not expect something.
>
> Yes, it's best effort, there's no guarantee then. But it doesn't harm to try.
>
YES. I agree.
Thanks.
> Thanks
>
> >
> >
> > >
> > > It looks like we have multiple goals here
> > >
> > > 1) avoid lockups, using workqueue + cond_resched() seems to be
> > > sufficient, it has issue but nothing new
> > > 2) recover from the unresponsive device, the issue for timeout is that
> > > it needs to deal with false positives
> >
> >
> > I agree.
> >
> > But I want to add a new goal, cvq async. In the netdim, we will
> > send many requests via the cvq, so the cvq async will be nice.
> >
> > Thanks.
> >
> >
> > >
> > > Thanks
> > >
> > > >
> > > > Thanks.
> > > >
> > > >
> > > > >
> > > > > Thans
> > > > >
> > > > > >
> > > > > > Zhu Yanjun
> > > > > >
> > > > > > >
> > > > > > > Andrew
> > > > > >
> > > > >
> > > >
> > >
> >
>
Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang
On Mon, Jan 22, 2024 at 2:55 PM Jason Wang wrote:
>
> On Mon, Jan 22, 2024 at 2:20 PM Xuan Zhuo wrote:
> >
> > On Mon, 22 Jan 2024 12:16:27 +0800, Jason Wang wrote:
> > > On Mon, Jan 22, 2024 at 12:00 PM Xuan Zhuo
> > > wrote:
> > > >
> > > > On Mon, 22 Jan 2024 11:14:30 +0800, Jason Wang
> > > > wrote:
> > > > > On Mon, Jan 22, 2024 at 10:12 AM Zhu Yanjun
> > > > > wrote:
> > > > > >
> > > > > >
> > > > > > 在 2024/1/20 1:29, Andrew Lunn 写道:
> > > > > > >while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> > > > > > > - !virtqueue_is_broken(vi->cvq))
> > > > > > > + !virtqueue_is_broken(vi->cvq)) {
> > > > > > > +if (timeout)
> > > > > > > +timeout--;
> > > > > > This is not really a timeout, just a loop counter. 200
> > > > > > iterations could
> > > > > > be a very short time on reasonable H/W. I guess this avoid the
> > > > > > soft
> > > > > > lockup, but possibly (likely?) breaks the functionality when
> > > > > > we need to
> > > > > > loop for some non negligible time.
> > > > > >
> > > > > > I fear we need a more complex solution, as mentioned by
> > > > > > Micheal in the
> > > > > > thread you quoted.
> > > > > > >>> Got it. I also look forward to the more complex solution to
> > > > > > >>> this problem.
> > > > > > >> Can we add a device capability (new feature bit) such as
> > > > > > >> ctrq_wait_timeout
> > > > > > >> to get a reasonable timeout?
> > > > > > > The usual solution to this is include/linux/iopoll.h. If you can
> > > > > > > sleep
> > > > > > > read_poll_timeout() otherwise read_poll_timeout_atomic().
> > > > > >
> > > > > > I read carefully the functions read_poll_timeout() and
> > > > > > read_poll_timeout_atomic(). The timeout is set by the caller of the
> > > > > > 2
> > > > > > functions.
> > > > >
> > > > > FYI, in order to avoid a swtich of atomic or not, we need convert rx
> > > > > mode setting to workqueue first:
> > > > >
> > > > > https://www.mail-archive.com/[email protected]/msg60298.html
> > > > >
> > > > > >
> > > > > > As such, can we add a module parameter to customize this timeout
> > > > > > value
> > > > > > by the user?
> > > > >
> > > > > Who is the "user" here, or how can the "user" know the value?
> > > > >
> > > > > >
> > > > > > Or this timeout value is stored in device register, virtio_net
> > > > > > driver
> > > > > > will read this timeout value at initialization?
> > > > >
> > > > > See another thread. The design needs to be general, or you can post a
> > > > > RFC.
> > > > >
> > > > > In another thought, we've already had a tx watchdog, maybe we can have
> > > > > something similar to cvq and use timeout + reset in that case.
> > > >
> > > > But we may block by the reset ^_^ if the device is broken?
> > >
> > > I mean vq reset here.
> >
> > I see.
> >
> > I mean when the deivce is broken, the vq reset also many be blocked.
> >
> > void vp_modern_set_queue_reset(struct virtio_pci_modern_device
> > *mdev, u16 index)
> > {
> > struct virtio_pci_modern_common_cfg __iomem *cfg;
> >
> > cfg = (struct virtio_pci_modern_common_cfg __iomem
> > *)mdev->common;
> >
> > vp_iowrite16(index, &cfg->cfg.queue_select);
> > vp_iowrite16(1, &cfg->queue_reset);
> >
> > while (vp_ioread16(&cfg->queue_reset))
> > msleep(1);
> >
> > while (vp_ioread16(&cfg->cfg.queue_enable))
> > msleep(1);
> > }
> > EXPORT_SYMBOL_GPL(vp_modern_set_queue_reset);
> >
> > In this function, for the broken device, we can not expect something.
>
> Yes, it's best effort, there's no guarantee then. But it doesn't harm to try.
>
> Thanks
>
> >
> >
> > >
> > > It looks like we have multiple goals here
> > >
> > > 1) avoid lockups, using workqueue + cond_resched() seems to be
> > > sufficient, it has issue but nothing new
> > > 2) recover from the unresponsive device, the issue for timeout is that
> > > it needs to deal with false positives
> >
> >
> > I agree.
> >
> > But I want to add a new goal, cvq async. In the netdim, we will
> > send many requests via the cvq, so the cvq async will be nice.
Then you need an interrupt for cvq.
FYI, I've posted a series that use interrupt for cvq in the past:
https://lore.kernel.org/lkml/[email protected]/t/
Haven't found time in working on this anymore, maybe we can start from
this or not.
Thanks
> >
> > Thanks.
> >
> >
> > >
> > > Thanks
> > >
> > > >
> > > > Thanks.
> > > >
> > > >
> > > > >
> > > > > Thans
> > > > >
> > > > > >
> > > > > > Zhu Yanjun
> > > > > >
> > > > > > >
> > > > > > > Andrew
> > > > > >
> > > > >
> > > >
> > >
> >
Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang
On Mon, Jan 22, 2024 at 2:20 PM Xuan Zhuo wrote:
>
> On Mon, 22 Jan 2024 12:16:27 +0800, Jason Wang wrote:
> > On Mon, Jan 22, 2024 at 12:00 PM Xuan Zhuo
> > wrote:
> > >
> > > On Mon, 22 Jan 2024 11:14:30 +0800, Jason Wang
> > > wrote:
> > > > On Mon, Jan 22, 2024 at 10:12 AM Zhu Yanjun
> > > > wrote:
> > > > >
> > > > >
> > > > > 在 2024/1/20 1:29, Andrew Lunn 写道:
> > > > > >while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> > > > > > - !virtqueue_is_broken(vi->cvq))
> > > > > > + !virtqueue_is_broken(vi->cvq)) {
> > > > > > +if (timeout)
> > > > > > +timeout--;
> > > > > This is not really a timeout, just a loop counter. 200
> > > > > iterations could
> > > > > be a very short time on reasonable H/W. I guess this avoid the
> > > > > soft
> > > > > lockup, but possibly (likely?) breaks the functionality when we
> > > > > need to
> > > > > loop for some non negligible time.
> > > > >
> > > > > I fear we need a more complex solution, as mentioned by Micheal
> > > > > in the
> > > > > thread you quoted.
> > > > > >>> Got it. I also look forward to the more complex solution to this
> > > > > >>> problem.
> > > > > >> Can we add a device capability (new feature bit) such as
> > > > > >> ctrq_wait_timeout
> > > > > >> to get a reasonable timeout?
> > > > > > The usual solution to this is include/linux/iopoll.h. If you can
> > > > > > sleep
> > > > > > read_poll_timeout() otherwise read_poll_timeout_atomic().
> > > > >
> > > > > I read carefully the functions read_poll_timeout() and
> > > > > read_poll_timeout_atomic(). The timeout is set by the caller of the 2
> > > > > functions.
> > > >
> > > > FYI, in order to avoid a swtich of atomic or not, we need convert rx
> > > > mode setting to workqueue first:
> > > >
> > > > https://www.mail-archive.com/[email protected]/msg60298.html
> > > >
> > > > >
> > > > > As such, can we add a module parameter to customize this timeout value
> > > > > by the user?
> > > >
> > > > Who is the "user" here, or how can the "user" know the value?
> > > >
> > > > >
> > > > > Or this timeout value is stored in device register, virtio_net driver
> > > > > will read this timeout value at initialization?
> > > >
> > > > See another thread. The design needs to be general, or you can post a
> > > > RFC.
> > > >
> > > > In another thought, we've already had a tx watchdog, maybe we can have
> > > > something similar to cvq and use timeout + reset in that case.
> > >
> > > But we may block by the reset ^_^ if the device is broken?
> >
> > I mean vq reset here.
>
> I see.
>
> I mean when the deivce is broken, the vq reset also many be blocked.
>
> void vp_modern_set_queue_reset(struct virtio_pci_modern_device *mdev,
> u16 index)
> {
> struct virtio_pci_modern_common_cfg __iomem *cfg;
>
> cfg = (struct virtio_pci_modern_common_cfg __iomem
> *)mdev->common;
>
> vp_iowrite16(index, &cfg->cfg.queue_select);
> vp_iowrite16(1, &cfg->queue_reset);
>
> while (vp_ioread16(&cfg->queue_reset))
> msleep(1);
>
> while (vp_ioread16(&cfg->cfg.queue_enable))
> msleep(1);
> }
> EXPORT_SYMBOL_GPL(vp_modern_set_queue_reset);
>
> In this function, for the broken device, we can not expect something.
Yes, it's best effort, there's no guarantee then. But it doesn't harm to try.
Thanks
>
>
> >
> > It looks like we have multiple goals here
> >
> > 1) avoid lockups, using workqueue + cond_resched() seems to be
> > sufficient, it has issue but nothing new
> > 2) recover from the unresponsive device, the issue for timeout is that
> > it needs to deal with false positives
>
>
> I agree.
>
> But I want to add a new goal, cvq async. In the netdim, we will
> send many requests via the cvq, so the cvq async will be nice.
>
> Thanks.
>
>
> >
> > Thanks
> >
> > >
> > > Thanks.
> > >
> > >
> > > >
> > > > Thans
> > > >
> > > > >
> > > > > Zhu Yanjun
> > > > >
> > > > > >
> > > > > > Andrew
> > > > >
> > > >
> > >
> >
>
Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang
On Mon, 22 Jan 2024 12:16:27 +0800, Jason Wang wrote:
> On Mon, Jan 22, 2024 at 12:00 PM Xuan Zhuo wrote:
> >
> > On Mon, 22 Jan 2024 11:14:30 +0800, Jason Wang wrote:
> > > On Mon, Jan 22, 2024 at 10:12 AM Zhu Yanjun wrote:
> > > >
> > > >
> > > > 在 2024/1/20 1:29, Andrew Lunn 写道:
> > > > >while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> > > > > - !virtqueue_is_broken(vi->cvq))
> > > > > + !virtqueue_is_broken(vi->cvq)) {
> > > > > +if (timeout)
> > > > > +timeout--;
> > > > This is not really a timeout, just a loop counter. 200 iterations
> > > > could
> > > > be a very short time on reasonable H/W. I guess this avoid the soft
> > > > lockup, but possibly (likely?) breaks the functionality when we
> > > > need to
> > > > loop for some non negligible time.
> > > >
> > > > I fear we need a more complex solution, as mentioned by Micheal in
> > > > the
> > > > thread you quoted.
> > > > >>> Got it. I also look forward to the more complex solution to this
> > > > >>> problem.
> > > > >> Can we add a device capability (new feature bit) such as
> > > > >> ctrq_wait_timeout
> > > > >> to get a reasonable timeout?
> > > > > The usual solution to this is include/linux/iopoll.h. If you can sleep
> > > > > read_poll_timeout() otherwise read_poll_timeout_atomic().
> > > >
> > > > I read carefully the functions read_poll_timeout() and
> > > > read_poll_timeout_atomic(). The timeout is set by the caller of the 2
> > > > functions.
> > >
> > > FYI, in order to avoid a swtich of atomic or not, we need convert rx
> > > mode setting to workqueue first:
> > >
> > > https://www.mail-archive.com/[email protected]/msg60298.html
> > >
> > > >
> > > > As such, can we add a module parameter to customize this timeout value
> > > > by the user?
> > >
> > > Who is the "user" here, or how can the "user" know the value?
> > >
> > > >
> > > > Or this timeout value is stored in device register, virtio_net driver
> > > > will read this timeout value at initialization?
> > >
> > > See another thread. The design needs to be general, or you can post a RFC.
> > >
> > > In another thought, we've already had a tx watchdog, maybe we can have
> > > something similar to cvq and use timeout + reset in that case.
> >
> > But we may block by the reset ^_^ if the device is broken?
>
> I mean vq reset here.
I see.
I mean when the deivce is broken, the vq reset also many be blocked.
void vp_modern_set_queue_reset(struct virtio_pci_modern_device *mdev,
u16 index)
{
struct virtio_pci_modern_common_cfg __iomem *cfg;
cfg = (struct virtio_pci_modern_common_cfg __iomem
*)mdev->common;
vp_iowrite16(index, &cfg->cfg.queue_select);
vp_iowrite16(1, &cfg->queue_reset);
while (vp_ioread16(&cfg->queue_reset))
msleep(1);
while (vp_ioread16(&cfg->cfg.queue_enable))
msleep(1);
}
EXPORT_SYMBOL_GPL(vp_modern_set_queue_reset);
In this function, for the broken device, we can not expect something.
>
> It looks like we have multiple goals here
>
> 1) avoid lockups, using workqueue + cond_resched() seems to be
> sufficient, it has issue but nothing new
> 2) recover from the unresponsive device, the issue for timeout is that
> it needs to deal with false positives
I agree.
But I want to add a new goal, cvq async. In the netdim, we will
send many requests via the cvq, so the cvq async will be nice.
Thanks.
>
> Thanks
>
> >
> > Thanks.
> >
> >
> > >
> > > Thans
> > >
> > > >
> > > > Zhu Yanjun
> > > >
> > > > >
> > > > > Andrew
> > > >
> > >
> >
>
Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang
在 2024/1/22 上午11:08, Jason Wang 写道: On Fri, Jan 19, 2024 at 10:27 PM Heng Qi wrote: 在 2024/1/18 下午8:01, Zhu Yanjun 写道: 在 2024/1/16 20:04, Paolo Abeni 写道: On Mon, 2024-01-15 at 09:29 +0800, Zhu Yanjun wrote: From: Zhu Yanjun Some devices emulate the virtio_net hardwares. When virtio_net driver sends commands to the emulated hardware, normally the hardware needs time to response. Sometimes the time is very long. Thus, the following will appear. Then the whole system will hang. The similar problems also occur in Intel NICs and Mellanox NICs. As such, the similar solution is borrowed from them. A timeout value is added and the timeout value as large as possible is set to ensure that the driver gets the maximum possible response from the hardware. " [ 213.795860] watchdog: BUG: soft lockup - CPU#108 stuck for 26s! [(udev-worker):3157] [ 213.796114] Modules linked in: virtio_net(+) net_failover failover qrtr rfkill sunrpc intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common intel_ifs i10nm_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp iTCO_wdt rapl intel_pmc_bxt dax_hmem iTCO_vendor_support vfat cxl_acpi intel_cstate pmt_telemetry pmt_class intel_sdsi joydev intel_uncore cxl_core fat pcspkr mei_me isst_if_mbox_pci isst_if_mmio idxd i2c_i801 isst_if_common mei intel_vsec idxd_bus i2c_smbus i2c_ismt ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler acpi_pad acpi_power_meter pfr_telemetry pfr_update fuse loop zram xfs crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3 bnxt_en sha256_ssse3 sha1_ssse3 nvme ast nvme_core i2c_algo_bit wmi pinctrl_emmitsburg scsi_dh_rdac scsi_dh_emc scsi_dh_alua dm_multipath [ 213.796194] irq event stamp: 67740 [ 213.796195] hardirqs last enabled at (67739): [] asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 213.796203] hardirqs last disabled at (67740): [] sysvec_apic_timer_interrupt+0xe/0x90 [ 213.796208] softirqs last enabled at (67686): [] __irq_exit_rcu+0xbe/0xe0 [ 213.796214] softirqs last disabled at (67681): [] __irq_exit_rcu+0xbe/0xe0 [ 213.796217] CPU: 108 PID: 3157 Comm: (udev-worker) Kdump: loaded Not tainted 6.7.0+ #9 [ 213.796220] Hardware name: Intel Corporation M50FCP2SBSTD/M50FCP2SBSTD, BIOS SE5C741.86B.01.01.0001.2211140926 11/14/2022 [ 213.796221] RIP: 0010:virtqueue_get_buf_ctx_split+0x8d/0x110 [ 213.796228] Code: 89 df e8 26 fe ff ff 0f b7 43 50 83 c0 01 66 89 43 50 f6 43 78 01 75 12 80 7b 42 00 48 8b 4b 68 8b 53 58 74 0f 66 87 44 51 04 <48> 89 e8 5b 5d c3 cc cc cc cc 66 89 44 51 04 0f ae f0 48 89 e8 5b [ 213.796230] RSP: 0018:ff4bbb362306f9b0 EFLAGS: 0246 [ 213.796233] RAX: RBX: ff2f15095896f000 RCX: 0001 [ 213.796235] RDX: RSI: ff4bbb362306f9cc RDI: ff2f15095896f000 [ 213.796236] RBP: R08: R09: [ 213.796237] R10: 0003 R11: ff2f15095893cc40 R12: 0002 [ 213.796239] R13: 0004 R14: R15: ff2f1509534f3000 [ 213.796240] FS: 7f775847d0c0() GS:ff2f1528bac0() knlGS: [ 213.796242] CS: 0010 DS: ES: CR0: 80050033 [ 213.796243] CR2: 557f987b6e70 CR3: 002098602006 CR4: 00f71ef0 [ 213.796245] DR0: DR1: DR2: [ 213.796246] DR3: DR6: fffe07f0 DR7: 0400 [ 213.796247] PKRU: 5554 [ 213.796249] Call Trace: [ 213.796250] [ 213.796252] ? watchdog_timer_fn+0x1c0/0x220 [ 213.796258] ? __pfx_watchdog_timer_fn+0x10/0x10 [ 213.796261] ? __hrtimer_run_queues+0x1af/0x380 [ 213.796269] ? hrtimer_interrupt+0xf8/0x230 [ 213.796274] ? __sysvec_apic_timer_interrupt+0x64/0x1a0 [ 213.796279] ? sysvec_apic_timer_interrupt+0x6d/0x90 [ 213.796282] [ 213.796284] [ 213.796285] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 213.796293] ? virtqueue_get_buf_ctx_split+0x8d/0x110 [ 213.796297] virtnet_send_command+0x18a/0x1f0 [virtio_net] [ 213.796310] _virtnet_set_queues+0xc6/0x120 [virtio_net] [ 213.796319] virtnet_probe+0xa06/0xd50 [virtio_net] [ 213.796328] virtio_dev_probe+0x195/0x230 [ 213.796333] really_probe+0x19f/0x400 [ 213.796338] ? __pfx___driver_attach+0x10/0x10 [ 213.796340] __driver_probe_device+0x78/0x160 [ 213.796343] driver_probe_device+0x1f/0x90 [ 213.796346] __driver_attach+0xd6/0x1d0 [ 213.796349] bus_for_each_dev+0x8c/0xe0 [ 213.796355] bus_add_driver+0x119/0x220 [ 213.796359] driver_register+0x59/0x100 [ 213.796362] ? __pfx_virtio_net_driver_init+0x10/0x10 [virtio_net] [ 213.796369] virtio_net_driver_init+0x8e/0xff0 [virtio_net] [ 213.796375] do_one_initcall+0x6f/0x380 [ 213.796384] do_init_module+0x60/0x240 [ 213.796388] init_module_from_file+0x86/0xc0 [ 213.796396] idempotent_init_module+0x129/0x2c0 [ 213.796406] __x64_sys_finit_module+0x5e/0xb0 [ 213.796409] do_sy
Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang
On Mon, Jan 22, 2024 at 12:00 PM Xuan Zhuo wrote:
>
> On Mon, 22 Jan 2024 11:14:30 +0800, Jason Wang wrote:
> > On Mon, Jan 22, 2024 at 10:12 AM Zhu Yanjun wrote:
> > >
> > >
> > > 在 2024/1/20 1:29, Andrew Lunn 写道:
> > > >while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> > > > - !virtqueue_is_broken(vi->cvq))
> > > > + !virtqueue_is_broken(vi->cvq)) {
> > > > +if (timeout)
> > > > +timeout--;
> > > This is not really a timeout, just a loop counter. 200 iterations
> > > could
> > > be a very short time on reasonable H/W. I guess this avoid the soft
> > > lockup, but possibly (likely?) breaks the functionality when we need
> > > to
> > > loop for some non negligible time.
> > >
> > > I fear we need a more complex solution, as mentioned by Micheal in
> > > the
> > > thread you quoted.
> > > >>> Got it. I also look forward to the more complex solution to this
> > > >>> problem.
> > > >> Can we add a device capability (new feature bit) such as
> > > >> ctrq_wait_timeout
> > > >> to get a reasonable timeout?
> > > > The usual solution to this is include/linux/iopoll.h. If you can sleep
> > > > read_poll_timeout() otherwise read_poll_timeout_atomic().
> > >
> > > I read carefully the functions read_poll_timeout() and
> > > read_poll_timeout_atomic(). The timeout is set by the caller of the 2
> > > functions.
> >
> > FYI, in order to avoid a swtich of atomic or not, we need convert rx
> > mode setting to workqueue first:
> >
> > https://www.mail-archive.com/[email protected]/msg60298.html
> >
> > >
> > > As such, can we add a module parameter to customize this timeout value
> > > by the user?
> >
> > Who is the "user" here, or how can the "user" know the value?
> >
> > >
> > > Or this timeout value is stored in device register, virtio_net driver
> > > will read this timeout value at initialization?
> >
> > See another thread. The design needs to be general, or you can post a RFC.
> >
> > In another thought, we've already had a tx watchdog, maybe we can have
> > something similar to cvq and use timeout + reset in that case.
>
> But we may block by the reset ^_^ if the device is broken?
I mean vq reset here.
It looks like we have multiple goals here
1) avoid lockups, using workqueue + cond_resched() seems to be
sufficient, it has issue but nothing new
2) recover from the unresponsive device, the issue for timeout is that
it needs to deal with false positives
Thanks
>
> Thanks.
>
>
> >
> > Thans
> >
> > >
> > > Zhu Yanjun
> > >
> > > >
> > > > Andrew
> > >
> >
>
Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang
On Mon, 22 Jan 2024 11:14:30 +0800, Jason Wang wrote:
> On Mon, Jan 22, 2024 at 10:12 AM Zhu Yanjun wrote:
> >
> >
> > 在 2024/1/20 1:29, Andrew Lunn 写道:
> > >while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> > > - !virtqueue_is_broken(vi->cvq))
> > > + !virtqueue_is_broken(vi->cvq)) {
> > > +if (timeout)
> > > +timeout--;
> > This is not really a timeout, just a loop counter. 200 iterations could
> > be a very short time on reasonable H/W. I guess this avoid the soft
> > lockup, but possibly (likely?) breaks the functionality when we need to
> > loop for some non negligible time.
> >
> > I fear we need a more complex solution, as mentioned by Micheal in the
> > thread you quoted.
> > >>> Got it. I also look forward to the more complex solution to this
> > >>> problem.
> > >> Can we add a device capability (new feature bit) such as
> > >> ctrq_wait_timeout
> > >> to get a reasonable timeout?
> > > The usual solution to this is include/linux/iopoll.h. If you can sleep
> > > read_poll_timeout() otherwise read_poll_timeout_atomic().
> >
> > I read carefully the functions read_poll_timeout() and
> > read_poll_timeout_atomic(). The timeout is set by the caller of the 2
> > functions.
>
> FYI, in order to avoid a swtich of atomic or not, we need convert rx
> mode setting to workqueue first:
>
> https://www.mail-archive.com/[email protected]/msg60298.html
>
> >
> > As such, can we add a module parameter to customize this timeout value
> > by the user?
>
> Who is the "user" here, or how can the "user" know the value?
>
> >
> > Or this timeout value is stored in device register, virtio_net driver
> > will read this timeout value at initialization?
>
> See another thread. The design needs to be general, or you can post a RFC.
>
> In another thought, we've already had a tx watchdog, maybe we can have
> something similar to cvq and use timeout + reset in that case.
But we may block by the reset ^_^ if the device is broken?
Thanks.
>
> Thans
>
> >
> > Zhu Yanjun
> >
> > >
> > > Andrew
> >
>
Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang
On Mon, Jan 22, 2024 at 10:12 AM Zhu Yanjun wrote:
>
>
> 在 2024/1/20 1:29, Andrew Lunn 写道:
> >while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> > - !virtqueue_is_broken(vi->cvq))
> > + !virtqueue_is_broken(vi->cvq)) {
> > +if (timeout)
> > +timeout--;
> This is not really a timeout, just a loop counter. 200 iterations could
> be a very short time on reasonable H/W. I guess this avoid the soft
> lockup, but possibly (likely?) breaks the functionality when we need to
> loop for some non negligible time.
>
> I fear we need a more complex solution, as mentioned by Micheal in the
> thread you quoted.
> >>> Got it. I also look forward to the more complex solution to this problem.
> >> Can we add a device capability (new feature bit) such as ctrq_wait_timeout
> >> to get a reasonable timeout?
> > The usual solution to this is include/linux/iopoll.h. If you can sleep
> > read_poll_timeout() otherwise read_poll_timeout_atomic().
>
> I read carefully the functions read_poll_timeout() and
> read_poll_timeout_atomic(). The timeout is set by the caller of the 2
> functions.
FYI, in order to avoid a swtich of atomic or not, we need convert rx
mode setting to workqueue first:
https://www.mail-archive.com/[email protected]/msg60298.html
>
> As such, can we add a module parameter to customize this timeout value
> by the user?
Who is the "user" here, or how can the "user" know the value?
>
> Or this timeout value is stored in device register, virtio_net driver
> will read this timeout value at initialization?
See another thread. The design needs to be general, or you can post a RFC.
In another thought, we've already had a tx watchdog, maybe we can have
something similar to cvq and use timeout + reset in that case.
Thans
>
> Zhu Yanjun
>
> >
> > Andrew
>
Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang
On Fri, Jan 19, 2024 at 10:27 PM Heng Qi wrote: > > > > 在 2024/1/18 下午8:01, Zhu Yanjun 写道: > > > > 在 2024/1/16 20:04, Paolo Abeni 写道: > >> On Mon, 2024-01-15 at 09:29 +0800, Zhu Yanjun wrote: > >>> From: Zhu Yanjun > >>> > >>> Some devices emulate the virtio_net hardwares. When virtio_net > >>> driver sends commands to the emulated hardware, normally the > >>> hardware needs time to response. Sometimes the time is very > >>> long. Thus, the following will appear. Then the whole system > >>> will hang. > >>> The similar problems also occur in Intel NICs and Mellanox NICs. > >>> As such, the similar solution is borrowed from them. A timeout > >>> value is added and the timeout value as large as possible is set > >>> to ensure that the driver gets the maximum possible response from > >>> the hardware. > >>> > >>> " > >>> [ 213.795860] watchdog: BUG: soft lockup - CPU#108 stuck for 26s! > >>> [(udev-worker):3157] > >>> [ 213.796114] Modules linked in: virtio_net(+) net_failover > >>> failover qrtr rfkill sunrpc intel_rapl_msr intel_rapl_common > >>> intel_uncore_frequency intel_uncore_frequency_common intel_ifs > >>> i10nm_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp > >>> coretemp iTCO_wdt rapl intel_pmc_bxt dax_hmem iTCO_vendor_support > >>> vfat cxl_acpi intel_cstate pmt_telemetry pmt_class intel_sdsi joydev > >>> intel_uncore cxl_core fat pcspkr mei_me isst_if_mbox_pci > >>> isst_if_mmio idxd i2c_i801 isst_if_common mei intel_vsec idxd_bus > >>> i2c_smbus i2c_ismt ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf > >>> ipmi_msghandler acpi_pad acpi_power_meter pfr_telemetry pfr_update > >>> fuse loop zram xfs crct10dif_pclmul crc32_pclmul crc32c_intel > >>> polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3 > >>> bnxt_en sha256_ssse3 sha1_ssse3 nvme ast nvme_core i2c_algo_bit wmi > >>> pinctrl_emmitsburg scsi_dh_rdac scsi_dh_emc scsi_dh_alua dm_multipath > >>> [ 213.796194] irq event stamp: 67740 > >>> [ 213.796195] hardirqs last enabled at (67739): > >>> [] asm_sysvec_apic_timer_interrupt+0x1a/0x20 > >>> [ 213.796203] hardirqs last disabled at (67740): > >>> [] sysvec_apic_timer_interrupt+0xe/0x90 > >>> [ 213.796208] softirqs last enabled at (67686): > >>> [] __irq_exit_rcu+0xbe/0xe0 > >>> [ 213.796214] softirqs last disabled at (67681): > >>> [] __irq_exit_rcu+0xbe/0xe0 > >>> [ 213.796217] CPU: 108 PID: 3157 Comm: (udev-worker) Kdump: loaded > >>> Not tainted 6.7.0+ #9 > >>> [ 213.796220] Hardware name: Intel Corporation > >>> M50FCP2SBSTD/M50FCP2SBSTD, BIOS SE5C741.86B.01.01.0001.2211140926 > >>> 11/14/2022 > >>> [ 213.796221] RIP: 0010:virtqueue_get_buf_ctx_split+0x8d/0x110 > >>> [ 213.796228] Code: 89 df e8 26 fe ff ff 0f b7 43 50 83 c0 01 66 89 > >>> 43 50 f6 43 78 01 75 12 80 7b 42 00 48 8b 4b 68 8b 53 58 74 0f 66 87 > >>> 44 51 04 <48> 89 e8 5b 5d c3 cc cc cc cc 66 89 44 51 04 0f ae f0 48 > >>> 89 e8 5b > >>> [ 213.796230] RSP: 0018:ff4bbb362306f9b0 EFLAGS: 0246 > >>> [ 213.796233] RAX: RBX: ff2f15095896f000 RCX: > >>> 0001 > >>> [ 213.796235] RDX: RSI: ff4bbb362306f9cc RDI: > >>> ff2f15095896f000 > >>> [ 213.796236] RBP: R08: R09: > >>> > >>> [ 213.796237] R10: 0003 R11: ff2f15095893cc40 R12: > >>> 0002 > >>> [ 213.796239] R13: 0004 R14: R15: > >>> ff2f1509534f3000 > >>> [ 213.796240] FS: 7f775847d0c0() GS:ff2f1528bac0() > >>> knlGS: > >>> [ 213.796242] CS: 0010 DS: ES: CR0: 80050033 > >>> [ 213.796243] CR2: 557f987b6e70 CR3: 002098602006 CR4: > >>> 00f71ef0 > >>> [ 213.796245] DR0: DR1: DR2: > >>> > >>> [ 213.796246] DR3: DR6: fffe07f0 DR7: > >>> 0400 > >>> [ 213.796247] PKRU: 5554 > >>> [ 213.796249] Call Trace: > >>> [ 213.796250] > >>> [ 213.796252] ? watchdog_timer_fn+0x1c0/0x220 > >>> [ 213.796258] ? __pfx_watchdog_timer_fn+0x10/0x10 > >>> [ 213.796261] ? __hrtimer_run_queues+0x1af/0x380 > >>> [ 213.796269] ? hrtimer_interrupt+0xf8/0x230 > >>> [ 213.796274] ? __sysvec_apic_timer_interrupt+0x64/0x1a0 > >>> [ 213.796279] ? sysvec_apic_timer_interrupt+0x6d/0x90 > >>> [ 213.796282] > >>> [ 213.796284] > >>> [ 213.796285] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 > >>> [ 213.796293] ? virtqueue_get_buf_ctx_split+0x8d/0x110 > >>> [ 213.796297] virtnet_send_command+0x18a/0x1f0 [virtio_net] > >>> [ 213.796310] _virtnet_set_queues+0xc6/0x120 [virtio_net] > >>> [ 213.796319] virtnet_probe+0xa06/0xd50 [virtio_net] > >>> [ 213.796328] virtio_dev_probe+0x195/0x230 > >>> [ 213.796333] really_probe+0x19f/0x400 > >>> [ 213.796338] ? __pfx___driver_attach+0x10/0x10 > >>> [ 213.796340] __driver_probe_device+0x78/0x160 > >>> [ 213.796343] driver_probe_device+0x1f/0x90 > >>> [ 213.796346] __driver_attach+0xd6/0x1d
Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang
On Mon, Jan 15, 2024 at 6:22 PM Zhu Yanjun wrote: > > > 在 2024/1/15 10:20, Jason Wang 写道: > > On Mon, Jan 15, 2024 at 9:35 AM Zhu Yanjun wrote: > > From: Zhu Yanjun > > Some devices emulate the virtio_net hardwares. When virtio_net > driver sends commands to the emulated hardware, normally the > hardware needs time to response. Sometimes the time is very > long. Thus, the following will appear. Then the whole system > will hang. > The similar problems also occur in Intel NICs and Mellanox NICs. > As such, the similar solution is borrowed from them. A timeout > value is added and the timeout value as large as possible is set > to ensure that the driver gets the maximum possible response from > the hardware. > > " > [ 213.795860] watchdog: BUG: soft lockup - CPU#108 stuck for 26s! > [(udev-worker):3157] > [ 213.796114] Modules linked in: virtio_net(+) net_failover failover qrtr > rfkill sunrpc intel_rapl_msr intel_rapl_common intel_uncore_frequency > intel_uncore_frequency_common intel_ifs i10nm_edac nfit libnvdimm > x86_pkg_temp_thermal intel_powerclamp coretemp iTCO_wdt rapl intel_pmc_bxt > dax_hmem iTCO_vendor_support vfat cxl_acpi intel_cstate pmt_telemetry > pmt_class intel_sdsi joydev intel_uncore cxl_core fat pcspkr mei_me > isst_if_mbox_pci isst_if_mmio idxd i2c_i801 isst_if_common mei intel_vsec > idxd_bus i2c_smbus i2c_ismt ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf > ipmi_msghandler acpi_pad acpi_power_meter pfr_telemetry pfr_update fuse loop > zram xfs crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni > polyval_generic ghash_clmulni_intel sha512_ssse3 bnxt_en sha256_ssse3 > sha1_ssse3 nvme ast nvme_core i2c_algo_bit wmi pinctrl_emmitsburg > scsi_dh_rdac scsi_dh_emc scsi_dh_alua dm_multipath > [ 213.796194] irq event stamp: 67740 > [ 213.796195] hardirqs last enabled at (67739): [] > asm_sysvec_apic_timer_interrupt+0x1a/0x20 > [ 213.796203] hardirqs last disabled at (67740): [] > sysvec_apic_timer_interrupt+0xe/0x90 > [ 213.796208] softirqs last enabled at (67686): [] > __irq_exit_rcu+0xbe/0xe0 > [ 213.796214] softirqs last disabled at (67681): [] > __irq_exit_rcu+0xbe/0xe0 > [ 213.796217] CPU: 108 PID: 3157 Comm: (udev-worker) Kdump: loaded Not > tainted 6.7.0+ #9 > [ 213.796220] Hardware name: Intel Corporation M50FCP2SBSTD/M50FCP2SBSTD, > BIOS SE5C741.86B.01.01.0001.2211140926 11/14/2022 > [ 213.796221] RIP: 0010:virtqueue_get_buf_ctx_split+0x8d/0x110 > [ 213.796228] Code: 89 df e8 26 fe ff ff 0f b7 43 50 83 c0 01 66 89 43 50 f6 > 43 78 01 75 12 80 7b 42 00 48 8b 4b 68 8b 53 58 74 0f 66 87 44 51 04 <48> 89 > e8 5b 5d c3 cc cc cc cc 66 89 44 51 04 0f ae f0 48 89 e8 5b > [ 213.796230] RSP: 0018:ff4bbb362306f9b0 EFLAGS: 0246 > [ 213.796233] RAX: RBX: ff2f15095896f000 RCX: > 0001 > [ 213.796235] RDX: RSI: ff4bbb362306f9cc RDI: > ff2f15095896f000 > [ 213.796236] RBP: R08: R09: > > [ 213.796237] R10: 0003 R11: ff2f15095893cc40 R12: > 0002 > [ 213.796239] R13: 0004 R14: R15: > ff2f1509534f3000 > [ 213.796240] FS: 7f775847d0c0() GS:ff2f1528bac0() > knlGS: > [ 213.796242] CS: 0010 DS: ES: CR0: 80050033 > [ 213.796243] CR2: 557f987b6e70 CR3: 002098602006 CR4: > 00f71ef0 > [ 213.796245] DR0: DR1: DR2: > > [ 213.796246] DR3: DR6: fffe07f0 DR7: > 0400 > [ 213.796247] PKRU: 5554 > [ 213.796249] Call Trace: > [ 213.796250] > [ 213.796252] ? watchdog_timer_fn+0x1c0/0x220 > [ 213.796258] ? __pfx_watchdog_timer_fn+0x10/0x10 > [ 213.796261] ? __hrtimer_run_queues+0x1af/0x380 > [ 213.796269] ? hrtimer_interrupt+0xf8/0x230 > [ 213.796274] ? __sysvec_apic_timer_interrupt+0x64/0x1a0 > [ 213.796279] ? sysvec_apic_timer_interrupt+0x6d/0x90 > [ 213.796282] > [ 213.796284] > [ 213.796285] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 > [ 213.796293] ? virtqueue_get_buf_ctx_split+0x8d/0x110 > [ 213.796297] virtnet_send_command+0x18a/0x1f0 [virtio_net] > [ 213.796310] _virtnet_set_queues+0xc6/0x120 [virtio_net] > [ 213.796319] virtnet_probe+0xa06/0xd50 [virtio_net] > [ 213.796328] virtio_dev_probe+0x195/0x230 > [ 213.796333] really_probe+0x19f/0x400 > [ 213.796338] ? __pfx___driver_attach+0x10/0x10 > [ 213.796340] __driver_probe_device+0x78/0x160 > [ 213.796343] driver_probe_device+0x1f/0x90 > [ 213.796346] __driver_attach+0xd6/0x1d0 > [ 213.796349] bus_for_each_dev+0x8c/0xe0 > [ 213.796355] bus_add_driver+0x119/0x220 > [ 213.796359] driver_register+0x59/0x100 > [ 213.796362] ? __pfx_virtio_net_driver_init+0x10/0x10 [virtio_net] > [ 213.796369] virtio_net_driver_init+0x8e/0xff0 [virtio_net] > [ 213.796375] do_one_initcall+0x6f/0x380 > [ 213.796384] do_init_module+0x60/0x240 > [ 213.796388] init
Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang
在 2024/1/20 1:29, Andrew Lunn 写道:
while (!virtqueue_get_buf(vi->cvq, &tmp) &&
- !virtqueue_is_broken(vi->cvq))
+ !virtqueue_is_broken(vi->cvq)) {
+ if (timeout)
+ timeout--;
This is not really a timeout, just a loop counter. 200 iterations could
be a very short time on reasonable H/W. I guess this avoid the soft
lockup, but possibly (likely?) breaks the functionality when we need to
loop for some non negligible time.
I fear we need a more complex solution, as mentioned by Micheal in the
thread you quoted.
Got it. I also look forward to the more complex solution to this problem.
Can we add a device capability (new feature bit) such as ctrq_wait_timeout
to get a reasonable timeout?
The usual solution to this is include/linux/iopoll.h. If you can sleep
read_poll_timeout() otherwise read_poll_timeout_atomic().
I read carefully the functions read_poll_timeout() and
read_poll_timeout_atomic(). The timeout is set by the caller of the 2
functions.
As such, can we add a module parameter to customize this timeout value
by the user?
Or this timeout value is stored in device register, virtio_net driver
will read this timeout value at initialization?
Zhu Yanjun
Andrew
Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang
在 2024/1/20 1:29, Andrew Lunn 写道:
while (!virtqueue_get_buf(vi->cvq, &tmp) &&
- !virtqueue_is_broken(vi->cvq))
+ !virtqueue_is_broken(vi->cvq)) {
+ if (timeout)
+ timeout--;
This is not really a timeout, just a loop counter. 200 iterations could
be a very short time on reasonable H/W. I guess this avoid the soft
lockup, but possibly (likely?) breaks the functionality when we need to
loop for some non negligible time.
I fear we need a more complex solution, as mentioned by Micheal in the
thread you quoted.
Got it. I also look forward to the more complex solution to this problem.
Can we add a device capability (new feature bit) such as ctrq_wait_timeout
to get a reasonable timeout?
The usual solution to this is include/linux/iopoll.h. If you can sleep
read_poll_timeout() otherwise read_poll_timeout_atomic().
Thanks. The 2 functions read_poll_timeout() and
read_poll_timeout_atomic() are interesting.
Zhu Yanjun
Andrew
Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang
> > > > while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> > > > - !virtqueue_is_broken(vi->cvq))
> > > > + !virtqueue_is_broken(vi->cvq)) {
> > > > + if (timeout)
> > > > + timeout--;
> > > This is not really a timeout, just a loop counter. 200 iterations could
> > > be a very short time on reasonable H/W. I guess this avoid the soft
> > > lockup, but possibly (likely?) breaks the functionality when we need to
> > > loop for some non negligible time.
> > >
> > > I fear we need a more complex solution, as mentioned by Micheal in the
> > > thread you quoted.
> >
> > Got it. I also look forward to the more complex solution to this problem.
>
> Can we add a device capability (new feature bit) such as ctrq_wait_timeout
> to get a reasonable timeout?
The usual solution to this is include/linux/iopoll.h. If you can sleep
read_poll_timeout() otherwise read_poll_timeout_atomic().
Andrew
Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang
在 2024/1/18 下午8:01, Zhu Yanjun 写道: 在 2024/1/16 20:04, Paolo Abeni 写道: On Mon, 2024-01-15 at 09:29 +0800, Zhu Yanjun wrote: From: Zhu Yanjun Some devices emulate the virtio_net hardwares. When virtio_net driver sends commands to the emulated hardware, normally the hardware needs time to response. Sometimes the time is very long. Thus, the following will appear. Then the whole system will hang. The similar problems also occur in Intel NICs and Mellanox NICs. As such, the similar solution is borrowed from them. A timeout value is added and the timeout value as large as possible is set to ensure that the driver gets the maximum possible response from the hardware. " [ 213.795860] watchdog: BUG: soft lockup - CPU#108 stuck for 26s! [(udev-worker):3157] [ 213.796114] Modules linked in: virtio_net(+) net_failover failover qrtr rfkill sunrpc intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common intel_ifs i10nm_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp iTCO_wdt rapl intel_pmc_bxt dax_hmem iTCO_vendor_support vfat cxl_acpi intel_cstate pmt_telemetry pmt_class intel_sdsi joydev intel_uncore cxl_core fat pcspkr mei_me isst_if_mbox_pci isst_if_mmio idxd i2c_i801 isst_if_common mei intel_vsec idxd_bus i2c_smbus i2c_ismt ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler acpi_pad acpi_power_meter pfr_telemetry pfr_update fuse loop zram xfs crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3 bnxt_en sha256_ssse3 sha1_ssse3 nvme ast nvme_core i2c_algo_bit wmi pinctrl_emmitsburg scsi_dh_rdac scsi_dh_emc scsi_dh_alua dm_multipath [ 213.796194] irq event stamp: 67740 [ 213.796195] hardirqs last enabled at (67739): [] asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 213.796203] hardirqs last disabled at (67740): [] sysvec_apic_timer_interrupt+0xe/0x90 [ 213.796208] softirqs last enabled at (67686): [] __irq_exit_rcu+0xbe/0xe0 [ 213.796214] softirqs last disabled at (67681): [] __irq_exit_rcu+0xbe/0xe0 [ 213.796217] CPU: 108 PID: 3157 Comm: (udev-worker) Kdump: loaded Not tainted 6.7.0+ #9 [ 213.796220] Hardware name: Intel Corporation M50FCP2SBSTD/M50FCP2SBSTD, BIOS SE5C741.86B.01.01.0001.2211140926 11/14/2022 [ 213.796221] RIP: 0010:virtqueue_get_buf_ctx_split+0x8d/0x110 [ 213.796228] Code: 89 df e8 26 fe ff ff 0f b7 43 50 83 c0 01 66 89 43 50 f6 43 78 01 75 12 80 7b 42 00 48 8b 4b 68 8b 53 58 74 0f 66 87 44 51 04 <48> 89 e8 5b 5d c3 cc cc cc cc 66 89 44 51 04 0f ae f0 48 89 e8 5b [ 213.796230] RSP: 0018:ff4bbb362306f9b0 EFLAGS: 0246 [ 213.796233] RAX: RBX: ff2f15095896f000 RCX: 0001 [ 213.796235] RDX: RSI: ff4bbb362306f9cc RDI: ff2f15095896f000 [ 213.796236] RBP: R08: R09: [ 213.796237] R10: 0003 R11: ff2f15095893cc40 R12: 0002 [ 213.796239] R13: 0004 R14: R15: ff2f1509534f3000 [ 213.796240] FS: 7f775847d0c0() GS:ff2f1528bac0() knlGS: [ 213.796242] CS: 0010 DS: ES: CR0: 80050033 [ 213.796243] CR2: 557f987b6e70 CR3: 002098602006 CR4: 00f71ef0 [ 213.796245] DR0: DR1: DR2: [ 213.796246] DR3: DR6: fffe07f0 DR7: 0400 [ 213.796247] PKRU: 5554 [ 213.796249] Call Trace: [ 213.796250] [ 213.796252] ? watchdog_timer_fn+0x1c0/0x220 [ 213.796258] ? __pfx_watchdog_timer_fn+0x10/0x10 [ 213.796261] ? __hrtimer_run_queues+0x1af/0x380 [ 213.796269] ? hrtimer_interrupt+0xf8/0x230 [ 213.796274] ? __sysvec_apic_timer_interrupt+0x64/0x1a0 [ 213.796279] ? sysvec_apic_timer_interrupt+0x6d/0x90 [ 213.796282] [ 213.796284] [ 213.796285] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 213.796293] ? virtqueue_get_buf_ctx_split+0x8d/0x110 [ 213.796297] virtnet_send_command+0x18a/0x1f0 [virtio_net] [ 213.796310] _virtnet_set_queues+0xc6/0x120 [virtio_net] [ 213.796319] virtnet_probe+0xa06/0xd50 [virtio_net] [ 213.796328] virtio_dev_probe+0x195/0x230 [ 213.796333] really_probe+0x19f/0x400 [ 213.796338] ? __pfx___driver_attach+0x10/0x10 [ 213.796340] __driver_probe_device+0x78/0x160 [ 213.796343] driver_probe_device+0x1f/0x90 [ 213.796346] __driver_attach+0xd6/0x1d0 [ 213.796349] bus_for_each_dev+0x8c/0xe0 [ 213.796355] bus_add_driver+0x119/0x220 [ 213.796359] driver_register+0x59/0x100 [ 213.796362] ? __pfx_virtio_net_driver_init+0x10/0x10 [virtio_net] [ 213.796369] virtio_net_driver_init+0x8e/0xff0 [virtio_net] [ 213.796375] do_one_initcall+0x6f/0x380 [ 213.796384] do_init_module+0x60/0x240 [ 213.796388] init_module_from_file+0x86/0xc0 [ 213.796396] idempotent_init_module+0x129/0x2c0 [ 213.796406] __x64_sys_finit_module+0x5e/0xb0 [ 213.796409] do_syscall_64+0x60/0xe0 [ 213.796415] ? do_syscall_64
Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang
On Thu, 18 Jan 2024 08:14:21 -0500, "Michael S. Tsirkin" wrote: > On Tue, Jan 16, 2024 at 01:04:49PM +0100, Paolo Abeni wrote: > > On Mon, 2024-01-15 at 09:29 +0800, Zhu Yanjun wrote: > > > From: Zhu Yanjun > > > > > > Some devices emulate the virtio_net hardwares. When virtio_net > > > driver sends commands to the emulated hardware, normally the > > > hardware needs time to response. Sometimes the time is very > > > long. Thus, the following will appear. Then the whole system > > > will hang. > > > The similar problems also occur in Intel NICs and Mellanox NICs. > > > As such, the similar solution is borrowed from them. A timeout > > > value is added and the timeout value as large as possible is set > > > to ensure that the driver gets the maximum possible response from > > > the hardware. > > > > > > " > > > [ 213.795860] watchdog: BUG: soft lockup - CPU#108 stuck for 26s! > > > [(udev-worker):3157] > > > [ 213.796114] Modules linked in: virtio_net(+) net_failover failover > > > qrtr rfkill sunrpc intel_rapl_msr intel_rapl_common > > > intel_uncore_frequency intel_uncore_frequency_common intel_ifs i10nm_edac > > > nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp iTCO_wdt > > > rapl intel_pmc_bxt dax_hmem iTCO_vendor_support vfat cxl_acpi > > > intel_cstate pmt_telemetry pmt_class intel_sdsi joydev intel_uncore > > > cxl_core fat pcspkr mei_me isst_if_mbox_pci isst_if_mmio idxd i2c_i801 > > > isst_if_common mei intel_vsec idxd_bus i2c_smbus i2c_ismt ipmi_ssif > > > acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler acpi_pad acpi_power_meter > > > pfr_telemetry pfr_update fuse loop zram xfs crct10dif_pclmul crc32_pclmul > > > crc32c_intel polyval_clmulni polyval_generic ghash_clmulni_intel > > > sha512_ssse3 bnxt_en sha256_ssse3 sha1_ssse3 nvme ast nvme_core > > > i2c_algo_bit wmi pinctrl_emmitsburg scsi_dh_rdac scsi_dh_emc scsi_dh_alua > > > dm_multipath > > > [ 213.796194] irq event stamp: 67740 > > > [ 213.796195] hardirqs last enabled at (67739): [] > > > asm_sysvec_apic_timer_interrupt+0x1a/0x20 > > > [ 213.796203] hardirqs last disabled at (67740): [] > > > sysvec_apic_timer_interrupt+0xe/0x90 > > > [ 213.796208] softirqs last enabled at (67686): [] > > > __irq_exit_rcu+0xbe/0xe0 > > > [ 213.796214] softirqs last disabled at (67681): [] > > > __irq_exit_rcu+0xbe/0xe0 > > > [ 213.796217] CPU: 108 PID: 3157 Comm: (udev-worker) Kdump: loaded Not > > > tainted 6.7.0+ #9 > > > [ 213.796220] Hardware name: Intel Corporation > > > M50FCP2SBSTD/M50FCP2SBSTD, BIOS SE5C741.86B.01.01.0001.2211140926 > > > 11/14/2022 > > > [ 213.796221] RIP: 0010:virtqueue_get_buf_ctx_split+0x8d/0x110 > > > [ 213.796228] Code: 89 df e8 26 fe ff ff 0f b7 43 50 83 c0 01 66 89 43 > > > 50 f6 43 78 01 75 12 80 7b 42 00 48 8b 4b 68 8b 53 58 74 0f 66 87 44 51 > > > 04 <48> 89 e8 5b 5d c3 cc cc cc cc 66 89 44 51 04 0f ae f0 48 89 e8 5b > > > [ 213.796230] RSP: 0018:ff4bbb362306f9b0 EFLAGS: 0246 > > > [ 213.796233] RAX: RBX: ff2f15095896f000 RCX: > > > 0001 > > > [ 213.796235] RDX: RSI: ff4bbb362306f9cc RDI: > > > ff2f15095896f000 > > > [ 213.796236] RBP: R08: R09: > > > > > > [ 213.796237] R10: 0003 R11: ff2f15095893cc40 R12: > > > 0002 > > > [ 213.796239] R13: 0004 R14: R15: > > > ff2f1509534f3000 > > > [ 213.796240] FS: 7f775847d0c0() GS:ff2f1528bac0() > > > knlGS: > > > [ 213.796242] CS: 0010 DS: ES: CR0: 80050033 > > > [ 213.796243] CR2: 557f987b6e70 CR3: 002098602006 CR4: > > > 00f71ef0 > > > [ 213.796245] DR0: DR1: DR2: > > > > > > [ 213.796246] DR3: DR6: fffe07f0 DR7: > > > 0400 > > > [ 213.796247] PKRU: 5554 > > > [ 213.796249] Call Trace: > > > [ 213.796250] > > > [ 213.796252] ? watchdog_timer_fn+0x1c0/0x220 > > > [ 213.796258] ? __pfx_watchdog_timer_fn+0x10/0x10 > > > [ 213.796261] ? __hrtimer_run_queues+0x1af/0x380 > > > [ 213.796269] ? hrtimer_interrupt+0xf8/0x230 > > > [ 213.796274] ? __sysvec_apic_timer_interrupt+0x64/0x1a0 > > > [ 213.796279] ? sysvec_apic_timer_interrupt+0x6d/0x90 > > > [ 213.796282] > > > [ 213.796284] > > > [ 213.796285] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 > > > [ 213.796293] ? virtqueue_get_buf_ctx_split+0x8d/0x110 > > > [ 213.796297] virtnet_send_command+0x18a/0x1f0 [virtio_net] > > > [ 213.796310] _virtnet_set_queues+0xc6/0x120 [virtio_net] > > > [ 213.796319] virtnet_probe+0xa06/0xd50 [virtio_net] > > > [ 213.796328] virtio_dev_probe+0x195/0x230 > > > [ 213.796333] really_probe+0x19f/0x400 > > > [ 213.796338] ? __pfx___driver_attach+0x10/0x10 > > > [ 213.796340] __driver_probe_device+0x78/0x160 > > > [ 213.796343] driver_probe_device+0x1f/0x90 > > > [ 213.796346] __driv
Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang
On Tue, Jan 16, 2024 at 01:04:49PM +0100, Paolo Abeni wrote: > On Mon, 2024-01-15 at 09:29 +0800, Zhu Yanjun wrote: > > From: Zhu Yanjun > > > > Some devices emulate the virtio_net hardwares. When virtio_net > > driver sends commands to the emulated hardware, normally the > > hardware needs time to response. Sometimes the time is very > > long. Thus, the following will appear. Then the whole system > > will hang. > > The similar problems also occur in Intel NICs and Mellanox NICs. > > As such, the similar solution is borrowed from them. A timeout > > value is added and the timeout value as large as possible is set > > to ensure that the driver gets the maximum possible response from > > the hardware. > > > > " > > [ 213.795860] watchdog: BUG: soft lockup - CPU#108 stuck for 26s! > > [(udev-worker):3157] > > [ 213.796114] Modules linked in: virtio_net(+) net_failover failover qrtr > > rfkill sunrpc intel_rapl_msr intel_rapl_common intel_uncore_frequency > > intel_uncore_frequency_common intel_ifs i10nm_edac nfit libnvdimm > > x86_pkg_temp_thermal intel_powerclamp coretemp iTCO_wdt rapl intel_pmc_bxt > > dax_hmem iTCO_vendor_support vfat cxl_acpi intel_cstate pmt_telemetry > > pmt_class intel_sdsi joydev intel_uncore cxl_core fat pcspkr mei_me > > isst_if_mbox_pci isst_if_mmio idxd i2c_i801 isst_if_common mei intel_vsec > > idxd_bus i2c_smbus i2c_ismt ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf > > ipmi_msghandler acpi_pad acpi_power_meter pfr_telemetry pfr_update fuse > > loop zram xfs crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni > > polyval_generic ghash_clmulni_intel sha512_ssse3 bnxt_en sha256_ssse3 > > sha1_ssse3 nvme ast nvme_core i2c_algo_bit wmi pinctrl_emmitsburg > > scsi_dh_rdac scsi_dh_emc scsi_dh_alua dm_multipath > > [ 213.796194] irq event stamp: 67740 > > [ 213.796195] hardirqs last enabled at (67739): [] > > asm_sysvec_apic_timer_interrupt+0x1a/0x20 > > [ 213.796203] hardirqs last disabled at (67740): [] > > sysvec_apic_timer_interrupt+0xe/0x90 > > [ 213.796208] softirqs last enabled at (67686): [] > > __irq_exit_rcu+0xbe/0xe0 > > [ 213.796214] softirqs last disabled at (67681): [] > > __irq_exit_rcu+0xbe/0xe0 > > [ 213.796217] CPU: 108 PID: 3157 Comm: (udev-worker) Kdump: loaded Not > > tainted 6.7.0+ #9 > > [ 213.796220] Hardware name: Intel Corporation M50FCP2SBSTD/M50FCP2SBSTD, > > BIOS SE5C741.86B.01.01.0001.2211140926 11/14/2022 > > [ 213.796221] RIP: 0010:virtqueue_get_buf_ctx_split+0x8d/0x110 > > [ 213.796228] Code: 89 df e8 26 fe ff ff 0f b7 43 50 83 c0 01 66 89 43 50 > > f6 43 78 01 75 12 80 7b 42 00 48 8b 4b 68 8b 53 58 74 0f 66 87 44 51 04 > > <48> 89 e8 5b 5d c3 cc cc cc cc 66 89 44 51 04 0f ae f0 48 89 e8 5b > > [ 213.796230] RSP: 0018:ff4bbb362306f9b0 EFLAGS: 0246 > > [ 213.796233] RAX: RBX: ff2f15095896f000 RCX: > > 0001 > > [ 213.796235] RDX: RSI: ff4bbb362306f9cc RDI: > > ff2f15095896f000 > > [ 213.796236] RBP: R08: R09: > > > > [ 213.796237] R10: 0003 R11: ff2f15095893cc40 R12: > > 0002 > > [ 213.796239] R13: 0004 R14: R15: > > ff2f1509534f3000 > > [ 213.796240] FS: 7f775847d0c0() GS:ff2f1528bac0() > > knlGS: > > [ 213.796242] CS: 0010 DS: ES: CR0: 80050033 > > [ 213.796243] CR2: 557f987b6e70 CR3: 002098602006 CR4: > > 00f71ef0 > > [ 213.796245] DR0: DR1: DR2: > > > > [ 213.796246] DR3: DR6: fffe07f0 DR7: > > 0400 > > [ 213.796247] PKRU: 5554 > > [ 213.796249] Call Trace: > > [ 213.796250] > > [ 213.796252] ? watchdog_timer_fn+0x1c0/0x220 > > [ 213.796258] ? __pfx_watchdog_timer_fn+0x10/0x10 > > [ 213.796261] ? __hrtimer_run_queues+0x1af/0x380 > > [ 213.796269] ? hrtimer_interrupt+0xf8/0x230 > > [ 213.796274] ? __sysvec_apic_timer_interrupt+0x64/0x1a0 > > [ 213.796279] ? sysvec_apic_timer_interrupt+0x6d/0x90 > > [ 213.796282] > > [ 213.796284] > > [ 213.796285] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 > > [ 213.796293] ? virtqueue_get_buf_ctx_split+0x8d/0x110 > > [ 213.796297] virtnet_send_command+0x18a/0x1f0 [virtio_net] > > [ 213.796310] _virtnet_set_queues+0xc6/0x120 [virtio_net] > > [ 213.796319] virtnet_probe+0xa06/0xd50 [virtio_net] > > [ 213.796328] virtio_dev_probe+0x195/0x230 > > [ 213.796333] really_probe+0x19f/0x400 > > [ 213.796338] ? __pfx___driver_attach+0x10/0x10 > > [ 213.796340] __driver_probe_device+0x78/0x160 > > [ 213.796343] driver_probe_device+0x1f/0x90 > > [ 213.796346] __driver_attach+0xd6/0x1d0 > > [ 213.796349] bus_for_each_dev+0x8c/0xe0 > > [ 213.796355] bus_add_driver+0x119/0x220 > > [ 213.796359] driver_register+0x59/0x100 > > [ 213.796362] ? __pfx_virtio_net_driver_init+0x10/0x10 [virtio_net] > > [ 213.796369] vi
Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang
在 2024/1/16 20:04, Paolo Abeni 写道: On Mon, 2024-01-15 at 09:29 +0800, Zhu Yanjun wrote: From: Zhu Yanjun Some devices emulate the virtio_net hardwares. When virtio_net driver sends commands to the emulated hardware, normally the hardware needs time to response. Sometimes the time is very long. Thus, the following will appear. Then the whole system will hang. The similar problems also occur in Intel NICs and Mellanox NICs. As such, the similar solution is borrowed from them. A timeout value is added and the timeout value as large as possible is set to ensure that the driver gets the maximum possible response from the hardware. " [ 213.795860] watchdog: BUG: soft lockup - CPU#108 stuck for 26s! [(udev-worker):3157] [ 213.796114] Modules linked in: virtio_net(+) net_failover failover qrtr rfkill sunrpc intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common intel_ifs i10nm_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp iTCO_wdt rapl intel_pmc_bxt dax_hmem iTCO_vendor_support vfat cxl_acpi intel_cstate pmt_telemetry pmt_class intel_sdsi joydev intel_uncore cxl_core fat pcspkr mei_me isst_if_mbox_pci isst_if_mmio idxd i2c_i801 isst_if_common mei intel_vsec idxd_bus i2c_smbus i2c_ismt ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler acpi_pad acpi_power_meter pfr_telemetry pfr_update fuse loop zram xfs crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3 bnxt_en sha256_ssse3 sha1_ssse3 nvme ast nvme_core i2c_algo_bit wmi pinctrl_emmitsburg scsi_dh_rdac scsi_dh_emc scsi_dh_alua dm_multipath [ 213.796194] irq event stamp: 67740 [ 213.796195] hardirqs last enabled at (67739): [] asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 213.796203] hardirqs last disabled at (67740): [] sysvec_apic_timer_interrupt+0xe/0x90 [ 213.796208] softirqs last enabled at (67686): [] __irq_exit_rcu+0xbe/0xe0 [ 213.796214] softirqs last disabled at (67681): [] __irq_exit_rcu+0xbe/0xe0 [ 213.796217] CPU: 108 PID: 3157 Comm: (udev-worker) Kdump: loaded Not tainted 6.7.0+ #9 [ 213.796220] Hardware name: Intel Corporation M50FCP2SBSTD/M50FCP2SBSTD, BIOS SE5C741.86B.01.01.0001.2211140926 11/14/2022 [ 213.796221] RIP: 0010:virtqueue_get_buf_ctx_split+0x8d/0x110 [ 213.796228] Code: 89 df e8 26 fe ff ff 0f b7 43 50 83 c0 01 66 89 43 50 f6 43 78 01 75 12 80 7b 42 00 48 8b 4b 68 8b 53 58 74 0f 66 87 44 51 04 <48> 89 e8 5b 5d c3 cc cc cc cc 66 89 44 51 04 0f ae f0 48 89 e8 5b [ 213.796230] RSP: 0018:ff4bbb362306f9b0 EFLAGS: 0246 [ 213.796233] RAX: RBX: ff2f15095896f000 RCX: 0001 [ 213.796235] RDX: RSI: ff4bbb362306f9cc RDI: ff2f15095896f000 [ 213.796236] RBP: R08: R09: [ 213.796237] R10: 0003 R11: ff2f15095893cc40 R12: 0002 [ 213.796239] R13: 0004 R14: R15: ff2f1509534f3000 [ 213.796240] FS: 7f775847d0c0() GS:ff2f1528bac0() knlGS: [ 213.796242] CS: 0010 DS: ES: CR0: 80050033 [ 213.796243] CR2: 557f987b6e70 CR3: 002098602006 CR4: 00f71ef0 [ 213.796245] DR0: DR1: DR2: [ 213.796246] DR3: DR6: fffe07f0 DR7: 0400 [ 213.796247] PKRU: 5554 [ 213.796249] Call Trace: [ 213.796250] [ 213.796252] ? watchdog_timer_fn+0x1c0/0x220 [ 213.796258] ? __pfx_watchdog_timer_fn+0x10/0x10 [ 213.796261] ? __hrtimer_run_queues+0x1af/0x380 [ 213.796269] ? hrtimer_interrupt+0xf8/0x230 [ 213.796274] ? __sysvec_apic_timer_interrupt+0x64/0x1a0 [ 213.796279] ? sysvec_apic_timer_interrupt+0x6d/0x90 [ 213.796282] [ 213.796284] [ 213.796285] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 213.796293] ? virtqueue_get_buf_ctx_split+0x8d/0x110 [ 213.796297] virtnet_send_command+0x18a/0x1f0 [virtio_net] [ 213.796310] _virtnet_set_queues+0xc6/0x120 [virtio_net] [ 213.796319] virtnet_probe+0xa06/0xd50 [virtio_net] [ 213.796328] virtio_dev_probe+0x195/0x230 [ 213.796333] really_probe+0x19f/0x400 [ 213.796338] ? __pfx___driver_attach+0x10/0x10 [ 213.796340] __driver_probe_device+0x78/0x160 [ 213.796343] driver_probe_device+0x1f/0x90 [ 213.796346] __driver_attach+0xd6/0x1d0 [ 213.796349] bus_for_each_dev+0x8c/0xe0 [ 213.796355] bus_add_driver+0x119/0x220 [ 213.796359] driver_register+0x59/0x100 [ 213.796362] ? __pfx_virtio_net_driver_init+0x10/0x10 [virtio_net] [ 213.796369] virtio_net_driver_init+0x8e/0xff0 [virtio_net] [ 213.796375] do_one_initcall+0x6f/0x380 [ 213.796384] do_init_module+0x60/0x240 [ 213.796388] init_module_from_file+0x86/0xc0 [ 213.796396] idempotent_init_module+0x129/0x2c0 [ 213.796406] __x64_sys_finit_module+0x5e/0xb0 [ 213.796409] do_syscall_64+0x60/0xe0 [ 213.796415] ? do_syscall_64+0x6f/0xe0 [ 213.796418] ? lockdep_hardirqs_on_prepar
Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang
On Mon, 2024-01-15 at 09:29 +0800, Zhu Yanjun wrote: > From: Zhu Yanjun > > Some devices emulate the virtio_net hardwares. When virtio_net > driver sends commands to the emulated hardware, normally the > hardware needs time to response. Sometimes the time is very > long. Thus, the following will appear. Then the whole system > will hang. > The similar problems also occur in Intel NICs and Mellanox NICs. > As such, the similar solution is borrowed from them. A timeout > value is added and the timeout value as large as possible is set > to ensure that the driver gets the maximum possible response from > the hardware. > > " > [ 213.795860] watchdog: BUG: soft lockup - CPU#108 stuck for 26s! > [(udev-worker):3157] > [ 213.796114] Modules linked in: virtio_net(+) net_failover failover qrtr > rfkill sunrpc intel_rapl_msr intel_rapl_common intel_uncore_frequency > intel_uncore_frequency_common intel_ifs i10nm_edac nfit libnvdimm > x86_pkg_temp_thermal intel_powerclamp coretemp iTCO_wdt rapl intel_pmc_bxt > dax_hmem iTCO_vendor_support vfat cxl_acpi intel_cstate pmt_telemetry > pmt_class intel_sdsi joydev intel_uncore cxl_core fat pcspkr mei_me > isst_if_mbox_pci isst_if_mmio idxd i2c_i801 isst_if_common mei intel_vsec > idxd_bus i2c_smbus i2c_ismt ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf > ipmi_msghandler acpi_pad acpi_power_meter pfr_telemetry pfr_update fuse loop > zram xfs crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni > polyval_generic ghash_clmulni_intel sha512_ssse3 bnxt_en sha256_ssse3 > sha1_ssse3 nvme ast nvme_core i2c_algo_bit wmi pinctrl_emmitsburg > scsi_dh_rdac scsi_dh_emc scsi_dh_alua dm_multipath > [ 213.796194] irq event stamp: 67740 > [ 213.796195] hardirqs last enabled at (67739): [] > asm_sysvec_apic_timer_interrupt+0x1a/0x20 > [ 213.796203] hardirqs last disabled at (67740): [] > sysvec_apic_timer_interrupt+0xe/0x90 > [ 213.796208] softirqs last enabled at (67686): [] > __irq_exit_rcu+0xbe/0xe0 > [ 213.796214] softirqs last disabled at (67681): [] > __irq_exit_rcu+0xbe/0xe0 > [ 213.796217] CPU: 108 PID: 3157 Comm: (udev-worker) Kdump: loaded Not > tainted 6.7.0+ #9 > [ 213.796220] Hardware name: Intel Corporation M50FCP2SBSTD/M50FCP2SBSTD, > BIOS SE5C741.86B.01.01.0001.2211140926 11/14/2022 > [ 213.796221] RIP: 0010:virtqueue_get_buf_ctx_split+0x8d/0x110 > [ 213.796228] Code: 89 df e8 26 fe ff ff 0f b7 43 50 83 c0 01 66 89 43 50 f6 > 43 78 01 75 12 80 7b 42 00 48 8b 4b 68 8b 53 58 74 0f 66 87 44 51 04 <48> 89 > e8 5b 5d c3 cc cc cc cc 66 89 44 51 04 0f ae f0 48 89 e8 5b > [ 213.796230] RSP: 0018:ff4bbb362306f9b0 EFLAGS: 0246 > [ 213.796233] RAX: RBX: ff2f15095896f000 RCX: > 0001 > [ 213.796235] RDX: RSI: ff4bbb362306f9cc RDI: > ff2f15095896f000 > [ 213.796236] RBP: R08: R09: > > [ 213.796237] R10: 0003 R11: ff2f15095893cc40 R12: > 0002 > [ 213.796239] R13: 0004 R14: R15: > ff2f1509534f3000 > [ 213.796240] FS: 7f775847d0c0() GS:ff2f1528bac0() > knlGS: > [ 213.796242] CS: 0010 DS: ES: CR0: 80050033 > [ 213.796243] CR2: 557f987b6e70 CR3: 002098602006 CR4: > 00f71ef0 > [ 213.796245] DR0: DR1: DR2: > > [ 213.796246] DR3: DR6: fffe07f0 DR7: > 0400 > [ 213.796247] PKRU: 5554 > [ 213.796249] Call Trace: > [ 213.796250] > [ 213.796252] ? watchdog_timer_fn+0x1c0/0x220 > [ 213.796258] ? __pfx_watchdog_timer_fn+0x10/0x10 > [ 213.796261] ? __hrtimer_run_queues+0x1af/0x380 > [ 213.796269] ? hrtimer_interrupt+0xf8/0x230 > [ 213.796274] ? __sysvec_apic_timer_interrupt+0x64/0x1a0 > [ 213.796279] ? sysvec_apic_timer_interrupt+0x6d/0x90 > [ 213.796282] > [ 213.796284] > [ 213.796285] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 > [ 213.796293] ? virtqueue_get_buf_ctx_split+0x8d/0x110 > [ 213.796297] virtnet_send_command+0x18a/0x1f0 [virtio_net] > [ 213.796310] _virtnet_set_queues+0xc6/0x120 [virtio_net] > [ 213.796319] virtnet_probe+0xa06/0xd50 [virtio_net] > [ 213.796328] virtio_dev_probe+0x195/0x230 > [ 213.796333] really_probe+0x19f/0x400 > [ 213.796338] ? __pfx___driver_attach+0x10/0x10 > [ 213.796340] __driver_probe_device+0x78/0x160 > [ 213.796343] driver_probe_device+0x1f/0x90 > [ 213.796346] __driver_attach+0xd6/0x1d0 > [ 213.796349] bus_for_each_dev+0x8c/0xe0 > [ 213.796355] bus_add_driver+0x119/0x220 > [ 213.796359] driver_register+0x59/0x100 > [ 213.796362] ? __pfx_virtio_net_driver_init+0x10/0x10 [virtio_net] > [ 213.796369] virtio_net_driver_init+0x8e/0xff0 [virtio_net] > [ 213.796375] do_one_initcall+0x6f/0x380 > [ 213.796384] do_init_module+0x60/0x240 > [ 213.796388] init_module_from_file+0x86/0xc0 > [ 213.796396] idempotent_init_module+0x129/0x2c0 > [ 213.796
Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang
在 2024/1/15 10:20, Jason Wang 写道: On Mon, Jan 15, 2024 at 9:35 AM Zhu Yanjun wrote: From: Zhu Yanjun Some devices emulate the virtio_net hardwares. When virtio_net driver sends commands to the emulated hardware, normally the hardware needs time to response. Sometimes the time is very long. Thus, the following will appear. Then the whole system will hang. The similar problems also occur in Intel NICs and Mellanox NICs. As such, the similar solution is borrowed from them. A timeout value is added and the timeout value as large as possible is set to ensure that the driver gets the maximum possible response from the hardware. " [ 213.795860] watchdog: BUG: soft lockup - CPU#108 stuck for 26s! [(udev-worker):3157] [ 213.796114] Modules linked in: virtio_net(+) net_failover failover qrtr rfkill sunrpc intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common intel_ifs i10nm_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp iTCO_wdt rapl intel_pmc_bxt dax_hmem iTCO_vendor_support vfat cxl_acpi intel_cstate pmt_telemetry pmt_class intel_sdsi joydev intel_uncore cxl_core fat pcspkr mei_me isst_if_mbox_pci isst_if_mmio idxd i2c_i801 isst_if_common mei intel_vsec idxd_bus i2c_smbus i2c_ismt ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler acpi_pad acpi_power_meter pfr_telemetry pfr_update fuse loop zram xfs crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3 bnxt_en sha256_ssse3 sha1_ssse3 nvme ast nvme_core i2c_algo_bit wmi pinctrl_emmitsburg scsi_dh_rdac scsi_dh_emc scsi_dh_alua dm_multipath [ 213.796194] irq event stamp: 67740 [ 213.796195] hardirqs last enabled at (67739): [] asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 213.796203] hardirqs last disabled at (67740): [] sysvec_apic_timer_interrupt+0xe/0x90 [ 213.796208] softirqs last enabled at (67686): [] __irq_exit_rcu+0xbe/0xe0 [ 213.796214] softirqs last disabled at (67681): [] __irq_exit_rcu+0xbe/0xe0 [ 213.796217] CPU: 108 PID: 3157 Comm: (udev-worker) Kdump: loaded Not tainted 6.7.0+ #9 [ 213.796220] Hardware name: Intel Corporation M50FCP2SBSTD/M50FCP2SBSTD, BIOS SE5C741.86B.01.01.0001.2211140926 11/14/2022 [ 213.796221] RIP: 0010:virtqueue_get_buf_ctx_split+0x8d/0x110 [ 213.796228] Code: 89 df e8 26 fe ff ff 0f b7 43 50 83 c0 01 66 89 43 50 f6 43 78 01 75 12 80 7b 42 00 48 8b 4b 68 8b 53 58 74 0f 66 87 44 51 04 <48> 89 e8 5b 5d c3 cc cc cc cc 66 89 44 51 04 0f ae f0 48 89 e8 5b [ 213.796230] RSP: 0018:ff4bbb362306f9b0 EFLAGS: 0246 [ 213.796233] RAX: RBX: ff2f15095896f000 RCX: 0001 [ 213.796235] RDX: RSI: ff4bbb362306f9cc RDI: ff2f15095896f000 [ 213.796236] RBP: R08: R09: [ 213.796237] R10: 0003 R11: ff2f15095893cc40 R12: 0002 [ 213.796239] R13: 0004 R14: R15: ff2f1509534f3000 [ 213.796240] FS: 7f775847d0c0() GS:ff2f1528bac0() knlGS: [ 213.796242] CS: 0010 DS: ES: CR0: 80050033 [ 213.796243] CR2: 557f987b6e70 CR3: 002098602006 CR4: 00f71ef0 [ 213.796245] DR0: DR1: DR2: [ 213.796246] DR3: DR6: fffe07f0 DR7: 0400 [ 213.796247] PKRU: 5554 [ 213.796249] Call Trace: [ 213.796250] [ 213.796252] ? watchdog_timer_fn+0x1c0/0x220 [ 213.796258] ? __pfx_watchdog_timer_fn+0x10/0x10 [ 213.796261] ? __hrtimer_run_queues+0x1af/0x380 [ 213.796269] ? hrtimer_interrupt+0xf8/0x230 [ 213.796274] ? __sysvec_apic_timer_interrupt+0x64/0x1a0 [ 213.796279] ? sysvec_apic_timer_interrupt+0x6d/0x90 [ 213.796282] [ 213.796284] [ 213.796285] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 213.796293] ? virtqueue_get_buf_ctx_split+0x8d/0x110 [ 213.796297] virtnet_send_command+0x18a/0x1f0 [virtio_net] [ 213.796310] _virtnet_set_queues+0xc6/0x120 [virtio_net] [ 213.796319] virtnet_probe+0xa06/0xd50 [virtio_net] [ 213.796328] virtio_dev_probe+0x195/0x230 [ 213.796333] really_probe+0x19f/0x400 [ 213.796338] ? __pfx___driver_attach+0x10/0x10 [ 213.796340] __driver_probe_device+0x78/0x160 [ 213.796343] driver_probe_device+0x1f/0x90 [ 213.796346] __driver_attach+0xd6/0x1d0 [ 213.796349] bus_for_each_dev+0x8c/0xe0 [ 213.796355] bus_add_driver+0x119/0x220 [ 213.796359] driver_register+0x59/0x100 [ 213.796362] ? __pfx_virtio_net_driver_init+0x10/0x10 [virtio_net] [ 213.796369] virtio_net_driver_init+0x8e/0xff0 [virtio_net] [ 213.796375] do_one_initcall+0x6f/0x380 [ 213.796384] do_init_module+0x60/0x240 [ 213.796388] init_module_from_file+0x86/0xc0 [ 213.796396] idempotent_init_module+0x129/0x2c0 [ 213.796406] __x64_sys_finit_module+0x5e/0xb0 [ 213.796409] do_syscall_64+0x60/0xe0 [ 213.796415] ? do_syscall_64+0x6f/0xe0 [ 213.796418] ? lockdep_hardirqs_on_prepare+0
Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang
On Mon, Jan 15, 2024 at 9:35 AM Zhu Yanjun wrote: > > From: Zhu Yanjun > > Some devices emulate the virtio_net hardwares. When virtio_net > driver sends commands to the emulated hardware, normally the > hardware needs time to response. Sometimes the time is very > long. Thus, the following will appear. Then the whole system > will hang. > The similar problems also occur in Intel NICs and Mellanox NICs. > As such, the similar solution is borrowed from them. A timeout > value is added and the timeout value as large as possible is set > to ensure that the driver gets the maximum possible response from > the hardware. > > " > [ 213.795860] watchdog: BUG: soft lockup - CPU#108 stuck for 26s! > [(udev-worker):3157] > [ 213.796114] Modules linked in: virtio_net(+) net_failover failover qrtr > rfkill sunrpc intel_rapl_msr intel_rapl_common intel_uncore_frequency > intel_uncore_frequency_common intel_ifs i10nm_edac nfit libnvdimm > x86_pkg_temp_thermal intel_powerclamp coretemp iTCO_wdt rapl intel_pmc_bxt > dax_hmem iTCO_vendor_support vfat cxl_acpi intel_cstate pmt_telemetry > pmt_class intel_sdsi joydev intel_uncore cxl_core fat pcspkr mei_me > isst_if_mbox_pci isst_if_mmio idxd i2c_i801 isst_if_common mei intel_vsec > idxd_bus i2c_smbus i2c_ismt ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf > ipmi_msghandler acpi_pad acpi_power_meter pfr_telemetry pfr_update fuse loop > zram xfs crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni > polyval_generic ghash_clmulni_intel sha512_ssse3 bnxt_en sha256_ssse3 > sha1_ssse3 nvme ast nvme_core i2c_algo_bit wmi pinctrl_emmitsburg > scsi_dh_rdac scsi_dh_emc scsi_dh_alua dm_multipath > [ 213.796194] irq event stamp: 67740 > [ 213.796195] hardirqs last enabled at (67739): [] > asm_sysvec_apic_timer_interrupt+0x1a/0x20 > [ 213.796203] hardirqs last disabled at (67740): [] > sysvec_apic_timer_interrupt+0xe/0x90 > [ 213.796208] softirqs last enabled at (67686): [] > __irq_exit_rcu+0xbe/0xe0 > [ 213.796214] softirqs last disabled at (67681): [] > __irq_exit_rcu+0xbe/0xe0 > [ 213.796217] CPU: 108 PID: 3157 Comm: (udev-worker) Kdump: loaded Not > tainted 6.7.0+ #9 > [ 213.796220] Hardware name: Intel Corporation M50FCP2SBSTD/M50FCP2SBSTD, > BIOS SE5C741.86B.01.01.0001.2211140926 11/14/2022 > [ 213.796221] RIP: 0010:virtqueue_get_buf_ctx_split+0x8d/0x110 > [ 213.796228] Code: 89 df e8 26 fe ff ff 0f b7 43 50 83 c0 01 66 89 43 50 f6 > 43 78 01 75 12 80 7b 42 00 48 8b 4b 68 8b 53 58 74 0f 66 87 44 51 04 <48> 89 > e8 5b 5d c3 cc cc cc cc 66 89 44 51 04 0f ae f0 48 89 e8 5b > [ 213.796230] RSP: 0018:ff4bbb362306f9b0 EFLAGS: 0246 > [ 213.796233] RAX: RBX: ff2f15095896f000 RCX: > 0001 > [ 213.796235] RDX: RSI: ff4bbb362306f9cc RDI: > ff2f15095896f000 > [ 213.796236] RBP: R08: R09: > > [ 213.796237] R10: 0003 R11: ff2f15095893cc40 R12: > 0002 > [ 213.796239] R13: 0004 R14: R15: > ff2f1509534f3000 > [ 213.796240] FS: 7f775847d0c0() GS:ff2f1528bac0() > knlGS: > [ 213.796242] CS: 0010 DS: ES: CR0: 80050033 > [ 213.796243] CR2: 557f987b6e70 CR3: 002098602006 CR4: > 00f71ef0 > [ 213.796245] DR0: DR1: DR2: > > [ 213.796246] DR3: DR6: fffe07f0 DR7: > 0400 > [ 213.796247] PKRU: 5554 > [ 213.796249] Call Trace: > [ 213.796250] > [ 213.796252] ? watchdog_timer_fn+0x1c0/0x220 > [ 213.796258] ? __pfx_watchdog_timer_fn+0x10/0x10 > [ 213.796261] ? __hrtimer_run_queues+0x1af/0x380 > [ 213.796269] ? hrtimer_interrupt+0xf8/0x230 > [ 213.796274] ? __sysvec_apic_timer_interrupt+0x64/0x1a0 > [ 213.796279] ? sysvec_apic_timer_interrupt+0x6d/0x90 > [ 213.796282] > [ 213.796284] > [ 213.796285] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 > [ 213.796293] ? virtqueue_get_buf_ctx_split+0x8d/0x110 > [ 213.796297] virtnet_send_command+0x18a/0x1f0 [virtio_net] > [ 213.796310] _virtnet_set_queues+0xc6/0x120 [virtio_net] > [ 213.796319] virtnet_probe+0xa06/0xd50 [virtio_net] > [ 213.796328] virtio_dev_probe+0x195/0x230 > [ 213.796333] really_probe+0x19f/0x400 > [ 213.796338] ? __pfx___driver_attach+0x10/0x10 > [ 213.796340] __driver_probe_device+0x78/0x160 > [ 213.796343] driver_probe_device+0x1f/0x90 > [ 213.796346] __driver_attach+0xd6/0x1d0 > [ 213.796349] bus_for_each_dev+0x8c/0xe0 > [ 213.796355] bus_add_driver+0x119/0x220 > [ 213.796359] driver_register+0x59/0x100 > [ 213.796362] ? __pfx_virtio_net_driver_init+0x10/0x10 [virtio_net] > [ 213.796369] virtio_net_driver_init+0x8e/0xff0 [virtio_net] > [ 213.796375] do_one_initcall+0x6f/0x380 > [ 213.796384] do_init_module+0x60/0x240 > [ 213.796388] init_module_from_file+0x86/0xc0 > [ 213.796396] idempotent_init_module+0x129/0x2c0 > [ 213.79640
