On Fri, 19 Apr 2024 at 20:15, Haiyang Zhang <[email protected]> wrote:
>
>
>
> > -----Original Message-----
> > From: Francois <[email protected]>
> > Sent: Sunday, April 14, 2024 1:15 PM
> > To: [email protected]
> > Subject: Nic flaps for 1 minute when reconnecting
> >
> > (Nic Connected) sent to Nic
> > 40DBAAF6-D408-452F-BC2E-B76AAF065732--B670C9DF-AB50-49C4-...
> > 24/12/2023 09:39:33 220 Information Status change
> > (Nic Connected) sent to Nic
> > 40DBAAF6-D408-452F-BC2E-B76AAF065732--B670C9DF-AB50-49C4-...
> > 24/12/2023 09:39:33 220 Information Status change
> > (Nic Disconnected) sent to Nic
> > 40DBAAF6-D408-452F-BC2E-B76AAF065732--B670C9DF-AB50-49...
> > 24/12/2023 09:39:33 220 Information Status change
> > (Nic Disconnected) sent to Nic
> > 40DBAAF6-D408-452F-BC2E-B76AAF065732--B670C9DF-AB50-49...
> > Thanks!
> > Francois
>
> The 2 seconds delay is necessary for the upper layers, like link_watch
> infrastructure, and userspace to handle the status change properly.
>
Hi, thanks for your response!
I understand the need to split a "change" event into 2 separate events, I
don't really understand why there needs to be a 2 seconds delay between
each. Surely other network drivers do not artificially add that delay?
In my case a lot of events are received (instead of a single
disconnect/reconnect) and they are all tailed and processed sequentially,
in practice the VM is not usable for a minute or so. It happens "by
surprise", I have no idea what is causing this.
I don't think I have a way to dig into the way Windows or Hyper-V are sending
these events, so I am living with the patch to reduce the delay. What would
you think of first adding a log in this fashion
> --- a/drivers/net/hyperv/netvsc_drv.c 2024-04-20 08:48:09.105928816 +0200
> +++ b/drivers/net/hyperv/netvsc_drv.c 2024-04-20 08:57:28.254412513 +0200
> @@ -2080,6 +2080,10 @@
> ndev_ctx->last_reconfig = jiffies;
>
> spin_lock_irqsave(&ndev_ctx->lock, flags);
> + size_t len = list_count_nodes(&ndev_ctx->reconfig_events);
> + if (len > 5) {
> + netdev_warn(net, "handle storm depth=%ld", len);
> + }
> if (!list_empty(&ndev_ctx->reconfig_events)) {
> event = list_first_entry(&ndev_ctx->reconfig_events,
> struct netvsc_reconfig, list);
to inform the user that something is wrong and events are being stacked
unnecessarily? Hopefully someone will notice and more users would be able to
chime in and report.
>
> Thanks,
> - Haiyang
>
Thanks!
Francois