Re: [PATCH net] hv_netvsc: Fix hibernation for mlx5 VF driver

2020-09-06 Thread Jakub Kicinski
On Sun, 6 Sep 2020 03:05:48 + Dexuan Cui wrote:
> > > @@ -2635,6 +2632,10 @@ static int netvsc_resume(struct hv_device *dev)
> > >   netvsc_devinfo_put(device_info);
> > >   net_device_ctx->saved_netvsc_dev_info = NULL;
> > >
> > > + vf_netdev = rtnl_dereference(net_device_ctx->vf_netdev);
> > > + if (vf_netdev && netvsc_vf_changed(vf_netdev) != NOTIFY_OK)
> > > + ret = -EINVAL;  
> > 
> > Should you perhaps remove the VF in case of the failure?  
> IMO this failure actually should not happen since we're resuming the netvsc
> NIC, so we're sure we have a valid pointer to the netvsc net device, and
> netvsc_vf_changed() should be able to find the netvsc pointer and return
> NOTIFY_OK. In case of a failure, something really bad must be happening,
> and I'm not sure if it's safe to simply remove the VF, so I just return
> -EINVAL for simplicity, since I believe the failure should not happen in 
> practice.

Okay, I see that the errors propagated by netvsc_vf_changed() aren't
actually coming from netvsc_switch_datapath(), so you're right. The
failures here won't be meaningful.

> I would rather keep the code as-is, but I'm OK to add a WARN_ON(1) if you
> think that's necessary.

No need, I think core will complain when resume callback fails. That
should be sufficient.


RE: [PATCH net] hv_netvsc: Fix hibernation for mlx5 VF driver

2020-09-05 Thread Dexuan Cui
> From: Jakub Kicinski 
> Sent: Saturday, September 5, 2020 4:27 PM
> [...]
> On Fri,  4 Sep 2020 19:52:18 -0700 Dexuan Cui wrote:
> > mlx5_suspend()/resume() keep the network interface, so during hibernation
> > netvsc_unregister_vf() and netvsc_register_vf() are not called, and hence
> > netvsc_resume() should call netvsc_vf_changed() to switch the data path
> > back to the VF after hibernation.
> 
> Does suspending the system automatically switch back to the synthetic
> datapath? 
Yes. 

For mlx4, since the VF network interafce is explicitly destroyed and re-created
during hibernation (i.e. suspend + resume), hv_netvsc explicitly switches the
data path from and to the VF.

For mlx5, the VF network interface persists across hibernation, so there is no
explicit switch-over, but after we close and re-open the vmbus channel of
the netvsc NIC in netvsc_suspend() and netvsc_resume(), the data path is
implicitly switched to the netvsc NIC, and with this patch netvsc_resume() ->
netvsc_vf_changed() switches the data path back to the mlx5 NIC.

> Please clarify this in the commit message and/or add a code
> comment.
I will add a comment in the commit message and the code.
 
> > @@ -2587,7 +2587,7 @@ static int netvsc_remove(struct hv_device *dev)
> >  static int netvsc_suspend(struct hv_device *dev)
> >  {
> > struct net_device_context *ndev_ctx;
> > -   struct net_device *vf_netdev, *net;
> > +   struct net_device *net;
> > struct netvsc_device *nvdev;
> > int ret;
> 
> Please keep reverse xmas tree variable ordering.

Will do.

> > @@ -2635,6 +2632,10 @@ static int netvsc_resume(struct hv_device *dev)
> > netvsc_devinfo_put(device_info);
> > net_device_ctx->saved_netvsc_dev_info = NULL;
> >
> > +   vf_netdev = rtnl_dereference(net_device_ctx->vf_netdev);
> > +   if (vf_netdev && netvsc_vf_changed(vf_netdev) != NOTIFY_OK)
> > +   ret = -EINVAL;
> 
> Should you perhaps remove the VF in case of the failure?
IMO this failure actually should not happen since we're resuming the netvsc
NIC, so we're sure we have a valid pointer to the netvsc net device, and
netvsc_vf_changed() should be able to find the netvsc pointer and return
NOTIFY_OK. In case of a failure, something really bad must be happening,
and I'm not sure if it's safe to simply remove the VF, so I just return
-EINVAL for simplicity, since I believe the failure should not happen in 
practice.

I would rather keep the code as-is, but I'm OK to add a WARN_ON(1) if you
think that's necessary.

Thanks,
-- Dexuan


Re: [PATCH net] hv_netvsc: Fix hibernation for mlx5 VF driver

2020-09-05 Thread Jakub Kicinski
On Fri,  4 Sep 2020 19:52:18 -0700 Dexuan Cui wrote:
> mlx5_suspend()/resume() keep the network interface, so during hibernation
> netvsc_unregister_vf() and netvsc_register_vf() are not called, and hence
> netvsc_resume() should call netvsc_vf_changed() to switch the data path
> back to the VF after hibernation.

Does suspending the system automatically switch back to the synthetic
datapath? Please clarify this in the commit message and/or add a code
comment.

> Similarly, netvsc_suspend() should not call netvsc_unregister_vf().
> 
> BTW, mlx4_suspend()/resume() are differnt in that they destroy and
> re-create the network device, so netvsc_register_vf() and
> netvsc_unregister_vf() are automatically called. Note: mlx4 can also work
> with the changes here because in netvsc_suspend()/resume()
> ndev_ctx->vf_netdev is NULL for mlx4.
> 
> Fixes: 0efeea5fb153 ("hv_netvsc: Add the support of hibernation")
> Signed-off-by: Dexuan Cui 
> ---
>  drivers/net/hyperv/netvsc_drv.c | 11 ++-
>  1 file changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
> index 64b0a74c1523..f896059a9588 100644
> --- a/drivers/net/hyperv/netvsc_drv.c
> +++ b/drivers/net/hyperv/netvsc_drv.c
> @@ -2587,7 +2587,7 @@ static int netvsc_remove(struct hv_device *dev)
>  static int netvsc_suspend(struct hv_device *dev)
>  {
>   struct net_device_context *ndev_ctx;
> - struct net_device *vf_netdev, *net;
> + struct net_device *net;
>   struct netvsc_device *nvdev;
>   int ret;

Please keep reverse xmas tree variable ordering.

> @@ -2604,10 +2604,6 @@ static int netvsc_suspend(struct hv_device *dev)
>   goto out;
>   }
>  
> - vf_netdev = rtnl_dereference(ndev_ctx->vf_netdev);
> - if (vf_netdev)
> - netvsc_unregister_vf(vf_netdev);
> -
>   /* Save the current config info */
>   ndev_ctx->saved_netvsc_dev_info = netvsc_devinfo_get(nvdev);
>  
> @@ -2623,6 +2619,7 @@ static int netvsc_resume(struct hv_device *dev)
>   struct net_device *net = hv_get_drvdata(dev);
>   struct net_device_context *net_device_ctx;
>   struct netvsc_device_info *device_info;
> + struct net_device *vf_netdev;
>   int ret;
>  
>   rtnl_lock();
> @@ -2635,6 +2632,10 @@ static int netvsc_resume(struct hv_device *dev)
>   netvsc_devinfo_put(device_info);
>   net_device_ctx->saved_netvsc_dev_info = NULL;
>  
> + vf_netdev = rtnl_dereference(net_device_ctx->vf_netdev);
> + if (vf_netdev && netvsc_vf_changed(vf_netdev) != NOTIFY_OK)
> + ret = -EINVAL;

Should you perhaps remove the VF in case of the failure?

>   rtnl_unlock();
>  
>   return ret;



[PATCH net] hv_netvsc: Fix hibernation for mlx5 VF driver

2020-09-04 Thread Dexuan Cui
mlx5_suspend()/resume() keep the network interface, so during hibernation
netvsc_unregister_vf() and netvsc_register_vf() are not called, and hence
netvsc_resume() should call netvsc_vf_changed() to switch the data path
back to the VF after hibernation. Similarly, netvsc_suspend() should
not call netvsc_unregister_vf().

BTW, mlx4_suspend()/resume() are differnt in that they destroy and
re-create the network device, so netvsc_register_vf() and
netvsc_unregister_vf() are automatically called. Note: mlx4 can also work
with the changes here because in netvsc_suspend()/resume()
ndev_ctx->vf_netdev is NULL for mlx4.

Fixes: 0efeea5fb153 ("hv_netvsc: Add the support of hibernation")
Signed-off-by: Dexuan Cui 
---
 drivers/net/hyperv/netvsc_drv.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index 64b0a74c1523..f896059a9588 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -2587,7 +2587,7 @@ static int netvsc_remove(struct hv_device *dev)
 static int netvsc_suspend(struct hv_device *dev)
 {
struct net_device_context *ndev_ctx;
-   struct net_device *vf_netdev, *net;
+   struct net_device *net;
struct netvsc_device *nvdev;
int ret;
 
@@ -2604,10 +2604,6 @@ static int netvsc_suspend(struct hv_device *dev)
goto out;
}
 
-   vf_netdev = rtnl_dereference(ndev_ctx->vf_netdev);
-   if (vf_netdev)
-   netvsc_unregister_vf(vf_netdev);
-
/* Save the current config info */
ndev_ctx->saved_netvsc_dev_info = netvsc_devinfo_get(nvdev);
 
@@ -2623,6 +2619,7 @@ static int netvsc_resume(struct hv_device *dev)
struct net_device *net = hv_get_drvdata(dev);
struct net_device_context *net_device_ctx;
struct netvsc_device_info *device_info;
+   struct net_device *vf_netdev;
int ret;
 
rtnl_lock();
@@ -2635,6 +2632,10 @@ static int netvsc_resume(struct hv_device *dev)
netvsc_devinfo_put(device_info);
net_device_ctx->saved_netvsc_dev_info = NULL;
 
+   vf_netdev = rtnl_dereference(net_device_ctx->vf_netdev);
+   if (vf_netdev && netvsc_vf_changed(vf_netdev) != NOTIFY_OK)
+   ret = -EINVAL;
+
rtnl_unlock();
 
return ret;
-- 
2.19.1