On Wed, Feb 28, 2018 at 04:47:30PM +0000, Ophir Munk wrote: > The following scenario causes a crash in function mlx4_get_ifname > 1. On testpmd startup mlx4 device is probed and started > 2. mlx4 sriov is disabled. As a result an RMV event is sent to > testpmd which closes the device and nullify the priv struct > members. > 3. Running 'show port info all' in testpmd results in segmentation > fault because of accessing NULL pointer priv->ctx > > The fix is to return with an error from mlx4_get_ifname() if priv->ctx > member is NULL.
So priv->ctx is NULL after mlx4_dev_close() was called. In my opinion the fact testpmd is allowed to call rte_eth_dev_info_get() on a closed port is fishy, there are quite a few other ethdev callbacks that may end up crashing in such a scenario. Since commit 284c908cc588 ("app/testpmd: request device removal interrupt"), testpmd automatically calls rte_eth_dev_close() and rte_eal_dev_detach() after receiving a removal event on a port. rte_eth_dev_close() documentation says: "Close a stopped Ethernet device. The device cannot be restarted! The function frees all resources except for needed by the closed state. To free these resources, call rte_eth_dev_detach()." Unfortunately testpmd calls rte_*eal*_dev_detach() not rte_*eth*_dev_detach(), the former doesn't release ethdev ports while the latter does. I think it's a mistake, there's no point in keeping a closed device around after it's been physically unplugged. In short, rmv_event_callback() should call detach_port() instead of rte_eal_dev_detach(). > Fixes: 61cbdd419478 ("net/mlx4: separate device control functions") > Cc: sta...@dpdk.org > > Signed-off-by: Ophir Munk <ophi...@mellanox.com> The above suggests the problem is actually in testpmd and was introduced in v17.05 by commit 284c908cc588. The proposed patch is a workaround that doesn't address the underlying issue, thus NACK unless proven otherwise :) > --- > drivers/net/mlx4/mlx4_ethdev.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/net/mlx4/mlx4_ethdev.c b/drivers/net/mlx4/mlx4_ethdev.c > index 3bc6927..cca5223 100644 > --- a/drivers/net/mlx4/mlx4_ethdev.c > +++ b/drivers/net/mlx4/mlx4_ethdev.c > @@ -67,6 +67,9 @@ mlx4_get_ifname(const struct priv *priv, char > (*ifname)[IF_NAMESIZE]) > char match[IF_NAMESIZE] = ""; > > { > + if (priv->ctx == NULL) > + return -ENOENT; > + > MKSTR(path, "%s/device/net", priv->ctx->device->ibdev_path); > > dir = opendir(path); > -- > 2.7.4 > -- Adrien Mazarguil 6WIND