> -----Original Message----- > From: Maxime Coquelin [mailto:maxime.coque...@redhat.com] > Sent: Friday, November 23, 2018 6:05 PM > To: Stojaczyk, Dariusz <dariusz.stojac...@intel.com>; dev@dpdk.org > Cc: gaetan.ri...@6wind.com; tho...@monjalon.net > Subject: Re: [dpdk-dev] [PATCH v3] dev: don't remove devargs that are still > referenced > > Hi, > > On 11/23/18 4:43 PM, Darek Stojaczyk wrote: > > Even if a device failed to plug, it's still a device > > object that references the devargs. Those devargs will > > be freed automatically together with the device, but > > freeing them any earlier - like it's done in the hotplug > > error handling path right now - will give us a dangling > > pointer and a segfault scenario. > > > > Consider the following case: > > * secondary process receives the hotplug request IPC message > > * devargs are either created or updated > > * the bus is scanned > > * a new device object is created with the latest devargs > > * the device can't be plugged for whatever reason, > > bus->plug returns error > > * the devargs are freed, even though they're still referenced > > by the device object on the bus > > > > For PCI devices, the generic device name comes from > > a buffer within the devargs. Freeing those will make > > EAL segfault whenever the device name is checked. > > > > This patch just prevents the hotplug error handling > > path from removing the devargs when there's a device > > that references them. This is done by simply exiting > > early from the hotplug function. As mentioned in the > > beginning, those devargs will be freed later, together > > with the device itself. > > > > Fixes: 7e8b26650146 ("eal: fix hotplug add / remove") > > Should you also cc stable? > Above commit is in since v17.08. >
Hi Maxime, Stable could use a similar patch, but not exactly this one as it is now. I'll resubmit for stable once the one here gets approved. Thank you, D. > > Cc: gaetan.ri...@6wind.com > > Cc: tho...@monjalon.net > > > > Signed-off-by: Darek Stojaczyk <dariusz.stojac...@intel.com> > > --- > > Changes since v2: > > * added an extra comment (Gaetan) > > > > Changes since v1: > > * described the failing scenario in commit msg (Thomas) > > > > lib/librte_eal/common/eal_common_dev.c | 13 ++++++++----- > > 1 file changed, 8 insertions(+), 5 deletions(-) > > > > diff --git a/lib/librte_eal/common/eal_common_dev.c > b/lib/librte_eal/common/eal_common_dev.c > > index 1fdc9ab17..d7950bc9a 100644 > > --- a/lib/librte_eal/common/eal_common_dev.c > > +++ b/lib/librte_eal/common/eal_common_dev.c > > @@ -166,14 +166,17 @@ local_dev_probe(const char *devargs, struct > rte_device **new_dev) > > ret = -ENODEV; > > goto err_devarg; > > } > > + /* Since there is a matching device, it is now its responsibility > > + * to manage the devargs we've just inserted. From this point > > + * those devargs shouldn't be removed manually anymore. > > + */ > > > > ret = dev->bus->plug(dev); > > if (ret) { > > - if (rte_dev_is_probed(dev)) /* if already succeeded earlier > */ > > - return ret; /* no rollback */ > > - RTE_LOG(ERR, EAL, "Driver cannot attach the device (%s)\n", > > - dev->name); > > - goto err_devarg; > > + if (!rte_dev_is_probed(dev)) /* if hasn't succeeded earlier */ > > + RTE_LOG(ERR, EAL, "Driver cannot attach the device > (%s)\n", > > + dev->name); > > + return ret; > > } > > > > *new_dev = dev; > > > > Other than that, it looks good to me: > Acked-by: Maxime Coquelin <maxime.coque...@redhat.com> > > Regards, > Maxime