On Wed, Jul 8, 2015 at 9:44 PM, Dmitry Torokhov <[email protected]> wrote: > On Wed, Jul 08, 2015 at 06:00:41PM -0700, Dan Williams wrote: >> On Wed, Jul 8, 2015 at 5:49 PM, Dmitry Torokhov >> <[email protected]> wrote: >> > On Wed, Jul 08, 2015 at 05:36:04PM -0700, Dan Williams wrote: >> >> On Mon, Jul 6, 2015 at 4:40 PM, Dmitry Torokhov >> >> <[email protected]> wrote: >> >> > On Tue, Jul 07, 2015 at 01:23:15AM +0200, Luis R. Rodriguez wrote: >> >> >> On Sat, Jul 04, 2015 at 07:09:19AM -0700, Dan Williams wrote: >> >> >> > On Fri, Jul 3, 2015 at 11:30 AM, Luis R. Rodriguez <[email protected]> >> >> >> > wrote: >> >> >> > > On Sat, Jun 27, 2015 at 04:45:25PM -0700, Dan Williams wrote: >> >> >> > >> On Mon, Mar 30, 2015 at 4:20 PM, Dmitry Torokhov >> >> >> > >> <[email protected]> wrote: >> >> >> > >> > Some devices take a long time when initializing, and not all >> >> >> > >> > drivers are >> >> >> > >> > suited to initialize their devices when they are open. For >> >> >> > >> > example, >> >> >> > >> > input drivers need to interrogate their devices in order to >> >> >> > >> > publish >> >> >> > >> > device's capabilities before userspace will open them. When >> >> >> > >> > such drivers >> >> >> > >> > are compiled into kernel they may stall entire kernel >> >> >> > >> > initialization. >> >> >> > >> > >> >> >> > >> > This change allows drivers request for their probe functions to >> >> >> > >> > be >> >> >> > >> > called asynchronously during driver and device registration >> >> >> > >> > (manual >> >> >> > >> > binding is still synchronous). Because async_schedule is used >> >> >> > >> > to perform >> >> >> > >> > asynchronous calls module loading will still wait for the >> >> >> > >> > probing to >> >> >> > >> > complete. >> >> >> > >> > >> >> >> > >> > Note that the end goal is to make the probing asynchronous by >> >> >> > >> > default, >> >> >> > >> > so annotating drivers with PROBE_PREFER_ASYNCHRONOUS is a >> >> >> > >> > temporary >> >> >> > >> > measure that allows us to speed up boot process while we >> >> >> > >> > validating and >> >> >> > >> > fixing the rest of the drivers and preparing userspace. >> >> >> > >> > >> >> >> > >> > This change is based on earlier patch by "Luis R. Rodriguez" >> >> >> > >> > <[email protected]> >> >> >> > >> > >> >> >> > >> > Signed-off-by: Dmitry Torokhov <[email protected]> >> >> >> > >> > --- >> >> >> > >> > drivers/base/base.h | 1 + >> >> >> > >> > drivers/base/bus.c | 31 +++++++--- >> >> >> > >> > drivers/base/dd.c | 149 >> >> >> > >> > ++++++++++++++++++++++++++++++++++++++++++------- >> >> >> > >> > include/linux/device.h | 28 ++++++++++ >> >> >> > >> > 4 files changed, 182 insertions(+), 27 deletions(-) >> >> >> > >> >> >> >> > >> Just noticed this patch. It caught my eye because I had a hard >> >> >> > >> time >> >> >> > >> getting an open coded implementation of asynchronous probing to >> >> >> > >> work >> >> >> > >> in the new libnvdimm subsystem. Especially the messy races of >> >> >> > >> tearing >> >> >> > >> things down while probing is still in flight. I ended up >> >> >> > >> implementing >> >> >> > >> asynchronous device registration which eliminated a lot of >> >> >> > >> complexity >> >> >> > >> and of course the bugs. In general I tend to think that async >> >> >> > >> registration is less risky than async probe since it keeps wider >> >> >> > >> portions of the traditional device model synchronous >> >> >> > > >> >> >> > > but its not see -DEFER_PROBE even before async probe. >> >> >> > >> >> >> > Except in that case you know probe has been seen by the driver at >> >> >> > least once. So I see that as less of a surprise, but point taken. >> >> >> > >> >> >> > >> and leverages the >> >> >> > >> fact that the device model is already well prepared for >> >> >> > >> asynchronous >> >> >> > >> arrival of devices due to hotplug. >> >> >> > > >> >> >> > > I think this sounds reasonable, do you have your code upstream or >> >> >> > > posted? >> >> >> > >> >> >> > Yes, see nd_device_register() in drivers/nvdimm/bus.c >> >> >> >> >> >> It should be I think rather easy for Dmitry to see if he can convert >> >> >> this input >> >> >> driver (not yet upstream) to this API and see if the same issues are >> >> >> fixed. >> >> > >> >> > No, I would rather not as it means we lose error handling on device >> >> > registration. >> >> > >> >> >> >> I think this is a red herring as I don't see how async probing is any >> >> better at handling device registration errors. The error is logged >> >> and "handled" by the fact that a device fails to appear, what other >> >> action would you take? In fact libnvdimm does detect registration >> >> failures and reports that in a parent device attribute (at least for a >> >> region device and their namespace child devices). >> > >> > What is libnvdimm behavior if you try to unload a module that tries to >> > register a device but it failed? Memory leak or crash, right? >> >> No, in the case of the "region" driver it is part of the core >> libnvdimm and it is pinned while any region device is active. > > No, not quite. Let's take a look for example at nd_btt_probe(). It calls > __nd_btt_create() which in turn calls __nd_device_register() which > returns void and asynchronously schedules device registration. Now > consider the device registration fails. The async code will drop 2 > references to the device, effectively freeing it. In the mean time > nd_btt_probe() stores the device pointer which may or may no longer be > valid and goes on it's merry way using it.
nd_btt_probe() is the driver probe for the btt device. If registration fails then the device is never probed. > The similar thing in nvdimm_create which returns a pointer that may no > longer be valid Exactly, which is why we fail the entire bus probe if any of the nvdimm devices failed to register. See nvdimm_bus_check_dimm_count(). > I have not traced enough through the code to make sure > if it can blow up, but this kind of situation is not desirable, > especially if the async registration pattern is applied generally > throughout the kernel. I do appreciate the review, but I don't think this signals the death knell for async registration just yet. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

