[CC Greg]

On Tuesday, January 15, 2019 1:04:02 AM CET Rafael J. Wysocki wrote:
> [CC Lukas and linux-pm]
> 
> On Mon, Jan 14, 2019 at 1:32 PM Rafael J. Wysocki <raf...@kernel.org> wrote:
> >
> > On Fri, Jan 11, 2019 at 3:49 PM Russell King - ARM Linux
> > <li...@armlinux.org.uk> wrote:
> 
> [cut]
> 
> > >
> > > This thread is discussing how to deal with Armada DRM, its use of the
> > > component helper, TDA998x's hybrid use of the component helper and
> > > DRM bridges.
> > >
> > > Currently, DRM bridges register into the DRM system and are added to
> > > a global list of bridges.  When a DRM driver binds, it looks up the
> > > DRM bridge and attaches to it.  There is no way in generic DRM code
> > > for the DRM driver to know if the DRM bridge is unbound from DRM,
> > > consequently a DRM driver may continue trying to call functions in
> > > the DRM bridge driver using memory that has already been freed.
> > >
> > > We had similar issues with imx-drm, and a number of years ago at a
> > > kernel summit, this was discussed, and I proposed a system which is
> > > now known as the component helper.  This handles the problem of a
> > > multi-component system.
> > >
> > > However, DRM bridge was already established, and there appears to be
> > > no desire to convert DRM bridges and DRM drivers to use the component
> > > helper.
> > >
> > > We are presently in the situation where Armada DRM is incompatible
> > > with the DRM bridges way of doing things, and making it compatible
> > > essentially means introducing potential use-after-free bugs into the
> > > code.
> > >
> > > Device links in their stateful form appear to provide an alternative
> > > to the component helper, in that a stateful device link will remove
> > > consumers of a resource when the supplier is going away - which is
> > > exactly the problem which the component helper is solving.  The
> > > difference is that device links look like being a cleaner solution.
> > >
> > > Just like the component helper, a stateful link would unbind the
> > > consumer of a resource when the supplier goes away - which is exactly
> > > the behaviour we're wanting.
> > >
> > > The problem is that the connection between various drivers is only
> > > really known when the drivers obtain their resources, and the
> > > following can happen:
> > >
> > > supplier                consumer
> > >
> > > probe()
> > >  alloc
> > >                         probe()
> > >  publish
> > >                         obtain supplier's resource
> > >  return
> > >
> > > Where things go wrong is if a stateful link is created when the
> > > consumer obtains the suppliers resource before the supplier has
> > > finished probing - which from what's been said is illegal.
> >
> > It just doesn't work (which means "invalid" rather than "illegal" I
> > guess, but whatever :-)).
> >
> > Admittedly, the original design didn't take this particular case into
> > account and I'm not actually sure how it can be taken into account
> > either.
> >
> > If the link is created by the consumer in the scenario above, its
> > status will be updated twice in a row after the consumer probe returns
> > and after the supplier probe returns.  It looks like this update would
> > need to work regardless of the order it which this happened which
> > sounds somewhat challenging.  I would need to think about that.
> 
> So if I'm not mistaken it can be made work with the help of the
> (completely untested) attached patch (of course, the documentation
> would need to be updated too).
> 
> The key observation here is that it should be fine to create a link
> from the consumer driver's probe while the supplier is still probing
> if the consumer has some way to find out that the supplier is
> functional at that point (like when it has published itself already in
> your example above).  The initial state of the link can be "consumer
> probe" in that case and I don't see a reason why that might not work.
> 

Below is a more complete patch that should take all of the corner cases
into account unless I have missed anything.  Testing it would be
appreciated. :-)

---
From: Rafael J. Wysocki <rafael.j.wyso...@intel.com>
Subject: [PATCH] driver core: Fix adding device links to probing suppliers

Currently, it is not valid to add a device link from a consumer
driver ->probe() callback to a supplier that is still probing too, but
generally this is a valid use case.  For example, if the consumer has
just acquired a resource that can only be available when the supplier
is functional, adding a device link to that supplier right away
should be safe (and even desirable arguably), but device_link_add()
doesn't handle that case correctly and the initial state of the link
created by it is wrong then.

To address this problem, change the initial state of device links
added between a probing supplier and a probing consumer to
DL_STATE_CONSUMER_PROBE and update device_links_driver_bound() to
skip such links on the supplier side.

With this change, if the supplier probe completes first,
device_links_driver_bound() called for it will skip the link state
update and when it is called for the consumer, the link state will
be updated to "active".  In turn, if the consumer probe completes
first, device_links_driver_bound() called for it will change the
state of the link to "active" and when it is called for the
supplier, the link status update will be skipped.

However, in principle the supplier or consumer probe may still fail
after the link has been added, so modify device_links_no_driver() to
change device links in the "active" or "consumer probe" state to
"dormant" on the supplier side and update __device_links_no_driver()
to change the link state to "available" only if it is "consumer
probe" or "active".

Then, if the supplier probe fails first, the leftover link to the
probing consumer will become "dormant" and device_links_no_driver()
called for the consumer (when its probe fails) will clean it up.
In turn, if the consumer probe fails first, it will either drop the
link, or change its state to "available" and, in the latter case,
when device_links_no_driver() is called for the supplier, it will
update the link state to "dormant".  [If the supplier probe fails,
but the consumer probe succeeds, which should not happen as long as
the consumer driver is correct, the link still will be around, but
it will be "dormant" until the supplier is probed again.]

Signed-off-by: Rafael J. Wysocki <rafael.j.wyso...@intel.com>
---
 Documentation/driver-api/device_link.rst |   10 ++--
 drivers/base/core.c                      |   74 +++++++++++++++++++++++++++----
 2 files changed, 73 insertions(+), 11 deletions(-)

Index: linux-pm/drivers/base/core.c
===================================================================
--- linux-pm.orig/drivers/base/core.c
+++ linux-pm/drivers/base/core.c
@@ -260,17 +260,26 @@ struct device_link *device_link_add(stru
                link->status = DL_STATE_NONE;
        } else {
                switch (supplier->links.status) {
-               case DL_DEV_DRIVER_BOUND:
+               case DL_DEV_PROBING:
                        switch (consumer->links.status) {
                        case DL_DEV_PROBING:
                                /*
-                                * Some callers expect the link creation during
-                                * consumer driver probe to resume the supplier
-                                * even without DL_FLAG_RPM_ACTIVE.
+                                * A consumer driver can create a link to a
+                                * supplier that has not completed its probing
+                                * yet as long as it knows that the supplier is
+                                * already functional (for example, it has just
+                                * acquired some resources from the supplier).
                                 */
-                               if (flags & DL_FLAG_PM_RUNTIME)
-                                       pm_runtime_resume(supplier);
-
+                               link->status = DL_STATE_CONSUMER_PROBE;
+                               break;
+                       default:
+                               link->status = DL_STATE_DORMANT;
+                               break;
+                       }
+                       break;
+               case DL_DEV_DRIVER_BOUND:
+                       switch (consumer->links.status) {
+                       case DL_DEV_PROBING:
                                link->status = DL_STATE_CONSUMER_PROBE;
                                break;
                        case DL_DEV_DRIVER_BOUND:
@@ -291,6 +300,14 @@ struct device_link *device_link_add(stru
        }
 
        /*
+        * Some callers expect the link creation during consumer driver probe to
+        * resume the supplier even without DL_FLAG_RPM_ACTIVE.
+        */
+       if (link->status == DL_STATE_CONSUMER_PROBE &&
+           flags & DL_FLAG_PM_RUNTIME)
+               pm_runtime_resume(supplier);
+
+       /*
         * Move the consumer and all of the devices depending on it to the end
         * of dpm_list and the devices_kset list.
         *
@@ -474,6 +491,16 @@ void device_links_driver_bound(struct de
                if (link->flags & DL_FLAG_STATELESS)
                        continue;
 
+               /*
+                * Links created during consumer probe may be in the "consumer
+                * probe" state to start with if the supplier is still probing
+                * when they are created and they may become "active" if the
+                * consumer probe returns first.  Skip them here.
+                */
+               if (link->status == DL_STATE_CONSUMER_PROBE ||
+                   link->status == DL_STATE_ACTIVE)
+                       continue;
+
                WARN_ON(link->status != DL_STATE_DORMANT);
                WRITE_ONCE(link->status, DL_STATE_AVAILABLE);
        }
@@ -513,17 +540,48 @@ static void __device_links_no_driver(str
 
                if (link->flags & DL_FLAG_AUTOREMOVE_CONSUMER)
                        kref_put(&link->kref, __device_link_del);
-               else if (link->status != DL_STATE_SUPPLIER_UNBIND)
+               else if (link->status == DL_STATE_CONSUMER_PROBE ||
+                        link->status == DL_STATE_ACTIVE)
                        WRITE_ONCE(link->status, DL_STATE_AVAILABLE);
        }
 
        dev->links.status = DL_DEV_NO_DRIVER;
 }
 
+/**
+ * device_links_no_driver - Update links after failing driver probe.
+ * @dev: Device whose driver has just failed to probe.
+ *
+ * Clean up leftover links to consumers for @dev and invoke
+ * %__device_links_no_driver() to update links to suppliers for it as
+ * appropriate.
+ *
+ * Links with the DL_FLAG_STATELESS flag set are ignored.
+ */
 void device_links_no_driver(struct device *dev)
 {
+       struct device_link *link;
+
        device_links_write_lock();
+
+       list_for_each_entry(link, &dev->links.consumers, s_node) {
+               if (link->flags & DL_FLAG_STATELESS)
+                       continue;
+
+               /*
+                * The probe has failed, so if the status of the link is
+                * "consumer probe" or "active", it must have been added by
+                * a probing consumer while this device was still probing.
+                * Change its state to "dormant", as it represents a valid
+                * relationship, but it is not functionally meaningful.
+                */
+               if (link->status == DL_STATE_CONSUMER_PROBE ||
+                   link->status == DL_STATE_ACTIVE)
+                       WRITE_ONCE(link->status, DL_STATE_DORMANT);
+       }
+
        __device_links_no_driver(dev);
+
        device_links_write_unlock();
 }
 
Index: linux-pm/Documentation/driver-api/device_link.rst
===================================================================
--- linux-pm.orig/Documentation/driver-api/device_link.rst
+++ linux-pm/Documentation/driver-api/device_link.rst
@@ -59,11 +59,15 @@ device ``->probe`` callback or a boot-ti
 
 Another example for an inconsistent state would be a device link that
 represents a driver presence dependency, yet is added from the consumer's
-``->probe`` callback while the supplier hasn't probed yet:  Had the driver
-core known about the device link earlier, it wouldn't have probed the
+``->probe`` callback while the supplier hasn't started to probe yet:  Had the
+driver core known about the device link earlier, it wouldn't have probed the
 consumer in the first place.  The onus is thus on the consumer to check
 presence of the supplier after adding the link, and defer probing on
-non-presence.
+non-presence.  [Note that it is valid to create a link from the consumer's
+``->probe`` callback while the supplier is still probing, but the consumer must
+know that the supplier is functional already at the link creation time (that is
+the case, for instance, if the consumer has just acquired some resources that
+would not have been available had the supplier not been functional then).]
 
 If a device link is added in the ``->probe`` callback of the supplier or
 consumer driver, it is typically deleted in its ``->remove`` callback for

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Reply via email to