Re: dev->of_node overwrite can cause device loading with different driver
Hi, Markus Pargmann writes: > On Sat, Sep 14, 2013 at 01:28:09PM +0100, Russell King - ARM Linux wrote: > > On Sat, Sep 14, 2013 at 05:17:29AM -0700, Greg Kroah-Hartman wrote: > > > On Sat, Sep 14, 2013 at 09:16:53AM +0200, Markus Pargmann wrote: > > > > 3. We could fix up all drivers that change the of_node. But there are > > > >ARM DT frameworks that require a device struct as parameter instead > > > >of a device_node parameter (e.g. soc-generic-dmaengine-pcm). So a > > > >driver core, initialized by a glue driver with DT bindings, has to > > > >set dev->of_node to use those frameworks. I think it is strange to > > > >have such DT framework interfaces if a driver is not supposed to > > > >overwrite dev->of_node permanently. > > > > > > How about any driver that does muck with this structure, restore it > > > properly if their probe() function fails? Yes, you show that this is > > > going to be tricky in some places (i.e. musb), but it makes sense that > > > the burden of fixing this issue would rest on them, as they are the ones > > > causing this problem, right? > > > > It's not about overwriting at all. > > musb does not overwrite of_node, but other drivers do, e.g. USB chipidea > core driver which uses its parent of_node. When probe fails in this > case, we could end up with similar issues. > This has already been fixed in commit: e98b44e9 usb: chipidea: prevent endless loop registering platform_devices when probe fails in linux-next Lothar Waßmann -- ___ Ka-Ro electronics GmbH | Pascalstraße 22 | D - 52076 Aachen Phone: +49 2408 1402-0 | Fax: +49 2408 1402-10 Geschäftsführer: Matthias Kaussen Handelsregistereintrag: Amtsgericht Aachen, HRB 4996 www.karo-electronics.de | i...@karo-electronics.de ___ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: dev->of_node overwrite can cause device loading with different driver
On Sat, Sep 14, 2013 at 12:20:56PM +0100, Russell King - ARM Linux wrote: > On Sat, Sep 14, 2013 at 09:16:53AM +0200, Markus Pargmann wrote: > > I think there are three options to solve this: > > > > 1. Break out of the driver list iteration loop as soon as a driver probe > >function fails. This way there is no possibility for another driver > >to be probed with a device struct that was changed by the first > >driver. But it would remove the support to fall back to > >another driver for a device. I am not aware of any device that is > >supported by multiple drivers. > > What if the incorrect driver (the one which created this platform device) > is the one which was matched first? Yes that would lead to the same recursion. But I think it is better to assign the of_node within driver probe which wouldn't be a problem for (1), e.g. by using dev->parent or by passing the of_node via platform data. > > > 2. We could restore the device struct completely or partially (only > >of_node) after a probe failure. This would avoid driver probes with > >unclean device structs, but introduces some overhead. > > I don't think it's about changing an existing device structure. The > problem is more to do with this: > > - We have a platform device with an associated of_node. > - The of_node is used to match it to its device driver. > - This device driver spawns a new platform device, and copies the > of_node to the new platform device (so that the _intended_ driver > can get access to the DT properties) > - The DD layer tries to match this new platform device with a driver, > and _can_ match this new platform device against the device driver > which created it due to the of_node matching. > > There's two solutions here: > > 1. get rid of this yucky "lets spawn a new device" stuff - if you >actually work out what's going on with MUSB and its use of this >platform device, it's _really_ horrid, and that's putting it >mildly. Let's call the device being probed, devA, and briefly: > > new_dev = platform_device_alloc(); > > devA->platform_data = glue > glue->dev = devA > glue->musb = new_dev > new_dev->parent = devA > > set_drvdata(devA, glue) > > platform_device_add(new_dev); > >musb->controller is the new device, callbacks into this layer do: > > glue = get_drvdata(musb->controller->parent) > >that's not too bad, because this is accessing devA's driver data >from within its owning driver... until you find this: > > musb = glue_to_musb(glue) > >which is defined as: > > #define glue_to_musb(g) platform_get_drvdata(g->musb) > >glue->musb is the _child_ platform device, and we're accessing the >child's driver data here. > >This seems to me to be a layering violation, and also rather racy when >you consider that glue_to_musb() gets used from workqueue contexts >(and I don't see a flush_workqueue() call in these stub drivers.) >What if the new platform device gets unbound just before the workqueue >fires? > >Another thing to note here is that platform_device_add_data() takes a >copy of the data - in the case of omap2430, this is kzalloc'd but >is pointlessly kept around until this driver is removed (via the devm_ >mechanisms.) > >The last thing I don't like in these drivers is this: > > memset(musb_resources, 0x00, sizeof(*musb_resources) * > ARRAY_SIZE(musb_resources)); > > musb_resources[0].name = pdev->resource[0].name; > musb_resources[0].start = pdev->resource[0].start; > musb_resources[0].end = pdev->resource[0].end; > musb_resources[0].flags = pdev->resource[0].flags; > > musb_resources[1].name = pdev->resource[1].name; > musb_resources[1].start = pdev->resource[1].start; > musb_resources[1].end = pdev->resource[1].end; > musb_resources[1].flags = pdev->resource[1].flags; > > musb_resources[2].name = pdev->resource[2].name; > musb_resources[2].start = pdev->resource[2].start; > musb_resources[2].end = pdev->resource[2].end; > musb_resources[2].flags = pdev->resource[2].flags; > > ret = platform_device_add_resources(musb, musb_resources, > ARRAY_SIZE(musb_resources)); > >A little knowledge of the driver model will reveal that the above >is utterly pointless - and can be simplified to: > > ret = platform_device_add_resources(musb, pdev->resource, > pdev->num_resources); > >as platform_device_add_resources() copies the passed resource >structures via kmemdup() already. There's no reason for this >device driver to make its own copy of the resources just to have >them re-copied. It's also a lot safer in case fewer than three >resources are supplied. It's also a lot less hastle if
Re: dev->of_node overwrite can cause device loading with different driver
On Sat, Sep 14, 2013 at 01:28:09PM +0100, Russell King - ARM Linux wrote: > On Sat, Sep 14, 2013 at 05:17:29AM -0700, Greg Kroah-Hartman wrote: > > On Sat, Sep 14, 2013 at 09:16:53AM +0200, Markus Pargmann wrote: > > > 3. We could fix up all drivers that change the of_node. But there are > > >ARM DT frameworks that require a device struct as parameter instead > > >of a device_node parameter (e.g. soc-generic-dmaengine-pcm). So a > > >driver core, initialized by a glue driver with DT bindings, has to > > >set dev->of_node to use those frameworks. I think it is strange to > > >have such DT framework interfaces if a driver is not supposed to > > >overwrite dev->of_node permanently. > > > > How about any driver that does muck with this structure, restore it > > properly if their probe() function fails? Yes, you show that this is > > going to be tricky in some places (i.e. musb), but it makes sense that > > the burden of fixing this issue would rest on them, as they are the ones > > causing this problem, right? > > It's not about overwriting at all. musb does not overwrite of_node, but other drivers do, e.g. USB chipidea core driver which uses its parent of_node. When probe fails in this case, we could end up with similar issues. Regards, Markus > > It's quite simple: > > 1. OF creates a platform device and attaches an of_node to it. > 2. This platform device is matched using the data in the of_node structure >against one of the MUSB stub drivers. > 3. The MUSB stub driver creates a new platform device, and copies the >of_node to it, and registers it. > 4. This new platform device _can_ itself be matched against the stub >driver using the of_node structure. When this happens, go to step >2 and repeat 2-4. > > That's where the problem is - it's not about overwriting an existing > platform device's of_node pointer with something that the driver has > dreamt up at all. > > If we're lucky, step 3.5 would be "match against the 'musb-hdrc' driver > and successfully probe it" which is the only thing that would break > the above loop. > -- Pengutronix e.K. | | Industrial Linux Solutions | http://www.pengutronix.de/ | Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0| Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917- | -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: dev-of_node overwrite can cause device loading with different driver
On Sat, Sep 14, 2013 at 01:28:09PM +0100, Russell King - ARM Linux wrote: On Sat, Sep 14, 2013 at 05:17:29AM -0700, Greg Kroah-Hartman wrote: On Sat, Sep 14, 2013 at 09:16:53AM +0200, Markus Pargmann wrote: 3. We could fix up all drivers that change the of_node. But there are ARM DT frameworks that require a device struct as parameter instead of a device_node parameter (e.g. soc-generic-dmaengine-pcm). So a driver core, initialized by a glue driver with DT bindings, has to set dev-of_node to use those frameworks. I think it is strange to have such DT framework interfaces if a driver is not supposed to overwrite dev-of_node permanently. How about any driver that does muck with this structure, restore it properly if their probe() function fails? Yes, you show that this is going to be tricky in some places (i.e. musb), but it makes sense that the burden of fixing this issue would rest on them, as they are the ones causing this problem, right? It's not about overwriting at all. musb does not overwrite of_node, but other drivers do, e.g. USB chipidea core driver which uses its parent of_node. When probe fails in this case, we could end up with similar issues. Regards, Markus It's quite simple: 1. OF creates a platform device and attaches an of_node to it. 2. This platform device is matched using the data in the of_node structure against one of the MUSB stub drivers. 3. The MUSB stub driver creates a new platform device, and copies the of_node to it, and registers it. 4. This new platform device _can_ itself be matched against the stub driver using the of_node structure. When this happens, go to step 2 and repeat 2-4. That's where the problem is - it's not about overwriting an existing platform device's of_node pointer with something that the driver has dreamt up at all. If we're lucky, step 3.5 would be match against the 'musb-hdrc' driver and successfully probe it which is the only thing that would break the above loop. -- Pengutronix e.K. | | Industrial Linux Solutions | http://www.pengutronix.de/ | Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0| Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917- | -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: dev-of_node overwrite can cause device loading with different driver
On Sat, Sep 14, 2013 at 12:20:56PM +0100, Russell King - ARM Linux wrote: On Sat, Sep 14, 2013 at 09:16:53AM +0200, Markus Pargmann wrote: I think there are three options to solve this: 1. Break out of the driver list iteration loop as soon as a driver probe function fails. This way there is no possibility for another driver to be probed with a device struct that was changed by the first driver. But it would remove the support to fall back to another driver for a device. I am not aware of any device that is supported by multiple drivers. What if the incorrect driver (the one which created this platform device) is the one which was matched first? Yes that would lead to the same recursion. But I think it is better to assign the of_node within driver probe which wouldn't be a problem for (1), e.g. by using dev-parent or by passing the of_node via platform data. 2. We could restore the device struct completely or partially (only of_node) after a probe failure. This would avoid driver probes with unclean device structs, but introduces some overhead. I don't think it's about changing an existing device structure. The problem is more to do with this: - We have a platform device with an associated of_node. - The of_node is used to match it to its device driver. - This device driver spawns a new platform device, and copies the of_node to the new platform device (so that the _intended_ driver can get access to the DT properties) - The DD layer tries to match this new platform device with a driver, and _can_ match this new platform device against the device driver which created it due to the of_node matching. There's two solutions here: 1. get rid of this yucky lets spawn a new device stuff - if you actually work out what's going on with MUSB and its use of this platform device, it's _really_ horrid, and that's putting it mildly. Let's call the device being probed, devA, and briefly: new_dev = platform_device_alloc(); devA-platform_data = glue glue-dev = devA glue-musb = new_dev new_dev-parent = devA set_drvdata(devA, glue) platform_device_add(new_dev); musb-controller is the new device, callbacks into this layer do: glue = get_drvdata(musb-controller-parent) that's not too bad, because this is accessing devA's driver data from within its owning driver... until you find this: musb = glue_to_musb(glue) which is defined as: #define glue_to_musb(g) platform_get_drvdata(g-musb) glue-musb is the _child_ platform device, and we're accessing the child's driver data here. This seems to me to be a layering violation, and also rather racy when you consider that glue_to_musb() gets used from workqueue contexts (and I don't see a flush_workqueue() call in these stub drivers.) What if the new platform device gets unbound just before the workqueue fires? Another thing to note here is that platform_device_add_data() takes a copy of the data - in the case of omap2430, this is kzalloc'd but is pointlessly kept around until this driver is removed (via the devm_ mechanisms.) The last thing I don't like in these drivers is this: memset(musb_resources, 0x00, sizeof(*musb_resources) * ARRAY_SIZE(musb_resources)); musb_resources[0].name = pdev-resource[0].name; musb_resources[0].start = pdev-resource[0].start; musb_resources[0].end = pdev-resource[0].end; musb_resources[0].flags = pdev-resource[0].flags; musb_resources[1].name = pdev-resource[1].name; musb_resources[1].start = pdev-resource[1].start; musb_resources[1].end = pdev-resource[1].end; musb_resources[1].flags = pdev-resource[1].flags; musb_resources[2].name = pdev-resource[2].name; musb_resources[2].start = pdev-resource[2].start; musb_resources[2].end = pdev-resource[2].end; musb_resources[2].flags = pdev-resource[2].flags; ret = platform_device_add_resources(musb, musb_resources, ARRAY_SIZE(musb_resources)); A little knowledge of the driver model will reveal that the above is utterly pointless - and can be simplified to: ret = platform_device_add_resources(musb, pdev-resource, pdev-num_resources); as platform_device_add_resources() copies the passed resource structures via kmemdup() already. There's no reason for this device driver to make its own copy of the resources just to have them re-copied. It's also a lot safer in case fewer than three resources are supplied. It's also a lot less hastle if additional resources are added (like what's happened recently to these drivers.) 2. don't pass the of_node via the platform_device, but as part of
Re: dev-of_node overwrite can cause device loading with different driver
Hi, Markus Pargmann writes: On Sat, Sep 14, 2013 at 01:28:09PM +0100, Russell King - ARM Linux wrote: On Sat, Sep 14, 2013 at 05:17:29AM -0700, Greg Kroah-Hartman wrote: On Sat, Sep 14, 2013 at 09:16:53AM +0200, Markus Pargmann wrote: 3. We could fix up all drivers that change the of_node. But there are ARM DT frameworks that require a device struct as parameter instead of a device_node parameter (e.g. soc-generic-dmaengine-pcm). So a driver core, initialized by a glue driver with DT bindings, has to set dev-of_node to use those frameworks. I think it is strange to have such DT framework interfaces if a driver is not supposed to overwrite dev-of_node permanently. How about any driver that does muck with this structure, restore it properly if their probe() function fails? Yes, you show that this is going to be tricky in some places (i.e. musb), but it makes sense that the burden of fixing this issue would rest on them, as they are the ones causing this problem, right? It's not about overwriting at all. musb does not overwrite of_node, but other drivers do, e.g. USB chipidea core driver which uses its parent of_node. When probe fails in this case, we could end up with similar issues. This has already been fixed in commit: e98b44e9 usb: chipidea: prevent endless loop registering platform_devices when probe fails in linux-next Lothar Waßmann -- ___ Ka-Ro electronics GmbH | Pascalstraße 22 | D - 52076 Aachen Phone: +49 2408 1402-0 | Fax: +49 2408 1402-10 Geschäftsführer: Matthias Kaussen Handelsregistereintrag: Amtsgericht Aachen, HRB 4996 www.karo-electronics.de | i...@karo-electronics.de ___ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: dev->of_node overwrite can cause device loading with different driver
On Sat, Sep 14, 2013 at 05:17:29AM -0700, Greg Kroah-Hartman wrote: > On Sat, Sep 14, 2013 at 09:16:53AM +0200, Markus Pargmann wrote: > > 3. We could fix up all drivers that change the of_node. But there are > >ARM DT frameworks that require a device struct as parameter instead > >of a device_node parameter (e.g. soc-generic-dmaengine-pcm). So a > >driver core, initialized by a glue driver with DT bindings, has to > >set dev->of_node to use those frameworks. I think it is strange to > >have such DT framework interfaces if a driver is not supposed to > >overwrite dev->of_node permanently. > > How about any driver that does muck with this structure, restore it > properly if their probe() function fails? Yes, you show that this is > going to be tricky in some places (i.e. musb), but it makes sense that > the burden of fixing this issue would rest on them, as they are the ones > causing this problem, right? It's not about overwriting at all. It's quite simple: 1. OF creates a platform device and attaches an of_node to it. 2. This platform device is matched using the data in the of_node structure against one of the MUSB stub drivers. 3. The MUSB stub driver creates a new platform device, and copies the of_node to it, and registers it. 4. This new platform device _can_ itself be matched against the stub driver using the of_node structure. When this happens, go to step 2 and repeat 2-4. That's where the problem is - it's not about overwriting an existing platform device's of_node pointer with something that the driver has dreamt up at all. If we're lucky, step 3.5 would be "match against the 'musb-hdrc' driver and successfully probe it" which is the only thing that would break the above loop. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: dev->of_node overwrite can cause device loading with different driver
On Sat, Sep 14, 2013 at 09:16:53AM +0200, Markus Pargmann wrote: > I think there are three options to solve this: > > 1. Break out of the driver list iteration loop as soon as a driver probe >function fails. This way there is no possibility for another driver >to be probed with a device struct that was changed by the first >driver. But it would remove the support to fall back to >another driver for a device. I am not aware of any device that is >supported by multiple drivers. What if the incorrect driver (the one which created this platform device) is the one which was matched first? > 2. We could restore the device struct completely or partially (only >of_node) after a probe failure. This would avoid driver probes with >unclean device structs, but introduces some overhead. I don't think it's about changing an existing device structure. The problem is more to do with this: - We have a platform device with an associated of_node. - The of_node is used to match it to its device driver. - This device driver spawns a new platform device, and copies the of_node to the new platform device (so that the _intended_ driver can get access to the DT properties) - The DD layer tries to match this new platform device with a driver, and _can_ match this new platform device against the device driver which created it due to the of_node matching. There's two solutions here: 1. get rid of this yucky "lets spawn a new device" stuff - if you actually work out what's going on with MUSB and its use of this platform device, it's _really_ horrid, and that's putting it mildly. Let's call the device being probed, devA, and briefly: new_dev = platform_device_alloc(); devA->platform_data = glue glue->dev = devA glue->musb = new_dev new_dev->parent = devA set_drvdata(devA, glue) platform_device_add(new_dev); musb->controller is the new device, callbacks into this layer do: glue = get_drvdata(musb->controller->parent) that's not too bad, because this is accessing devA's driver data from within its owning driver... until you find this: musb = glue_to_musb(glue) which is defined as: #define glue_to_musb(g) platform_get_drvdata(g->musb) glue->musb is the _child_ platform device, and we're accessing the child's driver data here. This seems to me to be a layering violation, and also rather racy when you consider that glue_to_musb() gets used from workqueue contexts (and I don't see a flush_workqueue() call in these stub drivers.) What if the new platform device gets unbound just before the workqueue fires? Another thing to note here is that platform_device_add_data() takes a copy of the data - in the case of omap2430, this is kzalloc'd but is pointlessly kept around until this driver is removed (via the devm_ mechanisms.) The last thing I don't like in these drivers is this: memset(musb_resources, 0x00, sizeof(*musb_resources) * ARRAY_SIZE(musb_resources)); musb_resources[0].name = pdev->resource[0].name; musb_resources[0].start = pdev->resource[0].start; musb_resources[0].end = pdev->resource[0].end; musb_resources[0].flags = pdev->resource[0].flags; musb_resources[1].name = pdev->resource[1].name; musb_resources[1].start = pdev->resource[1].start; musb_resources[1].end = pdev->resource[1].end; musb_resources[1].flags = pdev->resource[1].flags; musb_resources[2].name = pdev->resource[2].name; musb_resources[2].start = pdev->resource[2].start; musb_resources[2].end = pdev->resource[2].end; musb_resources[2].flags = pdev->resource[2].flags; ret = platform_device_add_resources(musb, musb_resources, ARRAY_SIZE(musb_resources)); A little knowledge of the driver model will reveal that the above is utterly pointless - and can be simplified to: ret = platform_device_add_resources(musb, pdev->resource, pdev->num_resources); as platform_device_add_resources() copies the passed resource structures via kmemdup() already. There's no reason for this device driver to make its own copy of the resources just to have them re-copied. It's also a lot safer in case fewer than three resources are supplied. It's also a lot less hastle if additional resources are added (like what's happened recently to these drivers.) 2. don't pass the of_node via the platform_device, but as part of the platform device's data. Is there any reason why musb isn't implemented as a library which these stub drivers can hook into? I'd much prefer (1), because it gets rid of the horrid dma_mask stuff in these drivers, turning it more into a "conventional" driver. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
Re: dev-of_node overwrite can cause device loading with different driver
On Sat, Sep 14, 2013 at 09:16:53AM +0200, Markus Pargmann wrote: I think there are three options to solve this: 1. Break out of the driver list iteration loop as soon as a driver probe function fails. This way there is no possibility for another driver to be probed with a device struct that was changed by the first driver. But it would remove the support to fall back to another driver for a device. I am not aware of any device that is supported by multiple drivers. What if the incorrect driver (the one which created this platform device) is the one which was matched first? 2. We could restore the device struct completely or partially (only of_node) after a probe failure. This would avoid driver probes with unclean device structs, but introduces some overhead. I don't think it's about changing an existing device structure. The problem is more to do with this: - We have a platform device with an associated of_node. - The of_node is used to match it to its device driver. - This device driver spawns a new platform device, and copies the of_node to the new platform device (so that the _intended_ driver can get access to the DT properties) - The DD layer tries to match this new platform device with a driver, and _can_ match this new platform device against the device driver which created it due to the of_node matching. There's two solutions here: 1. get rid of this yucky lets spawn a new device stuff - if you actually work out what's going on with MUSB and its use of this platform device, it's _really_ horrid, and that's putting it mildly. Let's call the device being probed, devA, and briefly: new_dev = platform_device_alloc(); devA-platform_data = glue glue-dev = devA glue-musb = new_dev new_dev-parent = devA set_drvdata(devA, glue) platform_device_add(new_dev); musb-controller is the new device, callbacks into this layer do: glue = get_drvdata(musb-controller-parent) that's not too bad, because this is accessing devA's driver data from within its owning driver... until you find this: musb = glue_to_musb(glue) which is defined as: #define glue_to_musb(g) platform_get_drvdata(g-musb) glue-musb is the _child_ platform device, and we're accessing the child's driver data here. This seems to me to be a layering violation, and also rather racy when you consider that glue_to_musb() gets used from workqueue contexts (and I don't see a flush_workqueue() call in these stub drivers.) What if the new platform device gets unbound just before the workqueue fires? Another thing to note here is that platform_device_add_data() takes a copy of the data - in the case of omap2430, this is kzalloc'd but is pointlessly kept around until this driver is removed (via the devm_ mechanisms.) The last thing I don't like in these drivers is this: memset(musb_resources, 0x00, sizeof(*musb_resources) * ARRAY_SIZE(musb_resources)); musb_resources[0].name = pdev-resource[0].name; musb_resources[0].start = pdev-resource[0].start; musb_resources[0].end = pdev-resource[0].end; musb_resources[0].flags = pdev-resource[0].flags; musb_resources[1].name = pdev-resource[1].name; musb_resources[1].start = pdev-resource[1].start; musb_resources[1].end = pdev-resource[1].end; musb_resources[1].flags = pdev-resource[1].flags; musb_resources[2].name = pdev-resource[2].name; musb_resources[2].start = pdev-resource[2].start; musb_resources[2].end = pdev-resource[2].end; musb_resources[2].flags = pdev-resource[2].flags; ret = platform_device_add_resources(musb, musb_resources, ARRAY_SIZE(musb_resources)); A little knowledge of the driver model will reveal that the above is utterly pointless - and can be simplified to: ret = platform_device_add_resources(musb, pdev-resource, pdev-num_resources); as platform_device_add_resources() copies the passed resource structures via kmemdup() already. There's no reason for this device driver to make its own copy of the resources just to have them re-copied. It's also a lot safer in case fewer than three resources are supplied. It's also a lot less hastle if additional resources are added (like what's happened recently to these drivers.) 2. don't pass the of_node via the platform_device, but as part of the platform device's data. Is there any reason why musb isn't implemented as a library which these stub drivers can hook into? I'd much prefer (1), because it gets rid of the horrid dma_mask stuff in these drivers, turning it more into a conventional driver. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to
Re: dev-of_node overwrite can cause device loading with different driver
On Sat, Sep 14, 2013 at 05:17:29AM -0700, Greg Kroah-Hartman wrote: On Sat, Sep 14, 2013 at 09:16:53AM +0200, Markus Pargmann wrote: 3. We could fix up all drivers that change the of_node. But there are ARM DT frameworks that require a device struct as parameter instead of a device_node parameter (e.g. soc-generic-dmaengine-pcm). So a driver core, initialized by a glue driver with DT bindings, has to set dev-of_node to use those frameworks. I think it is strange to have such DT framework interfaces if a driver is not supposed to overwrite dev-of_node permanently. How about any driver that does muck with this structure, restore it properly if their probe() function fails? Yes, you show that this is going to be tricky in some places (i.e. musb), but it makes sense that the burden of fixing this issue would rest on them, as they are the ones causing this problem, right? It's not about overwriting at all. It's quite simple: 1. OF creates a platform device and attaches an of_node to it. 2. This platform device is matched using the data in the of_node structure against one of the MUSB stub drivers. 3. The MUSB stub driver creates a new platform device, and copies the of_node to it, and registers it. 4. This new platform device _can_ itself be matched against the stub driver using the of_node structure. When this happens, go to step 2 and repeat 2-4. That's where the problem is - it's not about overwriting an existing platform device's of_node pointer with something that the driver has dreamt up at all. If we're lucky, step 3.5 would be match against the 'musb-hdrc' driver and successfully probe it which is the only thing that would break the above loop. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: dev->of_node overwrite can cause device loading with different driver
On Sat, Sep 14, 2013 at 09:16:53AM +0200, Markus Pargmann wrote: > > Ok, so what do you suggest we do for stuff like this? Fix up the musb > > driver? Or something else? > > I think there are three options to solve this: > > 1. Break out of the driver list iteration loop as soon as a driver probe >function fails. This way there is no possibility for another driver >to be probed with a device struct that was changed by the first >driver. But it would remove the support to fall back to >another driver for a device. I am not aware of any device that is >supported by multiple drivers. No, that's not ok, lots of drivers say "I support all devices, send them to me!" and then fail their probe function when they realize they don't really support a specific device that was sent to them. So that wouldn't work at all, as you would break all of those situations. > 2. We could restore the device struct completely or partially (only >of_node) after a probe failure. This would avoid driver probes with >unclean device structs, but introduces some overhead. Overhead isn't an issue here, this is not "performance critical" code paths. But it would be messy. > 3. We could fix up all drivers that change the of_node. But there are >ARM DT frameworks that require a device struct as parameter instead >of a device_node parameter (e.g. soc-generic-dmaengine-pcm). So a >driver core, initialized by a glue driver with DT bindings, has to >set dev->of_node to use those frameworks. I think it is strange to >have such DT framework interfaces if a driver is not supposed to >overwrite dev->of_node permanently. How about any driver that does muck with this structure, restore it properly if their probe() function fails? Yes, you show that this is going to be tricky in some places (i.e. musb), but it makes sense that the burden of fixing this issue would rest on them, as they are the ones causing this problem, right? thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: dev->of_node overwrite can cause device loading with different driver
On Fri, Sep 13, 2013 at 10:10:46AM -0700, Greg Kroah-Hartman wrote: > On Fri, Sep 13, 2013 at 05:53:31PM +0200, Markus Pargmann wrote: > > Hi, > > > > I ran into a serious problem with overwriting device of_node property as > > it is done in many drivers for ARM. If probing fails in such a > > situation, the device could be loaded with a different driver. > > > > This is an example situation, based on balbi's tag usb-for-v3.12: > > > > > > File drivers/usb/musb/musb_dsps.c: > > > > static int dsps_musb_init(struct musb *musb) > > { > > ... > > musb->xceiv = devm_usb_get_phy_by_phandle(dev, "phys", 0); > > if (IS_ERR(musb->xceiv)) > > return PTR_ERR(musb->xceiv); <-- This can return -EPROBE_DEFER > > ... > > } > > ... > > static int dsps_create_musb_pdev(struct dsps_glue *glue, > > struct platform_device *parent) > > { > > ... > > /* allocate the child platform device */ > > musb = platform_device_alloc("musb-hdrc", PLATFORM_DEVID_AUTO); > > if (!musb) { > > dev_err(dev, "failed to allocate musb device\n"); > > return -ENOMEM; > > } > > > > musb->dev.parent= dev; > > musb->dev.dma_mask = _dmamask; > > musb->dev.coherent_dma_mask = musb_dmamask; > > musb->dev.of_node = of_node_get(dn); <-- Overwrites the > > device of_node > > ... > > ret = platform_device_add(musb); > > ... > > } > > static int dsps_probe(struct platform_device *pdev) > > { > > ... > > ret = dsps_create_musb_pdev(glue, pdev); > > ... > > } > > > > > > File drivers/usb/musb/musb_core.c: > > > > static int > > musb_init_controller(struct device *dev, int nIrq, void __iomem *ctrl) > > { > > ... > > status = musb_platform_init(musb); <-- This calls dsps_musb_init > > if (status < 0) > > goto fail1; > > ... > > return status; > > > > } > > static int musb_probe(struct platform_device *pdev) > > { > > ... > > return musb_init_controller(dev, irq, base); > > } > > > > > > > > musb_dsps is a glue driver that creates a core device in the probe > > function and assigns it's own of_node to the new device. > > Starting at the platform_device_add call, this is the function call > > tree: > > > > platform_device_add() > > > > ... > > > > device_attach() in drivers/base/dd.c, which iterates through a list of > > drivers, calls __device_attach() on each of them. The list contains both > > drivers, musb_dsps and musb_core. This is where this example splits into > > two cases: > > > > 1. We find the first matching driver, musb_dsps: > > > > __device_attach() > > platform_match() /* for the musb_core, detecting a match. */ > > driver_probe_device() > > really_probe() > > musb_probe() is called, which returns -EPROBE_DEFER > > > > /* really_probe drops the return value and makes some cleanups > > */ > > > > 2. The error code does not reach the driver list iteration loop. It > > is not supposed to do so because the drivercore tries to find another > > matching driver. Now it tries the musb_dsps driver: > > > > __device_attach() > > platform_match() > > /* succeeds again, because the device has the > >of_node from its parent which claims that this > >is a musb_dsps device. */ > > driver_probe_device() > > really_probe() > > dsps_probe() ... /* from here on it starts from the > > beginning. */ > > > > This recursion continued until the kernel had no memory left. This is a > > special situation but there are many drivers that overwrite the of_node > > property in their probe function. So they can actually match with a > > different driver on the second or third probe attempt. > > Ok, so what do you suggest we do for stuff like this? Fix up the musb > driver? Or something else? I think there are three options to solve this: 1. Break out of the driver list iteration loop as soon as a driver probe function fails. This way there is no possibility for another driver to be probed with a device struct that was changed by the first driver. But it would remove the support to fall back to another driver for a device. I am not aware of any device that is supported by multiple drivers. 2. We could restore the device struct completely or partially (only of_node) after a probe failure. This would avoid driver probes with unclean device structs, but introduces some overhead. 3. We could fix up all drivers that change the of_node. But there are ARM DT frameworks that require a device struct as parameter instead of a device_node parameter (e.g. soc-generic-dmaengine-pcm). So a driver
Re: dev-of_node overwrite can cause device loading with different driver
On Fri, Sep 13, 2013 at 10:10:46AM -0700, Greg Kroah-Hartman wrote: On Fri, Sep 13, 2013 at 05:53:31PM +0200, Markus Pargmann wrote: Hi, I ran into a serious problem with overwriting device of_node property as it is done in many drivers for ARM. If probing fails in such a situation, the device could be loaded with a different driver. This is an example situation, based on balbi's tag usb-for-v3.12: File drivers/usb/musb/musb_dsps.c: static int dsps_musb_init(struct musb *musb) { ... musb-xceiv = devm_usb_get_phy_by_phandle(dev, phys, 0); if (IS_ERR(musb-xceiv)) return PTR_ERR(musb-xceiv); -- This can return -EPROBE_DEFER ... } ... static int dsps_create_musb_pdev(struct dsps_glue *glue, struct platform_device *parent) { ... /* allocate the child platform device */ musb = platform_device_alloc(musb-hdrc, PLATFORM_DEVID_AUTO); if (!musb) { dev_err(dev, failed to allocate musb device\n); return -ENOMEM; } musb-dev.parent= dev; musb-dev.dma_mask = musb_dmamask; musb-dev.coherent_dma_mask = musb_dmamask; musb-dev.of_node = of_node_get(dn); -- Overwrites the device of_node ... ret = platform_device_add(musb); ... } static int dsps_probe(struct platform_device *pdev) { ... ret = dsps_create_musb_pdev(glue, pdev); ... } File drivers/usb/musb/musb_core.c: static int musb_init_controller(struct device *dev, int nIrq, void __iomem *ctrl) { ... status = musb_platform_init(musb); -- This calls dsps_musb_init if (status 0) goto fail1; ... return status; } static int musb_probe(struct platform_device *pdev) { ... return musb_init_controller(dev, irq, base); } musb_dsps is a glue driver that creates a core device in the probe function and assigns it's own of_node to the new device. Starting at the platform_device_add call, this is the function call tree: platform_device_add() ... device_attach() in drivers/base/dd.c, which iterates through a list of drivers, calls __device_attach() on each of them. The list contains both drivers, musb_dsps and musb_core. This is where this example splits into two cases: 1. We find the first matching driver, musb_dsps: __device_attach() platform_match() /* for the musb_core, detecting a match. */ driver_probe_device() really_probe() musb_probe() is called, which returns -EPROBE_DEFER /* really_probe drops the return value and makes some cleanups */ 2. The error code does not reach the driver list iteration loop. It is not supposed to do so because the drivercore tries to find another matching driver. Now it tries the musb_dsps driver: __device_attach() platform_match() /* succeeds again, because the device has the of_node from its parent which claims that this is a musb_dsps device. */ driver_probe_device() really_probe() dsps_probe() ... /* from here on it starts from the beginning. */ This recursion continued until the kernel had no memory left. This is a special situation but there are many drivers that overwrite the of_node property in their probe function. So they can actually match with a different driver on the second or third probe attempt. Ok, so what do you suggest we do for stuff like this? Fix up the musb driver? Or something else? I think there are three options to solve this: 1. Break out of the driver list iteration loop as soon as a driver probe function fails. This way there is no possibility for another driver to be probed with a device struct that was changed by the first driver. But it would remove the support to fall back to another driver for a device. I am not aware of any device that is supported by multiple drivers. 2. We could restore the device struct completely or partially (only of_node) after a probe failure. This would avoid driver probes with unclean device structs, but introduces some overhead. 3. We could fix up all drivers that change the of_node. But there are ARM DT frameworks that require a device struct as parameter instead of a device_node parameter (e.g. soc-generic-dmaengine-pcm). So a driver core, initialized by a glue driver with DT bindings, has to set dev-of_node to use those frameworks. I think it is strange to have such DT framework interfaces if a driver is not supposed to overwrite dev-of_node permanently.
Re: dev-of_node overwrite can cause device loading with different driver
On Sat, Sep 14, 2013 at 09:16:53AM +0200, Markus Pargmann wrote: Ok, so what do you suggest we do for stuff like this? Fix up the musb driver? Or something else? I think there are three options to solve this: 1. Break out of the driver list iteration loop as soon as a driver probe function fails. This way there is no possibility for another driver to be probed with a device struct that was changed by the first driver. But it would remove the support to fall back to another driver for a device. I am not aware of any device that is supported by multiple drivers. No, that's not ok, lots of drivers say I support all devices, send them to me! and then fail their probe function when they realize they don't really support a specific device that was sent to them. So that wouldn't work at all, as you would break all of those situations. 2. We could restore the device struct completely or partially (only of_node) after a probe failure. This would avoid driver probes with unclean device structs, but introduces some overhead. Overhead isn't an issue here, this is not performance critical code paths. But it would be messy. 3. We could fix up all drivers that change the of_node. But there are ARM DT frameworks that require a device struct as parameter instead of a device_node parameter (e.g. soc-generic-dmaengine-pcm). So a driver core, initialized by a glue driver with DT bindings, has to set dev-of_node to use those frameworks. I think it is strange to have such DT framework interfaces if a driver is not supposed to overwrite dev-of_node permanently. How about any driver that does muck with this structure, restore it properly if their probe() function fails? Yes, you show that this is going to be tricky in some places (i.e. musb), but it makes sense that the burden of fixing this issue would rest on them, as they are the ones causing this problem, right? thanks, greg k-h -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: dev->of_node overwrite can cause device loading with different driver
On Fri, Sep 13, 2013 at 05:53:31PM +0200, Markus Pargmann wrote: > Hi, > > I ran into a serious problem with overwriting device of_node property as > it is done in many drivers for ARM. If probing fails in such a > situation, the device could be loaded with a different driver. > > This is an example situation, based on balbi's tag usb-for-v3.12: > > > File drivers/usb/musb/musb_dsps.c: > > static int dsps_musb_init(struct musb *musb) > { > ... > musb->xceiv = devm_usb_get_phy_by_phandle(dev, "phys", 0); > if (IS_ERR(musb->xceiv)) > return PTR_ERR(musb->xceiv); <-- This can return -EPROBE_DEFER > ... > } > ... > static int dsps_create_musb_pdev(struct dsps_glue *glue, > struct platform_device *parent) > { > ... > /* allocate the child platform device */ > musb = platform_device_alloc("musb-hdrc", PLATFORM_DEVID_AUTO); > if (!musb) { > dev_err(dev, "failed to allocate musb device\n"); > return -ENOMEM; > } > > musb->dev.parent= dev; > musb->dev.dma_mask = _dmamask; > musb->dev.coherent_dma_mask = musb_dmamask; > musb->dev.of_node = of_node_get(dn); <-- Overwrites the > device of_node > ... > ret = platform_device_add(musb); > ... > } > static int dsps_probe(struct platform_device *pdev) > { > ... > ret = dsps_create_musb_pdev(glue, pdev); > ... > } > > > File drivers/usb/musb/musb_core.c: > > static int > musb_init_controller(struct device *dev, int nIrq, void __iomem *ctrl) > { > ... > status = musb_platform_init(musb); <-- This calls dsps_musb_init > if (status < 0) > goto fail1; > ... > return status; > > } > static int musb_probe(struct platform_device *pdev) > { > ... > return musb_init_controller(dev, irq, base); > } > > > > musb_dsps is a glue driver that creates a core device in the probe > function and assigns it's own of_node to the new device. > Starting at the platform_device_add call, this is the function call > tree: > > platform_device_add() > > ... > > device_attach() in drivers/base/dd.c, which iterates through a list of > drivers, calls __device_attach() on each of them. The list contains both > drivers, musb_dsps and musb_core. This is where this example splits into > two cases: > > 1. We find the first matching driver, musb_dsps: > > __device_attach() > platform_match() /* for the musb_core, detecting a match. */ > driver_probe_device() > really_probe() > musb_probe() is called, which returns -EPROBE_DEFER > > /* really_probe drops the return value and makes some cleanups */ > > 2. The error code does not reach the driver list iteration loop. It > is not supposed to do so because the drivercore tries to find another > matching driver. Now it tries the musb_dsps driver: > > __device_attach() > platform_match() > /* succeeds again, because the device has the > of_node from its parent which claims that this > is a musb_dsps device. */ > driver_probe_device() > really_probe() > dsps_probe() ... /* from here on it starts from the beginning. > */ > > This recursion continued until the kernel had no memory left. This is a > special situation but there are many drivers that overwrite the of_node > property in their probe function. So they can actually match with a > different driver on the second or third probe attempt. Ok, so what do you suggest we do for stuff like this? Fix up the musb driver? Or something else? thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: dev-of_node overwrite can cause device loading with different driver
On Fri, Sep 13, 2013 at 05:53:31PM +0200, Markus Pargmann wrote: Hi, I ran into a serious problem with overwriting device of_node property as it is done in many drivers for ARM. If probing fails in such a situation, the device could be loaded with a different driver. This is an example situation, based on balbi's tag usb-for-v3.12: File drivers/usb/musb/musb_dsps.c: static int dsps_musb_init(struct musb *musb) { ... musb-xceiv = devm_usb_get_phy_by_phandle(dev, phys, 0); if (IS_ERR(musb-xceiv)) return PTR_ERR(musb-xceiv); -- This can return -EPROBE_DEFER ... } ... static int dsps_create_musb_pdev(struct dsps_glue *glue, struct platform_device *parent) { ... /* allocate the child platform device */ musb = platform_device_alloc(musb-hdrc, PLATFORM_DEVID_AUTO); if (!musb) { dev_err(dev, failed to allocate musb device\n); return -ENOMEM; } musb-dev.parent= dev; musb-dev.dma_mask = musb_dmamask; musb-dev.coherent_dma_mask = musb_dmamask; musb-dev.of_node = of_node_get(dn); -- Overwrites the device of_node ... ret = platform_device_add(musb); ... } static int dsps_probe(struct platform_device *pdev) { ... ret = dsps_create_musb_pdev(glue, pdev); ... } File drivers/usb/musb/musb_core.c: static int musb_init_controller(struct device *dev, int nIrq, void __iomem *ctrl) { ... status = musb_platform_init(musb); -- This calls dsps_musb_init if (status 0) goto fail1; ... return status; } static int musb_probe(struct platform_device *pdev) { ... return musb_init_controller(dev, irq, base); } musb_dsps is a glue driver that creates a core device in the probe function and assigns it's own of_node to the new device. Starting at the platform_device_add call, this is the function call tree: platform_device_add() ... device_attach() in drivers/base/dd.c, which iterates through a list of drivers, calls __device_attach() on each of them. The list contains both drivers, musb_dsps and musb_core. This is where this example splits into two cases: 1. We find the first matching driver, musb_dsps: __device_attach() platform_match() /* for the musb_core, detecting a match. */ driver_probe_device() really_probe() musb_probe() is called, which returns -EPROBE_DEFER /* really_probe drops the return value and makes some cleanups */ 2. The error code does not reach the driver list iteration loop. It is not supposed to do so because the drivercore tries to find another matching driver. Now it tries the musb_dsps driver: __device_attach() platform_match() /* succeeds again, because the device has the of_node from its parent which claims that this is a musb_dsps device. */ driver_probe_device() really_probe() dsps_probe() ... /* from here on it starts from the beginning. */ This recursion continued until the kernel had no memory left. This is a special situation but there are many drivers that overwrite the of_node property in their probe function. So they can actually match with a different driver on the second or third probe attempt. Ok, so what do you suggest we do for stuff like this? Fix up the musb driver? Or something else? thanks, greg k-h -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/