Re: [PATCH v3 1/3] driver core: Revert default driver_deferred_probe_timeout value to 0
On Fri, Aug 07, 2020 at 01:02:44PM +0200, Thierry Reding wrote: > On Thu, Aug 06, 2020 at 07:09:16PM -0700, John Stultz wrote: > > On Thu, Aug 6, 2020 at 6:52 AM Thierry Reding > > wrote: > > > > > > On Wed, Apr 22, 2020 at 08:32:43PM +, John Stultz wrote: > > > > This patch addresses a regression in 5.7-rc1+ > > > > > > > > In commit c8c43cee29f6 ("driver core: Fix > > > > driver_deferred_probe_check_state() logic"), we both cleaned up > > > > the logic and also set the default driver_deferred_probe_timeout > > > > value to 30 seconds to allow for drivers that are missing > > > > dependencies to have some time so that the dependency may be > > > > loaded from userland after initcalls_done is set. > > > > > > > > However, Yoshihiro Shimoda reported that on his device that > > > > expects to have unmet dependencies (due to "optional links" in > > > > its devicetree), was failing to mount the NFS root. > > > > > > > > In digging further, it seemed the problem was that while the > > > > device properly probes after waiting 30 seconds for any missing > > > > modules to load, the ip_auto_config() had already failed, > > > > resulting in NFS to fail. This was due to ip_auto_config() > > > > calling wait_for_device_probe() which doesn't wait for the > > > > driver_deferred_probe_timeout to fire. > > > > > > > > Fixing that issue is possible, but could also introduce 30 > > > > second delays in bootups for users who don't have any > > > > missing dependencies, which is not ideal. > > > > > > > > So I think the best solution to avoid any regressions is to > > > > revert back to a default timeout value of zero, and allow > > > > systems that need to utilize the timeout in order for userland > > > > to load any modules that supply misisng dependencies in the dts > > > > to specify the timeout length via the exiting documented boot > > > > argument. > > > > > > > > Thanks to Geert for chasing down that ip_auto_config was why NFS > > > > was failing in this case! > > > > > > > > Cc: "David S. Miller" > > > > Cc: Alexey Kuznetsov > > > > Cc: Hideaki YOSHIFUJI > > > > Cc: Jakub Kicinski > > > > Cc: Greg Kroah-Hartman > > > > Cc: Rafael J. Wysocki > > > > Cc: Rob Herring > > > > Cc: Geert Uytterhoeven > > > > Cc: Yoshihiro Shimoda > > > > Cc: Robin Murphy > > > > Cc: Andy Shevchenko > > > > Cc: Sudeep Holla > > > > Cc: Andy Shevchenko > > > > Cc: Naresh Kamboju > > > > Cc: Basil Eljuse > > > > Cc: Ferry Toth > > > > Cc: Arnd Bergmann > > > > Cc: Anders Roxell > > > > Cc: netdev > > > > Cc: linux...@vger.kernel.org > > > > Reported-by: Yoshihiro Shimoda > > > > Tested-by: Yoshihiro Shimoda > > > > Fixes: c8c43cee29f6 ("driver core: Fix > > > > driver_deferred_probe_check_state() logic") > > > > Signed-off-by: John Stultz > > > > --- > > > > drivers/base/dd.c | 13 ++--- > > > > 1 file changed, 2 insertions(+), 11 deletions(-) > > > > > > Sorry for being a bit late to the party, but this breaks suspend/resume > > > support on various Tegra devices. I've only noticed now because, well, > > > suspend/resume have been broken for other reasons for a little while and > > > it's taken us a bit to resolve those issues. > > > > > > But now that those other issues have been fixed, I've started seeing an > > > issue where after resume from suspend some of the I2C controllers are no > > > longer working. The reason for this is that they share pins with DP AUX > > > controllers via the pinctrl framework. The DP AUX driver registers as > > > part of the DRM/KMS driver, which usually happens in userspace. Since > > > the deferred probe timeout was set to 0 by default this no longer works > > > because no pinctrl states are assigned to the I2C controller and > > > therefore upon resume the pins cannot be configured for I2C operation. > > > > Oof. My apologies! > > > > > I'm also somewhat confused by this patch and a few before because they > > > claim that they restore previous default behaviour, but that's just not > > > true. Originally when this timeout was introduced it was -1, which meant > > > that there was no timeout at all and hence users had to opt-in if they > > > wanted to use a deferred probe timeout. > > > > I don't think that's quite true, since the point of my original > > changes were to avoid troubles I was seeing with drivers not loading > > because once the timeout fired after init, driver loading would fail > > with ENODEV instead of returning EPROBE_DEFER. The logic that existed > > was buggy so the timeout handling didn't really work (changing the > > boot argument wouldn't help, because after init the logic would return > > ENODEV before it checked the timeout value). > > > > That said, looking at it now, I do realize the > > driver_deferred_probe_check_state_continue() logic in effect never > > returned ETIMEDOUT before was consolidated in the earlier changes, and > > now we've backed the default timeout to 0, old user (see bec6c0ecb243) > > will now get ETIMEDOUT where th
Re: [PATCH v3 1/3] driver core: Revert default driver_deferred_probe_timeout value to 0
On Thu, Aug 06, 2020 at 07:09:16PM -0700, John Stultz wrote: > On Thu, Aug 6, 2020 at 6:52 AM Thierry Reding > wrote: > > > > On Wed, Apr 22, 2020 at 08:32:43PM +, John Stultz wrote: > > > This patch addresses a regression in 5.7-rc1+ > > > > > > In commit c8c43cee29f6 ("driver core: Fix > > > driver_deferred_probe_check_state() logic"), we both cleaned up > > > the logic and also set the default driver_deferred_probe_timeout > > > value to 30 seconds to allow for drivers that are missing > > > dependencies to have some time so that the dependency may be > > > loaded from userland after initcalls_done is set. > > > > > > However, Yoshihiro Shimoda reported that on his device that > > > expects to have unmet dependencies (due to "optional links" in > > > its devicetree), was failing to mount the NFS root. > > > > > > In digging further, it seemed the problem was that while the > > > device properly probes after waiting 30 seconds for any missing > > > modules to load, the ip_auto_config() had already failed, > > > resulting in NFS to fail. This was due to ip_auto_config() > > > calling wait_for_device_probe() which doesn't wait for the > > > driver_deferred_probe_timeout to fire. > > > > > > Fixing that issue is possible, but could also introduce 30 > > > second delays in bootups for users who don't have any > > > missing dependencies, which is not ideal. > > > > > > So I think the best solution to avoid any regressions is to > > > revert back to a default timeout value of zero, and allow > > > systems that need to utilize the timeout in order for userland > > > to load any modules that supply misisng dependencies in the dts > > > to specify the timeout length via the exiting documented boot > > > argument. > > > > > > Thanks to Geert for chasing down that ip_auto_config was why NFS > > > was failing in this case! > > > > > > Cc: "David S. Miller" > > > Cc: Alexey Kuznetsov > > > Cc: Hideaki YOSHIFUJI > > > Cc: Jakub Kicinski > > > Cc: Greg Kroah-Hartman > > > Cc: Rafael J. Wysocki > > > Cc: Rob Herring > > > Cc: Geert Uytterhoeven > > > Cc: Yoshihiro Shimoda > > > Cc: Robin Murphy > > > Cc: Andy Shevchenko > > > Cc: Sudeep Holla > > > Cc: Andy Shevchenko > > > Cc: Naresh Kamboju > > > Cc: Basil Eljuse > > > Cc: Ferry Toth > > > Cc: Arnd Bergmann > > > Cc: Anders Roxell > > > Cc: netdev > > > Cc: linux...@vger.kernel.org > > > Reported-by: Yoshihiro Shimoda > > > Tested-by: Yoshihiro Shimoda > > > Fixes: c8c43cee29f6 ("driver core: Fix > > > driver_deferred_probe_check_state() logic") > > > Signed-off-by: John Stultz > > > --- > > > drivers/base/dd.c | 13 ++--- > > > 1 file changed, 2 insertions(+), 11 deletions(-) > > > > Sorry for being a bit late to the party, but this breaks suspend/resume > > support on various Tegra devices. I've only noticed now because, well, > > suspend/resume have been broken for other reasons for a little while and > > it's taken us a bit to resolve those issues. > > > > But now that those other issues have been fixed, I've started seeing an > > issue where after resume from suspend some of the I2C controllers are no > > longer working. The reason for this is that they share pins with DP AUX > > controllers via the pinctrl framework. The DP AUX driver registers as > > part of the DRM/KMS driver, which usually happens in userspace. Since > > the deferred probe timeout was set to 0 by default this no longer works > > because no pinctrl states are assigned to the I2C controller and > > therefore upon resume the pins cannot be configured for I2C operation. > > Oof. My apologies! > > > I'm also somewhat confused by this patch and a few before because they > > claim that they restore previous default behaviour, but that's just not > > true. Originally when this timeout was introduced it was -1, which meant > > that there was no timeout at all and hence users had to opt-in if they > > wanted to use a deferred probe timeout. > > I don't think that's quite true, since the point of my original > changes were to avoid troubles I was seeing with drivers not loading > because once the timeout fired after init, driver loading would fail > with ENODEV instead of returning EPROBE_DEFER. The logic that existed > was buggy so the timeout handling didn't really work (changing the > boot argument wouldn't help, because after init the logic would return > ENODEV before it checked the timeout value). > > That said, looking at it now, I do realize the > driver_deferred_probe_check_state_continue() logic in effect never > returned ETIMEDOUT before was consolidated in the earlier changes, and > now we've backed the default timeout to 0, old user (see bec6c0ecb243) > will now get ETIMEDOUT where they wouldn't before. > > So would the following fix it up for you? (sorry its whitespace corrupted) > > diff --git a/drivers/pinctrl/devicetree.c b/drivers/pinctrl/devicetree.c > index c6fe7d64c913..c7448be64d07 100644 > --- a/drivers/pinctrl/devicetree.c > +++ b/drive
Re: [PATCH v3 1/3] driver core: Revert default driver_deferred_probe_timeout value to 0
On Thu, Aug 6, 2020 at 6:52 AM Thierry Reding wrote: > > On Wed, Apr 22, 2020 at 08:32:43PM +, John Stultz wrote: > > This patch addresses a regression in 5.7-rc1+ > > > > In commit c8c43cee29f6 ("driver core: Fix > > driver_deferred_probe_check_state() logic"), we both cleaned up > > the logic and also set the default driver_deferred_probe_timeout > > value to 30 seconds to allow for drivers that are missing > > dependencies to have some time so that the dependency may be > > loaded from userland after initcalls_done is set. > > > > However, Yoshihiro Shimoda reported that on his device that > > expects to have unmet dependencies (due to "optional links" in > > its devicetree), was failing to mount the NFS root. > > > > In digging further, it seemed the problem was that while the > > device properly probes after waiting 30 seconds for any missing > > modules to load, the ip_auto_config() had already failed, > > resulting in NFS to fail. This was due to ip_auto_config() > > calling wait_for_device_probe() which doesn't wait for the > > driver_deferred_probe_timeout to fire. > > > > Fixing that issue is possible, but could also introduce 30 > > second delays in bootups for users who don't have any > > missing dependencies, which is not ideal. > > > > So I think the best solution to avoid any regressions is to > > revert back to a default timeout value of zero, and allow > > systems that need to utilize the timeout in order for userland > > to load any modules that supply misisng dependencies in the dts > > to specify the timeout length via the exiting documented boot > > argument. > > > > Thanks to Geert for chasing down that ip_auto_config was why NFS > > was failing in this case! > > > > Cc: "David S. Miller" > > Cc: Alexey Kuznetsov > > Cc: Hideaki YOSHIFUJI > > Cc: Jakub Kicinski > > Cc: Greg Kroah-Hartman > > Cc: Rafael J. Wysocki > > Cc: Rob Herring > > Cc: Geert Uytterhoeven > > Cc: Yoshihiro Shimoda > > Cc: Robin Murphy > > Cc: Andy Shevchenko > > Cc: Sudeep Holla > > Cc: Andy Shevchenko > > Cc: Naresh Kamboju > > Cc: Basil Eljuse > > Cc: Ferry Toth > > Cc: Arnd Bergmann > > Cc: Anders Roxell > > Cc: netdev > > Cc: linux...@vger.kernel.org > > Reported-by: Yoshihiro Shimoda > > Tested-by: Yoshihiro Shimoda > > Fixes: c8c43cee29f6 ("driver core: Fix driver_deferred_probe_check_state() > > logic") > > Signed-off-by: John Stultz > > --- > > drivers/base/dd.c | 13 ++--- > > 1 file changed, 2 insertions(+), 11 deletions(-) > > Sorry for being a bit late to the party, but this breaks suspend/resume > support on various Tegra devices. I've only noticed now because, well, > suspend/resume have been broken for other reasons for a little while and > it's taken us a bit to resolve those issues. > > But now that those other issues have been fixed, I've started seeing an > issue where after resume from suspend some of the I2C controllers are no > longer working. The reason for this is that they share pins with DP AUX > controllers via the pinctrl framework. The DP AUX driver registers as > part of the DRM/KMS driver, which usually happens in userspace. Since > the deferred probe timeout was set to 0 by default this no longer works > because no pinctrl states are assigned to the I2C controller and > therefore upon resume the pins cannot be configured for I2C operation. Oof. My apologies! > I'm also somewhat confused by this patch and a few before because they > claim that they restore previous default behaviour, but that's just not > true. Originally when this timeout was introduced it was -1, which meant > that there was no timeout at all and hence users had to opt-in if they > wanted to use a deferred probe timeout. I don't think that's quite true, since the point of my original changes were to avoid troubles I was seeing with drivers not loading because once the timeout fired after init, driver loading would fail with ENODEV instead of returning EPROBE_DEFER. The logic that existed was buggy so the timeout handling didn't really work (changing the boot argument wouldn't help, because after init the logic would return ENODEV before it checked the timeout value). That said, looking at it now, I do realize the driver_deferred_probe_check_state_continue() logic in effect never returned ETIMEDOUT before was consolidated in the earlier changes, and now we've backed the default timeout to 0, old user (see bec6c0ecb243) will now get ETIMEDOUT where they wouldn't before. So would the following fix it up for you? (sorry its whitespace corrupted) diff --git a/drivers/pinctrl/devicetree.c b/drivers/pinctrl/devicetree.c index c6fe7d64c913..c7448be64d07 100644 --- a/drivers/pinctrl/devicetree.c +++ b/drivers/pinctrl/devicetree.c @@ -129,9 +129,8 @@ static int dt_to_map_one_config(struct pinctrl *p, if (!np_pctldev || of_node_is_root(np_pctldev)) { of_node_put(np_pctldev); ret = driver_deferred_probe_check_state(p->dev
Re: [PATCH v3 1/3] driver core: Revert default driver_deferred_probe_timeout value to 0
On Wed, Apr 22, 2020 at 08:32:43PM +, John Stultz wrote: > This patch addresses a regression in 5.7-rc1+ > > In commit c8c43cee29f6 ("driver core: Fix > driver_deferred_probe_check_state() logic"), we both cleaned up > the logic and also set the default driver_deferred_probe_timeout > value to 30 seconds to allow for drivers that are missing > dependencies to have some time so that the dependency may be > loaded from userland after initcalls_done is set. > > However, Yoshihiro Shimoda reported that on his device that > expects to have unmet dependencies (due to "optional links" in > its devicetree), was failing to mount the NFS root. > > In digging further, it seemed the problem was that while the > device properly probes after waiting 30 seconds for any missing > modules to load, the ip_auto_config() had already failed, > resulting in NFS to fail. This was due to ip_auto_config() > calling wait_for_device_probe() which doesn't wait for the > driver_deferred_probe_timeout to fire. > > Fixing that issue is possible, but could also introduce 30 > second delays in bootups for users who don't have any > missing dependencies, which is not ideal. > > So I think the best solution to avoid any regressions is to > revert back to a default timeout value of zero, and allow > systems that need to utilize the timeout in order for userland > to load any modules that supply misisng dependencies in the dts > to specify the timeout length via the exiting documented boot > argument. > > Thanks to Geert for chasing down that ip_auto_config was why NFS > was failing in this case! > > Cc: "David S. Miller" > Cc: Alexey Kuznetsov > Cc: Hideaki YOSHIFUJI > Cc: Jakub Kicinski > Cc: Greg Kroah-Hartman > Cc: Rafael J. Wysocki > Cc: Rob Herring > Cc: Geert Uytterhoeven > Cc: Yoshihiro Shimoda > Cc: Robin Murphy > Cc: Andy Shevchenko > Cc: Sudeep Holla > Cc: Andy Shevchenko > Cc: Naresh Kamboju > Cc: Basil Eljuse > Cc: Ferry Toth > Cc: Arnd Bergmann > Cc: Anders Roxell > Cc: netdev > Cc: linux...@vger.kernel.org > Reported-by: Yoshihiro Shimoda > Tested-by: Yoshihiro Shimoda > Fixes: c8c43cee29f6 ("driver core: Fix driver_deferred_probe_check_state() > logic") > Signed-off-by: John Stultz > --- > drivers/base/dd.c | 13 ++--- > 1 file changed, 2 insertions(+), 11 deletions(-) Sorry for being a bit late to the party, but this breaks suspend/resume support on various Tegra devices. I've only noticed now because, well, suspend/resume have been broken for other reasons for a little while and it's taken us a bit to resolve those issues. But now that those other issues have been fixed, I've started seeing an issue where after resume from suspend some of the I2C controllers are no longer working. The reason for this is that they share pins with DP AUX controllers via the pinctrl framework. The DP AUX driver registers as part of the DRM/KMS driver, which usually happens in userspace. Since the deferred probe timeout was set to 0 by default this no longer works because no pinctrl states are assigned to the I2C controller and therefore upon resume the pins cannot be configured for I2C operation. I'm also somewhat confused by this patch and a few before because they claim that they restore previous default behaviour, but that's just not true. Originally when this timeout was introduced it was -1, which meant that there was no timeout at all and hence users had to opt-in if they wanted to use a deferred probe timeout. But now after this series the default is for there to be a very short timeout, which in turn causes existing use-cases to potentially break. I'm also going to suggest here that in most cases a driver will require the resources that it asks for, so the case that Yoshihiro described and that this patch is meant to fix sounds to me like it's the odd one out rather than the other way around. But I realize that that's not very constructive. So perhaps we can find some other way for drivers to advertise that their dependencies are optional? I came up with the below patch, which restores suspend/resume on Tegra and could be used in conjunction with a patch that opts into this behaviour for the problematic driver in Yoshihiro's case to make this again work for everyone. --- >8 --- From a95f8f41b8a32dee3434db4f0515af7376d1873a Mon Sep 17 00:00:00 2001 From: Thierry Reding Date: Thu, 6 Aug 2020 14:51:59 +0200 Subject: [PATCH] driver core: Do not ignore dependencies by default Many drivers do require the resources that they ask for and timing out may not always be an option. While there is a way to allow probing to continue to be deferred for some time after the system has booted, the fact that this is controlled via a command-line parameter is undesired because it require manual intervention, whereas in can be avoid in the majority of cases. Instead of requiring users to edit the kernel command-line, add a way for drivers to specify whether or not their dependencies are optional so that the
Re: [PATCH v3 1/3] driver core: Revert default driver_deferred_probe_timeout value to 0
On Wed, Apr 29, 2020 at 6:52 AM Mark Brown wrote: > On Wed, Apr 29, 2020 at 03:46:04PM +0200, Marek Szyprowski wrote: > > On 22.04.2020 22:32, John Stultz wrote: > > > > Fixes: c8c43cee29f6 ("driver core: Fix > > > driver_deferred_probe_check_state() logic") > > > Signed-off-by: John Stultz > > > Please also revert dca0b44957e5 "regulator: Use > > driver_deferred_probe_timeout for regulator_init_complete_work" then, > > because now with the default 0 timeout some regulators gets disabled > > during boot, before their supplies gets instantiated. > > Yes, please - I requested this when the revert was originally proposed :( Oh, my apologies. I misunderstood what you were suggesting earlier. Sorry for being thick headed. I'll spin up a revert here shortly. thanks -john
Re: [PATCH v3 1/3] driver core: Revert default driver_deferred_probe_timeout value to 0
On Wed, Apr 29, 2020 at 03:46:04PM +0200, Marek Szyprowski wrote: > On 22.04.2020 22:32, John Stultz wrote: > > Fixes: c8c43cee29f6 ("driver core: Fix driver_deferred_probe_check_state() > > logic") > > Signed-off-by: John Stultz > Please also revert dca0b44957e5 "regulator: Use > driver_deferred_probe_timeout for regulator_init_complete_work" then, > because now with the default 0 timeout some regulators gets disabled > during boot, before their supplies gets instantiated. Yes, please - I requested this when the revert was originally proposed :( signature.asc Description: PGP signature
Re: [PATCH v3 1/3] driver core: Revert default driver_deferred_probe_timeout value to 0
Hi John, On 22.04.2020 22:32, John Stultz wrote: > This patch addresses a regression in 5.7-rc1+ > > In commit c8c43cee29f6 ("driver core: Fix > driver_deferred_probe_check_state() logic"), we both cleaned up > the logic and also set the default driver_deferred_probe_timeout > value to 30 seconds to allow for drivers that are missing > dependencies to have some time so that the dependency may be > loaded from userland after initcalls_done is set. > > However, Yoshihiro Shimoda reported that on his device that > expects to have unmet dependencies (due to "optional links" in > its devicetree), was failing to mount the NFS root. > > In digging further, it seemed the problem was that while the > device properly probes after waiting 30 seconds for any missing > modules to load, the ip_auto_config() had already failed, > resulting in NFS to fail. This was due to ip_auto_config() > calling wait_for_device_probe() which doesn't wait for the > driver_deferred_probe_timeout to fire. > > Fixing that issue is possible, but could also introduce 30 > second delays in bootups for users who don't have any > missing dependencies, which is not ideal. > > So I think the best solution to avoid any regressions is to > revert back to a default timeout value of zero, and allow > systems that need to utilize the timeout in order for userland > to load any modules that supply misisng dependencies in the dts > to specify the timeout length via the exiting documented boot > argument. > > Thanks to Geert for chasing down that ip_auto_config was why NFS > was failing in this case! > > Cc: "David S. Miller" > Cc: Alexey Kuznetsov > Cc: Hideaki YOSHIFUJI > Cc: Jakub Kicinski > Cc: Greg Kroah-Hartman > Cc: Rafael J. Wysocki > Cc: Rob Herring > Cc: Geert Uytterhoeven > Cc: Yoshihiro Shimoda > Cc: Robin Murphy > Cc: Andy Shevchenko > Cc: Sudeep Holla > Cc: Andy Shevchenko > Cc: Naresh Kamboju > Cc: Basil Eljuse > Cc: Ferry Toth > Cc: Arnd Bergmann > Cc: Anders Roxell > Cc: netdev > Cc: linux...@vger.kernel.org > Reported-by: Yoshihiro Shimoda > Tested-by: Yoshihiro Shimoda > Fixes: c8c43cee29f6 ("driver core: Fix driver_deferred_probe_check_state() > logic") > Signed-off-by: John Stultz Please also revert dca0b44957e5 "regulator: Use driver_deferred_probe_timeout for regulator_init_complete_work" then, because now with the default 0 timeout some regulators gets disabled during boot, before their supplies gets instantiated. This patch broke booting of Samsung Exynos5800-based Peach-Pi Chromeboot with the default multi_v7_defconfig. > --- > drivers/base/dd.c | 13 ++--- > 1 file changed, 2 insertions(+), 11 deletions(-) > > diff --git a/drivers/base/dd.c b/drivers/base/dd.c > index 06ec0e851fa1..908ae4d7805e 100644 > --- a/drivers/base/dd.c > +++ b/drivers/base/dd.c > @@ -224,16 +224,7 @@ static int deferred_devs_show(struct seq_file *s, void > *data) > } > DEFINE_SHOW_ATTRIBUTE(deferred_devs); > > -#ifdef CONFIG_MODULES > -/* > - * In the case of modules, set the default probe timeout to > - * 30 seconds to give userland some time to load needed modules > - */ > -int driver_deferred_probe_timeout = 30; > -#else > -/* In the case of !modules, no probe timeout needed */ > -int driver_deferred_probe_timeout = -1; > -#endif > +int driver_deferred_probe_timeout; > EXPORT_SYMBOL_GPL(driver_deferred_probe_timeout); > > static int __init deferred_probe_timeout_setup(char *str) > @@ -266,7 +257,7 @@ int driver_deferred_probe_check_state(struct device *dev) > return -ENODEV; > } > > - if (!driver_deferred_probe_timeout) { > + if (!driver_deferred_probe_timeout && initcalls_done) { > dev_WARN(dev, "deferred probe timeout, ignoring dependency"); > return -ETIMEDOUT; > } Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland