Re: [PATCH] watchdog: add a parameter for stop wdt before register
On 01/15/14 at 11:55am, Vivek Goyal wrote: > On Wed, Jan 15, 2014 at 12:15:56PM +, One Thousand Gnomes wrote: > > > watchdog and crash dump really conflicts to some degree, from the watchdog > > > point of view it can reboot system whhen kdump kernel hangs. But from > > > kdump > > > point of view it want ensure saving the vmcore for later debugging. > > > > > > Maybe we can only select only one in this case. > > > > You want to be able to make a decision at runtime which to use. > > > > > > - if it can be stopped, you can open and stop it > > > > > > For the last one since crashing happens we have no chance to open and > > > stop. > > > > When you decide you need to set up to catch a core rather than just > > crash you can open and stop the watchdog (if supported), and you can then > > set up for a kdump and then at some point later if it crashes capture the > > dump. > > Disabling watchdog if kdump serice starts will not make many happy. If > kernel hangs, we don't have a functionality to reboot it. > > What about other idea of keeping watchdog interval long enough that new > kernel can boot, driver can load and then new driver/user space can > continue to kick the watchdog. And if second kernel hangs, watchdog will > reboot the system. I would agree this way. Thanks Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] watchdog: add a parameter for stop wdt before register
On 01/15/14 at 11:46am, Vivek Goyal wrote: > On Wed, Jan 15, 2014 at 09:11:42AM +0800, Dave Young wrote: > > [..] > > > I thought this problem was resolved (atleast conceptually) last time > > > when Don Zickus brought it up. > > > > > > He mentioned that it was concluded that keep watchdog interval long > > > enough, say 60 seconds and keep on kicking it fast enough, say every > > > 10-20 seconds. That would ensure that after the crash, there is atleast > > > 60 - 20 = 40 seconds left before watchdog expires. And in that duration > > > we should try to boot into second kernel load watchdog driver early enough > > > from initramfs which can start kicking watchdog again. > > > > Some drivers did stop the watchdog while module loading such as iTCO_wdt. > > Instead of stopping why not keep on kicking it till user space takes > over this job. This will also make sure that if kdump kernel hangs, > watchdog wil do the job it is supposed to do? Yes, rethinking about this problem, kicking it is better than stopping it. But we will also have more uncertern time between userspace kicking before and after panic. Thanks Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] watchdog: add a parameter for stop wdt before register
On Wed, Jan 15, 2014 at 12:15:56PM +, One Thousand Gnomes wrote: > > watchdog and crash dump really conflicts to some degree, from the watchdog > > point of view it can reboot system whhen kdump kernel hangs. But from kdump > > point of view it want ensure saving the vmcore for later debugging. > > > > Maybe we can only select only one in this case. > > You want to be able to make a decision at runtime which to use. > > > > - if it can be stopped, you can open and stop it > > > > For the last one since crashing happens we have no chance to open and stop. > > When you decide you need to set up to catch a core rather than just > crash you can open and stop the watchdog (if supported), and you can then > set up for a kdump and then at some point later if it crashes capture the > dump. Disabling watchdog if kdump serice starts will not make many happy. If kernel hangs, we don't have a functionality to reboot it. What about other idea of keeping watchdog interval long enough that new kernel can boot, driver can load and then new driver/user space can continue to kick the watchdog. And if second kernel hangs, watchdog will reboot the system. Thanks Vivek -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] watchdog: add a parameter for stop wdt before register
On Wed, Jan 15, 2014 at 09:11:42AM +0800, Dave Young wrote: [..] > > I thought this problem was resolved (atleast conceptually) last time > > when Don Zickus brought it up. > > > > He mentioned that it was concluded that keep watchdog interval long > > enough, say 60 seconds and keep on kicking it fast enough, say every > > 10-20 seconds. That would ensure that after the crash, there is atleast > > 60 - 20 = 40 seconds left before watchdog expires. And in that duration > > we should try to boot into second kernel load watchdog driver early enough > > from initramfs which can start kicking watchdog again. > > Some drivers did stop the watchdog while module loading such as iTCO_wdt. Instead of stopping why not keep on kicking it till user space takes over this job. This will also make sure that if kdump kernel hangs, watchdog wil do the job it is supposed to do? Thanks Vivek -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] watchdog: add a parameter for stop wdt before register
> watchdog and crash dump really conflicts to some degree, from the watchdog > point of view it can reboot system whhen kdump kernel hangs. But from kdump > point of view it want ensure saving the vmcore for later debugging. > > Maybe we can only select only one in this case. You want to be able to make a decision at runtime which to use. > > - if it can be stopped, you can open and stop it > > For the last one since crashing happens we have no chance to open and stop. When you decide you need to set up to catch a core rather than just crash you can open and stop the watchdog (if supported), and you can then set up for a kdump and then at some point later if it crashes capture the dump. As you say the two are basically incompatible models of operating, but that also means if you are about to take a dump you want ensure you will not be disturbed. So as far as I can see if you might need to take a dump, turn the watchdog off in advance. Worrying about it as you take a dump is too late. Alan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] watchdog: add a parameter for stop wdt before register
watchdog and crash dump really conflicts to some degree, from the watchdog point of view it can reboot system whhen kdump kernel hangs. But from kdump point of view it want ensure saving the vmcore for later debugging. Maybe we can only select only one in this case. You want to be able to make a decision at runtime which to use. - if it can be stopped, you can open and stop it For the last one since crashing happens we have no chance to open and stop. When you decide you need to set up to catch a core rather than just crash you can open and stop the watchdog (if supported), and you can then set up for a kdump and then at some point later if it crashes capture the dump. As you say the two are basically incompatible models of operating, but that also means if you are about to take a dump you want ensure you will not be disturbed. So as far as I can see if you might need to take a dump, turn the watchdog off in advance. Worrying about it as you take a dump is too late. Alan -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] watchdog: add a parameter for stop wdt before register
On Wed, Jan 15, 2014 at 09:11:42AM +0800, Dave Young wrote: [..] I thought this problem was resolved (atleast conceptually) last time when Don Zickus brought it up. He mentioned that it was concluded that keep watchdog interval long enough, say 60 seconds and keep on kicking it fast enough, say every 10-20 seconds. That would ensure that after the crash, there is atleast 60 - 20 = 40 seconds left before watchdog expires. And in that duration we should try to boot into second kernel load watchdog driver early enough from initramfs which can start kicking watchdog again. Some drivers did stop the watchdog while module loading such as iTCO_wdt. Instead of stopping why not keep on kicking it till user space takes over this job. This will also make sure that if kdump kernel hangs, watchdog wil do the job it is supposed to do? Thanks Vivek -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] watchdog: add a parameter for stop wdt before register
On Wed, Jan 15, 2014 at 12:15:56PM +, One Thousand Gnomes wrote: watchdog and crash dump really conflicts to some degree, from the watchdog point of view it can reboot system whhen kdump kernel hangs. But from kdump point of view it want ensure saving the vmcore for later debugging. Maybe we can only select only one in this case. You want to be able to make a decision at runtime which to use. - if it can be stopped, you can open and stop it For the last one since crashing happens we have no chance to open and stop. When you decide you need to set up to catch a core rather than just crash you can open and stop the watchdog (if supported), and you can then set up for a kdump and then at some point later if it crashes capture the dump. Disabling watchdog if kdump serice starts will not make many happy. If kernel hangs, we don't have a functionality to reboot it. What about other idea of keeping watchdog interval long enough that new kernel can boot, driver can load and then new driver/user space can continue to kick the watchdog. And if second kernel hangs, watchdog will reboot the system. Thanks Vivek -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] watchdog: add a parameter for stop wdt before register
On 01/15/14 at 11:46am, Vivek Goyal wrote: On Wed, Jan 15, 2014 at 09:11:42AM +0800, Dave Young wrote: [..] I thought this problem was resolved (atleast conceptually) last time when Don Zickus brought it up. He mentioned that it was concluded that keep watchdog interval long enough, say 60 seconds and keep on kicking it fast enough, say every 10-20 seconds. That would ensure that after the crash, there is atleast 60 - 20 = 40 seconds left before watchdog expires. And in that duration we should try to boot into second kernel load watchdog driver early enough from initramfs which can start kicking watchdog again. Some drivers did stop the watchdog while module loading such as iTCO_wdt. Instead of stopping why not keep on kicking it till user space takes over this job. This will also make sure that if kdump kernel hangs, watchdog wil do the job it is supposed to do? Yes, rethinking about this problem, kicking it is better than stopping it. But we will also have more uncertern time between userspace kicking before and after panic. Thanks Dave -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] watchdog: add a parameter for stop wdt before register
On 01/15/14 at 11:55am, Vivek Goyal wrote: On Wed, Jan 15, 2014 at 12:15:56PM +, One Thousand Gnomes wrote: watchdog and crash dump really conflicts to some degree, from the watchdog point of view it can reboot system whhen kdump kernel hangs. But from kdump point of view it want ensure saving the vmcore for later debugging. Maybe we can only select only one in this case. You want to be able to make a decision at runtime which to use. - if it can be stopped, you can open and stop it For the last one since crashing happens we have no chance to open and stop. When you decide you need to set up to catch a core rather than just crash you can open and stop the watchdog (if supported), and you can then set up for a kdump and then at some point later if it crashes capture the dump. Disabling watchdog if kdump serice starts will not make many happy. If kernel hangs, we don't have a functionality to reboot it. What about other idea of keeping watchdog interval long enough that new kernel can boot, driver can load and then new driver/user space can continue to kick the watchdog. And if second kernel hangs, watchdog will reboot the system. I would agree this way. Thanks Dave -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] watchdog: add a parameter for stop wdt before register
On 01/14/14 at 11:24am, Vivek Goyal wrote: > On Tue, Jan 14, 2014 at 12:16:39PM +, One Thousand Gnomes wrote: > > On Tue, 14 Jan 2014 16:23:23 +0800 > > Dave Young wrote: > > > > > In kdump kernel watchdog could interrupt vmcore capturing because we > > > have no way to disable/stop it while crashing happens. > > > > Lots of watchdogs cannot be stopped. > > > > > Add a module parameter stop_before_register so watchdog can be stopped > > > before register in driver loading path. Thus we can try to load the > > > watchdog driver as early as possible in kdump kernel to ensure vmcore > > > capturing. > > > > If you want to kdump then don't start the watchdog. The goal of the > > watchdog is to make sure the system never gets stuck. Adding conditions > > and special cases simply increases the odds of something bad not > > triggering the watchdog. > > > > If you have a system that can stop the watchdog then providing no way out > > is not set you can open it and stop it. > > > > I don't see the need for any kernel change here > > > > - if it can't be stopped you lost > > - if "nowayout" is set then by design you lost > > - if it can be stopped, you can open and stop it > > > > Now whether in the !nowayout case the watchdog core should catch whatever > > hooks/notifiers are available and stop any watchdogs it can on a > > kexec/kdump is a more interesting question and probably needs to default > > to not doing so but with the option to force otherwise for debugging work. > > Hi All, > > I thought this problem was resolved (atleast conceptually) last time > when Don Zickus brought it up. > > He mentioned that it was concluded that keep watchdog interval long > enough, say 60 seconds and keep on kicking it fast enough, say every > 10-20 seconds. That would ensure that after the crash, there is atleast > 60 - 20 = 40 seconds left before watchdog expires. And in that duration > we should try to boot into second kernel load watchdog driver early enough > from initramfs which can start kicking watchdog again. Some drivers did stop the watchdog while module loading such as iTCO_wdt. so we can load them as early as possible and not necessary to kick them again. Thus I wrote this patch to add the stop to generic code so more drivers can be covered. But as Alan said I begin to feel that this is not a good design. the iTCO_wdt nowayout become useless because of this stop_before_register.. For wdts which can not be stopped we can still continue working on kicking them early in initramfs. > > I am wondering what happened to this idea. Dave, did we try to implement/ > experiment with this? No, we are just begin with iTCO_wdt and is trying to add iTCO_wdt firstly. Other devices has not been investigated. Thanks Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] watchdog: add a parameter for stop wdt before register
On 01/14/14 at 12:16pm, One Thousand Gnomes wrote: > On Tue, 14 Jan 2014 16:23:23 +0800 > Dave Young wrote: > > > In kdump kernel watchdog could interrupt vmcore capturing because we > > have no way to disable/stop it while crashing happens. > > Lots of watchdogs cannot be stopped. > > > Add a module parameter stop_before_register so watchdog can be stopped > > before register in driver loading path. Thus we can try to load the > > watchdog driver as early as possible in kdump kernel to ensure vmcore > > capturing. > > If you want to kdump then don't start the watchdog. The goal of the > watchdog is to make sure the system never gets stuck. Adding conditions > and special cases simply increases the odds of something bad not > triggering the watchdog. watchdog and crash dump really conflicts to some degree, from the watchdog point of view it can reboot system whhen kdump kernel hangs. But from kdump point of view it want ensure saving the vmcore for later debugging. Maybe we can only select only one in this case. > > If you have a system that can stop the watchdog then providing no way out > is not set you can open it and stop it. > > I don't see the need for any kernel change here > > - if it can't be stopped you lost > - if "nowayout" is set then by design you lost > - if it can be stopped, you can open and stop it For the last one since crashing happens we have no chance to open and stop. > > Now whether in the !nowayout case the watchdog core should catch whatever > hooks/notifiers are available and stop any watchdogs it can on a > kexec/kdump is a more interesting question and probably needs to default > to not doing so but with the option to force otherwise for debugging work. Unfortunately in crash path there's no chance to do so. It's not good to add more logic in that path as well. Thanks Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] watchdog: add a parameter for stop wdt before register
On Tue, Jan 14, 2014 at 12:16:39PM +, One Thousand Gnomes wrote: > On Tue, 14 Jan 2014 16:23:23 +0800 > Dave Young wrote: > > > In kdump kernel watchdog could interrupt vmcore capturing because we > > have no way to disable/stop it while crashing happens. > > Lots of watchdogs cannot be stopped. > > > Add a module parameter stop_before_register so watchdog can be stopped > > before register in driver loading path. Thus we can try to load the > > watchdog driver as early as possible in kdump kernel to ensure vmcore > > capturing. > > If you want to kdump then don't start the watchdog. The goal of the > watchdog is to make sure the system never gets stuck. Adding conditions > and special cases simply increases the odds of something bad not > triggering the watchdog. > > If you have a system that can stop the watchdog then providing no way out > is not set you can open it and stop it. > > I don't see the need for any kernel change here > > - if it can't be stopped you lost > - if "nowayout" is set then by design you lost > - if it can be stopped, you can open and stop it > > Now whether in the !nowayout case the watchdog core should catch whatever > hooks/notifiers are available and stop any watchdogs it can on a > kexec/kdump is a more interesting question and probably needs to default > to not doing so but with the option to force otherwise for debugging work. Hi All, I thought this problem was resolved (atleast conceptually) last time when Don Zickus brought it up. He mentioned that it was concluded that keep watchdog interval long enough, say 60 seconds and keep on kicking it fast enough, say every 10-20 seconds. That would ensure that after the crash, there is atleast 60 - 20 = 40 seconds left before watchdog expires. And in that duration we should try to boot into second kernel load watchdog driver early enough from initramfs which can start kicking watchdog again. I am wondering what happened to this idea. Dave, did we try to implement/ experiment with this? Thanks Vivek -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] watchdog: add a parameter for stop wdt before register
On Tue, 14 Jan 2014 16:23:23 +0800 Dave Young wrote: > In kdump kernel watchdog could interrupt vmcore capturing because we > have no way to disable/stop it while crashing happens. Lots of watchdogs cannot be stopped. > Add a module parameter stop_before_register so watchdog can be stopped > before register in driver loading path. Thus we can try to load the > watchdog driver as early as possible in kdump kernel to ensure vmcore > capturing. If you want to kdump then don't start the watchdog. The goal of the watchdog is to make sure the system never gets stuck. Adding conditions and special cases simply increases the odds of something bad not triggering the watchdog. If you have a system that can stop the watchdog then providing no way out is not set you can open it and stop it. I don't see the need for any kernel change here - if it can't be stopped you lost - if "nowayout" is set then by design you lost - if it can be stopped, you can open and stop it Now whether in the !nowayout case the watchdog core should catch whatever hooks/notifiers are available and stop any watchdogs it can on a kexec/kdump is a more interesting question and probably needs to default to not doing so but with the option to force otherwise for debugging work. Alan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] watchdog: add a parameter for stop wdt before register
On 01/14/14 at 04:41pm, Dave Young wrote: > On 01/14/14 at 09:26am, Wim Van Sebroeck wrote: > > Hi Dave, > > > > > In kdump kernel watchdog could interrupt vmcore capturing because we > > > have no way to disable/stop it while crashing happens. > > > > > > Add a module parameter stop_before_register so watchdog can be stopped > > > before register in driver loading path. Thus we can try to load the > > > watchdog driver as early as possible in kdump kernel to ensure vmcore > > > capturing. > > > > > > Don Zickus mentioned that there's the case that bios start the watchdog > > > and it is expected that the kernel keep the watchdog alive. To address > > > this case I added the module parameter which is false by default so > > > it will stop the watchdog only when user provice kernel cmdline > > > "watchdog.stop_before_register=1". > > > > > > Signed-off-by: Dave Young > > > --- > > > drivers/watchdog/watchdog_core.c |6 ++ > > > 1 file changed, 6 insertions(+) > > > > > > --- linux.orig/drivers/watchdog/watchdog_core.c > > > +++ linux/drivers/watchdog/watchdog_core.c > > > @@ -42,6 +42,7 @@ > > > > > > static DEFINE_IDA(watchdog_ida); > > > static struct class *watchdog_class; > > > +static bool stop_before_register; > > > > > > static void watchdog_check_min_max_timeout(struct watchdog_device *wdd) > > > { > > > @@ -119,6 +120,9 @@ int watchdog_register_device(struct watc > > > if (wdd->ops->start == NULL || wdd->ops->stop == NULL) > > > return -EINVAL; > > > > > > + if (stop_before_register) > > > + wdd->ops->stop(wdd); > > > + > > > watchdog_check_min_max_timeout(wdd); > > > > > > /* > > > @@ -220,6 +224,8 @@ static void __exit watchdog_exit(void) > > > subsys_initcall(watchdog_init); > > > module_exit(watchdog_exit); > > > > > > +module_param(stop_before_register, bool, 0644); > > > + > > > MODULE_AUTHOR("Alan Cox "); > > > MODULE_AUTHOR("Wim Van Sebroeck "); > > > MODULE_DESCRIPTION("WatchDog Timer Driver Core"); > > > > Hmm, need to look closer to this, but my first thought is: > > what about devices that cannot be stopped once started... > > They should be able to override this module_parameter... > > Hi, Wim > > Thanks for quick feedback! > > I'm not sure the meaning of "cannot be stopped", if it means that > the policy that it should not be stopped, I think since the watchdog_core > is always built-in so the param can only be provided via boot cmdline it would > be fine? Hmm, the wdt driver can be removed then insmod again. I think you are talking about the nowayout, for this case probably should add below: if (stop_before_register && !nowayout) stop it. But is there a general way for checkout nowayout, could you give some hints? > > For device which really *cannot* stop, the stop() will fail silently? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] watchdog: add a parameter for stop wdt before register
On 01/14/14 at 09:26am, Wim Van Sebroeck wrote: > Hi Dave, > > > In kdump kernel watchdog could interrupt vmcore capturing because we > > have no way to disable/stop it while crashing happens. > > > > Add a module parameter stop_before_register so watchdog can be stopped > > before register in driver loading path. Thus we can try to load the > > watchdog driver as early as possible in kdump kernel to ensure vmcore > > capturing. > > > > Don Zickus mentioned that there's the case that bios start the watchdog > > and it is expected that the kernel keep the watchdog alive. To address > > this case I added the module parameter which is false by default so > > it will stop the watchdog only when user provice kernel cmdline > > "watchdog.stop_before_register=1". > > > > Signed-off-by: Dave Young > > --- > > drivers/watchdog/watchdog_core.c |6 ++ > > 1 file changed, 6 insertions(+) > > > > --- linux.orig/drivers/watchdog/watchdog_core.c > > +++ linux/drivers/watchdog/watchdog_core.c > > @@ -42,6 +42,7 @@ > > > > static DEFINE_IDA(watchdog_ida); > > static struct class *watchdog_class; > > +static bool stop_before_register; > > > > static void watchdog_check_min_max_timeout(struct watchdog_device *wdd) > > { > > @@ -119,6 +120,9 @@ int watchdog_register_device(struct watc > > if (wdd->ops->start == NULL || wdd->ops->stop == NULL) > > return -EINVAL; > > > > + if (stop_before_register) > > + wdd->ops->stop(wdd); > > + > > watchdog_check_min_max_timeout(wdd); > > > > /* > > @@ -220,6 +224,8 @@ static void __exit watchdog_exit(void) > > subsys_initcall(watchdog_init); > > module_exit(watchdog_exit); > > > > +module_param(stop_before_register, bool, 0644); > > + > > MODULE_AUTHOR("Alan Cox "); > > MODULE_AUTHOR("Wim Van Sebroeck "); > > MODULE_DESCRIPTION("WatchDog Timer Driver Core"); > > Hmm, need to look closer to this, but my first thought is: > what about devices that cannot be stopped once started... > They should be able to override this module_parameter... Hi, Wim Thanks for quick feedback! I'm not sure the meaning of "cannot be stopped", if it means that the policy that it should not be stopped, I think since the watchdog_core is always built-in so the param can only be provided via boot cmdline it would be fine? For device which really *cannot* stop, the stop() will fail silently? Thanks Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] watchdog: add a parameter for stop wdt before register
Hi Dave, > In kdump kernel watchdog could interrupt vmcore capturing because we > have no way to disable/stop it while crashing happens. > > Add a module parameter stop_before_register so watchdog can be stopped > before register in driver loading path. Thus we can try to load the > watchdog driver as early as possible in kdump kernel to ensure vmcore > capturing. > > Don Zickus mentioned that there's the case that bios start the watchdog > and it is expected that the kernel keep the watchdog alive. To address > this case I added the module parameter which is false by default so > it will stop the watchdog only when user provice kernel cmdline > "watchdog.stop_before_register=1". > > Signed-off-by: Dave Young > --- > drivers/watchdog/watchdog_core.c |6 ++ > 1 file changed, 6 insertions(+) > > --- linux.orig/drivers/watchdog/watchdog_core.c > +++ linux/drivers/watchdog/watchdog_core.c > @@ -42,6 +42,7 @@ > > static DEFINE_IDA(watchdog_ida); > static struct class *watchdog_class; > +static bool stop_before_register; > > static void watchdog_check_min_max_timeout(struct watchdog_device *wdd) > { > @@ -119,6 +120,9 @@ int watchdog_register_device(struct watc > if (wdd->ops->start == NULL || wdd->ops->stop == NULL) > return -EINVAL; > > + if (stop_before_register) > + wdd->ops->stop(wdd); > + > watchdog_check_min_max_timeout(wdd); > > /* > @@ -220,6 +224,8 @@ static void __exit watchdog_exit(void) > subsys_initcall(watchdog_init); > module_exit(watchdog_exit); > > +module_param(stop_before_register, bool, 0644); > + > MODULE_AUTHOR("Alan Cox "); > MODULE_AUTHOR("Wim Van Sebroeck "); > MODULE_DESCRIPTION("WatchDog Timer Driver Core"); Hmm, need to look closer to this, but my first thought is: what about devices that cannot be stopped once started... They should be able to override this module_parameter... Kind regards, Wim. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] watchdog: add a parameter for stop wdt before register
Hi Dave, In kdump kernel watchdog could interrupt vmcore capturing because we have no way to disable/stop it while crashing happens. Add a module parameter stop_before_register so watchdog can be stopped before register in driver loading path. Thus we can try to load the watchdog driver as early as possible in kdump kernel to ensure vmcore capturing. Don Zickus mentioned that there's the case that bios start the watchdog and it is expected that the kernel keep the watchdog alive. To address this case I added the module parameter which is false by default so it will stop the watchdog only when user provice kernel cmdline watchdog.stop_before_register=1. Signed-off-by: Dave Young dyo...@redhat.com --- drivers/watchdog/watchdog_core.c |6 ++ 1 file changed, 6 insertions(+) --- linux.orig/drivers/watchdog/watchdog_core.c +++ linux/drivers/watchdog/watchdog_core.c @@ -42,6 +42,7 @@ static DEFINE_IDA(watchdog_ida); static struct class *watchdog_class; +static bool stop_before_register; static void watchdog_check_min_max_timeout(struct watchdog_device *wdd) { @@ -119,6 +120,9 @@ int watchdog_register_device(struct watc if (wdd-ops-start == NULL || wdd-ops-stop == NULL) return -EINVAL; + if (stop_before_register) + wdd-ops-stop(wdd); + watchdog_check_min_max_timeout(wdd); /* @@ -220,6 +224,8 @@ static void __exit watchdog_exit(void) subsys_initcall(watchdog_init); module_exit(watchdog_exit); +module_param(stop_before_register, bool, 0644); + MODULE_AUTHOR(Alan Cox a...@lxorguk.ukuu.org.uk); MODULE_AUTHOR(Wim Van Sebroeck w...@iguana.be); MODULE_DESCRIPTION(WatchDog Timer Driver Core); Hmm, need to look closer to this, but my first thought is: what about devices that cannot be stopped once started... They should be able to override this module_parameter... Kind regards, Wim. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] watchdog: add a parameter for stop wdt before register
On 01/14/14 at 09:26am, Wim Van Sebroeck wrote: Hi Dave, In kdump kernel watchdog could interrupt vmcore capturing because we have no way to disable/stop it while crashing happens. Add a module parameter stop_before_register so watchdog can be stopped before register in driver loading path. Thus we can try to load the watchdog driver as early as possible in kdump kernel to ensure vmcore capturing. Don Zickus mentioned that there's the case that bios start the watchdog and it is expected that the kernel keep the watchdog alive. To address this case I added the module parameter which is false by default so it will stop the watchdog only when user provice kernel cmdline watchdog.stop_before_register=1. Signed-off-by: Dave Young dyo...@redhat.com --- drivers/watchdog/watchdog_core.c |6 ++ 1 file changed, 6 insertions(+) --- linux.orig/drivers/watchdog/watchdog_core.c +++ linux/drivers/watchdog/watchdog_core.c @@ -42,6 +42,7 @@ static DEFINE_IDA(watchdog_ida); static struct class *watchdog_class; +static bool stop_before_register; static void watchdog_check_min_max_timeout(struct watchdog_device *wdd) { @@ -119,6 +120,9 @@ int watchdog_register_device(struct watc if (wdd-ops-start == NULL || wdd-ops-stop == NULL) return -EINVAL; + if (stop_before_register) + wdd-ops-stop(wdd); + watchdog_check_min_max_timeout(wdd); /* @@ -220,6 +224,8 @@ static void __exit watchdog_exit(void) subsys_initcall(watchdog_init); module_exit(watchdog_exit); +module_param(stop_before_register, bool, 0644); + MODULE_AUTHOR(Alan Cox a...@lxorguk.ukuu.org.uk); MODULE_AUTHOR(Wim Van Sebroeck w...@iguana.be); MODULE_DESCRIPTION(WatchDog Timer Driver Core); Hmm, need to look closer to this, but my first thought is: what about devices that cannot be stopped once started... They should be able to override this module_parameter... Hi, Wim Thanks for quick feedback! I'm not sure the meaning of cannot be stopped, if it means that the policy that it should not be stopped, I think since the watchdog_core is always built-in so the param can only be provided via boot cmdline it would be fine? For device which really *cannot* stop, the stop() will fail silently? Thanks Dave -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] watchdog: add a parameter for stop wdt before register
On 01/14/14 at 04:41pm, Dave Young wrote: On 01/14/14 at 09:26am, Wim Van Sebroeck wrote: Hi Dave, In kdump kernel watchdog could interrupt vmcore capturing because we have no way to disable/stop it while crashing happens. Add a module parameter stop_before_register so watchdog can be stopped before register in driver loading path. Thus we can try to load the watchdog driver as early as possible in kdump kernel to ensure vmcore capturing. Don Zickus mentioned that there's the case that bios start the watchdog and it is expected that the kernel keep the watchdog alive. To address this case I added the module parameter which is false by default so it will stop the watchdog only when user provice kernel cmdline watchdog.stop_before_register=1. Signed-off-by: Dave Young dyo...@redhat.com --- drivers/watchdog/watchdog_core.c |6 ++ 1 file changed, 6 insertions(+) --- linux.orig/drivers/watchdog/watchdog_core.c +++ linux/drivers/watchdog/watchdog_core.c @@ -42,6 +42,7 @@ static DEFINE_IDA(watchdog_ida); static struct class *watchdog_class; +static bool stop_before_register; static void watchdog_check_min_max_timeout(struct watchdog_device *wdd) { @@ -119,6 +120,9 @@ int watchdog_register_device(struct watc if (wdd-ops-start == NULL || wdd-ops-stop == NULL) return -EINVAL; + if (stop_before_register) + wdd-ops-stop(wdd); + watchdog_check_min_max_timeout(wdd); /* @@ -220,6 +224,8 @@ static void __exit watchdog_exit(void) subsys_initcall(watchdog_init); module_exit(watchdog_exit); +module_param(stop_before_register, bool, 0644); + MODULE_AUTHOR(Alan Cox a...@lxorguk.ukuu.org.uk); MODULE_AUTHOR(Wim Van Sebroeck w...@iguana.be); MODULE_DESCRIPTION(WatchDog Timer Driver Core); Hmm, need to look closer to this, but my first thought is: what about devices that cannot be stopped once started... They should be able to override this module_parameter... Hi, Wim Thanks for quick feedback! I'm not sure the meaning of cannot be stopped, if it means that the policy that it should not be stopped, I think since the watchdog_core is always built-in so the param can only be provided via boot cmdline it would be fine? Hmm, the wdt driver can be removed then insmod again. I think you are talking about the nowayout, for this case probably should add below: if (stop_before_register !nowayout) stop it. But is there a general way for checkout nowayout, could you give some hints? For device which really *cannot* stop, the stop() will fail silently? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] watchdog: add a parameter for stop wdt before register
On Tue, 14 Jan 2014 16:23:23 +0800 Dave Young dyo...@redhat.com wrote: In kdump kernel watchdog could interrupt vmcore capturing because we have no way to disable/stop it while crashing happens. Lots of watchdogs cannot be stopped. Add a module parameter stop_before_register so watchdog can be stopped before register in driver loading path. Thus we can try to load the watchdog driver as early as possible in kdump kernel to ensure vmcore capturing. If you want to kdump then don't start the watchdog. The goal of the watchdog is to make sure the system never gets stuck. Adding conditions and special cases simply increases the odds of something bad not triggering the watchdog. If you have a system that can stop the watchdog then providing no way out is not set you can open it and stop it. I don't see the need for any kernel change here - if it can't be stopped you lost - if nowayout is set then by design you lost - if it can be stopped, you can open and stop it Now whether in the !nowayout case the watchdog core should catch whatever hooks/notifiers are available and stop any watchdogs it can on a kexec/kdump is a more interesting question and probably needs to default to not doing so but with the option to force otherwise for debugging work. Alan -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] watchdog: add a parameter for stop wdt before register
On Tue, Jan 14, 2014 at 12:16:39PM +, One Thousand Gnomes wrote: On Tue, 14 Jan 2014 16:23:23 +0800 Dave Young dyo...@redhat.com wrote: In kdump kernel watchdog could interrupt vmcore capturing because we have no way to disable/stop it while crashing happens. Lots of watchdogs cannot be stopped. Add a module parameter stop_before_register so watchdog can be stopped before register in driver loading path. Thus we can try to load the watchdog driver as early as possible in kdump kernel to ensure vmcore capturing. If you want to kdump then don't start the watchdog. The goal of the watchdog is to make sure the system never gets stuck. Adding conditions and special cases simply increases the odds of something bad not triggering the watchdog. If you have a system that can stop the watchdog then providing no way out is not set you can open it and stop it. I don't see the need for any kernel change here - if it can't be stopped you lost - if nowayout is set then by design you lost - if it can be stopped, you can open and stop it Now whether in the !nowayout case the watchdog core should catch whatever hooks/notifiers are available and stop any watchdogs it can on a kexec/kdump is a more interesting question and probably needs to default to not doing so but with the option to force otherwise for debugging work. Hi All, I thought this problem was resolved (atleast conceptually) last time when Don Zickus brought it up. He mentioned that it was concluded that keep watchdog interval long enough, say 60 seconds and keep on kicking it fast enough, say every 10-20 seconds. That would ensure that after the crash, there is atleast 60 - 20 = 40 seconds left before watchdog expires. And in that duration we should try to boot into second kernel load watchdog driver early enough from initramfs which can start kicking watchdog again. I am wondering what happened to this idea. Dave, did we try to implement/ experiment with this? Thanks Vivek -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] watchdog: add a parameter for stop wdt before register
On 01/14/14 at 12:16pm, One Thousand Gnomes wrote: On Tue, 14 Jan 2014 16:23:23 +0800 Dave Young dyo...@redhat.com wrote: In kdump kernel watchdog could interrupt vmcore capturing because we have no way to disable/stop it while crashing happens. Lots of watchdogs cannot be stopped. Add a module parameter stop_before_register so watchdog can be stopped before register in driver loading path. Thus we can try to load the watchdog driver as early as possible in kdump kernel to ensure vmcore capturing. If you want to kdump then don't start the watchdog. The goal of the watchdog is to make sure the system never gets stuck. Adding conditions and special cases simply increases the odds of something bad not triggering the watchdog. watchdog and crash dump really conflicts to some degree, from the watchdog point of view it can reboot system whhen kdump kernel hangs. But from kdump point of view it want ensure saving the vmcore for later debugging. Maybe we can only select only one in this case. If you have a system that can stop the watchdog then providing no way out is not set you can open it and stop it. I don't see the need for any kernel change here - if it can't be stopped you lost - if nowayout is set then by design you lost - if it can be stopped, you can open and stop it For the last one since crashing happens we have no chance to open and stop. Now whether in the !nowayout case the watchdog core should catch whatever hooks/notifiers are available and stop any watchdogs it can on a kexec/kdump is a more interesting question and probably needs to default to not doing so but with the option to force otherwise for debugging work. Unfortunately in crash path there's no chance to do so. It's not good to add more logic in that path as well. Thanks Dave -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] watchdog: add a parameter for stop wdt before register
On 01/14/14 at 11:24am, Vivek Goyal wrote: On Tue, Jan 14, 2014 at 12:16:39PM +, One Thousand Gnomes wrote: On Tue, 14 Jan 2014 16:23:23 +0800 Dave Young dyo...@redhat.com wrote: In kdump kernel watchdog could interrupt vmcore capturing because we have no way to disable/stop it while crashing happens. Lots of watchdogs cannot be stopped. Add a module parameter stop_before_register so watchdog can be stopped before register in driver loading path. Thus we can try to load the watchdog driver as early as possible in kdump kernel to ensure vmcore capturing. If you want to kdump then don't start the watchdog. The goal of the watchdog is to make sure the system never gets stuck. Adding conditions and special cases simply increases the odds of something bad not triggering the watchdog. If you have a system that can stop the watchdog then providing no way out is not set you can open it and stop it. I don't see the need for any kernel change here - if it can't be stopped you lost - if nowayout is set then by design you lost - if it can be stopped, you can open and stop it Now whether in the !nowayout case the watchdog core should catch whatever hooks/notifiers are available and stop any watchdogs it can on a kexec/kdump is a more interesting question and probably needs to default to not doing so but with the option to force otherwise for debugging work. Hi All, I thought this problem was resolved (atleast conceptually) last time when Don Zickus brought it up. He mentioned that it was concluded that keep watchdog interval long enough, say 60 seconds and keep on kicking it fast enough, say every 10-20 seconds. That would ensure that after the crash, there is atleast 60 - 20 = 40 seconds left before watchdog expires. And in that duration we should try to boot into second kernel load watchdog driver early enough from initramfs which can start kicking watchdog again. Some drivers did stop the watchdog while module loading such as iTCO_wdt. so we can load them as early as possible and not necessary to kick them again. Thus I wrote this patch to add the stop to generic code so more drivers can be covered. But as Alan said I begin to feel that this is not a good design. the iTCO_wdt nowayout become useless because of this stop_before_register.. For wdts which can not be stopped we can still continue working on kicking them early in initramfs. I am wondering what happened to this idea. Dave, did we try to implement/ experiment with this? No, we are just begin with iTCO_wdt and is trying to add iTCO_wdt firstly. Other devices has not been investigated. Thanks Dave -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/