Re: [PATCH] watchdog: add a parameter for stop wdt before register

2014-01-15 Thread Dave Young
On 01/15/14 at 11:55am, Vivek Goyal wrote:
> On Wed, Jan 15, 2014 at 12:15:56PM +, One Thousand Gnomes wrote:
> > > watchdog and crash dump really conflicts to some degree, from the watchdog
> > > point of view it can reboot system whhen kdump kernel hangs. But from 
> > > kdump
> > > point of view it want ensure saving the vmcore for later debugging.
> > > 
> > > Maybe we can only select only one in this case.
> > 
> > You want to be able to make a decision at runtime which to use.
> > 
> > > > - if it can be stopped, you can open and stop it
> > > 
> > > For the last one since crashing happens we have no chance to open and 
> > > stop.
> > 
> > When you decide you need to set up to catch a core rather than just
> > crash you can open and stop the watchdog (if supported), and you can then
> > set up for a kdump and then at some point later if it crashes capture the
> > dump.
> 
> Disabling watchdog if kdump serice starts will not make many happy. If
> kernel hangs, we don't have a functionality to reboot it.
> 
> What about other idea of keeping watchdog interval long enough that new
> kernel can boot, driver can load and then new driver/user space can
> continue to kick the watchdog. And if second kernel hangs, watchdog will
> reboot the system.

I would agree this way.

Thanks
Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] watchdog: add a parameter for stop wdt before register

2014-01-15 Thread Dave Young
On 01/15/14 at 11:46am, Vivek Goyal wrote:
> On Wed, Jan 15, 2014 at 09:11:42AM +0800, Dave Young wrote:
> 
> [..]
> > > I thought this problem was resolved (atleast conceptually) last time
> > > when Don Zickus brought it up.
> > > 
> > > He mentioned that it was concluded that keep watchdog interval long
> > > enough, say 60 seconds and keep on kicking it fast enough, say every
> > > 10-20 seconds. That would ensure that after the crash, there is atleast
> > > 60 - 20 = 40 seconds left before watchdog expires. And in that duration
> > > we should try to boot into second kernel load watchdog driver early enough
> > > from initramfs which can start kicking watchdog again.
> > 
> > Some drivers did stop the watchdog while module loading such as iTCO_wdt.
> 
> Instead of stopping why not keep on kicking it till user space takes
> over this job. This will also make sure that if kdump kernel hangs,
> watchdog wil do the job it is supposed to do?

Yes, rethinking about this problem, kicking it is better than stopping it.
But we will also have more uncertern time between userspace kicking before and
after panic.

Thanks
Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] watchdog: add a parameter for stop wdt before register

2014-01-15 Thread Vivek Goyal
On Wed, Jan 15, 2014 at 12:15:56PM +, One Thousand Gnomes wrote:
> > watchdog and crash dump really conflicts to some degree, from the watchdog
> > point of view it can reboot system whhen kdump kernel hangs. But from kdump
> > point of view it want ensure saving the vmcore for later debugging.
> > 
> > Maybe we can only select only one in this case.
> 
> You want to be able to make a decision at runtime which to use.
> 
> > > - if it can be stopped, you can open and stop it
> > 
> > For the last one since crashing happens we have no chance to open and stop.
> 
> When you decide you need to set up to catch a core rather than just
> crash you can open and stop the watchdog (if supported), and you can then
> set up for a kdump and then at some point later if it crashes capture the
> dump.

Disabling watchdog if kdump serice starts will not make many happy. If
kernel hangs, we don't have a functionality to reboot it.

What about other idea of keeping watchdog interval long enough that new
kernel can boot, driver can load and then new driver/user space can
continue to kick the watchdog. And if second kernel hangs, watchdog will
reboot the system.

Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] watchdog: add a parameter for stop wdt before register

2014-01-15 Thread Vivek Goyal
On Wed, Jan 15, 2014 at 09:11:42AM +0800, Dave Young wrote:

[..]
> > I thought this problem was resolved (atleast conceptually) last time
> > when Don Zickus brought it up.
> > 
> > He mentioned that it was concluded that keep watchdog interval long
> > enough, say 60 seconds and keep on kicking it fast enough, say every
> > 10-20 seconds. That would ensure that after the crash, there is atleast
> > 60 - 20 = 40 seconds left before watchdog expires. And in that duration
> > we should try to boot into second kernel load watchdog driver early enough
> > from initramfs which can start kicking watchdog again.
> 
> Some drivers did stop the watchdog while module loading such as iTCO_wdt.

Instead of stopping why not keep on kicking it till user space takes
over this job. This will also make sure that if kdump kernel hangs,
watchdog wil do the job it is supposed to do?

Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] watchdog: add a parameter for stop wdt before register

2014-01-15 Thread One Thousand Gnomes
> watchdog and crash dump really conflicts to some degree, from the watchdog
> point of view it can reboot system whhen kdump kernel hangs. But from kdump
> point of view it want ensure saving the vmcore for later debugging.
> 
> Maybe we can only select only one in this case.

You want to be able to make a decision at runtime which to use.

> > - if it can be stopped, you can open and stop it
> 
> For the last one since crashing happens we have no chance to open and stop.

When you decide you need to set up to catch a core rather than just
crash you can open and stop the watchdog (if supported), and you can then
set up for a kdump and then at some point later if it crashes capture the
dump.

As you say the two are basically incompatible models of operating, but
that also means if you are about to take a dump you want ensure you will
not be disturbed. So as far as I can see if you might need to take a
dump, turn the watchdog off in advance. Worrying about it as you take a
dump is too late.

Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] watchdog: add a parameter for stop wdt before register

2014-01-15 Thread One Thousand Gnomes
 watchdog and crash dump really conflicts to some degree, from the watchdog
 point of view it can reboot system whhen kdump kernel hangs. But from kdump
 point of view it want ensure saving the vmcore for later debugging.
 
 Maybe we can only select only one in this case.

You want to be able to make a decision at runtime which to use.

  - if it can be stopped, you can open and stop it
 
 For the last one since crashing happens we have no chance to open and stop.

When you decide you need to set up to catch a core rather than just
crash you can open and stop the watchdog (if supported), and you can then
set up for a kdump and then at some point later if it crashes capture the
dump.

As you say the two are basically incompatible models of operating, but
that also means if you are about to take a dump you want ensure you will
not be disturbed. So as far as I can see if you might need to take a
dump, turn the watchdog off in advance. Worrying about it as you take a
dump is too late.

Alan
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] watchdog: add a parameter for stop wdt before register

2014-01-15 Thread Vivek Goyal
On Wed, Jan 15, 2014 at 09:11:42AM +0800, Dave Young wrote:

[..]
  I thought this problem was resolved (atleast conceptually) last time
  when Don Zickus brought it up.
  
  He mentioned that it was concluded that keep watchdog interval long
  enough, say 60 seconds and keep on kicking it fast enough, say every
  10-20 seconds. That would ensure that after the crash, there is atleast
  60 - 20 = 40 seconds left before watchdog expires. And in that duration
  we should try to boot into second kernel load watchdog driver early enough
  from initramfs which can start kicking watchdog again.
 
 Some drivers did stop the watchdog while module loading such as iTCO_wdt.

Instead of stopping why not keep on kicking it till user space takes
over this job. This will also make sure that if kdump kernel hangs,
watchdog wil do the job it is supposed to do?

Thanks
Vivek
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] watchdog: add a parameter for stop wdt before register

2014-01-15 Thread Vivek Goyal
On Wed, Jan 15, 2014 at 12:15:56PM +, One Thousand Gnomes wrote:
  watchdog and crash dump really conflicts to some degree, from the watchdog
  point of view it can reboot system whhen kdump kernel hangs. But from kdump
  point of view it want ensure saving the vmcore for later debugging.
  
  Maybe we can only select only one in this case.
 
 You want to be able to make a decision at runtime which to use.
 
   - if it can be stopped, you can open and stop it
  
  For the last one since crashing happens we have no chance to open and stop.
 
 When you decide you need to set up to catch a core rather than just
 crash you can open and stop the watchdog (if supported), and you can then
 set up for a kdump and then at some point later if it crashes capture the
 dump.

Disabling watchdog if kdump serice starts will not make many happy. If
kernel hangs, we don't have a functionality to reboot it.

What about other idea of keeping watchdog interval long enough that new
kernel can boot, driver can load and then new driver/user space can
continue to kick the watchdog. And if second kernel hangs, watchdog will
reboot the system.

Thanks
Vivek
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] watchdog: add a parameter for stop wdt before register

2014-01-15 Thread Dave Young
On 01/15/14 at 11:46am, Vivek Goyal wrote:
 On Wed, Jan 15, 2014 at 09:11:42AM +0800, Dave Young wrote:
 
 [..]
   I thought this problem was resolved (atleast conceptually) last time
   when Don Zickus brought it up.
   
   He mentioned that it was concluded that keep watchdog interval long
   enough, say 60 seconds and keep on kicking it fast enough, say every
   10-20 seconds. That would ensure that after the crash, there is atleast
   60 - 20 = 40 seconds left before watchdog expires. And in that duration
   we should try to boot into second kernel load watchdog driver early enough
   from initramfs which can start kicking watchdog again.
  
  Some drivers did stop the watchdog while module loading such as iTCO_wdt.
 
 Instead of stopping why not keep on kicking it till user space takes
 over this job. This will also make sure that if kdump kernel hangs,
 watchdog wil do the job it is supposed to do?

Yes, rethinking about this problem, kicking it is better than stopping it.
But we will also have more uncertern time between userspace kicking before and
after panic.

Thanks
Dave
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] watchdog: add a parameter for stop wdt before register

2014-01-15 Thread Dave Young
On 01/15/14 at 11:55am, Vivek Goyal wrote:
 On Wed, Jan 15, 2014 at 12:15:56PM +, One Thousand Gnomes wrote:
   watchdog and crash dump really conflicts to some degree, from the watchdog
   point of view it can reboot system whhen kdump kernel hangs. But from 
   kdump
   point of view it want ensure saving the vmcore for later debugging.
   
   Maybe we can only select only one in this case.
  
  You want to be able to make a decision at runtime which to use.
  
- if it can be stopped, you can open and stop it
   
   For the last one since crashing happens we have no chance to open and 
   stop.
  
  When you decide you need to set up to catch a core rather than just
  crash you can open and stop the watchdog (if supported), and you can then
  set up for a kdump and then at some point later if it crashes capture the
  dump.
 
 Disabling watchdog if kdump serice starts will not make many happy. If
 kernel hangs, we don't have a functionality to reboot it.
 
 What about other idea of keeping watchdog interval long enough that new
 kernel can boot, driver can load and then new driver/user space can
 continue to kick the watchdog. And if second kernel hangs, watchdog will
 reboot the system.

I would agree this way.

Thanks
Dave
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] watchdog: add a parameter for stop wdt before register

2014-01-14 Thread Dave Young
On 01/14/14 at 11:24am, Vivek Goyal wrote:
> On Tue, Jan 14, 2014 at 12:16:39PM +, One Thousand Gnomes wrote:
> > On Tue, 14 Jan 2014 16:23:23 +0800
> > Dave Young  wrote:
> > 
> > > In kdump kernel watchdog could interrupt vmcore capturing because we
> > > have no way to disable/stop it while crashing happens.
> > 
> > Lots of watchdogs cannot be stopped.
> > 
> > > Add a module parameter stop_before_register so watchdog can be stopped
> > > before register in driver loading path. Thus we can try to load the
> > > watchdog driver as early as possible in kdump kernel to ensure vmcore
> > > capturing.
> > 
> > If you want to kdump then don't start the watchdog. The goal of the
> > watchdog is to make sure the system never gets stuck. Adding conditions
> > and special cases simply increases the odds of something bad not
> > triggering the watchdog.
> > 
> > If you have a system that can stop the watchdog then providing no way out
> > is not set you can open it and stop it.
> > 
> > I don't see the need for any kernel change here
> > 
> > - if it can't be stopped you lost
> > - if "nowayout" is set then by design you lost
> > - if it can be stopped, you can open and stop it
> > 
> > Now whether in the !nowayout case the watchdog core should catch whatever
> > hooks/notifiers are available and stop any watchdogs it can on a
> > kexec/kdump is a more interesting question and probably needs to default
> > to not doing so but with the option to force otherwise for debugging work.
> 
> Hi All,
> 
> I thought this problem was resolved (atleast conceptually) last time
> when Don Zickus brought it up.
> 
> He mentioned that it was concluded that keep watchdog interval long
> enough, say 60 seconds and keep on kicking it fast enough, say every
> 10-20 seconds. That would ensure that after the crash, there is atleast
> 60 - 20 = 40 seconds left before watchdog expires. And in that duration
> we should try to boot into second kernel load watchdog driver early enough
> from initramfs which can start kicking watchdog again.

Some drivers did stop the watchdog while module loading such as iTCO_wdt.
so we can load them as early as possible and not necessary to kick them
again. Thus I wrote this patch to add the stop to generic code so more drivers
can be covered.

But as Alan said I begin to feel that this is not a good design. the iTCO_wdt
nowayout become useless because of this stop_before_register..

For wdts which can not be stopped we can still continue working on kicking
them early in initramfs.

> 
> I am wondering what happened to this idea. Dave, did we try to implement/
> experiment with this?

No, we are just begin with iTCO_wdt and is trying to add iTCO_wdt firstly.
Other devices has not been investigated.

Thanks
Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] watchdog: add a parameter for stop wdt before register

2014-01-14 Thread Dave Young
On 01/14/14 at 12:16pm, One Thousand Gnomes wrote:
> On Tue, 14 Jan 2014 16:23:23 +0800
> Dave Young  wrote:
> 
> > In kdump kernel watchdog could interrupt vmcore capturing because we
> > have no way to disable/stop it while crashing happens.
> 
> Lots of watchdogs cannot be stopped.
> 
> > Add a module parameter stop_before_register so watchdog can be stopped
> > before register in driver loading path. Thus we can try to load the
> > watchdog driver as early as possible in kdump kernel to ensure vmcore
> > capturing.
> 
> If you want to kdump then don't start the watchdog. The goal of the
> watchdog is to make sure the system never gets stuck. Adding conditions
> and special cases simply increases the odds of something bad not
> triggering the watchdog.

watchdog and crash dump really conflicts to some degree, from the watchdog
point of view it can reboot system whhen kdump kernel hangs. But from kdump
point of view it want ensure saving the vmcore for later debugging.

Maybe we can only select only one in this case.
> 
> If you have a system that can stop the watchdog then providing no way out
> is not set you can open it and stop it.
> 
> I don't see the need for any kernel change here
> 
> - if it can't be stopped you lost
> - if "nowayout" is set then by design you lost
> - if it can be stopped, you can open and stop it

For the last one since crashing happens we have no chance to open and stop.

> 
> Now whether in the !nowayout case the watchdog core should catch whatever
> hooks/notifiers are available and stop any watchdogs it can on a
> kexec/kdump is a more interesting question and probably needs to default
> to not doing so but with the option to force otherwise for debugging work.

Unfortunately in crash path there's no chance to do so. It's not good to
add more logic in that path as well.

Thanks
Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] watchdog: add a parameter for stop wdt before register

2014-01-14 Thread Vivek Goyal
On Tue, Jan 14, 2014 at 12:16:39PM +, One Thousand Gnomes wrote:
> On Tue, 14 Jan 2014 16:23:23 +0800
> Dave Young  wrote:
> 
> > In kdump kernel watchdog could interrupt vmcore capturing because we
> > have no way to disable/stop it while crashing happens.
> 
> Lots of watchdogs cannot be stopped.
> 
> > Add a module parameter stop_before_register so watchdog can be stopped
> > before register in driver loading path. Thus we can try to load the
> > watchdog driver as early as possible in kdump kernel to ensure vmcore
> > capturing.
> 
> If you want to kdump then don't start the watchdog. The goal of the
> watchdog is to make sure the system never gets stuck. Adding conditions
> and special cases simply increases the odds of something bad not
> triggering the watchdog.
> 
> If you have a system that can stop the watchdog then providing no way out
> is not set you can open it and stop it.
> 
> I don't see the need for any kernel change here
> 
> - if it can't be stopped you lost
> - if "nowayout" is set then by design you lost
> - if it can be stopped, you can open and stop it
> 
> Now whether in the !nowayout case the watchdog core should catch whatever
> hooks/notifiers are available and stop any watchdogs it can on a
> kexec/kdump is a more interesting question and probably needs to default
> to not doing so but with the option to force otherwise for debugging work.

Hi All,

I thought this problem was resolved (atleast conceptually) last time
when Don Zickus brought it up.

He mentioned that it was concluded that keep watchdog interval long
enough, say 60 seconds and keep on kicking it fast enough, say every
10-20 seconds. That would ensure that after the crash, there is atleast
60 - 20 = 40 seconds left before watchdog expires. And in that duration
we should try to boot into second kernel load watchdog driver early enough
from initramfs which can start kicking watchdog again.

I am wondering what happened to this idea. Dave, did we try to implement/
experiment with this?

Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] watchdog: add a parameter for stop wdt before register

2014-01-14 Thread One Thousand Gnomes
On Tue, 14 Jan 2014 16:23:23 +0800
Dave Young  wrote:

> In kdump kernel watchdog could interrupt vmcore capturing because we
> have no way to disable/stop it while crashing happens.

Lots of watchdogs cannot be stopped.

> Add a module parameter stop_before_register so watchdog can be stopped
> before register in driver loading path. Thus we can try to load the
> watchdog driver as early as possible in kdump kernel to ensure vmcore
> capturing.

If you want to kdump then don't start the watchdog. The goal of the
watchdog is to make sure the system never gets stuck. Adding conditions
and special cases simply increases the odds of something bad not
triggering the watchdog.

If you have a system that can stop the watchdog then providing no way out
is not set you can open it and stop it.

I don't see the need for any kernel change here

- if it can't be stopped you lost
- if "nowayout" is set then by design you lost
- if it can be stopped, you can open and stop it

Now whether in the !nowayout case the watchdog core should catch whatever
hooks/notifiers are available and stop any watchdogs it can on a
kexec/kdump is a more interesting question and probably needs to default
to not doing so but with the option to force otherwise for debugging work.

Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] watchdog: add a parameter for stop wdt before register

2014-01-14 Thread Dave Young
On 01/14/14 at 04:41pm, Dave Young wrote:
> On 01/14/14 at 09:26am, Wim Van Sebroeck wrote:
> > Hi Dave,
> > 
> > > In kdump kernel watchdog could interrupt vmcore capturing because we
> > > have no way to disable/stop it while crashing happens.
> > > 
> > > Add a module parameter stop_before_register so watchdog can be stopped
> > > before register in driver loading path. Thus we can try to load the
> > > watchdog driver as early as possible in kdump kernel to ensure vmcore
> > > capturing.
> > > 
> > > Don Zickus mentioned that there's the case that bios start the watchdog
> > > and it is expected that the kernel keep the watchdog alive. To address
> > > this case I added the module parameter which is false by default so
> > > it will stop the watchdog only when user provice kernel cmdline
> > > "watchdog.stop_before_register=1".  
> > > 
> > > Signed-off-by: Dave Young 
> > > ---
> > >  drivers/watchdog/watchdog_core.c |6 ++
> > >  1 file changed, 6 insertions(+)
> > > 
> > > --- linux.orig/drivers/watchdog/watchdog_core.c
> > > +++ linux/drivers/watchdog/watchdog_core.c
> > > @@ -42,6 +42,7 @@
> > >  
> > >  static DEFINE_IDA(watchdog_ida);
> > >  static struct class *watchdog_class;
> > > +static bool stop_before_register;
> > >  
> > >  static void watchdog_check_min_max_timeout(struct watchdog_device *wdd)
> > >  {
> > > @@ -119,6 +120,9 @@ int watchdog_register_device(struct watc
> > >   if (wdd->ops->start == NULL || wdd->ops->stop == NULL)
> > >   return -EINVAL;
> > >  
> > > + if (stop_before_register)
> > > + wdd->ops->stop(wdd);
> > > +
> > >   watchdog_check_min_max_timeout(wdd);
> > >  
> > >   /*
> > > @@ -220,6 +224,8 @@ static void __exit watchdog_exit(void)
> > >  subsys_initcall(watchdog_init);
> > >  module_exit(watchdog_exit);
> > >  
> > > +module_param(stop_before_register, bool, 0644);
> > > +
> > >  MODULE_AUTHOR("Alan Cox ");
> > >  MODULE_AUTHOR("Wim Van Sebroeck ");
> > >  MODULE_DESCRIPTION("WatchDog Timer Driver Core");
> > 
> > Hmm, need to look closer to this, but my first thought is:
> > what about devices that cannot be stopped once started...
> > They should be able to override this module_parameter...
> 
> Hi, Wim
> 
> Thanks for quick feedback!
> 
> I'm not sure the meaning of "cannot be stopped", if it means that
> the policy that it should not be stopped, I think since the watchdog_core
> is always built-in so the param can only be provided via boot cmdline it would
> be fine?

Hmm, the wdt driver can be removed then insmod again.
I think you are talking about the nowayout, for this case probably should add 
below:

if (stop_before_register && !nowayout)
  stop it.

But is there a general way for checkout nowayout, could you give some
hints?

> 
> For device which really *cannot* stop, the stop() will fail silently?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] watchdog: add a parameter for stop wdt before register

2014-01-14 Thread Dave Young
On 01/14/14 at 09:26am, Wim Van Sebroeck wrote:
> Hi Dave,
> 
> > In kdump kernel watchdog could interrupt vmcore capturing because we
> > have no way to disable/stop it while crashing happens.
> > 
> > Add a module parameter stop_before_register so watchdog can be stopped
> > before register in driver loading path. Thus we can try to load the
> > watchdog driver as early as possible in kdump kernel to ensure vmcore
> > capturing.
> > 
> > Don Zickus mentioned that there's the case that bios start the watchdog
> > and it is expected that the kernel keep the watchdog alive. To address
> > this case I added the module parameter which is false by default so
> > it will stop the watchdog only when user provice kernel cmdline
> > "watchdog.stop_before_register=1".  
> > 
> > Signed-off-by: Dave Young 
> > ---
> >  drivers/watchdog/watchdog_core.c |6 ++
> >  1 file changed, 6 insertions(+)
> > 
> > --- linux.orig/drivers/watchdog/watchdog_core.c
> > +++ linux/drivers/watchdog/watchdog_core.c
> > @@ -42,6 +42,7 @@
> >  
> >  static DEFINE_IDA(watchdog_ida);
> >  static struct class *watchdog_class;
> > +static bool stop_before_register;
> >  
> >  static void watchdog_check_min_max_timeout(struct watchdog_device *wdd)
> >  {
> > @@ -119,6 +120,9 @@ int watchdog_register_device(struct watc
> > if (wdd->ops->start == NULL || wdd->ops->stop == NULL)
> > return -EINVAL;
> >  
> > +   if (stop_before_register)
> > +   wdd->ops->stop(wdd);
> > +
> > watchdog_check_min_max_timeout(wdd);
> >  
> > /*
> > @@ -220,6 +224,8 @@ static void __exit watchdog_exit(void)
> >  subsys_initcall(watchdog_init);
> >  module_exit(watchdog_exit);
> >  
> > +module_param(stop_before_register, bool, 0644);
> > +
> >  MODULE_AUTHOR("Alan Cox ");
> >  MODULE_AUTHOR("Wim Van Sebroeck ");
> >  MODULE_DESCRIPTION("WatchDog Timer Driver Core");
> 
> Hmm, need to look closer to this, but my first thought is:
> what about devices that cannot be stopped once started...
> They should be able to override this module_parameter...

Hi, Wim

Thanks for quick feedback!

I'm not sure the meaning of "cannot be stopped", if it means that
the policy that it should not be stopped, I think since the watchdog_core
is always built-in so the param can only be provided via boot cmdline it would
be fine?

For device which really *cannot* stop, the stop() will fail silently?

Thanks
Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] watchdog: add a parameter for stop wdt before register

2014-01-14 Thread Wim Van Sebroeck
Hi Dave,

> In kdump kernel watchdog could interrupt vmcore capturing because we
> have no way to disable/stop it while crashing happens.
> 
> Add a module parameter stop_before_register so watchdog can be stopped
> before register in driver loading path. Thus we can try to load the
> watchdog driver as early as possible in kdump kernel to ensure vmcore
> capturing.
> 
> Don Zickus mentioned that there's the case that bios start the watchdog
> and it is expected that the kernel keep the watchdog alive. To address
> this case I added the module parameter which is false by default so
> it will stop the watchdog only when user provice kernel cmdline
> "watchdog.stop_before_register=1".  
> 
> Signed-off-by: Dave Young 
> ---
>  drivers/watchdog/watchdog_core.c |6 ++
>  1 file changed, 6 insertions(+)
> 
> --- linux.orig/drivers/watchdog/watchdog_core.c
> +++ linux/drivers/watchdog/watchdog_core.c
> @@ -42,6 +42,7 @@
>  
>  static DEFINE_IDA(watchdog_ida);
>  static struct class *watchdog_class;
> +static bool stop_before_register;
>  
>  static void watchdog_check_min_max_timeout(struct watchdog_device *wdd)
>  {
> @@ -119,6 +120,9 @@ int watchdog_register_device(struct watc
>   if (wdd->ops->start == NULL || wdd->ops->stop == NULL)
>   return -EINVAL;
>  
> + if (stop_before_register)
> + wdd->ops->stop(wdd);
> +
>   watchdog_check_min_max_timeout(wdd);
>  
>   /*
> @@ -220,6 +224,8 @@ static void __exit watchdog_exit(void)
>  subsys_initcall(watchdog_init);
>  module_exit(watchdog_exit);
>  
> +module_param(stop_before_register, bool, 0644);
> +
>  MODULE_AUTHOR("Alan Cox ");
>  MODULE_AUTHOR("Wim Van Sebroeck ");
>  MODULE_DESCRIPTION("WatchDog Timer Driver Core");

Hmm, need to look closer to this, but my first thought is:
what about devices that cannot be stopped once started...
They should be able to override this module_parameter...

Kind regards,
Wim.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] watchdog: add a parameter for stop wdt before register

2014-01-14 Thread Wim Van Sebroeck
Hi Dave,

 In kdump kernel watchdog could interrupt vmcore capturing because we
 have no way to disable/stop it while crashing happens.
 
 Add a module parameter stop_before_register so watchdog can be stopped
 before register in driver loading path. Thus we can try to load the
 watchdog driver as early as possible in kdump kernel to ensure vmcore
 capturing.
 
 Don Zickus mentioned that there's the case that bios start the watchdog
 and it is expected that the kernel keep the watchdog alive. To address
 this case I added the module parameter which is false by default so
 it will stop the watchdog only when user provice kernel cmdline
 watchdog.stop_before_register=1.  
 
 Signed-off-by: Dave Young dyo...@redhat.com
 ---
  drivers/watchdog/watchdog_core.c |6 ++
  1 file changed, 6 insertions(+)
 
 --- linux.orig/drivers/watchdog/watchdog_core.c
 +++ linux/drivers/watchdog/watchdog_core.c
 @@ -42,6 +42,7 @@
  
  static DEFINE_IDA(watchdog_ida);
  static struct class *watchdog_class;
 +static bool stop_before_register;
  
  static void watchdog_check_min_max_timeout(struct watchdog_device *wdd)
  {
 @@ -119,6 +120,9 @@ int watchdog_register_device(struct watc
   if (wdd-ops-start == NULL || wdd-ops-stop == NULL)
   return -EINVAL;
  
 + if (stop_before_register)
 + wdd-ops-stop(wdd);
 +
   watchdog_check_min_max_timeout(wdd);
  
   /*
 @@ -220,6 +224,8 @@ static void __exit watchdog_exit(void)
  subsys_initcall(watchdog_init);
  module_exit(watchdog_exit);
  
 +module_param(stop_before_register, bool, 0644);
 +
  MODULE_AUTHOR(Alan Cox a...@lxorguk.ukuu.org.uk);
  MODULE_AUTHOR(Wim Van Sebroeck w...@iguana.be);
  MODULE_DESCRIPTION(WatchDog Timer Driver Core);

Hmm, need to look closer to this, but my first thought is:
what about devices that cannot be stopped once started...
They should be able to override this module_parameter...

Kind regards,
Wim.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] watchdog: add a parameter for stop wdt before register

2014-01-14 Thread Dave Young
On 01/14/14 at 09:26am, Wim Van Sebroeck wrote:
 Hi Dave,
 
  In kdump kernel watchdog could interrupt vmcore capturing because we
  have no way to disable/stop it while crashing happens.
  
  Add a module parameter stop_before_register so watchdog can be stopped
  before register in driver loading path. Thus we can try to load the
  watchdog driver as early as possible in kdump kernel to ensure vmcore
  capturing.
  
  Don Zickus mentioned that there's the case that bios start the watchdog
  and it is expected that the kernel keep the watchdog alive. To address
  this case I added the module parameter which is false by default so
  it will stop the watchdog only when user provice kernel cmdline
  watchdog.stop_before_register=1.  
  
  Signed-off-by: Dave Young dyo...@redhat.com
  ---
   drivers/watchdog/watchdog_core.c |6 ++
   1 file changed, 6 insertions(+)
  
  --- linux.orig/drivers/watchdog/watchdog_core.c
  +++ linux/drivers/watchdog/watchdog_core.c
  @@ -42,6 +42,7 @@
   
   static DEFINE_IDA(watchdog_ida);
   static struct class *watchdog_class;
  +static bool stop_before_register;
   
   static void watchdog_check_min_max_timeout(struct watchdog_device *wdd)
   {
  @@ -119,6 +120,9 @@ int watchdog_register_device(struct watc
  if (wdd-ops-start == NULL || wdd-ops-stop == NULL)
  return -EINVAL;
   
  +   if (stop_before_register)
  +   wdd-ops-stop(wdd);
  +
  watchdog_check_min_max_timeout(wdd);
   
  /*
  @@ -220,6 +224,8 @@ static void __exit watchdog_exit(void)
   subsys_initcall(watchdog_init);
   module_exit(watchdog_exit);
   
  +module_param(stop_before_register, bool, 0644);
  +
   MODULE_AUTHOR(Alan Cox a...@lxorguk.ukuu.org.uk);
   MODULE_AUTHOR(Wim Van Sebroeck w...@iguana.be);
   MODULE_DESCRIPTION(WatchDog Timer Driver Core);
 
 Hmm, need to look closer to this, but my first thought is:
 what about devices that cannot be stopped once started...
 They should be able to override this module_parameter...

Hi, Wim

Thanks for quick feedback!

I'm not sure the meaning of cannot be stopped, if it means that
the policy that it should not be stopped, I think since the watchdog_core
is always built-in so the param can only be provided via boot cmdline it would
be fine?

For device which really *cannot* stop, the stop() will fail silently?

Thanks
Dave
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] watchdog: add a parameter for stop wdt before register

2014-01-14 Thread Dave Young
On 01/14/14 at 04:41pm, Dave Young wrote:
 On 01/14/14 at 09:26am, Wim Van Sebroeck wrote:
  Hi Dave,
  
   In kdump kernel watchdog could interrupt vmcore capturing because we
   have no way to disable/stop it while crashing happens.
   
   Add a module parameter stop_before_register so watchdog can be stopped
   before register in driver loading path. Thus we can try to load the
   watchdog driver as early as possible in kdump kernel to ensure vmcore
   capturing.
   
   Don Zickus mentioned that there's the case that bios start the watchdog
   and it is expected that the kernel keep the watchdog alive. To address
   this case I added the module parameter which is false by default so
   it will stop the watchdog only when user provice kernel cmdline
   watchdog.stop_before_register=1.  
   
   Signed-off-by: Dave Young dyo...@redhat.com
   ---
drivers/watchdog/watchdog_core.c |6 ++
1 file changed, 6 insertions(+)
   
   --- linux.orig/drivers/watchdog/watchdog_core.c
   +++ linux/drivers/watchdog/watchdog_core.c
   @@ -42,6 +42,7 @@

static DEFINE_IDA(watchdog_ida);
static struct class *watchdog_class;
   +static bool stop_before_register;

static void watchdog_check_min_max_timeout(struct watchdog_device *wdd)
{
   @@ -119,6 +120,9 @@ int watchdog_register_device(struct watc
 if (wdd-ops-start == NULL || wdd-ops-stop == NULL)
 return -EINVAL;

   + if (stop_before_register)
   + wdd-ops-stop(wdd);
   +
 watchdog_check_min_max_timeout(wdd);

 /*
   @@ -220,6 +224,8 @@ static void __exit watchdog_exit(void)
subsys_initcall(watchdog_init);
module_exit(watchdog_exit);

   +module_param(stop_before_register, bool, 0644);
   +
MODULE_AUTHOR(Alan Cox a...@lxorguk.ukuu.org.uk);
MODULE_AUTHOR(Wim Van Sebroeck w...@iguana.be);
MODULE_DESCRIPTION(WatchDog Timer Driver Core);
  
  Hmm, need to look closer to this, but my first thought is:
  what about devices that cannot be stopped once started...
  They should be able to override this module_parameter...
 
 Hi, Wim
 
 Thanks for quick feedback!
 
 I'm not sure the meaning of cannot be stopped, if it means that
 the policy that it should not be stopped, I think since the watchdog_core
 is always built-in so the param can only be provided via boot cmdline it would
 be fine?

Hmm, the wdt driver can be removed then insmod again.
I think you are talking about the nowayout, for this case probably should add 
below:

if (stop_before_register  !nowayout)
  stop it.

But is there a general way for checkout nowayout, could you give some
hints?

 
 For device which really *cannot* stop, the stop() will fail silently?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] watchdog: add a parameter for stop wdt before register

2014-01-14 Thread One Thousand Gnomes
On Tue, 14 Jan 2014 16:23:23 +0800
Dave Young dyo...@redhat.com wrote:

 In kdump kernel watchdog could interrupt vmcore capturing because we
 have no way to disable/stop it while crashing happens.

Lots of watchdogs cannot be stopped.

 Add a module parameter stop_before_register so watchdog can be stopped
 before register in driver loading path. Thus we can try to load the
 watchdog driver as early as possible in kdump kernel to ensure vmcore
 capturing.

If you want to kdump then don't start the watchdog. The goal of the
watchdog is to make sure the system never gets stuck. Adding conditions
and special cases simply increases the odds of something bad not
triggering the watchdog.

If you have a system that can stop the watchdog then providing no way out
is not set you can open it and stop it.

I don't see the need for any kernel change here

- if it can't be stopped you lost
- if nowayout is set then by design you lost
- if it can be stopped, you can open and stop it

Now whether in the !nowayout case the watchdog core should catch whatever
hooks/notifiers are available and stop any watchdogs it can on a
kexec/kdump is a more interesting question and probably needs to default
to not doing so but with the option to force otherwise for debugging work.

Alan
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] watchdog: add a parameter for stop wdt before register

2014-01-14 Thread Vivek Goyal
On Tue, Jan 14, 2014 at 12:16:39PM +, One Thousand Gnomes wrote:
 On Tue, 14 Jan 2014 16:23:23 +0800
 Dave Young dyo...@redhat.com wrote:
 
  In kdump kernel watchdog could interrupt vmcore capturing because we
  have no way to disable/stop it while crashing happens.
 
 Lots of watchdogs cannot be stopped.
 
  Add a module parameter stop_before_register so watchdog can be stopped
  before register in driver loading path. Thus we can try to load the
  watchdog driver as early as possible in kdump kernel to ensure vmcore
  capturing.
 
 If you want to kdump then don't start the watchdog. The goal of the
 watchdog is to make sure the system never gets stuck. Adding conditions
 and special cases simply increases the odds of something bad not
 triggering the watchdog.
 
 If you have a system that can stop the watchdog then providing no way out
 is not set you can open it and stop it.
 
 I don't see the need for any kernel change here
 
 - if it can't be stopped you lost
 - if nowayout is set then by design you lost
 - if it can be stopped, you can open and stop it
 
 Now whether in the !nowayout case the watchdog core should catch whatever
 hooks/notifiers are available and stop any watchdogs it can on a
 kexec/kdump is a more interesting question and probably needs to default
 to not doing so but with the option to force otherwise for debugging work.

Hi All,

I thought this problem was resolved (atleast conceptually) last time
when Don Zickus brought it up.

He mentioned that it was concluded that keep watchdog interval long
enough, say 60 seconds and keep on kicking it fast enough, say every
10-20 seconds. That would ensure that after the crash, there is atleast
60 - 20 = 40 seconds left before watchdog expires. And in that duration
we should try to boot into second kernel load watchdog driver early enough
from initramfs which can start kicking watchdog again.

I am wondering what happened to this idea. Dave, did we try to implement/
experiment with this?

Thanks
Vivek
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] watchdog: add a parameter for stop wdt before register

2014-01-14 Thread Dave Young
On 01/14/14 at 12:16pm, One Thousand Gnomes wrote:
 On Tue, 14 Jan 2014 16:23:23 +0800
 Dave Young dyo...@redhat.com wrote:
 
  In kdump kernel watchdog could interrupt vmcore capturing because we
  have no way to disable/stop it while crashing happens.
 
 Lots of watchdogs cannot be stopped.
 
  Add a module parameter stop_before_register so watchdog can be stopped
  before register in driver loading path. Thus we can try to load the
  watchdog driver as early as possible in kdump kernel to ensure vmcore
  capturing.
 
 If you want to kdump then don't start the watchdog. The goal of the
 watchdog is to make sure the system never gets stuck. Adding conditions
 and special cases simply increases the odds of something bad not
 triggering the watchdog.

watchdog and crash dump really conflicts to some degree, from the watchdog
point of view it can reboot system whhen kdump kernel hangs. But from kdump
point of view it want ensure saving the vmcore for later debugging.

Maybe we can only select only one in this case.
 
 If you have a system that can stop the watchdog then providing no way out
 is not set you can open it and stop it.
 
 I don't see the need for any kernel change here
 
 - if it can't be stopped you lost
 - if nowayout is set then by design you lost
 - if it can be stopped, you can open and stop it

For the last one since crashing happens we have no chance to open and stop.

 
 Now whether in the !nowayout case the watchdog core should catch whatever
 hooks/notifiers are available and stop any watchdogs it can on a
 kexec/kdump is a more interesting question and probably needs to default
 to not doing so but with the option to force otherwise for debugging work.

Unfortunately in crash path there's no chance to do so. It's not good to
add more logic in that path as well.

Thanks
Dave
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] watchdog: add a parameter for stop wdt before register

2014-01-14 Thread Dave Young
On 01/14/14 at 11:24am, Vivek Goyal wrote:
 On Tue, Jan 14, 2014 at 12:16:39PM +, One Thousand Gnomes wrote:
  On Tue, 14 Jan 2014 16:23:23 +0800
  Dave Young dyo...@redhat.com wrote:
  
   In kdump kernel watchdog could interrupt vmcore capturing because we
   have no way to disable/stop it while crashing happens.
  
  Lots of watchdogs cannot be stopped.
  
   Add a module parameter stop_before_register so watchdog can be stopped
   before register in driver loading path. Thus we can try to load the
   watchdog driver as early as possible in kdump kernel to ensure vmcore
   capturing.
  
  If you want to kdump then don't start the watchdog. The goal of the
  watchdog is to make sure the system never gets stuck. Adding conditions
  and special cases simply increases the odds of something bad not
  triggering the watchdog.
  
  If you have a system that can stop the watchdog then providing no way out
  is not set you can open it and stop it.
  
  I don't see the need for any kernel change here
  
  - if it can't be stopped you lost
  - if nowayout is set then by design you lost
  - if it can be stopped, you can open and stop it
  
  Now whether in the !nowayout case the watchdog core should catch whatever
  hooks/notifiers are available and stop any watchdogs it can on a
  kexec/kdump is a more interesting question and probably needs to default
  to not doing so but with the option to force otherwise for debugging work.
 
 Hi All,
 
 I thought this problem was resolved (atleast conceptually) last time
 when Don Zickus brought it up.
 
 He mentioned that it was concluded that keep watchdog interval long
 enough, say 60 seconds and keep on kicking it fast enough, say every
 10-20 seconds. That would ensure that after the crash, there is atleast
 60 - 20 = 40 seconds left before watchdog expires. And in that duration
 we should try to boot into second kernel load watchdog driver early enough
 from initramfs which can start kicking watchdog again.

Some drivers did stop the watchdog while module loading such as iTCO_wdt.
so we can load them as early as possible and not necessary to kick them
again. Thus I wrote this patch to add the stop to generic code so more drivers
can be covered.

But as Alan said I begin to feel that this is not a good design. the iTCO_wdt
nowayout become useless because of this stop_before_register..

For wdts which can not be stopped we can still continue working on kicking
them early in initramfs.

 
 I am wondering what happened to this idea. Dave, did we try to implement/
 experiment with this?

No, we are just begin with iTCO_wdt and is trying to add iTCO_wdt firstly.
Other devices has not been investigated.

Thanks
Dave
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/