Re: Fw: suspicious behaviour in pcwd driver.
Hi Dave, > drivers/char/watchdog/pcwd.c does this if it detects > a temperature out of range.. > > if (temp_panic) { > printk (KERN_INFO PFX "Temperature overheat trip!\n"); > machine_power_off(); > } > > Two problems here are.. > > 1. machine_power_off() isn't exported on ppc64. (patch below) > 2. that printk will never hit the logs, so the admin will just find > a powered off box with no idea what happened. > Should we at least sync block devices before doing the power off ? First you need to enable the "temp_panic" by setting the WDIOS_TEMPPANIC option flag. (I'm curious who actually uses this and when they use this, but that is something totally different). And then you need to read the card's status before this piece of code will be triggered. So in my opinion this isn't used a lott and is simply an option to protect your hardware. But your comment is valid: chances that the warning is written to the log files is very small. So I think that it might indeed be better to sync most devices if that can be done in a few seconds... Greetings, Wim. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Fw: suspicious behaviour in pcwd driver.
Hi Dave, drivers/char/watchdog/pcwd.c does this if it detects a temperature out of range.. if (temp_panic) { printk (KERN_INFO PFX Temperature overheat trip!\n); machine_power_off(); } Two problems here are.. 1. machine_power_off() isn't exported on ppc64. (patch below) 2. that printk will never hit the logs, so the admin will just find a powered off box with no idea what happened. Should we at least sync block devices before doing the power off ? First you need to enable the temp_panic by setting the WDIOS_TEMPPANIC option flag. (I'm curious who actually uses this and when they use this, but that is something totally different). And then you need to read the card's status before this piece of code will be triggered. So in my opinion this isn't used a lott and is simply an option to protect your hardware. But your comment is valid: chances that the warning is written to the log files is very small. So I think that it might indeed be better to sync most devices if that can be done in a few seconds... Greetings, Wim. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: suspicious behaviour in pcwd driver.
Hi! > > 2. that printk will never hit the logs, so the admin will just find > > a powered off box with no idea what happened. > > Should we at least sync block devices before doing the power off ? > > AFAICS, this is still a problem with kernel_power_off() though ? > Look at how acpi does this; we probably want to trigger clean shutdown. -- 64 bytes from 195.113.31.123: icmp_seq=28 ttl=51 time=448769.1 ms - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: suspicious behaviour in pcwd driver.
Hi! 2. that printk will never hit the logs, so the admin will just find a powered off box with no idea what happened. Should we at least sync block devices before doing the power off ? AFAICS, this is still a problem with kernel_power_off() though ? Look at how acpi does this; we probably want to trigger clean shutdown. -- 64 bytes from 195.113.31.123: icmp_seq=28 ttl=51 time=448769.1 ms - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: suspicious behaviour in pcwd driver.
> Export machine_power_off() on ppc64, as the pcwd watchdog driver needs it. > > Signed-off-by: Dave Jones <[EMAIL PROTECTED]> > > --- linux-2.6.12/arch/ppc64/kernel/setup.c~ 2005-08-09 17:37:36.0 > -0400 > +++ linux-2.6.12/arch/ppc64/kernel/setup.c2005-08-09 17:37:53.0 > -0400 > @@ -706,6 +706,7 @@ void machine_power_off(void) > local_irq_disable(); > while (1) ; > } > +EXPORT_SYMBOL(machine_power_off); > > void machine_halt(void) > { > In fact, we need that for the G5 thermal driver too. I wonder why/how this export got removed ... Some over-zealous janitors ? Hrm... /me plays with gitk Ahhh, ok, so that is this patch: << machine_restart, machine_halt and machine_power_off are machine specific hooks deep into the reboot logic, that modules have no business messing with. Usually code should be calling kernel_restart, kernel_halt, kernel_power_off, or emergency_restart. So don't export machine_restart, machine_halt, and machine_power_off so we can catch buggy users. >> Well, I think for now, it's safe for therm_pm72 to call machine_power_off() in case of critical overtemp. I'll have a look at kernel_* equivalents later. Can you still slip that patch into 2.6.13 ? Ben. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: suspicious behaviour in pcwd driver.
On Mon, Aug 22, 2005 at 02:30:06PM -0400, Dave Jones wrote: > drivers/char/watchdog/pcwd.c does this if it detects > a temperature out of range.. > > if (temp_panic) { > printk (KERN_INFO PFX "Temperature overheat trip!\n"); > machine_power_off(); > } > > Two problems here are.. > > 1. machine_power_off() isn't exported on ppc64. (patch below) I was looking at an old tree, and this is now kernel_power_off() so this isn't a problem for pcwd, however the export is still needed for drivers/macintosh/therm_pm72.c > 2. that printk will never hit the logs, so the admin will just find > a powered off box with no idea what happened. > Should we at least sync block devices before doing the power off ? AFAICS, this is still a problem with kernel_power_off() though ? Dave - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
suspicious behaviour in pcwd driver.
drivers/char/watchdog/pcwd.c does this if it detects a temperature out of range.. if (temp_panic) { printk (KERN_INFO PFX "Temperature overheat trip!\n"); machine_power_off(); } Two problems here are.. 1. machine_power_off() isn't exported on ppc64. (patch below) 2. that printk will never hit the logs, so the admin will just find a powered off box with no idea what happened. Should we at least sync block devices before doing the power off ? Dave Export machine_power_off() on ppc64, as the pcwd watchdog driver needs it. Signed-off-by: Dave Jones <[EMAIL PROTECTED]> --- linux-2.6.12/arch/ppc64/kernel/setup.c~ 2005-08-09 17:37:36.0 -0400 +++ linux-2.6.12/arch/ppc64/kernel/setup.c 2005-08-09 17:37:53.0 -0400 @@ -706,6 +706,7 @@ void machine_power_off(void) local_irq_disable(); while (1) ; } +EXPORT_SYMBOL(machine_power_off); void machine_halt(void) { - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
suspicious behaviour in pcwd driver.
drivers/char/watchdog/pcwd.c does this if it detects a temperature out of range.. if (temp_panic) { printk (KERN_INFO PFX Temperature overheat trip!\n); machine_power_off(); } Two problems here are.. 1. machine_power_off() isn't exported on ppc64. (patch below) 2. that printk will never hit the logs, so the admin will just find a powered off box with no idea what happened. Should we at least sync block devices before doing the power off ? Dave Export machine_power_off() on ppc64, as the pcwd watchdog driver needs it. Signed-off-by: Dave Jones [EMAIL PROTECTED] --- linux-2.6.12/arch/ppc64/kernel/setup.c~ 2005-08-09 17:37:36.0 -0400 +++ linux-2.6.12/arch/ppc64/kernel/setup.c 2005-08-09 17:37:53.0 -0400 @@ -706,6 +706,7 @@ void machine_power_off(void) local_irq_disable(); while (1) ; } +EXPORT_SYMBOL(machine_power_off); void machine_halt(void) { - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: suspicious behaviour in pcwd driver.
On Mon, Aug 22, 2005 at 02:30:06PM -0400, Dave Jones wrote: drivers/char/watchdog/pcwd.c does this if it detects a temperature out of range.. if (temp_panic) { printk (KERN_INFO PFX Temperature overheat trip!\n); machine_power_off(); } Two problems here are.. 1. machine_power_off() isn't exported on ppc64. (patch below) I was looking at an old tree, and this is now kernel_power_off() so this isn't a problem for pcwd, however the export is still needed for drivers/macintosh/therm_pm72.c 2. that printk will never hit the logs, so the admin will just find a powered off box with no idea what happened. Should we at least sync block devices before doing the power off ? AFAICS, this is still a problem with kernel_power_off() though ? Dave - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: suspicious behaviour in pcwd driver.
Export machine_power_off() on ppc64, as the pcwd watchdog driver needs it. Signed-off-by: Dave Jones [EMAIL PROTECTED] --- linux-2.6.12/arch/ppc64/kernel/setup.c~ 2005-08-09 17:37:36.0 -0400 +++ linux-2.6.12/arch/ppc64/kernel/setup.c2005-08-09 17:37:53.0 -0400 @@ -706,6 +706,7 @@ void machine_power_off(void) local_irq_disable(); while (1) ; } +EXPORT_SYMBOL(machine_power_off); void machine_halt(void) { In fact, we need that for the G5 thermal driver too. I wonder why/how this export got removed ... Some over-zealous janitors ? Hrm... /me plays with gitk Ahhh, ok, so that is this patch: machine_restart, machine_halt and machine_power_off are machine specific hooks deep into the reboot logic, that modules have no business messing with. Usually code should be calling kernel_restart, kernel_halt, kernel_power_off, or emergency_restart. So don't export machine_restart, machine_halt, and machine_power_off so we can catch buggy users. Well, I think for now, it's safe for therm_pm72 to call machine_power_off() in case of critical overtemp. I'll have a look at kernel_* equivalents later. Can you still slip that patch into 2.6.13 ? Ben. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/