Re: 2.6.22 regression: thermal trip points

2007-08-13 Thread Pavel Machek
Hi!

> > > > For the
> > > > upstream kernel, I think it is more appropriate to expose and fix
> > > > the fundamental problems.  For distro kernels, I'm less concerned
> > > > if you hide bugs instead of fixing them.
> > > 
> > > This is okay as long as you are willing to work around the fundamental
> > > problems in kernel. You are unable to _fix_ them. They are broken
> > > BIOSes.
> > 
> > The thing Linux needs to figure out is why Windows doesn't
> > get confused by what Linux claims to be broken BIOS.
> 
> Why do you assume that Windows work? Yes, they probably will not have
> 'machine runs at 50% speed' problem, but I'd be very surprised if
> critical shutdown  worked properly on more than 90% of notebooks
> 
> > So far I have one live sighting to be addressed by
> > the upstream kernel (from Knut).  I'm certainly looking
> > forward to the 2nd live sighting...
> 
> Ok, I guess I should steal that old xe3 I was talking about...

Done, xe3 was re-built from parts.

/proc/acpi/.../trip_points:
critical (S5):  100 C
passive:83 C...
active[0]:  100 C...

(hmm, active=critical? Interesting. Fortunately fan seems to be driven
by BIOS).

Temperature is ~63 C in "normal" use. Now lets simulate fan failure...
and lets load the cpu...

temperature slowly rises, 1min00 -- 72C, 1min15 -- 75C, 1min30 --
77C, 1min45 -- 80C, 1min00 -- 82C, 1min15 -- 83C, 1min45 -- sudden
powerdown, presumably because of hardware failsafe.

So we have two bugs here: machine should have attempted to use passive
cooling sooner, so that critical temperature would not be reached, and
machine should have attempted shutdown before hardware failsafe killed
the power. I could do both in 2.6.21, with echo of new trip points and
enable of polling.

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-13 Thread Pavel Machek
Hi!

For the
upstream kernel, I think it is more appropriate to expose and fix
the fundamental problems.  For distro kernels, I'm less concerned
if you hide bugs instead of fixing them.
   
   This is okay as long as you are willing to work around the fundamental
   problems in kernel. You are unable to _fix_ them. They are broken
   BIOSes.
  
  The thing Linux needs to figure out is why Windows doesn't
  get confused by what Linux claims to be broken BIOS.
 
 Why do you assume that Windows work? Yes, they probably will not have
 'machine runs at 50% speed' problem, but I'd be very surprised if
 critical shutdown  worked properly on more than 90% of notebooks
 
  So far I have one live sighting to be addressed by
  the upstream kernel (from Knut).  I'm certainly looking
  forward to the 2nd live sighting...
 
 Ok, I guess I should steal that old xe3 I was talking about...

Done, xe3 was re-built from parts.

/proc/acpi/.../trip_points:
critical (S5):  100 C
passive:83 C...
active[0]:  100 C...

(hmm, active=critical? Interesting. Fortunately fan seems to be driven
by BIOS).

Temperature is ~63 C in normal use. Now lets simulate fan failure...
and lets load the cpu...

temperature slowly rises, 1min00 -- 72C, 1min15 -- 75C, 1min30 --
77C, 1min45 -- 80C, 1min00 -- 82C, 1min15 -- 83C, 1min45 -- sudden
powerdown, presumably because of hardware failsafe.

So we have two bugs here: machine should have attempted to use passive
cooling sooner, so that critical temperature would not be reached, and
machine should have attempted shutdown before hardware failsafe killed
the power. I could do both in 2.6.21, with echo of new trip points and
enable of polling.

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-07 Thread Pavel Machek
On Tue 2007-08-07 14:58:45, Len Brown wrote:
> On Monday 06 August 2007 05:55, Pavel Machek wrote:
> > > For the
> > > upstream kernel, I think it is more appropriate to expose and fix
> > > the fundamental problems.  For distro kernels, I'm less concerned
> > > if you hide bugs instead of fixing them.
> > 
> > This is okay as long as you are willing to work around the fundamental
> > problems in kernel. You are unable to _fix_ them. They are broken
> > BIOSes.
> 
> The thing Linux needs to figure out is why Windows doesn't
> get confused by what Linux claims to be broken BIOS.

Why do you assume that Windows work? Yes, they probably will not have
'machine runs at 50% speed' problem, but I'd be very surprised if
critical shutdown  worked properly on more than 90% of notebooks

> So far I have one live sighting to be addressed by
> the upstream kernel (from Knut).  I'm certainly looking
> forward to the 2nd live sighting...

Ok, I guess I should steal that old xe3 I was talking about...

Vojtech, could I have that machine from table football room for a few
experiments? I keep using it as counterexample.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-07 Thread Len Brown
On Monday 06 August 2007 05:55, Pavel Machek wrote:
> > For the
> > upstream kernel, I think it is more appropriate to expose and fix
> > the fundamental problems.  For distro kernels, I'm less concerned
> > if you hide bugs instead of fixing them.
> 
> This is okay as long as you are willing to work around the fundamental
> problems in kernel. You are unable to _fix_ them. They are broken
> BIOSes.

The thing Linux needs to figure out is why Windows doesn't
get confused by what Linux claims to be broken BIOS.

So far I have one live sighting to be addressed by
the upstream kernel (from Knut).  I'm certainly looking
forward to the 2nd live sighting...

-Len
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-07 Thread Len Brown
On Monday 06 August 2007 05:55, Pavel Machek wrote:
  For the
  upstream kernel, I think it is more appropriate to expose and fix
  the fundamental problems.  For distro kernels, I'm less concerned
  if you hide bugs instead of fixing them.
 
 This is okay as long as you are willing to work around the fundamental
 problems in kernel. You are unable to _fix_ them. They are broken
 BIOSes.

The thing Linux needs to figure out is why Windows doesn't
get confused by what Linux claims to be broken BIOS.

So far I have one live sighting to be addressed by
the upstream kernel (from Knut).  I'm certainly looking
forward to the 2nd live sighting...

-Len
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-07 Thread Pavel Machek
On Tue 2007-08-07 14:58:45, Len Brown wrote:
 On Monday 06 August 2007 05:55, Pavel Machek wrote:
   For the
   upstream kernel, I think it is more appropriate to expose and fix
   the fundamental problems.  For distro kernels, I'm less concerned
   if you hide bugs instead of fixing them.
  
  This is okay as long as you are willing to work around the fundamental
  problems in kernel. You are unable to _fix_ them. They are broken
  BIOSes.
 
 The thing Linux needs to figure out is why Windows doesn't
 get confused by what Linux claims to be broken BIOS.

Why do you assume that Windows work? Yes, they probably will not have
'machine runs at 50% speed' problem, but I'd be very surprised if
critical shutdown  worked properly on more than 90% of notebooks

 So far I have one live sighting to be addressed by
 the upstream kernel (from Knut).  I'm certainly looking
 forward to the 2nd live sighting...

Ok, I guess I should steal that old xe3 I was talking about...

Vojtech, could I have that machine from table football room for a few
experiments? I keep using it as counterexample.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-06 Thread Pavel Machek
Hi!

> > If we have something like this, we could still discuss a config option,
> > that also allows to increase trip points, marking it with "If you set
> > this you can destroy your machine, you have been warned...". While this
> > would not be an option for distributions to compile in, some people may
> > come around the biggest hammer -> overriding DSDT.
> > 
> > I cannot promise, but I try to get this for 2.6.24.
> 
> I think if you are enamored with overriding trip points at SuSE,
> that you should simply restore the original scheme as the "value add"
> for SuSE kernels.  Seriously, I'm totally fine with that.
> 
> You should be aware, however, that (one of) the fundamental flaws
> with that scheme, shared with what you describe above, is that the OS
> can not actually change the trip points in the thermal sensor.
> The sensor is going to trip at the temperature that _it_ thinks

Yep, you work around this one by enabling polling.

> This faking out the user, plus the fact that the BIOS does change
> trip-points at run-time, made the original scheme fundamentally
> unsound.  Further, I've not yet found a single system where use

Yes, this one is uglier. But maybe "enable polling automatically +
ignore any updates from bios" (+ maybe "only enable lowering") is
better solution than "just remove the knob"? After all, "the knob" is
still useful for debugging at least.

> of this scheme wasn't papering over some other problem.  For the
> upstream kernel, I think it is more appropriate to expose and fix
> the fundamental problems.  For distro kernels, I'm less concerned
> if you hide bugs instead of fixing them.

This is okay as long as you are willing to work around the fundamental
problems in kernel. You are unable to _fix_ them. They are broken
BIOSes.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-06 Thread Pavel Machek
Hi!

  If we have something like this, we could still discuss a config option,
  that also allows to increase trip points, marking it with If you set
  this you can destroy your machine, you have been warned While this
  would not be an option for distributions to compile in, some people may
  come around the biggest hammer - overriding DSDT.
  
  I cannot promise, but I try to get this for 2.6.24.
 
 I think if you are enamored with overriding trip points at SuSE,
 that you should simply restore the original scheme as the value add
 for SuSE kernels.  Seriously, I'm totally fine with that.
 
 You should be aware, however, that (one of) the fundamental flaws
 with that scheme, shared with what you describe above, is that the OS
 can not actually change the trip points in the thermal sensor.
 The sensor is going to trip at the temperature that _it_ thinks

Yep, you work around this one by enabling polling.

 This faking out the user, plus the fact that the BIOS does change
 trip-points at run-time, made the original scheme fundamentally
 unsound.  Further, I've not yet found a single system where use

Yes, this one is uglier. But maybe enable polling automatically +
ignore any updates from bios (+ maybe only enable lowering) is
better solution than just remove the knob? After all, the knob is
still useful for debugging at least.

 of this scheme wasn't papering over some other problem.  For the
 upstream kernel, I think it is more appropriate to expose and fix
 the fundamental problems.  For distro kernels, I'm less concerned
 if you hide bugs instead of fixing them.

This is okay as long as you are willing to work around the fundamental
problems in kernel. You are unable to _fix_ them. They are broken
BIOSes.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-03 Thread Len Brown
On Friday 03 August 2007 07:16, Thomas Renninger wrote:
> On Thu, 2007-08-02 at 20:38 +0200, Andi Kleen wrote:
> > On Thu, Aug 02, 2007 at 03:57:54PM +, Pavel Machek wrote:
> > > On Thu 2007-08-02 15:16:22, Andi Kleen wrote:
> > > > On Thu, Aug 02, 2007 at 02:04:42PM +0100, Alan Cox wrote:
> > > > > > > Set a taint flag, 
> > > > > > That's hardly any useful if the machine is dead afterwards.
> > > > > 
> > > > > It won't be the hardware will do a failsafe shutdown first.
> > > > 
> > > > Not necessarily. At SUSE we had at least one broken laptop
> > > > with wrong trip points. The machine ran very hot for some time
> > > > and afterwards the hard disk was dead.
> > > 
> > > Yes, but it was original BIOS trip points that were wrong. And yes,
> > > its failsafe shutdown was too late. At least lowering the trip points
> > > would allow me to run it safely.
> > 
> > I have no problem with lowering them (in fact I proposed this
> > to Thomas as a possible solution at some point). Just rising 
> > is a bad idea.
> 
> Ok.
> If nobody screams (especially Len who has to accept this in the end, I
> don't want to do work for nothing..), I'll try an implementation that:
>   - Allows lowering trip points
>   - If BIOS modifies trip points, the overridden ones might also
> get lowered if they are even lower
>   - Allow the definition of a passive trip point (with some default
> values for hysteresis), even if the thermal zone does not
> provide one
> 
> If we have something like this, we could still discuss a config option,
> that also allows to increase trip points, marking it with "If you set
> this you can destroy your machine, you have been warned...". While this
> would not be an option for distributions to compile in, some people may
> come around the biggest hammer -> overriding DSDT.
> 
> I cannot promise, but I try to get this for 2.6.24.

I think if you are enamored with overriding trip points at SuSE,
that you should simply restore the original scheme as the "value add"
for SuSE kernels.  Seriously, I'm totally fine with that.

You should be aware, however, that (one of) the fundamental flaws
with that scheme, shared with what you describe above, is that the OS
can not actually change the trip points in the thermal sensor.
The sensor is going to trip at the temperature that _it_ thinks
the trip point is at -- not the trip point that you are letting
the user think it is at.  Ie. what is advertised as a trip-point
override actually defeats the entire concept of trip-points,
and it is mandatory that you enable periodic polling of the
current temperature to compare with your new thresholds
to work-around that.

This faking out the user, plus the fact that the BIOS does change
trip-points at run-time, made the original scheme fundamentally
unsound.  Further, I've not yet found a single system where use
of this scheme wasn't papering over some other problem.  For the
upstream kernel, I think it is more appropriate to expose and fix
the fundamental problems.  For distro kernels, I'm less concerned
if you hide bugs instead of fixing them.

We had quite a long discussion when I deleted the trip-point-override
scheme in -mm.  Then it rode through the entire 2.6.22 release cycle.
However, I have yet to see a single bug report filed that has shown
that Linux should be doing this, or something like it.  I'm hopeful
that Knut's or Adrian's will be the first -- but I'm still waiting.

-Len
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-03 Thread Len Brown
On Friday 03 August 2007 07:43, Renato S. Yamane wrote:
> Len Brown escreveu:
> > On Thursday 02 August 2007 04:40, Knut Petersen wrote:
> >> mainboard: AOpen i915GMm-hfs, AWARD BIOS
> >> cpu: Pentium-M 750 (0.8 to 1.86 MHz)
> >> openSuSE 10.2 with kernel 2.6.22.1
> >>
> >> The cpu fan can not be controled by linux kernel.
> >> The BIOS will switch on the cpu fan a bit above 50 deg. Celsius.
> >> The active and passive trip points both are set to 50 deg. Celsius.
> >> Temperature of the idle cpu at 800 Mhz: 34 to 42 deg. C.
> >> The BIOS never changes the trip points.
> >> Cpufreq does work perfectly.
> 
> On my Toshiba M45-S355 (Toshiba Bios, Pentium M 750 - 0.8 at 1.86GHz, 
> Debian Etch) I see the same using Kernel 2.6.21.6
> 
> >> Previously there was the possibility  to add something like
> >>
> >> echo  "100:0:65:70:0" > /proc/acpi/thermal_zone/THRM/trip_points
> >> echo  2 > /proc/acpi/thermal_zone/THRM/polling_frequency
> >> echo ondemand > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
> 
> I never do that, but see below (Kernel 2.6.21.6):
> 
> cat /proc/acpi/thermal_zone/TZCL/trip_points
> critical (S5):   105 C
> 
> cat /proc/acpi/thermal_zone/TZCL/polling_frequency
> polling frequency:   2 seconds
> 
> cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
> ondemand
> 

Renato,
I don't understand how your Toshiba is similar to Knut's Aopen.
You've got a single critical trip point at 105C, but no active or passive
trip points.

Are you reporting some kind of failure?

The only thing wrong with your system is that polling_frequency != 0 --
but that is probably a distro configuration issue rather than
a kernel issue.

thanks,
-Len

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-03 Thread Len Brown
On Friday 03 August 2007 08:53, Knut Petersen wrote:
> Len Brown :
> >  
> >
> > Thanks for the sighting, Knut!
> > This regression is dramatic when put in the terms of 50% performance hit!
> > I guess the good news is that thermal throttling is doing the job
> > we are asking it to:-)
> >
> >
> >   
> Thermal management by cpufreq is working really fine ;-)

Unfortunately, I a lot of people don't understand that the ";-)"
after this statement and they really think that cpufreq is a
solution for thermal management.  It isn't.  Systems still
need to be thermally sane when they are fully utilized and
cpufreq helps not.

> My problems are definitely not related to a linux bug. All trip_points
> are fixed, hardcoded in the system BIOS at address 0x000FF810.
> 
> Yes, I could hack  and flash a custom BIOS.
> 
> After reading a lot I think I even could fix the DSDT.

No, you should never have to override your BIOS --
except for debugging.

If Windows works out-of-the-box on this system,
then Linux should too - even if we have to use a DMI-based
workaround for a BIOS bug.

I'm looking forward to seeing the bug report that you are
going to file.  Please include the dmidecode output in addition
to the acpidump output.

thanks,
-Len
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-03 Thread Knut Petersen
Len Brown :
>  
>
> Thanks for the sighting, Knut!
> This regression is dramatic when put in the terms of 50% performance hit!
> I guess the good news is that thermal throttling is doing the job
> we are asking it to:-)
>
>
>   
Thermal management by cpufreq is working really fine ;-)

My problems are definitely not related to a linux bug. All trip_points
are fixed, hardcoded in the system BIOS at address 0x000FF810.

Yes, I could hack  and flash a custom BIOS.

After reading a lot I think I even could fix the DSDT.

But all that would only be a solution for my system. The principal
question is, if that hook that allowed to override unreasonable
trip point definitions is too dangerous to be a part of the linux kernel.

You and some others believed it should not be part of the kernel,
and so it was eliminated a while ago. Some people want it back,
either because
- they need it desperately to allow their machines healthy operation,
- they need it to restore performance of their machines, or
- they want a really quiet system.

Root should be allowed to smoke his system - ask him if he really
wants to do so, ask him to echo "Yes, it´s me who is guilty" to
some file prior to allow trip point changes, but do not eliminate
hooks useful for the management of buggy machines from our
kernel.

We do need writable trip points again. And, Thomas, some people
also need to raise the defaults.

cu,
 Knut
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-03 Thread Renato S. Yamane

Len Brown escreveu:

On Thursday 02 August 2007 04:40, Knut Petersen wrote:

mainboard: AOpen i915GMm-hfs, AWARD BIOS
cpu: Pentium-M 750 (0.8 to 1.86 MHz)
openSuSE 10.2 with kernel 2.6.22.1

The cpu fan can not be controled by linux kernel.
The BIOS will switch on the cpu fan a bit above 50 deg. Celsius.
The active and passive trip points both are set to 50 deg. Celsius.
Temperature of the idle cpu at 800 Mhz: 34 to 42 deg. C.
The BIOS never changes the trip points.
Cpufreq does work perfectly.


On my Toshiba M45-S355 (Toshiba Bios, Pentium M 750 - 0.8 at 1.86GHz, 
Debian Etch) I see the same using Kernel 2.6.21.6



Previously there was the possibility  to add something like

echo  "100:0:65:70:0" > /proc/acpi/thermal_zone/THRM/trip_points
echo  2 > /proc/acpi/thermal_zone/THRM/polling_frequency
echo ondemand > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor


I never do that, but see below (Kernel 2.6.21.6):

cat /proc/acpi/thermal_zone/TZCL/trip_points
critical (S5):   105 C

cat /proc/acpi/thermal_zone/TZCL/polling_frequency
polling frequency:   2 seconds

cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
ondemand

Regards,
Renato S. Yamane
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-03 Thread Thomas Renninger
On Thu, 2007-08-02 at 20:38 +0200, Andi Kleen wrote:
> On Thu, Aug 02, 2007 at 03:57:54PM +, Pavel Machek wrote:
> > On Thu 2007-08-02 15:16:22, Andi Kleen wrote:
> > > On Thu, Aug 02, 2007 at 02:04:42PM +0100, Alan Cox wrote:
> > > > > > Set a taint flag, 
> > > > > That's hardly any useful if the machine is dead afterwards.
> > > > 
> > > > It won't be the hardware will do a failsafe shutdown first.
> > > 
> > > Not necessarily. At SUSE we had at least one broken laptop
> > > with wrong trip points. The machine ran very hot for some time
> > > and afterwards the hard disk was dead.
> > 
> > Yes, but it was original BIOS trip points that were wrong. And yes,
> > its failsafe shutdown was too late. At least lowering the trip points
> > would allow me to run it safely.
> 
> I have no problem with lowering them (in fact I proposed this
> to Thomas as a possible solution at some point). Just rising 
> is a bad idea.

Ok.
If nobody screams (especially Len who has to accept this in the end, I
don't want to do work for nothing..), I'll try an implementation that:
  - Allows lowering trip points
  - If BIOS modifies trip points, the overridden ones might also
get lowered if they are even lower
  - Allow the definition of a passive trip point (with some default
values for hysteresis), even if the thermal zone does not
provide one

If we have something like this, we could still discuss a config option,
that also allows to increase trip points, marking it with "If you set
this you can destroy your machine, you have been warned...". While this
would not be an option for distributions to compile in, some people may
come around the biggest hammer -> overriding DSDT.

I cannot promise, but I try to get this for 2.6.24.

   Thomas

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-03 Thread Thomas Renninger
On Thu, 2007-08-02 at 20:38 +0200, Andi Kleen wrote:
 On Thu, Aug 02, 2007 at 03:57:54PM +, Pavel Machek wrote:
  On Thu 2007-08-02 15:16:22, Andi Kleen wrote:
   On Thu, Aug 02, 2007 at 02:04:42PM +0100, Alan Cox wrote:
  Set a taint flag, 
 That's hardly any useful if the machine is dead afterwards.

It won't be the hardware will do a failsafe shutdown first.
   
   Not necessarily. At SUSE we had at least one broken laptop
   with wrong trip points. The machine ran very hot for some time
   and afterwards the hard disk was dead.
  
  Yes, but it was original BIOS trip points that were wrong. And yes,
  its failsafe shutdown was too late. At least lowering the trip points
  would allow me to run it safely.
 
 I have no problem with lowering them (in fact I proposed this
 to Thomas as a possible solution at some point). Just rising 
 is a bad idea.

Ok.
If nobody screams (especially Len who has to accept this in the end, I
don't want to do work for nothing..), I'll try an implementation that:
  - Allows lowering trip points
  - If BIOS modifies trip points, the overridden ones might also
get lowered if they are even lower
  - Allow the definition of a passive trip point (with some default
values for hysteresis), even if the thermal zone does not
provide one

If we have something like this, we could still discuss a config option,
that also allows to increase trip points, marking it with If you set
this you can destroy your machine, you have been warned While this
would not be an option for distributions to compile in, some people may
come around the biggest hammer - overriding DSDT.

I cannot promise, but I try to get this for 2.6.24.

   Thomas

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-03 Thread Knut Petersen
Len Brown :
  

 Thanks for the sighting, Knut!
 This regression is dramatic when put in the terms of 50% performance hit!
 I guess the good news is that thermal throttling is doing the job
 we are asking it to:-)


   
Thermal management by cpufreq is working really fine ;-)

My problems are definitely not related to a linux bug. All trip_points
are fixed, hardcoded in the system BIOS at address 0x000FF810.

Yes, I could hack  and flash a custom BIOS.

After reading a lot I think I even could fix the DSDT.

But all that would only be a solution for my system. The principal
question is, if that hook that allowed to override unreasonable
trip point definitions is too dangerous to be a part of the linux kernel.

You and some others believed it should not be part of the kernel,
and so it was eliminated a while ago. Some people want it back,
either because
- they need it desperately to allow their machines healthy operation,
- they need it to restore performance of their machines, or
- they want a really quiet system.

Root should be allowed to smoke his system - ask him if he really
wants to do so, ask him to echo Yes, it´s me who is guilty to
some file prior to allow trip point changes, but do not eliminate
hooks useful for the management of buggy machines from our
kernel.

We do need writable trip points again. And, Thomas, some people
also need to raise the defaults.

cu,
 Knut
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-03 Thread Renato S. Yamane

Len Brown escreveu:

On Thursday 02 August 2007 04:40, Knut Petersen wrote:

mainboard: AOpen i915GMm-hfs, AWARD BIOS
cpu: Pentium-M 750 (0.8 to 1.86 MHz)
openSuSE 10.2 with kernel 2.6.22.1

The cpu fan can not be controled by linux kernel.
The BIOS will switch on the cpu fan a bit above 50 deg. Celsius.
The active and passive trip points both are set to 50 deg. Celsius.
Temperature of the idle cpu at 800 Mhz: 34 to 42 deg. C.
The BIOS never changes the trip points.
Cpufreq does work perfectly.


On my Toshiba M45-S355 (Toshiba Bios, Pentium M 750 - 0.8 at 1.86GHz, 
Debian Etch) I see the same using Kernel 2.6.21.6



Previously there was the possibility  to add something like

echo  100:0:65:70:0  /proc/acpi/thermal_zone/THRM/trip_points
echo  2  /proc/acpi/thermal_zone/THRM/polling_frequency
echo ondemand  /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor


I never do that, but see below (Kernel 2.6.21.6):

cat /proc/acpi/thermal_zone/TZCL/trip_points
critical (S5):   105 C

cat /proc/acpi/thermal_zone/TZCL/polling_frequency
polling frequency:   2 seconds

cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
ondemand

Regards,
Renato S. Yamane
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-03 Thread Len Brown
On Friday 03 August 2007 08:53, Knut Petersen wrote:
 Len Brown :
   
 
  Thanks for the sighting, Knut!
  This regression is dramatic when put in the terms of 50% performance hit!
  I guess the good news is that thermal throttling is doing the job
  we are asking it to:-)
 
 

 Thermal management by cpufreq is working really fine ;-)

Unfortunately, I a lot of people don't understand that the ;-)
after this statement and they really think that cpufreq is a
solution for thermal management.  It isn't.  Systems still
need to be thermally sane when they are fully utilized and
cpufreq helps not.

 My problems are definitely not related to a linux bug. All trip_points
 are fixed, hardcoded in the system BIOS at address 0x000FF810.
 
 Yes, I could hack  and flash a custom BIOS.
 
 After reading a lot I think I even could fix the DSDT.

No, you should never have to override your BIOS --
except for debugging.

If Windows works out-of-the-box on this system,
then Linux should too - even if we have to use a DMI-based
workaround for a BIOS bug.

I'm looking forward to seeing the bug report that you are
going to file.  Please include the dmidecode output in addition
to the acpidump output.

thanks,
-Len
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-03 Thread Len Brown
On Friday 03 August 2007 07:43, Renato S. Yamane wrote:
 Len Brown escreveu:
  On Thursday 02 August 2007 04:40, Knut Petersen wrote:
  mainboard: AOpen i915GMm-hfs, AWARD BIOS
  cpu: Pentium-M 750 (0.8 to 1.86 MHz)
  openSuSE 10.2 with kernel 2.6.22.1
 
  The cpu fan can not be controled by linux kernel.
  The BIOS will switch on the cpu fan a bit above 50 deg. Celsius.
  The active and passive trip points both are set to 50 deg. Celsius.
  Temperature of the idle cpu at 800 Mhz: 34 to 42 deg. C.
  The BIOS never changes the trip points.
  Cpufreq does work perfectly.
 
 On my Toshiba M45-S355 (Toshiba Bios, Pentium M 750 - 0.8 at 1.86GHz, 
 Debian Etch) I see the same using Kernel 2.6.21.6
 
  Previously there was the possibility  to add something like
 
  echo  100:0:65:70:0  /proc/acpi/thermal_zone/THRM/trip_points
  echo  2  /proc/acpi/thermal_zone/THRM/polling_frequency
  echo ondemand  /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
 
 I never do that, but see below (Kernel 2.6.21.6):
 
 cat /proc/acpi/thermal_zone/TZCL/trip_points
 critical (S5):   105 C
 
 cat /proc/acpi/thermal_zone/TZCL/polling_frequency
 polling frequency:   2 seconds
 
 cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
 ondemand
 

Renato,
I don't understand how your Toshiba is similar to Knut's Aopen.
You've got a single critical trip point at 105C, but no active or passive
trip points.

Are you reporting some kind of failure?

The only thing wrong with your system is that polling_frequency != 0 --
but that is probably a distro configuration issue rather than
a kernel issue.

thanks,
-Len

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-03 Thread Len Brown
On Friday 03 August 2007 07:16, Thomas Renninger wrote:
 On Thu, 2007-08-02 at 20:38 +0200, Andi Kleen wrote:
  On Thu, Aug 02, 2007 at 03:57:54PM +, Pavel Machek wrote:
   On Thu 2007-08-02 15:16:22, Andi Kleen wrote:
On Thu, Aug 02, 2007 at 02:04:42PM +0100, Alan Cox wrote:
   Set a taint flag, 
  That's hardly any useful if the machine is dead afterwards.
 
 It won't be the hardware will do a failsafe shutdown first.

Not necessarily. At SUSE we had at least one broken laptop
with wrong trip points. The machine ran very hot for some time
and afterwards the hard disk was dead.
   
   Yes, but it was original BIOS trip points that were wrong. And yes,
   its failsafe shutdown was too late. At least lowering the trip points
   would allow me to run it safely.
  
  I have no problem with lowering them (in fact I proposed this
  to Thomas as a possible solution at some point). Just rising 
  is a bad idea.
 
 Ok.
 If nobody screams (especially Len who has to accept this in the end, I
 don't want to do work for nothing..), I'll try an implementation that:
   - Allows lowering trip points
   - If BIOS modifies trip points, the overridden ones might also
 get lowered if they are even lower
   - Allow the definition of a passive trip point (with some default
 values for hysteresis), even if the thermal zone does not
 provide one
 
 If we have something like this, we could still discuss a config option,
 that also allows to increase trip points, marking it with If you set
 this you can destroy your machine, you have been warned While this
 would not be an option for distributions to compile in, some people may
 come around the biggest hammer - overriding DSDT.
 
 I cannot promise, but I try to get this for 2.6.24.

I think if you are enamored with overriding trip points at SuSE,
that you should simply restore the original scheme as the value add
for SuSE kernels.  Seriously, I'm totally fine with that.

You should be aware, however, that (one of) the fundamental flaws
with that scheme, shared with what you describe above, is that the OS
can not actually change the trip points in the thermal sensor.
The sensor is going to trip at the temperature that _it_ thinks
the trip point is at -- not the trip point that you are letting
the user think it is at.  Ie. what is advertised as a trip-point
override actually defeats the entire concept of trip-points,
and it is mandatory that you enable periodic polling of the
current temperature to compare with your new thresholds
to work-around that.

This faking out the user, plus the fact that the BIOS does change
trip-points at run-time, made the original scheme fundamentally
unsound.  Further, I've not yet found a single system where use
of this scheme wasn't papering over some other problem.  For the
upstream kernel, I think it is more appropriate to expose and fix
the fundamental problems.  For distro kernels, I'm less concerned
if you hide bugs instead of fixing them.

We had quite a long discussion when I deleted the trip-point-override
scheme in -mm.  Then it rode through the entire 2.6.22 release cycle.
However, I have yet to see a single bug report filed that has shown
that Linux should be doing this, or something like it.  I'm hopeful
that Knut's or Adrian's will be the first -- but I'm still waiting.

-Len
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Len Brown
On Thursday 02 August 2007 05:45, Adrian Schröter wrote:
> On Thursday 02 August 2007 11:42:27 wrote Thomas Renninger:
> > On Thu, 2007-08-02 at 10:40 +0200, Knut Petersen wrote:
> > > Hi everybody!
> > >
> > > Kernel 2.6.22 decreases performance by about 50% on my system.
> > > No, I do not like that. The reason is a broken BIOS, granted, but there
> > > was a perfect workaround in the kernel that has been dropped.
> > >
> > > mainboard: AOpen i915GMm-hfs, AWARD BIOS
> > > cpu: Pentium-M 750 (0.8 to 1.86 MHz)
> > > openSuSE 10.2 with kernel 2.6.22.1
> >
> > Is this a DELL laptop that gets throttled by 75% to throttling state 6
> > if 60 degrees are exceeded?
> > Adrian has such a machine..., no idea what is going on with that one,
> > but only workaround to get any use out of this machine is to override at
> > least the passive trip point.
> 
> JFYI, there are plenty of these systems around, it was one out of four 
> standard Novell modells. I am mabye just the first one who uses Factory on 
> it, but expect more bugreports when 10.3 gets released ...

That's very good news, Adrian.  In the past all we had to go on
was the memory of a machine that died several years ago.
But if you've got a live failure, that is really valuable.

Please go here
http://bugzilla.kernel.org/enter_bug.cgi?product=ACPI
and submit a new sighting vs. Power-Thermal
and attach the output from acpidump, cat /proc/acpi/thermal_zone/*/*
and assign it to [EMAIL PROTECTED]

thanks,
-Len
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Len Brown
On Thursday 02 August 2007 04:40, Knut Petersen wrote:

> Kernel 2.6.22 decreases performance by about 50% on my system.
> No, I do not like that. The reason is a broken BIOS, granted, but there
> was a perfect workaround in the kernel that has been dropped.
> 
> mainboard: AOpen i915GMm-hfs, AWARD BIOS
> cpu: Pentium-M 750 (0.8 to 1.86 MHz)
> openSuSE 10.2 with kernel 2.6.22.1
> 
> The cpu fan can not be controled by linux kernel.
> The BIOS will switch on the cpu fan a bit above 50 deg. Celsius.
> The active and passive trip points both are set to 50 deg. Celsius.
> Temperature of the idle cpu at 800 Mhz: 34 to 42 deg. C.
> The BIOS never changes the trip points.
> Cpufreq does work perfectly.
> 
> Previously there was the possibility  to add something like
> 
> echo  "100:0:65:70:0" > /proc/acpi/thermal_zone/THRM/trip_points
> echo  2 > /proc/acpi/thermal_zone/THRM/polling_frequency
> echo ondemand > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
> 
> to e.g. /etc/init.d/boot.local. With 2.6.22 that solution does not exist
> any longer. Now the code in thermal.c slows down the cpu under load
> to prevent "overheating". Kernel compile time increases from about 12
> to 18 minutes. No, I don´t like that, nobody would.
 

Thanks for the sighting, Knut!
This regression is dramatic when put in the terms of 50% performance hit!
I guess the good news is that thermal throttling is doing the job
we are asking it to:-)

The statement above regarding the existence of active trip points
and the kernel not being able to control the fan are inconsistent
with each other.

Please open a sighting for this machine here:

http://bugzilla.kernel.org/enter_bug.cgi?product=ACPI
vs. Power-Thermal
and attach the output from acpidump, cat /proc/acpi/thermal_zone/*/*
and assign it to [EMAIL PROTECTED]

BTW. does the board boot and run properly with "acpi=off"?

thanks,
-Len
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Krzysztof Halasa
Knut Petersen <[EMAIL PROTECTED]> writes:

> echo "I know what I am doing" >
> /proc/acpi/thermal_zone/THRM/enable_really_dangerous_options

There is a shorter version:
$ su
Password:
# 
-- 
Krzysztof Halasa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Matthew Garrett
On Thu, Aug 02, 2007 at 08:38:30PM +0200, Andi Kleen wrote:
> On Thu, Aug 02, 2007 at 03:57:54PM +, Pavel Machek wrote:
> > Yes, but it was original BIOS trip points that were wrong. And yes,
> > its failsafe shutdown was too late. At least lowering the trip points
> > would allow me to run it safely.
> 
> I have no problem with lowering them (in fact I proposed this
> to Thomas as a possible solution at some point). Just rising 
> is a bad idea.

Though for this to be reliable, you need to ignore any notifications 
that would raise the trip points while still paying attention to any 
that would lower them.

-- 
Matthew Garrett | [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Andi Kleen
On Thu, Aug 02, 2007 at 03:57:54PM +, Pavel Machek wrote:
> On Thu 2007-08-02 15:16:22, Andi Kleen wrote:
> > On Thu, Aug 02, 2007 at 02:04:42PM +0100, Alan Cox wrote:
> > > > > Set a taint flag, 
> > > > That's hardly any useful if the machine is dead afterwards.
> > > 
> > > It won't be the hardware will do a failsafe shutdown first.
> > 
> > Not necessarily. At SUSE we had at least one broken laptop
> > with wrong trip points. The machine ran very hot for some time
> > and afterwards the hard disk was dead.
> 
> Yes, but it was original BIOS trip points that were wrong. And yes,
> its failsafe shutdown was too late. At least lowering the trip points
> would allow me to run it safely.

I have no problem with lowering them (in fact I proposed this
to Thomas as a possible solution at some point). Just rising 
is a bad idea.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Pavel Machek
Hi!

> Well, it would not be the first time to eliminate a regression by
> reverting a
> patch after it was accepted previously.
> >> Sanity checks that trip points only can get lowered (compared to initial
> >> provided ones) needs to be added.
> >> Len, Rui: For short-term can some 
> But I _need_ to raise the unreasonably low passive trip point. We could
> decide to
> protect the innocent user by allowing write access to trip_points only
> after a previous

Actually, you should lower your active trip point, and keep cpu temp
below 50C.

> echo "I know what I am doing" >
> /proc/acpi/thermal_zone/THRM/enable_really_dangerous_options

No... but patch that only permits lowering could be acceptable.

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Pavel Machek
On Thu 2007-08-02 15:16:22, Andi Kleen wrote:
> On Thu, Aug 02, 2007 at 02:04:42PM +0100, Alan Cox wrote:
> > > > Set a taint flag, 
> > > That's hardly any useful if the machine is dead afterwards.
> > 
> > It won't be the hardware will do a failsafe shutdown first.
> 
> Not necessarily. At SUSE we had at least one broken laptop
> with wrong trip points. The machine ran very hot for some time
> and afterwards the hard disk was dead.

Yes, but it was original BIOS trip points that were wrong. And yes,
its failsafe shutdown was too late. At least lowering the trip points
would allow me to run it safely.

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Pavel Machek
Hi!

> > I didn't understand the arguments either, actually.
> 
> The issue is that you can actually kill hardware by setting this wrong.
> We've had such cases where trip point problems eventually lead
> to overheated laptops with hard disks dying etc. 

Actually, that was my machine. Omnibook xe3; BIOS provided trip points
*did* kill the disk. At least I was able to work around it with
writing to trip points.

Yes, ACPI mandates emergency shutdown when critical+delta point is
reached, *in hardware*. So this only endangers very broken machines,
and it also fixes lot of them.

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Andi Kleen
On Thu, Aug 02, 2007 at 02:04:42PM +0100, Alan Cox wrote:
> > > Set a taint flag, 
> > That's hardly any useful if the machine is dead afterwards.
> 
> It won't be the hardware will do a failsafe shutdown first.

Not necessarily. At SUSE we had at least one broken laptop
with wrong trip points. The machine ran very hot for some time
and afterwards the hard disk was dead.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Alan Cox
> > Andi, would the above be mechanism sufficiently safe for your taste?
> 
> No.

I don't beleve Andi's taste (or lack thereof) is relevant to this
discussion. He's not for example explained why its better to force people
to disable all the APCI power and thermal control on their system rather
than adjust trip points.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Alan Cox
> > Set a taint flag, 
> That's hardly any useful if the machine is dead afterwards.

It won't be the hardware will do a failsafe shutdown first.

> You'll just end up with "Linux destroyed my laptop" headlines all 
> over the internet and rightfully very annoyed users.

You have to systematically sit down and tweak your machine.

> The philosophy didn't include physically destroying hardware
> as far as I know.

It most certainly did. With safety checks you could override.

> > As root you can erase the bios, 
> We don't ship the devbios driver for good reasons.

Thats debatably a bad reason (the user space API is wrong thats all), and
one thats totally inconsistent with some of the other drivers we do ship.

> > lock the hard disk with a random
> > password, reflash your video card  
> 
> That all requires significant effort and custom software. It's not that we 
> have a one liner echo destroy > /sys/.../flash-bios. 

Well you can do the hard disk one in one line of perl, the video card one
in a small bit of C. And this merely makes the argument that raising the
trip points should be harder.

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Matthew Garrett
On Thu, Aug 02, 2007 at 02:42:19PM +0200, Thomas Renninger wrote:
> On Thu, 2007-08-02 at 12:56 +0100, Matthew Garrett wrote:
> > The policy has been to attempt to be bug-compatible with Windows 
> > whenever possible for some time now.
> *whenever possible*

But there's no evidence whatsoever that this is something we can't 
handle...

> > No, that's not the only reason for notifications. Alteration in hardware 
> > state may also force a recalculation of trip point (adding a battery to 
> > a bay rather than a DVD drive may require the platform to be kept at a 
> > lower temperature)
> "I've seen no evidence that this happens...", but I see the point.

It's explicitly mentioned as one of the use cases for trip point 
alteration in the spec.

> > Surely people want this functionality so that they can raise trip 
> > points?
> For Adrian it would be enough to be able to lower them.

Which suggests that we're probably doing something wrong at some more 
fundamental level...

> Also being able to define a passive trip point (even if not provided by
> BIOS) could help a lot machines.

I agree that being able to lower trip points is unlikely to result in 
hardware damage, but still think that it's likely to be papering over 
genuine bugs that we could fix properly.

-- 
Matthew Garrett | [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Matthew Garrett
On Thu, Aug 02, 2007 at 02:35:18PM +0200, Thomas Renninger wrote:
> On Thu, 2007-08-02 at 13:15 +0100, Matthew Garrett wrote:
> > That machine has no active thermal trip points, so I'm not sure how it's 
> > relevant here.
> >From above: "Windows as I understand it has vendor mechanisms to..."
> Maybe thermal trip points are not influenced here, it's at least about
> thermal management and another prove that we cannot just try to copy
> Windows behavior, but need to provide workarounds wherever possible.

There's absolutely no evidence in the bug log there that the user's 
problems are in any way due to Windows-specific code. The SetSilentMode 
stuff is an additional item of functionality that underclocks various 
bits of hardware, not one that's actually required for the platform to 
function correctly.
-- 
Matthew Garrett | [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Thomas Renninger
On Thu, 2007-08-02 at 12:56 +0100, Matthew Garrett wrote:
> On Thu, Aug 02, 2007 at 01:45:00PM +0200, Thomas Renninger wrote:
> > On Thu, 2007-08-02 at 12:13 +0100, Matthew Garrett wrote:
> > > I strongly suspect that the vast majority[1] of hardware that "needs" 
> > > the trip points changing works perfectly well under Windows, so it's 
> > > likely to be papering over bugs in the kernel. It'd be nice if we fixed 
> > > those rather than encouraging people to poke stuff into /proc,
> > Some arguments against that:
> >   - You cannot tell a customer: Wait for the kernel in half a year.
> > This is the time it at least needs until a laptop got sold, the
> > problem is found, a patch is written and checked in and finally
> > hits the distribution.
> 
> We have to do so frequently. New hardware often exposes bugs in the 
> kernel.
And often we can provide a boot param or whatever, that makes it at
least useable.
> 
> >   - You can also not backport fixes as ACPI patches mostly have the
> > potential to break other machines/BIOSes
> >   - There also exist the policy to not fix up/workaround totally broken
> > AML BIOS implementations
> 
> The policy has been to attempt to be bug-compatible with Windows 
> whenever possible for some time now.
*whenever possible*
> 
> >   - We do not need to and never will be able to copy or do the same
> > Windows is doing
> 
> Given that many vendors still only test against Windows, that's exactly 
> what we need to do.
But we cannot (copy all windows (mis-)behavior).
> 
> > > especially when doing so is guaranteed to break in really confusing ways 
> > > with a lot of hardware. The firmware can reset the trip points at 
> > > essentially arbitrary times and is well within its rights to expect the 
> > > OS to actually pay attention to them.
> > What the hell is so wrong with:
> > 
> > Let the user override the trip points. If he does so, ignore
> > thermal trip point updates from BIOS. Don't care for hysteresis
> > BIOS implementations (these are the BIOS trip point updates).
> 
> No, that's not the only reason for notifications. Alteration in hardware 
> state may also force a recalculation of trip point (adding a battery to 
> a bay rather than a DVD drive may require the platform to be kept at a 
> lower temperature)
"I've seen no evidence that this happens...", but I see the point.
> > If user changes them, it's his fault, he doesn't need to...
> > Make sure that trip points can only be lowered, compared to the
> > initially fetched one from BIOS.
> 
> Surely people want this functionality so that they can raise trip 
> points?
For Adrian it would be enough to be able to lower them.
Also being able to define a passive trip point (even if not provided by
BIOS) could help a lot machines.

What about at least:
  - Be able to override passive cooling trip point
  - If BIOS does not provide one, let user be able to define it
This should already make a lot people happy.

Thomas


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Thomas Renninger
On Thu, 2007-08-02 at 13:15 +0100, Matthew Garrett wrote:
> On Thu, Aug 02, 2007 at 02:06:26PM +0200, Thomas Renninger wrote:
> > On Thu, 2007-08-02 at 12:57 +0100, Matthew Garrett wrote:
> > > On Thu, Aug 02, 2007 at 12:59:47PM +0100, Alan Cox wrote:
> > > > Windows as I understand it has vendor mechanisms to allow the bits
> > > > shipped with the OS to override/ignore just about everything trip points
> > > > included. Lots of hardware that requires fixups in Linux and just works
> > > > in Windows is not Linux bugs but Windows magic .inf files and other
> > > > registry gunge done by the machine vendor. We see this in ATA, in power
> > > > management and elsewhere.
> > > 
> > > I've seen no evidence that this happens with thermal trip points.
> > 
> > WMI needed for fan control -- FSC Amilo M3438G
> > http://bugzilla.kernel.org/show_bug.cgi?id=5670
> 
> That machine has no active thermal trip points, so I'm not sure how it's 
> relevant here.
>From above: "Windows as I understand it has vendor mechanisms to..."
Maybe thermal trip points are not influenced here, it's at least about
thermal management and another prove that we cannot just try to copy
Windows behavior, but need to provide workarounds wherever possible.

   Thomas

> By the sounds of the bug log, I suspect Linux just runs 
> slightly hotter on the machine than Windows does - especially since the 
> user isn't running the closed nvidia driver, so there's nothing to carry 
> out any power management on the GPU.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Matthew Garrett
On Thu, Aug 02, 2007 at 02:06:26PM +0200, Thomas Renninger wrote:
> On Thu, 2007-08-02 at 12:57 +0100, Matthew Garrett wrote:
> > On Thu, Aug 02, 2007 at 12:59:47PM +0100, Alan Cox wrote:
> > > Windows as I understand it has vendor mechanisms to allow the bits
> > > shipped with the OS to override/ignore just about everything trip points
> > > included. Lots of hardware that requires fixups in Linux and just works
> > > in Windows is not Linux bugs but Windows magic .inf files and other
> > > registry gunge done by the machine vendor. We see this in ATA, in power
> > > management and elsewhere.
> > 
> > I've seen no evidence that this happens with thermal trip points.
> 
> WMI needed for fan control -- FSC Amilo M3438G
> http://bugzilla.kernel.org/show_bug.cgi?id=5670

That machine has no active thermal trip points, so I'm not sure how it's 
relevant here. By the sounds of the bug log, I suspect Linux just runs 
slightly hotter on the machine than Windows does - especially since the 
user isn't running the closed nvidia driver, so there's nothing to carry 
out any power management on the GPU.
-- 
Matthew Garrett | [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Andi Kleen
> Andi Kleen wrote:
> 
>   > I don't think it's that unreasonable to require source code
> modifications
>   >  for anything that can kill hardware. At least that raises the barrier
>   >  a bit and hopefully ensures people think twice about it and then really
>   > only blame themselves if anything goes wrong.
> 
> Andi, would the above be mechanism sufficiently safe for your taste?

No.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Thomas Renninger
On Thu, 2007-08-02 at 12:57 +0100, Matthew Garrett wrote:
> On Thu, Aug 02, 2007 at 12:59:47PM +0100, Alan Cox wrote:
> > > I strongly suspect that the vast majority[1] of hardware that "needs" 
> > > the trip points changing works perfectly well under Windows, so it's 
> > 
> > Windows as I understand it has vendor mechanisms to allow the bits
> > shipped with the OS to override/ignore just about everything trip points
> > included. Lots of hardware that requires fixups in Linux and just works
> > in Windows is not Linux bugs but Windows magic .inf files and other
> > registry gunge done by the machine vendor. We see this in ATA, in power
> > management and elsewhere.
> 
> I've seen no evidence that this happens with thermal trip points.

WMI needed for fan control -- FSC Amilo M3438G
http://bugzilla.kernel.org/show_bug.cgi?id=5670

   Thomas

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Andi Kleen
> Set a taint flag, 

That's hardly any useful if the machine is dead afterwards.

> print a loud message 

Neither.

You'll just end up with "Linux destroyed my laptop" headlines all 
over the internet and rightfully very annoyed users.

> Or have you forgotten the original Unix
> philosophy too ?

The philosophy didn't include physically destroying hardware
as far as I know.

> > > Here we had obviously-useful-to-you functionality which was taken away
> > > without, afaik, providing any alternative.
> > 
> > I don't think it's that unreasonable to require source code modifications
> > for anything that can kill hardware.
> 
> As root you can erase the bios, 

We don't ship the devbios driver for good reasons.

> lock the hard disk with a random
> password, reflash your video card  

That all requires significant effort and custom software. It's not that we 
have a one liner echo destroy > /sys/.../flash-bios. 

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Matthew Garrett
On Thu, Aug 02, 2007 at 12:59:47PM +0100, Alan Cox wrote:
> > I strongly suspect that the vast majority[1] of hardware that "needs" 
> > the trip points changing works perfectly well under Windows, so it's 
> 
> Windows as I understand it has vendor mechanisms to allow the bits
> shipped with the OS to override/ignore just about everything trip points
> included. Lots of hardware that requires fixups in Linux and just works
> in Windows is not Linux bugs but Windows magic .inf files and other
> registry gunge done by the machine vendor. We see this in ATA, in power
> management and elsewhere.

I've seen no evidence that this happens with thermal trip points.
-- 
Matthew Garrett | [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Matthew Garrett
On Thu, Aug 02, 2007 at 01:45:00PM +0200, Thomas Renninger wrote:
> On Thu, 2007-08-02 at 12:13 +0100, Matthew Garrett wrote:
> > I strongly suspect that the vast majority[1] of hardware that "needs" 
> > the trip points changing works perfectly well under Windows, so it's 
> > likely to be papering over bugs in the kernel. It'd be nice if we fixed 
> > those rather than encouraging people to poke stuff into /proc,
> Some arguments against that:
>   - You cannot tell a customer: Wait for the kernel in half a year.
> This is the time it at least needs until a laptop got sold, the
> problem is found, a patch is written and checked in and finally
> hits the distribution.

We have to do so frequently. New hardware often exposes bugs in the 
kernel.

>   - You can also not backport fixes as ACPI patches mostly have the
> potential to break other machines/BIOSes
>   - There also exist the policy to not fix up/workaround totally broken
> AML BIOS implementations

The policy has been to attempt to be bug-compatible with Windows 
whenever possible for some time now.

>   - We do not need to and never will be able to copy or do the same
> Windows is doing

Given that many vendors still only test against Windows, that's exactly 
what we need to do.

> > especially when doing so is guaranteed to break in really confusing ways 
> > with a lot of hardware. The firmware can reset the trip points at 
> > essentially arbitrary times and is well within its rights to expect the 
> > OS to actually pay attention to them.
> What the hell is so wrong with:
> 
> Let the user override the trip points. If he does so, ignore
> thermal trip point updates from BIOS. Don't care for hysteresis
> BIOS implementations (these are the BIOS trip point updates).

No, that's not the only reason for notifications. Alteration in hardware 
state may also force a recalculation of trip point (adding a battery to 
a bay rather than a DVD drive may require the platform to be kept at a 
lower temperature)

> If user changes them, it's his fault, he doesn't need to...
> Make sure that trip points can only be lowered, compared to the
> initially fetched one from BIOS.

Surely people want this functionality so that they can raise trip 
points?

-- 
Matthew Garrett | [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Alan Cox
> I strongly suspect that the vast majority[1] of hardware that "needs" 
> the trip points changing works perfectly well under Windows, so it's 

Windows as I understand it has vendor mechanisms to allow the bits
shipped with the OS to override/ignore just about everything trip points
included. Lots of hardware that requires fixups in Linux and just works
in Windows is not Linux bugs but Windows magic .inf files and other
registry gunge done by the machine vendor. We see this in ATA, in power
management and elsewhere.

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Thomas Renninger
On Thu, 2007-08-02 at 12:13 +0100, Matthew Garrett wrote:
> On Thu, Aug 02, 2007 at 12:02:21PM +0100, Alan Cox wrote:
> > > Anyway, only solution/workaround to use these machines with current
> > > kernels is to override trip points, maybe the patch should really just
> > > be reverted...
> > 
> > The question really is whether the vendors will all revert it and carry
> > it as a patch or whether the main tree will accept reality on this one.
> > 
> > Reverting it and adding a taint marker if you do it is much preferable I
> > suspect to having every vendor revert this bogus if well meaning
> > changeset.
> 
> I strongly suspect that the vast majority[1] of hardware that "needs" 
> the trip points changing works perfectly well under Windows, so it's 
> likely to be papering over bugs in the kernel. It'd be nice if we fixed 
> those rather than encouraging people to poke stuff into /proc,
Some arguments against that:
  - You cannot tell a customer: Wait for the kernel in half a year.
This is the time it at least needs until a laptop got sold, the
problem is found, a patch is written and checked in and finally
hits the distribution.
  - You can also not backport fixes as ACPI patches mostly have the
potential to break other machines/BIOSes
  - There also exist the policy to not fix up/workaround totally broken
AML BIOS implementations
  - We do not need to and never will be able to copy or do the same
Windows is doing
  - ...

> especially when doing so is guaranteed to break in really confusing ways 
> with a lot of hardware. The firmware can reset the trip points at 
> essentially arbitrary times and is well within its rights to expect the 
> OS to actually pay attention to them.
What the hell is so wrong with:

Let the user override the trip points. If he does so, ignore
thermal trip point updates from BIOS. Don't care for hysteresis
BIOS implementations (these are the BIOS trip point updates).
If user changes them, it's his fault, he doesn't need to...
Make sure that trip points can only be lowered, compared to the
initially fetched one from BIOS.

This is neither confusing, nor dangerous in any way (beside the fact
that the critical trip point might get dynamically lowered by BIOS,
which is totally insane).

  Thomas

> 
> [1] Some hardware is simply broken. We don't carry phc just because some 
> vendors put the wrong voltage values in their tables, either


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Knut Petersen
Thomas Renninger wrote:
>> mainboard: AOpen i915GMm-hfs, AWARD BIOS
>> cpu: Pentium-M 750 (0.8 to 1.86 MHz)
>> openSuSE 10.2 with kernel 2.6.22.1
> Is this a DELL laptop that gets throttled by 75% to throttling state 6
> if 60 degrees are exceeded?
No, it is a Pentium M desktop board.:
Chipset i915GM, FSB 533MHz, max 2GB DDR2 RAM, 2 PCI
and 1 16x PCI Express slots, serial, parallel, usb, firewire,
2x Marvel Gigabit Ethernet, Realtek ALC 880 sound, IDE,
Intel SATA and SiI SATA Raid, FDC, DVI and VGA video out etc.
Very low power consumption: ~40W to 65W for the whole system,
except monitor.
> As 2.6.22 was shipped without, I think reverting is not a real option.
Well, it would not be the first time to eliminate a regression by
reverting a
patch after it was accepted previously.
>> Sanity checks that trip points only can get lowered (compared to initial
>> provided ones) needs to be added.
>> Len, Rui: For short-term can some 
But I _need_ to raise the unreasonably low passive trip point. We could
decide to
protect the innocent user by allowing write access to trip_points only
after a previous

echo "I know what I am doing" >
/proc/acpi/thermal_zone/THRM/enable_really_dangerous_options

if we believe that this is a good idea ...

Andi Kleen wrote:

  > I don't think it's that unreasonable to require source code
modifications
  >  for anything that can kill hardware. At least that raises the barrier
  >  a bit and hopefully ensures people think twice about it and then really
  > only blame themselves if anything goes wrong.

Andi, would the above be mechanism sufficiently safe for your taste?

cu,
 Knut

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Matthew Garrett
On Thu, Aug 02, 2007 at 12:02:21PM +0100, Alan Cox wrote:
> > Anyway, only solution/workaround to use these machines with current
> > kernels is to override trip points, maybe the patch should really just
> > be reverted...
> 
> The question really is whether the vendors will all revert it and carry
> it as a patch or whether the main tree will accept reality on this one.
> 
> Reverting it and adding a taint marker if you do it is much preferable I
> suspect to having every vendor revert this bogus if well meaning
> changeset.

I strongly suspect that the vast majority[1] of hardware that "needs" 
the trip points changing works perfectly well under Windows, so it's 
likely to be papering over bugs in the kernel. It'd be nice if we fixed 
those rather than encouraging people to poke stuff into /proc, 
especially when doing so is guaranteed to break in really confusing ways 
with a lot of hardware. The firmware can reset the trip points at 
essentially arbitrary times and is well within its rights to expect the 
OS to actually pay attention to them.

[1] Some hardware is simply broken. We don't carry phc just because some 
vendors put the wrong voltage values in their tables, either
-- 
Matthew Garrett | [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Alan Cox
> Anyway, only solution/workaround to use these machines with current
> kernels is to override trip points, maybe the patch should really just
> be reverted...

The question really is whether the vendors will all revert it and carry
it as a patch or whether the main tree will accept reality on this one.

Reverting it and adding a taint marker if you do it is much preferable I
suspect to having every vendor revert this bogus if well meaning
changeset.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Alan Cox
> Also it runs the system out of spec and is similar to overclocking
> which we also do not support.

We do not systematically prevent overclocking. There are lots of cases
where altering the trip points is helpful, and if you look in vendor
bugzilla databases there are multiple moans from people whose laptops now
run slow, or in many cases are simply unusable as a result of Len's
change.

Given you can achieve some of the same result by not loading the relevant
ACPI code in the first place your argument makes no rational sense at all.

Set a taint flag, print a loud message but don't stop users actually
doing things they intend as root. Or have you forgotten the original Unix
philosophy too ?

> > Here we had obviously-useful-to-you functionality which was taken away
> > without, afaik, providing any alternative.
> 
> I don't think it's that unreasonable to require source code modifications
> for anything that can kill hardware.

As root you can erase the bios, lock the hard disk with a random
password, reflash your video card  

Sorry Andi, you simply do not know better than all end users.

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Thomas Renninger
On Thu, 2007-08-02 at 11:45 +0200, Adrian Schröter wrote:
> On Thursday 02 August 2007 11:42:27 wrote Thomas Renninger:
> > On Thu, 2007-08-02 at 10:40 +0200, Knut Petersen wrote:
> > > Hi everybody!
> > >
> > > Kernel 2.6.22 decreases performance by about 50% on my system.
> > > No, I do not like that. The reason is a broken BIOS, granted, but there
> > > was a perfect workaround in the kernel that has been dropped.
> > >
> > > mainboard: AOpen i915GMm-hfs, AWARD BIOS
> > > cpu: Pentium-M 750 (0.8 to 1.86 MHz)
> > > openSuSE 10.2 with kernel 2.6.22.1
> >
> > Is this a DELL laptop that gets throttled by 75% to throttling state 6
> > if 60 degrees are exceeded?
> > Adrian has such a machine..., no idea what is going on with that one,
> > but only workaround to get any use out of this machine is to override at
> > least the passive trip point.
> 
> JFYI, there are plenty of these systems around, it was one out of four 
> standard Novell modells. I am mabye just the first one who uses Factory on 
> it, but expect more bugreports when 10.3 gets released ...

Oops. So this is not broken HW/BIOS, but definitely a kernel problem?
Only idea that comes to my mind finding this is to grep through the DSDT
and look out for code that accesses CPU throttling HW ports. Maybe ACPI
subsystem gets something wrong, processing this code and activating
throttling by accident?

Anyway, only solution/workaround to use these machines with current
kernels is to override trip points, maybe the patch should really just
be reverted...

   Thomas 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Andi Kleen
Andrew Morton <[EMAIL PROTECTED]> writes:

> I didn't understand the arguments either, actually.

The issue is that you can actually kill hardware by setting this wrong.
We've had such cases where trip point problems eventually lead
to overheated laptops with hard disks dying etc. 

Also it runs the system out of spec and is similar to overclocking
which we also do not support.
 
> Here we had obviously-useful-to-you functionality which was taken away
> without, afaik, providing any alternative.

I don't think it's that unreasonable to require source code modifications
for anything that can kill hardware. At least that raises the barrier
a bit and hopefully ensures people think twice about it and then really
only blame themselves if anything goes wrong.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Adrian Schröter
On Thursday 02 August 2007 11:42:27 wrote Thomas Renninger:
> On Thu, 2007-08-02 at 10:40 +0200, Knut Petersen wrote:
> > Hi everybody!
> >
> > Kernel 2.6.22 decreases performance by about 50% on my system.
> > No, I do not like that. The reason is a broken BIOS, granted, but there
> > was a perfect workaround in the kernel that has been dropped.
> >
> > mainboard: AOpen i915GMm-hfs, AWARD BIOS
> > cpu: Pentium-M 750 (0.8 to 1.86 MHz)
> > openSuSE 10.2 with kernel 2.6.22.1
>
> Is this a DELL laptop that gets throttled by 75% to throttling state 6
> if 60 degrees are exceeded?
> Adrian has such a machine..., no idea what is going on with that one,
> but only workaround to get any use out of this machine is to override at
> least the passive trip point.

JFYI, there are plenty of these systems around, it was one out of four 
standard Novell modells. I am mabye just the first one who uses Factory on 
it, but expect more bugreports when 10.3 gets released ...

bye
adrian

-- 

Adrian Schroeter
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)
email: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Thomas Renninger
On Thu, 2007-08-02 at 10:40 +0200, Knut Petersen wrote:
> Hi everybody!
> 
> Kernel 2.6.22 decreases performance by about 50% on my system.
> No, I do not like that. The reason is a broken BIOS, granted, but there
> was a perfect workaround in the kernel that has been dropped.
> 
> mainboard: AOpen i915GMm-hfs, AWARD BIOS
> cpu: Pentium-M 750 (0.8 to 1.86 MHz)
> openSuSE 10.2 with kernel 2.6.22.1
Is this a DELL laptop that gets throttled by 75% to throttling state 6
if 60 degrees are exceeded?
Adrian has such a machine..., no idea what is going on with that one,
but only workaround to get any use out of this machine is to override at
least the passive trip point.
> 
> The cpu fan can not be controled by linux kernel.
> The BIOS will switch on the cpu fan a bit above 50 deg. Celsius.
> The active and passive trip points both are set to 50 deg. Celsius.
> Temperature of the idle cpu at 800 Mhz: 34 to 42 deg. C.
> The BIOS never changes the trip points.
> Cpufreq does work perfectly.
> 
> Previously there was the possibility  to add something like
> 
> echo  "100:0:65:70:0" > /proc/acpi/thermal_zone/THRM/trip_points
> echo  2 > /proc/acpi/thermal_zone/THRM/polling_frequency
> echo ondemand > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
> 
> to e.g. /etc/init.d/boot.local. With 2.6.22 that solution does not exist
> any longer. Now the code in thermal.c slows down the cpu under load
> to prevent "overheating". Kernel compile time increases from about 12
> to 18 minutes. No, I don´t like that, nobody would.
> 
> Possible solutions:
> 
> 1. Get a better BIOS! --- There is none.
> 
> 2. Fix DSDT!  --- Recompiling gives a number of errors ... I do not  know
> how to fix it.
> 
> 3. Don´t include thermal.c! --- That does help, but as this is a 24/7
> system, the
> cpu fan could break. At that time I do not want to rely on the BIOS to
> save my
> system (the next trip point is at 100 deg. Celsius).
> 
> 4. Revert Len Browns commit 11ccc0f249cb01a129f54760b8ff087f242935d4
> 
> I would vote for option 4, but I do understand some of the arguments of
> Len in
> the 2.6.22-rc1-mm1 discussion in May. Yes, communicating trip points to
> thermal.c is a hack, it will fail on systems that change trip points
> dynamically
> and it might be dangerous for the machine if unreasonable trip points
> are chosen.
> But it does help to keep the machine quiet, and to work around a too low
> or too
> trip points defined by the BIOS.
> 
> If it should be not acceptable to revert the questionable commit without
> changes,
As 2.6.22 was shipped without, I think reverting is not a real option.

> would it be acceptable to make rw trip_points a kernel config option?
IMO something new should be added.
On longterm, maybe it's possible to marriage ACPI thermal control with
hwmon interface, AFAIK there are already efforts to do so, but I don't
know much about it. Still overriding trip points is a problem because
BIOS can change them at runtime... IMO it should just be possible and
machines changing them at runtime either:
  - do change the user's overrides
  - or trip points are simply fixed after user has overridden them
-> my favorite (Don't care for hysteresis BIOS implementations,
if user changes them, it's his fault, he doesn't need to...)
Sanity checks that trip points only can get lowered (compared to initial
provided ones) needs to be added.
Len, Rui: For short-term can something like that be added at least to
the new sysfs interface (I am willing to help if this is a "would be
nice to have, but no time, maybe later" issue)?

Especially passive trip point modification is IMO a powerful feature.
You can easily build a passive cooled system, running at the performance
level your cooling system allows (CPU frequency simply gets lowered
before fans kick in).
Other architectures than ACPI powered already make use of CPU frequency
scaling. An ACPI independent passive cooling implementation connecting
thermal control (hwmon?) and cpufreq interface should be desired for
future? (could get tricky because ACPI spec has some special needs for
passive cooling)

   Thomas

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Andrew Morton
On Thu, 02 Aug 2007 10:40:44 +0200 Knut Petersen <[EMAIL PROTECTED]> wrote:

> Hi everybody!
> 
> Kernel 2.6.22 decreases performance by about 50% on my system.
> No, I do not like that. The reason is a broken BIOS, granted, but there
> was a perfect workaround in the kernel that has been dropped.
> 
> mainboard: AOpen i915GMm-hfs, AWARD BIOS
> cpu: Pentium-M 750 (0.8 to 1.86 MHz)
> openSuSE 10.2 with kernel 2.6.22.1
> 
> The cpu fan can not be controled by linux kernel.
> The BIOS will switch on the cpu fan a bit above 50 deg. Celsius.
> The active and passive trip points both are set to 50 deg. Celsius.
> Temperature of the idle cpu at 800 Mhz: 34 to 42 deg. C.
> The BIOS never changes the trip points.
> Cpufreq does work perfectly.
> 
> Previously there was the possibility  to add something like
> 
> echo  "100:0:65:70:0" > /proc/acpi/thermal_zone/THRM/trip_points
> echo  2 > /proc/acpi/thermal_zone/THRM/polling_frequency
> echo ondemand > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
> 
> to e.g. /etc/init.d/boot.local. With 2.6.22 that solution does not exist
> any longer. Now the code in thermal.c slows down the cpu under load
> to prevent "overheating". Kernel compile time increases from about 12
> to 18 minutes. No, I don´t like that, nobody would.
> 
> Possible solutions:
> 
> 1. Get a better BIOS! --- There is none.
> 
> 2. Fix DSDT!  --- Recompiling gives a number of errors ... I do not  know
> how to fix it.
> 
> 3. Don´t include thermal.c! --- That does help, but as this is a 24/7
> system, the
> cpu fan could break. At that time I do not want to rely on the BIOS to
> save my
> system (the next trip point is at 100 deg. Celsius).
> 
> 4. Revert Len Browns commit 11ccc0f249cb01a129f54760b8ff087f242935d4
> 
> I would vote for option 4, but I do understand some of the arguments of
> Len in
> the 2.6.22-rc1-mm1 discussion in May. Yes, communicating trip points to
> thermal.c is a hack, it will fail on systems that change trip points
> dynamically
> and it might be dangerous for the machine if unreasonable trip points
> are chosen.
> But it does help to keep the machine quiet, and to work around a too low
> or too
> trip points defined by the BIOS.

I didn't understand the arguments either, actually.

Here we had obviously-useful-to-you functionality which was taken away
without, afaik, providing any alternative.

> If it should be not acceptable to revert the questionable commit without
> changes,
> would it be acceptable to make rw trip_points a kernel config option?

Well we obviously need to do _something_.  And reverting that commit until
we get a decent replacement in place sounds like a fine idea to me.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Andrew Morton
On Thu, 02 Aug 2007 10:40:44 +0200 Knut Petersen [EMAIL PROTECTED] wrote:

 Hi everybody!
 
 Kernel 2.6.22 decreases performance by about 50% on my system.
 No, I do not like that. The reason is a broken BIOS, granted, but there
 was a perfect workaround in the kernel that has been dropped.
 
 mainboard: AOpen i915GMm-hfs, AWARD BIOS
 cpu: Pentium-M 750 (0.8 to 1.86 MHz)
 openSuSE 10.2 with kernel 2.6.22.1
 
 The cpu fan can not be controled by linux kernel.
 The BIOS will switch on the cpu fan a bit above 50 deg. Celsius.
 The active and passive trip points both are set to 50 deg. Celsius.
 Temperature of the idle cpu at 800 Mhz: 34 to 42 deg. C.
 The BIOS never changes the trip points.
 Cpufreq does work perfectly.
 
 Previously there was the possibility  to add something like
 
 echo  100:0:65:70:0  /proc/acpi/thermal_zone/THRM/trip_points
 echo  2  /proc/acpi/thermal_zone/THRM/polling_frequency
 echo ondemand  /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
 
 to e.g. /etc/init.d/boot.local. With 2.6.22 that solution does not exist
 any longer. Now the code in thermal.c slows down the cpu under load
 to prevent overheating. Kernel compile time increases from about 12
 to 18 minutes. No, I don´t like that, nobody would.
 
 Possible solutions:
 
 1. Get a better BIOS! --- There is none.
 
 2. Fix DSDT!  --- Recompiling gives a number of errors ... I do not  know
 how to fix it.
 
 3. Don´t include thermal.c! --- That does help, but as this is a 24/7
 system, the
 cpu fan could break. At that time I do not want to rely on the BIOS to
 save my
 system (the next trip point is at 100 deg. Celsius).
 
 4. Revert Len Browns commit 11ccc0f249cb01a129f54760b8ff087f242935d4
 
 I would vote for option 4, but I do understand some of the arguments of
 Len in
 the 2.6.22-rc1-mm1 discussion in May. Yes, communicating trip points to
 thermal.c is a hack, it will fail on systems that change trip points
 dynamically
 and it might be dangerous for the machine if unreasonable trip points
 are chosen.
 But it does help to keep the machine quiet, and to work around a too low
 or too
 trip points defined by the BIOS.

I didn't understand the arguments either, actually.

Here we had obviously-useful-to-you functionality which was taken away
without, afaik, providing any alternative.

 If it should be not acceptable to revert the questionable commit without
 changes,
 would it be acceptable to make rw trip_points a kernel config option?

Well we obviously need to do _something_.  And reverting that commit until
we get a decent replacement in place sounds like a fine idea to me.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Thomas Renninger
On Thu, 2007-08-02 at 11:45 +0200, Adrian Schröter wrote:
 On Thursday 02 August 2007 11:42:27 wrote Thomas Renninger:
  On Thu, 2007-08-02 at 10:40 +0200, Knut Petersen wrote:
   Hi everybody!
  
   Kernel 2.6.22 decreases performance by about 50% on my system.
   No, I do not like that. The reason is a broken BIOS, granted, but there
   was a perfect workaround in the kernel that has been dropped.
  
   mainboard: AOpen i915GMm-hfs, AWARD BIOS
   cpu: Pentium-M 750 (0.8 to 1.86 MHz)
   openSuSE 10.2 with kernel 2.6.22.1
 
  Is this a DELL laptop that gets throttled by 75% to throttling state 6
  if 60 degrees are exceeded?
  Adrian has such a machine..., no idea what is going on with that one,
  but only workaround to get any use out of this machine is to override at
  least the passive trip point.
 
 JFYI, there are plenty of these systems around, it was one out of four 
 standard Novell modells. I am mabye just the first one who uses Factory on 
 it, but expect more bugreports when 10.3 gets released ...

Oops. So this is not broken HW/BIOS, but definitely a kernel problem?
Only idea that comes to my mind finding this is to grep through the DSDT
and look out for code that accesses CPU throttling HW ports. Maybe ACPI
subsystem gets something wrong, processing this code and activating
throttling by accident?

Anyway, only solution/workaround to use these machines with current
kernels is to override trip points, maybe the patch should really just
be reverted...

   Thomas 

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Thomas Renninger
On Thu, 2007-08-02 at 10:40 +0200, Knut Petersen wrote:
 Hi everybody!
 
 Kernel 2.6.22 decreases performance by about 50% on my system.
 No, I do not like that. The reason is a broken BIOS, granted, but there
 was a perfect workaround in the kernel that has been dropped.
 
 mainboard: AOpen i915GMm-hfs, AWARD BIOS
 cpu: Pentium-M 750 (0.8 to 1.86 MHz)
 openSuSE 10.2 with kernel 2.6.22.1
Is this a DELL laptop that gets throttled by 75% to throttling state 6
if 60 degrees are exceeded?
Adrian has such a machine..., no idea what is going on with that one,
but only workaround to get any use out of this machine is to override at
least the passive trip point.
 
 The cpu fan can not be controled by linux kernel.
 The BIOS will switch on the cpu fan a bit above 50 deg. Celsius.
 The active and passive trip points both are set to 50 deg. Celsius.
 Temperature of the idle cpu at 800 Mhz: 34 to 42 deg. C.
 The BIOS never changes the trip points.
 Cpufreq does work perfectly.
 
 Previously there was the possibility  to add something like
 
 echo  100:0:65:70:0  /proc/acpi/thermal_zone/THRM/trip_points
 echo  2  /proc/acpi/thermal_zone/THRM/polling_frequency
 echo ondemand  /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
 
 to e.g. /etc/init.d/boot.local. With 2.6.22 that solution does not exist
 any longer. Now the code in thermal.c slows down the cpu under load
 to prevent overheating. Kernel compile time increases from about 12
 to 18 minutes. No, I don´t like that, nobody would.
 
 Possible solutions:
 
 1. Get a better BIOS! --- There is none.
 
 2. Fix DSDT!  --- Recompiling gives a number of errors ... I do not  know
 how to fix it.
 
 3. Don´t include thermal.c! --- That does help, but as this is a 24/7
 system, the
 cpu fan could break. At that time I do not want to rely on the BIOS to
 save my
 system (the next trip point is at 100 deg. Celsius).
 
 4. Revert Len Browns commit 11ccc0f249cb01a129f54760b8ff087f242935d4
 
 I would vote for option 4, but I do understand some of the arguments of
 Len in
 the 2.6.22-rc1-mm1 discussion in May. Yes, communicating trip points to
 thermal.c is a hack, it will fail on systems that change trip points
 dynamically
 and it might be dangerous for the machine if unreasonable trip points
 are chosen.
 But it does help to keep the machine quiet, and to work around a too low
 or too
 trip points defined by the BIOS.
 
 If it should be not acceptable to revert the questionable commit without
 changes,
As 2.6.22 was shipped without, I think reverting is not a real option.

 would it be acceptable to make rw trip_points a kernel config option?
IMO something new should be added.
On longterm, maybe it's possible to marriage ACPI thermal control with
hwmon interface, AFAIK there are already efforts to do so, but I don't
know much about it. Still overriding trip points is a problem because
BIOS can change them at runtime... IMO it should just be possible and
machines changing them at runtime either:
  - do change the user's overrides
  - or trip points are simply fixed after user has overridden them
- my favorite (Don't care for hysteresis BIOS implementations,
if user changes them, it's his fault, he doesn't need to...)
Sanity checks that trip points only can get lowered (compared to initial
provided ones) needs to be added.
Len, Rui: For short-term can something like that be added at least to
the new sysfs interface (I am willing to help if this is a would be
nice to have, but no time, maybe later issue)?

Especially passive trip point modification is IMO a powerful feature.
You can easily build a passive cooled system, running at the performance
level your cooling system allows (CPU frequency simply gets lowered
before fans kick in).
Other architectures than ACPI powered already make use of CPU frequency
scaling. An ACPI independent passive cooling implementation connecting
thermal control (hwmon?) and cpufreq interface should be desired for
future? (could get tricky because ACPI spec has some special needs for
passive cooling)

   Thomas

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Adrian Schröter
On Thursday 02 August 2007 11:42:27 wrote Thomas Renninger:
 On Thu, 2007-08-02 at 10:40 +0200, Knut Petersen wrote:
  Hi everybody!
 
  Kernel 2.6.22 decreases performance by about 50% on my system.
  No, I do not like that. The reason is a broken BIOS, granted, but there
  was a perfect workaround in the kernel that has been dropped.
 
  mainboard: AOpen i915GMm-hfs, AWARD BIOS
  cpu: Pentium-M 750 (0.8 to 1.86 MHz)
  openSuSE 10.2 with kernel 2.6.22.1

 Is this a DELL laptop that gets throttled by 75% to throttling state 6
 if 60 degrees are exceeded?
 Adrian has such a machine..., no idea what is going on with that one,
 but only workaround to get any use out of this machine is to override at
 least the passive trip point.

JFYI, there are plenty of these systems around, it was one out of four 
standard Novell modells. I am mabye just the first one who uses Factory on 
it, but expect more bugreports when 10.3 gets released ...

bye
adrian

-- 

Adrian Schroeter
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)
email: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Andi Kleen
Andrew Morton [EMAIL PROTECTED] writes:

 I didn't understand the arguments either, actually.

The issue is that you can actually kill hardware by setting this wrong.
We've had such cases where trip point problems eventually lead
to overheated laptops with hard disks dying etc. 

Also it runs the system out of spec and is similar to overclocking
which we also do not support.
 
 Here we had obviously-useful-to-you functionality which was taken away
 without, afaik, providing any alternative.

I don't think it's that unreasonable to require source code modifications
for anything that can kill hardware. At least that raises the barrier
a bit and hopefully ensures people think twice about it and then really
only blame themselves if anything goes wrong.

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Alan Cox
 Also it runs the system out of spec and is similar to overclocking
 which we also do not support.

We do not systematically prevent overclocking. There are lots of cases
where altering the trip points is helpful, and if you look in vendor
bugzilla databases there are multiple moans from people whose laptops now
run slow, or in many cases are simply unusable as a result of Len's
change.

Given you can achieve some of the same result by not loading the relevant
ACPI code in the first place your argument makes no rational sense at all.

Set a taint flag, print a loud message but don't stop users actually
doing things they intend as root. Or have you forgotten the original Unix
philosophy too ?

  Here we had obviously-useful-to-you functionality which was taken away
  without, afaik, providing any alternative.
 
 I don't think it's that unreasonable to require source code modifications
 for anything that can kill hardware.

As root you can erase the bios, lock the hard disk with a random
password, reflash your video card  

Sorry Andi, you simply do not know better than all end users.

Alan
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Alan Cox
 Anyway, only solution/workaround to use these machines with current
 kernels is to override trip points, maybe the patch should really just
 be reverted...

The question really is whether the vendors will all revert it and carry
it as a patch or whether the main tree will accept reality on this one.

Reverting it and adding a taint marker if you do it is much preferable I
suspect to having every vendor revert this bogus if well meaning
changeset.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Matthew Garrett
On Thu, Aug 02, 2007 at 12:02:21PM +0100, Alan Cox wrote:
  Anyway, only solution/workaround to use these machines with current
  kernels is to override trip points, maybe the patch should really just
  be reverted...
 
 The question really is whether the vendors will all revert it and carry
 it as a patch or whether the main tree will accept reality on this one.
 
 Reverting it and adding a taint marker if you do it is much preferable I
 suspect to having every vendor revert this bogus if well meaning
 changeset.

I strongly suspect that the vast majority[1] of hardware that needs 
the trip points changing works perfectly well under Windows, so it's 
likely to be papering over bugs in the kernel. It'd be nice if we fixed 
those rather than encouraging people to poke stuff into /proc, 
especially when doing so is guaranteed to break in really confusing ways 
with a lot of hardware. The firmware can reset the trip points at 
essentially arbitrary times and is well within its rights to expect the 
OS to actually pay attention to them.

[1] Some hardware is simply broken. We don't carry phc just because some 
vendors put the wrong voltage values in their tables, either
-- 
Matthew Garrett | [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Knut Petersen
Thomas Renninger wrote:
 mainboard: AOpen i915GMm-hfs, AWARD BIOS
 cpu: Pentium-M 750 (0.8 to 1.86 MHz)
 openSuSE 10.2 with kernel 2.6.22.1
 Is this a DELL laptop that gets throttled by 75% to throttling state 6
 if 60 degrees are exceeded?
No, it is a Pentium M desktop board.:
Chipset i915GM, FSB 533MHz, max 2GB DDR2 RAM, 2 PCI
and 1 16x PCI Express slots, serial, parallel, usb, firewire,
2x Marvel Gigabit Ethernet, Realtek ALC 880 sound, IDE,
Intel SATA and SiI SATA Raid, FDC, DVI and VGA video out etc.
Very low power consumption: ~40W to 65W for the whole system,
except monitor.
 As 2.6.22 was shipped without, I think reverting is not a real option.
Well, it would not be the first time to eliminate a regression by
reverting a
patch after it was accepted previously.
 Sanity checks that trip points only can get lowered (compared to initial
 provided ones) needs to be added.
 Len, Rui: For short-term can some 
But I _need_ to raise the unreasonably low passive trip point. We could
decide to
protect the innocent user by allowing write access to trip_points only
after a previous

echo I know what I am doing 
/proc/acpi/thermal_zone/THRM/enable_really_dangerous_options

if we believe that this is a good idea ...

Andi Kleen wrote:

   I don't think it's that unreasonable to require source code
modifications
for anything that can kill hardware. At least that raises the barrier
a bit and hopefully ensures people think twice about it and then really
   only blame themselves if anything goes wrong.

Andi, would the above be mechanism sufficiently safe for your taste?

cu,
 Knut

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Thomas Renninger
On Thu, 2007-08-02 at 12:13 +0100, Matthew Garrett wrote:
 On Thu, Aug 02, 2007 at 12:02:21PM +0100, Alan Cox wrote:
   Anyway, only solution/workaround to use these machines with current
   kernels is to override trip points, maybe the patch should really just
   be reverted...
  
  The question really is whether the vendors will all revert it and carry
  it as a patch or whether the main tree will accept reality on this one.
  
  Reverting it and adding a taint marker if you do it is much preferable I
  suspect to having every vendor revert this bogus if well meaning
  changeset.
 
 I strongly suspect that the vast majority[1] of hardware that needs 
 the trip points changing works perfectly well under Windows, so it's 
 likely to be papering over bugs in the kernel. It'd be nice if we fixed 
 those rather than encouraging people to poke stuff into /proc,
Some arguments against that:
  - You cannot tell a customer: Wait for the kernel in half a year.
This is the time it at least needs until a laptop got sold, the
problem is found, a patch is written and checked in and finally
hits the distribution.
  - You can also not backport fixes as ACPI patches mostly have the
potential to break other machines/BIOSes
  - There also exist the policy to not fix up/workaround totally broken
AML BIOS implementations
  - We do not need to and never will be able to copy or do the same
Windows is doing
  - ...

 especially when doing so is guaranteed to break in really confusing ways 
 with a lot of hardware. The firmware can reset the trip points at 
 essentially arbitrary times and is well within its rights to expect the 
 OS to actually pay attention to them.
What the hell is so wrong with:

Let the user override the trip points. If he does so, ignore
thermal trip point updates from BIOS. Don't care for hysteresis
BIOS implementations (these are the BIOS trip point updates).
If user changes them, it's his fault, he doesn't need to...
Make sure that trip points can only be lowered, compared to the
initially fetched one from BIOS.

This is neither confusing, nor dangerous in any way (beside the fact
that the critical trip point might get dynamically lowered by BIOS,
which is totally insane).

  Thomas

 
 [1] Some hardware is simply broken. We don't carry phc just because some 
 vendors put the wrong voltage values in their tables, either


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Matthew Garrett
On Thu, Aug 02, 2007 at 01:45:00PM +0200, Thomas Renninger wrote:
 On Thu, 2007-08-02 at 12:13 +0100, Matthew Garrett wrote:
  I strongly suspect that the vast majority[1] of hardware that needs 
  the trip points changing works perfectly well under Windows, so it's 
  likely to be papering over bugs in the kernel. It'd be nice if we fixed 
  those rather than encouraging people to poke stuff into /proc,
 Some arguments against that:
   - You cannot tell a customer: Wait for the kernel in half a year.
 This is the time it at least needs until a laptop got sold, the
 problem is found, a patch is written and checked in and finally
 hits the distribution.

We have to do so frequently. New hardware often exposes bugs in the 
kernel.

   - You can also not backport fixes as ACPI patches mostly have the
 potential to break other machines/BIOSes
   - There also exist the policy to not fix up/workaround totally broken
 AML BIOS implementations

The policy has been to attempt to be bug-compatible with Windows 
whenever possible for some time now.

   - We do not need to and never will be able to copy or do the same
 Windows is doing

Given that many vendors still only test against Windows, that's exactly 
what we need to do.

  especially when doing so is guaranteed to break in really confusing ways 
  with a lot of hardware. The firmware can reset the trip points at 
  essentially arbitrary times and is well within its rights to expect the 
  OS to actually pay attention to them.
 What the hell is so wrong with:
 
 Let the user override the trip points. If he does so, ignore
 thermal trip point updates from BIOS. Don't care for hysteresis
 BIOS implementations (these are the BIOS trip point updates).

No, that's not the only reason for notifications. Alteration in hardware 
state may also force a recalculation of trip point (adding a battery to 
a bay rather than a DVD drive may require the platform to be kept at a 
lower temperature)

 If user changes them, it's his fault, he doesn't need to...
 Make sure that trip points can only be lowered, compared to the
 initially fetched one from BIOS.

Surely people want this functionality so that they can raise trip 
points?

-- 
Matthew Garrett | [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Alan Cox
 I strongly suspect that the vast majority[1] of hardware that needs 
 the trip points changing works perfectly well under Windows, so it's 

Windows as I understand it has vendor mechanisms to allow the bits
shipped with the OS to override/ignore just about everything trip points
included. Lots of hardware that requires fixups in Linux and just works
in Windows is not Linux bugs but Windows magic .inf files and other
registry gunge done by the machine vendor. We see this in ATA, in power
management and elsewhere.

Alan
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Matthew Garrett
On Thu, Aug 02, 2007 at 02:06:26PM +0200, Thomas Renninger wrote:
 On Thu, 2007-08-02 at 12:57 +0100, Matthew Garrett wrote:
  On Thu, Aug 02, 2007 at 12:59:47PM +0100, Alan Cox wrote:
   Windows as I understand it has vendor mechanisms to allow the bits
   shipped with the OS to override/ignore just about everything trip points
   included. Lots of hardware that requires fixups in Linux and just works
   in Windows is not Linux bugs but Windows magic .inf files and other
   registry gunge done by the machine vendor. We see this in ATA, in power
   management and elsewhere.
  
  I've seen no evidence that this happens with thermal trip points.
 
 WMI needed for fan control -- FSC Amilo M3438G
 http://bugzilla.kernel.org/show_bug.cgi?id=5670

That machine has no active thermal trip points, so I'm not sure how it's 
relevant here. By the sounds of the bug log, I suspect Linux just runs 
slightly hotter on the machine than Windows does - especially since the 
user isn't running the closed nvidia driver, so there's nothing to carry 
out any power management on the GPU.
-- 
Matthew Garrett | [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Matthew Garrett
On Thu, Aug 02, 2007 at 12:59:47PM +0100, Alan Cox wrote:
  I strongly suspect that the vast majority[1] of hardware that needs 
  the trip points changing works perfectly well under Windows, so it's 
 
 Windows as I understand it has vendor mechanisms to allow the bits
 shipped with the OS to override/ignore just about everything trip points
 included. Lots of hardware that requires fixups in Linux and just works
 in Windows is not Linux bugs but Windows magic .inf files and other
 registry gunge done by the machine vendor. We see this in ATA, in power
 management and elsewhere.

I've seen no evidence that this happens with thermal trip points.
-- 
Matthew Garrett | [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Andi Kleen
 Set a taint flag, 

That's hardly any useful if the machine is dead afterwards.

 print a loud message 

Neither.

You'll just end up with Linux destroyed my laptop headlines all 
over the internet and rightfully very annoyed users.

 Or have you forgotten the original Unix
 philosophy too ?

The philosophy didn't include physically destroying hardware
as far as I know.

   Here we had obviously-useful-to-you functionality which was taken away
   without, afaik, providing any alternative.
  
  I don't think it's that unreasonable to require source code modifications
  for anything that can kill hardware.
 
 As root you can erase the bios, 

We don't ship the devbios driver for good reasons.

 lock the hard disk with a random
 password, reflash your video card  

That all requires significant effort and custom software. It's not that we 
have a one liner echo destroy  /sys/.../flash-bios. 

-Andi

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Andi Kleen
 Andi Kleen wrote:
 
I don't think it's that unreasonable to require source code
 modifications
 for anything that can kill hardware. At least that raises the barrier
 a bit and hopefully ensures people think twice about it and then really
only blame themselves if anything goes wrong.
 
 Andi, would the above be mechanism sufficiently safe for your taste?

No.

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Thomas Renninger
On Thu, 2007-08-02 at 12:57 +0100, Matthew Garrett wrote:
 On Thu, Aug 02, 2007 at 12:59:47PM +0100, Alan Cox wrote:
   I strongly suspect that the vast majority[1] of hardware that needs 
   the trip points changing works perfectly well under Windows, so it's 
  
  Windows as I understand it has vendor mechanisms to allow the bits
  shipped with the OS to override/ignore just about everything trip points
  included. Lots of hardware that requires fixups in Linux and just works
  in Windows is not Linux bugs but Windows magic .inf files and other
  registry gunge done by the machine vendor. We see this in ATA, in power
  management and elsewhere.
 
 I've seen no evidence that this happens with thermal trip points.

WMI needed for fan control -- FSC Amilo M3438G
http://bugzilla.kernel.org/show_bug.cgi?id=5670

   Thomas

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Thomas Renninger
On Thu, 2007-08-02 at 12:56 +0100, Matthew Garrett wrote:
 On Thu, Aug 02, 2007 at 01:45:00PM +0200, Thomas Renninger wrote:
  On Thu, 2007-08-02 at 12:13 +0100, Matthew Garrett wrote:
   I strongly suspect that the vast majority[1] of hardware that needs 
   the trip points changing works perfectly well under Windows, so it's 
   likely to be papering over bugs in the kernel. It'd be nice if we fixed 
   those rather than encouraging people to poke stuff into /proc,
  Some arguments against that:
- You cannot tell a customer: Wait for the kernel in half a year.
  This is the time it at least needs until a laptop got sold, the
  problem is found, a patch is written and checked in and finally
  hits the distribution.
 
 We have to do so frequently. New hardware often exposes bugs in the 
 kernel.
And often we can provide a boot param or whatever, that makes it at
least useable.
 
- You can also not backport fixes as ACPI patches mostly have the
  potential to break other machines/BIOSes
- There also exist the policy to not fix up/workaround totally broken
  AML BIOS implementations
 
 The policy has been to attempt to be bug-compatible with Windows 
 whenever possible for some time now.
*whenever possible*
 
- We do not need to and never will be able to copy or do the same
  Windows is doing
 
 Given that many vendors still only test against Windows, that's exactly 
 what we need to do.
But we cannot (copy all windows (mis-)behavior).
 
   especially when doing so is guaranteed to break in really confusing ways 
   with a lot of hardware. The firmware can reset the trip points at 
   essentially arbitrary times and is well within its rights to expect the 
   OS to actually pay attention to them.
  What the hell is so wrong with:
  
  Let the user override the trip points. If he does so, ignore
  thermal trip point updates from BIOS. Don't care for hysteresis
  BIOS implementations (these are the BIOS trip point updates).
 
 No, that's not the only reason for notifications. Alteration in hardware 
 state may also force a recalculation of trip point (adding a battery to 
 a bay rather than a DVD drive may require the platform to be kept at a 
 lower temperature)
I've seen no evidence that this happens..., but I see the point.
  If user changes them, it's his fault, he doesn't need to...
  Make sure that trip points can only be lowered, compared to the
  initially fetched one from BIOS.
 
 Surely people want this functionality so that they can raise trip 
 points?
For Adrian it would be enough to be able to lower them.
Also being able to define a passive trip point (even if not provided by
BIOS) could help a lot machines.

What about at least:
  - Be able to override passive cooling trip point
  - If BIOS does not provide one, let user be able to define it
This should already make a lot people happy.

Thomas


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Matthew Garrett
On Thu, Aug 02, 2007 at 02:35:18PM +0200, Thomas Renninger wrote:
 On Thu, 2007-08-02 at 13:15 +0100, Matthew Garrett wrote:
  That machine has no active thermal trip points, so I'm not sure how it's 
  relevant here.
 From above: Windows as I understand it has vendor mechanisms to...
 Maybe thermal trip points are not influenced here, it's at least about
 thermal management and another prove that we cannot just try to copy
 Windows behavior, but need to provide workarounds wherever possible.

There's absolutely no evidence in the bug log there that the user's 
problems are in any way due to Windows-specific code. The SetSilentMode 
stuff is an additional item of functionality that underclocks various 
bits of hardware, not one that's actually required for the platform to 
function correctly.
-- 
Matthew Garrett | [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Matthew Garrett
On Thu, Aug 02, 2007 at 02:42:19PM +0200, Thomas Renninger wrote:
 On Thu, 2007-08-02 at 12:56 +0100, Matthew Garrett wrote:
  The policy has been to attempt to be bug-compatible with Windows 
  whenever possible for some time now.
 *whenever possible*

But there's no evidence whatsoever that this is something we can't 
handle...

  No, that's not the only reason for notifications. Alteration in hardware 
  state may also force a recalculation of trip point (adding a battery to 
  a bay rather than a DVD drive may require the platform to be kept at a 
  lower temperature)
 I've seen no evidence that this happens..., but I see the point.

It's explicitly mentioned as one of the use cases for trip point 
alteration in the spec.

  Surely people want this functionality so that they can raise trip 
  points?
 For Adrian it would be enough to be able to lower them.

Which suggests that we're probably doing something wrong at some more 
fundamental level...

 Also being able to define a passive trip point (even if not provided by
 BIOS) could help a lot machines.

I agree that being able to lower trip points is unlikely to result in 
hardware damage, but still think that it's likely to be papering over 
genuine bugs that we could fix properly.

-- 
Matthew Garrett | [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Alan Cox
  Set a taint flag, 
 That's hardly any useful if the machine is dead afterwards.

It won't be the hardware will do a failsafe shutdown first.

 You'll just end up with Linux destroyed my laptop headlines all 
 over the internet and rightfully very annoyed users.

You have to systematically sit down and tweak your machine.

 The philosophy didn't include physically destroying hardware
 as far as I know.

It most certainly did. With safety checks you could override.

  As root you can erase the bios, 
 We don't ship the devbios driver for good reasons.

Thats debatably a bad reason (the user space API is wrong thats all), and
one thats totally inconsistent with some of the other drivers we do ship.

  lock the hard disk with a random
  password, reflash your video card  
 
 That all requires significant effort and custom software. It's not that we 
 have a one liner echo destroy  /sys/.../flash-bios. 

Well you can do the hard disk one in one line of perl, the video card one
in a small bit of C. And this merely makes the argument that raising the
trip points should be harder.

Alan
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Thomas Renninger
On Thu, 2007-08-02 at 13:15 +0100, Matthew Garrett wrote:
 On Thu, Aug 02, 2007 at 02:06:26PM +0200, Thomas Renninger wrote:
  On Thu, 2007-08-02 at 12:57 +0100, Matthew Garrett wrote:
   On Thu, Aug 02, 2007 at 12:59:47PM +0100, Alan Cox wrote:
Windows as I understand it has vendor mechanisms to allow the bits
shipped with the OS to override/ignore just about everything trip points
included. Lots of hardware that requires fixups in Linux and just works
in Windows is not Linux bugs but Windows magic .inf files and other
registry gunge done by the machine vendor. We see this in ATA, in power
management and elsewhere.
   
   I've seen no evidence that this happens with thermal trip points.
  
  WMI needed for fan control -- FSC Amilo M3438G
  http://bugzilla.kernel.org/show_bug.cgi?id=5670
 
 That machine has no active thermal trip points, so I'm not sure how it's 
 relevant here.
From above: Windows as I understand it has vendor mechanisms to...
Maybe thermal trip points are not influenced here, it's at least about
thermal management and another prove that we cannot just try to copy
Windows behavior, but need to provide workarounds wherever possible.

   Thomas

 By the sounds of the bug log, I suspect Linux just runs 
 slightly hotter on the machine than Windows does - especially since the 
 user isn't running the closed nvidia driver, so there's nothing to carry 
 out any power management on the GPU.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Andi Kleen
On Thu, Aug 02, 2007 at 02:04:42PM +0100, Alan Cox wrote:
   Set a taint flag, 
  That's hardly any useful if the machine is dead afterwards.
 
 It won't be the hardware will do a failsafe shutdown first.

Not necessarily. At SUSE we had at least one broken laptop
with wrong trip points. The machine ran very hot for some time
and afterwards the hard disk was dead.

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Alan Cox
  Andi, would the above be mechanism sufficiently safe for your taste?
 
 No.

I don't beleve Andi's taste (or lack thereof) is relevant to this
discussion. He's not for example explained why its better to force people
to disable all the APCI power and thermal control on their system rather
than adjust trip points.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Pavel Machek
Hi!

  I didn't understand the arguments either, actually.
 
 The issue is that you can actually kill hardware by setting this wrong.
 We've had such cases where trip point problems eventually lead
 to overheated laptops with hard disks dying etc. 

Actually, that was my machine. Omnibook xe3; BIOS provided trip points
*did* kill the disk. At least I was able to work around it with
writing to trip points.

Yes, ACPI mandates emergency shutdown when critical+delta point is
reached, *in hardware*. So this only endangers very broken machines,
and it also fixes lot of them.

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Pavel Machek
On Thu 2007-08-02 15:16:22, Andi Kleen wrote:
 On Thu, Aug 02, 2007 at 02:04:42PM +0100, Alan Cox wrote:
Set a taint flag, 
   That's hardly any useful if the machine is dead afterwards.
  
  It won't be the hardware will do a failsafe shutdown first.
 
 Not necessarily. At SUSE we had at least one broken laptop
 with wrong trip points. The machine ran very hot for some time
 and afterwards the hard disk was dead.

Yes, but it was original BIOS trip points that were wrong. And yes,
its failsafe shutdown was too late. At least lowering the trip points
would allow me to run it safely.

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Pavel Machek
Hi!

 Well, it would not be the first time to eliminate a regression by
 reverting a
 patch after it was accepted previously.
  Sanity checks that trip points only can get lowered (compared to initial
  provided ones) needs to be added.
  Len, Rui: For short-term can some 
 But I _need_ to raise the unreasonably low passive trip point. We could
 decide to
 protect the innocent user by allowing write access to trip_points only
 after a previous

Actually, you should lower your active trip point, and keep cpu temp
below 50C.

 echo I know what I am doing 
 /proc/acpi/thermal_zone/THRM/enable_really_dangerous_options

No... but patch that only permits lowering could be acceptable.

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Andi Kleen
On Thu, Aug 02, 2007 at 03:57:54PM +, Pavel Machek wrote:
 On Thu 2007-08-02 15:16:22, Andi Kleen wrote:
  On Thu, Aug 02, 2007 at 02:04:42PM +0100, Alan Cox wrote:
 Set a taint flag, 
That's hardly any useful if the machine is dead afterwards.
   
   It won't be the hardware will do a failsafe shutdown first.
  
  Not necessarily. At SUSE we had at least one broken laptop
  with wrong trip points. The machine ran very hot for some time
  and afterwards the hard disk was dead.
 
 Yes, but it was original BIOS trip points that were wrong. And yes,
 its failsafe shutdown was too late. At least lowering the trip points
 would allow me to run it safely.

I have no problem with lowering them (in fact I proposed this
to Thomas as a possible solution at some point). Just rising 
is a bad idea.

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Matthew Garrett
On Thu, Aug 02, 2007 at 08:38:30PM +0200, Andi Kleen wrote:
 On Thu, Aug 02, 2007 at 03:57:54PM +, Pavel Machek wrote:
  Yes, but it was original BIOS trip points that were wrong. And yes,
  its failsafe shutdown was too late. At least lowering the trip points
  would allow me to run it safely.
 
 I have no problem with lowering them (in fact I proposed this
 to Thomas as a possible solution at some point). Just rising 
 is a bad idea.

Though for this to be reliable, you need to ignore any notifications 
that would raise the trip points while still paying attention to any 
that would lower them.

-- 
Matthew Garrett | [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Krzysztof Halasa
Knut Petersen [EMAIL PROTECTED] writes:

 echo I know what I am doing 
 /proc/acpi/thermal_zone/THRM/enable_really_dangerous_options

There is a shorter version:
$ su
Password:
# 
-- 
Krzysztof Halasa
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Len Brown
On Thursday 02 August 2007 04:40, Knut Petersen wrote:

 Kernel 2.6.22 decreases performance by about 50% on my system.
 No, I do not like that. The reason is a broken BIOS, granted, but there
 was a perfect workaround in the kernel that has been dropped.
 
 mainboard: AOpen i915GMm-hfs, AWARD BIOS
 cpu: Pentium-M 750 (0.8 to 1.86 MHz)
 openSuSE 10.2 with kernel 2.6.22.1
 
 The cpu fan can not be controled by linux kernel.
 The BIOS will switch on the cpu fan a bit above 50 deg. Celsius.
 The active and passive trip points both are set to 50 deg. Celsius.
 Temperature of the idle cpu at 800 Mhz: 34 to 42 deg. C.
 The BIOS never changes the trip points.
 Cpufreq does work perfectly.
 
 Previously there was the possibility  to add something like
 
 echo  100:0:65:70:0  /proc/acpi/thermal_zone/THRM/trip_points
 echo  2  /proc/acpi/thermal_zone/THRM/polling_frequency
 echo ondemand  /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
 
 to e.g. /etc/init.d/boot.local. With 2.6.22 that solution does not exist
 any longer. Now the code in thermal.c slows down the cpu under load
 to prevent overheating. Kernel compile time increases from about 12
 to 18 minutes. No, I don´t like that, nobody would.
 

Thanks for the sighting, Knut!
This regression is dramatic when put in the terms of 50% performance hit!
I guess the good news is that thermal throttling is doing the job
we are asking it to:-)

The statement above regarding the existence of active trip points
and the kernel not being able to control the fan are inconsistent
with each other.

Please open a sighting for this machine here:

http://bugzilla.kernel.org/enter_bug.cgi?product=ACPI
vs. Power-Thermal
and attach the output from acpidump, cat /proc/acpi/thermal_zone/*/*
and assign it to [EMAIL PROTECTED]

BTW. does the board boot and run properly with acpi=off?

thanks,
-Len
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22 regression: thermal trip points

2007-08-02 Thread Len Brown
On Thursday 02 August 2007 05:45, Adrian Schröter wrote:
 On Thursday 02 August 2007 11:42:27 wrote Thomas Renninger:
  On Thu, 2007-08-02 at 10:40 +0200, Knut Petersen wrote:
   Hi everybody!
  
   Kernel 2.6.22 decreases performance by about 50% on my system.
   No, I do not like that. The reason is a broken BIOS, granted, but there
   was a perfect workaround in the kernel that has been dropped.
  
   mainboard: AOpen i915GMm-hfs, AWARD BIOS
   cpu: Pentium-M 750 (0.8 to 1.86 MHz)
   openSuSE 10.2 with kernel 2.6.22.1
 
  Is this a DELL laptop that gets throttled by 75% to throttling state 6
  if 60 degrees are exceeded?
  Adrian has such a machine..., no idea what is going on with that one,
  but only workaround to get any use out of this machine is to override at
  least the passive trip point.
 
 JFYI, there are plenty of these systems around, it was one out of four 
 standard Novell modells. I am mabye just the first one who uses Factory on 
 it, but expect more bugreports when 10.3 gets released ...

That's very good news, Adrian.  In the past all we had to go on
was the memory of a machine that died several years ago.
But if you've got a live failure, that is really valuable.

Please go here
http://bugzilla.kernel.org/enter_bug.cgi?product=ACPI
and submit a new sighting vs. Power-Thermal
and attach the output from acpidump, cat /proc/acpi/thermal_zone/*/*
and assign it to [EMAIL PROTECTED]

thanks,
-Len
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/