[Kernel-packages] [Bug 1600599] Re: Thermald is totally broken, or its default configuration is

2019-08-15 Thread Colin Ian King
The bionic SRU test message occurred because I accidentally uploaded the
package with the entire old history.  This bug has already been fixed
and the verification for bionic can be ignored.

** No longer affects: thermald (Ubuntu Bionic)

** Tags removed: verification-needed verification-needed-bionic
** Tags added: verification-done

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to thermald in Ubuntu.
https://bugs.launchpad.net/bugs/1600599

Title:
  Thermald is totally broken, or its default configuration is

Status in thermald package in Ubuntu:
  Fix Released

Bug description:
  ** WORKAROUND**: shut down the thermald process completely. If your
  computer has an actual physical cooling fan and it's fully functional,
  you don't need thermald at all.

  I have Ubuntu 15.10 up to date with automatic updates and I never touched 
thermald configuration.
  This is on a laptop, which has an actual physical cooling fan (like most 
laptops).

  EXPECTED BEHAVIOR:
  as the CPU temperature increases, the fan should spin faster to keep the 
temperature from getting too high. ONLY IF, even with the fan at its full 
capacity, or approaching it, the temperature keeps growing, THEN that's when 
powerclamp and things like that should trigger, throttling the CPU, so that it 
doesn't burn (or shut down abruptly). Also, these kinds of CPU throttling 
should come in gradually as needed. That is, if you inject idle processes, you 
should inject just the minimum amount that is needed. For example, if the fan 
at its maximum speed is *almost* enough to keep the temperature below the 
threshold, but not *quite* enough, injecting just a small amount of idle time 
into the CPU should be enough to do that extra bit of cooling that is needed. 
You would barely notice it. It would not slow your system down a lot, unless 
the heating is *way* higher than the fan alone can fight.

  On a fully functional system (where the fan is enough to prevent the
  CPU from overheating and/or excessive CPU consumption does not occur
  in a huge degree for a long time), you shouldn't note any difference
  by shutting down thermald completely. Only on a system where the fan
  is not fully functional and/or hugely excessive CPU usage goes on for
  too long (actually, if the latter alone is enough to make it happen,
  it means that the fan is underdimensioned) would you notice the
  difference between having thermald (powerclamp and other CPU
  throttling mechanisms would kick in and prevent the temperature from
  becoming critical) and not having it (temperature would eventually go
  critical and something bad would happen, such as a sudden shutdown)

  OBSERVED BEHAVIOR
  When CPU temperature becomes high due to relatively high (not huge) CPU 
consumption, intel_powerclamp starts to kick in injecting idle processes and 
crippling the whole system. The observable result is that the system becomes 
unresponsive and unusable, yet the physical fan is sponning at roughly HALF of 
its maximum speed. So, you have a fast quad core machine, with a cooling fan 
that is perfectly capable of keeping the temperature down while using all the 
computing power that you require, BUT since powerclamp and things like that 
kick in too soon, you are limited to use a tiny fraction of the power your 
machine is capable of.
  To put it another way: you can't watch a f***ing youtube video in full screen 
because the whole system will become unresponsive.
  Even after removing and blacklisting the intel_powerclamp and intel_rapl 
kernel modules, the apparent behavior was practically the same, except that I 
wouldn't observe the "kidle_inject" processes by running "top". I guess there 
are other CPU-throttling mechanisms besides powerclamp and rapl.

  So now I have SHUT DOWN THERMALD completely, and my system behaves
  NORMALLY. The fan, of course, reaches higher speeds. Not even _much_
  higher, which means that it needed just a little bit more speed to
  keep up with the heating. Powerclamp and other cpu throttling
  mechanisms were kicking in WAY too soon.

  It took me quite a long time to figure out that this was the problem.
  I just assumed that some bug was causing excessive CPU consumption for
  trivial stuff such as playing video (which is actually true but is not
  the whole story) and that the CPU consumption actually was causing too
  much heat for the fan to dissipate, making it necessary for powerclamp
  to kick in. Also, I thought my fan was probably filled with dust and
  uncapable of doing its job efficiently (which is also true but is not
  the whole story).

  Until I realised that when I was observing unresponsiveness, the fan
  was not even close to its maximum speed.

  CONCLUSION: either thermald does a ridiculously bad job, or its
  default configuration is ridiculously bad.

  NOTE: this issue is **CRITICAL**: this cripples the whole system making it 
unresponsive when doing moderately 

[Kernel-packages] [Bug 1600599] Re: Thermald is totally broken, or its default configuration is

2019-08-15 Thread Andy Whitcroft
Hello teo1978, or anyone else affected,

Accepted thermald into bionic-proposed. The package will build now and
be available at
https://launchpad.net/ubuntu/+source/thermald/1.7.0-5ubuntu4 in a few
hours, and then in the -proposed repository.

Please help us by testing this new package.  See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how
to enable and use -proposed.  Your feedback will aid us getting this
update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested and change the tag from
verification-needed-bionic to verification-done-bionic. If it does not
fix the bug for you, please add a comment stating that, and change the
tag to verification-failed-bionic. In either case, without details of
your testing we will not be able to proceed.

Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification .  Thank you in
advance for helping!

N.B. The updated package will be released to -updates after the bug(s)
fixed by this package have been verified and the package has been in
-proposed for a minimum of 7 days.

** Changed in: thermald (Ubuntu Bionic)
   Status: New => Fix Committed

** Tags added: verification-needed verification-needed-bionic

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to thermald in Ubuntu.
https://bugs.launchpad.net/bugs/1600599

Title:
  Thermald is totally broken, or its default configuration is

Status in thermald package in Ubuntu:
  Fix Released
Status in thermald source package in Bionic:
  Fix Committed

Bug description:
  ** WORKAROUND**: shut down the thermald process completely. If your
  computer has an actual physical cooling fan and it's fully functional,
  you don't need thermald at all.

  I have Ubuntu 15.10 up to date with automatic updates and I never touched 
thermald configuration.
  This is on a laptop, which has an actual physical cooling fan (like most 
laptops).

  EXPECTED BEHAVIOR:
  as the CPU temperature increases, the fan should spin faster to keep the 
temperature from getting too high. ONLY IF, even with the fan at its full 
capacity, or approaching it, the temperature keeps growing, THEN that's when 
powerclamp and things like that should trigger, throttling the CPU, so that it 
doesn't burn (or shut down abruptly). Also, these kinds of CPU throttling 
should come in gradually as needed. That is, if you inject idle processes, you 
should inject just the minimum amount that is needed. For example, if the fan 
at its maximum speed is *almost* enough to keep the temperature below the 
threshold, but not *quite* enough, injecting just a small amount of idle time 
into the CPU should be enough to do that extra bit of cooling that is needed. 
You would barely notice it. It would not slow your system down a lot, unless 
the heating is *way* higher than the fan alone can fight.

  On a fully functional system (where the fan is enough to prevent the
  CPU from overheating and/or excessive CPU consumption does not occur
  in a huge degree for a long time), you shouldn't note any difference
  by shutting down thermald completely. Only on a system where the fan
  is not fully functional and/or hugely excessive CPU usage goes on for
  too long (actually, if the latter alone is enough to make it happen,
  it means that the fan is underdimensioned) would you notice the
  difference between having thermald (powerclamp and other CPU
  throttling mechanisms would kick in and prevent the temperature from
  becoming critical) and not having it (temperature would eventually go
  critical and something bad would happen, such as a sudden shutdown)

  OBSERVED BEHAVIOR
  When CPU temperature becomes high due to relatively high (not huge) CPU 
consumption, intel_powerclamp starts to kick in injecting idle processes and 
crippling the whole system. The observable result is that the system becomes 
unresponsive and unusable, yet the physical fan is sponning at roughly HALF of 
its maximum speed. So, you have a fast quad core machine, with a cooling fan 
that is perfectly capable of keeping the temperature down while using all the 
computing power that you require, BUT since powerclamp and things like that 
kick in too soon, you are limited to use a tiny fraction of the power your 
machine is capable of.
  To put it another way: you can't watch a f***ing youtube video in full screen 
because the whole system will become unresponsive.
  Even after removing and blacklisting the intel_powerclamp and intel_rapl 
kernel modules, the apparent behavior was practically the same, except that I 
wouldn't observe the "kidle_inject" processes by running "top". I guess there 
are other CPU-throttling mechanisms besides powerclamp and rapl.

  So now I have SHUT DOWN THERMALD completely, and my system behaves
  NORMALLY. The fan, of course, 

[Kernel-packages] [Bug 1600599] Re: Thermald is totally broken, or its default configuration is

2018-08-03 Thread dah bien-hwa
Just had similar episodes as what is described here.
My CPU apparently was close to overheating (though the maximum temperature I 
afterwards observed was 95°C, with a specified high/max of 100°C). The Ubuntu 
18.04 on my laptop (Asus Zenbook UX301LA) consistently became unresponsive 
after starting high-cpu-usage tasks (compiling something using ninja-build); 
after stopping the thermald service, the unresponsiveness was gone even when 
running the high-cpu-usage tasks for a long time.

The unresponsiveness is quite severe: Though I did manage one time to
switch to the tty1 and log in there, I could never enter any more
commands there (waiting for approximately a minute or so...). I
eventually always had to resort to more harsh methods of restarting the
laptop.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to thermald in Ubuntu.
https://bugs.launchpad.net/bugs/1600599

Title:
  Thermald is totally broken, or its default configuration is

Status in thermald package in Ubuntu:
  Fix Released

Bug description:
  ** WORKAROUND**: shut down the thermald process completely. If your
  computer has an actual physical cooling fan and it's fully functional,
  you don't need thermald at all.

  I have Ubuntu 15.10 up to date with automatic updates and I never touched 
thermald configuration.
  This is on a laptop, which has an actual physical cooling fan (like most 
laptops).

  EXPECTED BEHAVIOR:
  as the CPU temperature increases, the fan should spin faster to keep the 
temperature from getting too high. ONLY IF, even with the fan at its full 
capacity, or approaching it, the temperature keeps growing, THEN that's when 
powerclamp and things like that should trigger, throttling the CPU, so that it 
doesn't burn (or shut down abruptly). Also, these kinds of CPU throttling 
should come in gradually as needed. That is, if you inject idle processes, you 
should inject just the minimum amount that is needed. For example, if the fan 
at its maximum speed is *almost* enough to keep the temperature below the 
threshold, but not *quite* enough, injecting just a small amount of idle time 
into the CPU should be enough to do that extra bit of cooling that is needed. 
You would barely notice it. It would not slow your system down a lot, unless 
the heating is *way* higher than the fan alone can fight.

  On a fully functional system (where the fan is enough to prevent the
  CPU from overheating and/or excessive CPU consumption does not occur
  in a huge degree for a long time), you shouldn't note any difference
  by shutting down thermald completely. Only on a system where the fan
  is not fully functional and/or hugely excessive CPU usage goes on for
  too long (actually, if the latter alone is enough to make it happen,
  it means that the fan is underdimensioned) would you notice the
  difference between having thermald (powerclamp and other CPU
  throttling mechanisms would kick in and prevent the temperature from
  becoming critical) and not having it (temperature would eventually go
  critical and something bad would happen, such as a sudden shutdown)

  OBSERVED BEHAVIOR
  When CPU temperature becomes high due to relatively high (not huge) CPU 
consumption, intel_powerclamp starts to kick in injecting idle processes and 
crippling the whole system. The observable result is that the system becomes 
unresponsive and unusable, yet the physical fan is sponning at roughly HALF of 
its maximum speed. So, you have a fast quad core machine, with a cooling fan 
that is perfectly capable of keeping the temperature down while using all the 
computing power that you require, BUT since powerclamp and things like that 
kick in too soon, you are limited to use a tiny fraction of the power your 
machine is capable of.
  To put it another way: you can't watch a f***ing youtube video in full screen 
because the whole system will become unresponsive.
  Even after removing and blacklisting the intel_powerclamp and intel_rapl 
kernel modules, the apparent behavior was practically the same, except that I 
wouldn't observe the "kidle_inject" processes by running "top". I guess there 
are other CPU-throttling mechanisms besides powerclamp and rapl.

  So now I have SHUT DOWN THERMALD completely, and my system behaves
  NORMALLY. The fan, of course, reaches higher speeds. Not even _much_
  higher, which means that it needed just a little bit more speed to
  keep up with the heating. Powerclamp and other cpu throttling
  mechanisms were kicking in WAY too soon.

  It took me quite a long time to figure out that this was the problem.
  I just assumed that some bug was causing excessive CPU consumption for
  trivial stuff such as playing video (which is actually true but is not
  the whole story) and that the CPU consumption actually was causing too
  much heat for the fan to dissipate, making it necessary for powerclamp
  to kick in. Also, I thought my fan was probably filled with dust