Let's take this one point at a time:
* fan not running at full speed in disengaged mode in a thermal emergency
   - as mentioned earlier, the default fan mode on the machine is to run under 
firmware control, in which case it runs in engaged mode with a loop feed back 
controller so it never exceeds a top speed of 3500 RPM.   This matches the 
original thermal design by the manufacturer.  So either they made a mistake and 
all machines like yours overheat (and we would see lots of owners with your 
machine reporting this bug) or this issue is particular to your machine

* CPU not throttling in a thermal emergency (unless the frequency readings are 
wrong)
  - that needs investigation as thermald should be doing that (but as I 
mentioned earlier, I will examine the thermald issues later)

* shutting down when supposed to suspend as a reaction to overheat, 
unnecessarily destroying session
  - when a critical thermal event occurs one has a very short time window to 
react. Potentially the silicon may be permanently damaged, so the kernel 
chooses to power down rather ran try to suspend (since this can get stuck and 
exacerbate the issue).  Without the handling of this thermal event, the next 
step is for the hardware to physically shut itself down which is out of any 
form of operating system control, so either way, the machine is desperately 
trying to save itself from breaking.

* destroying session in a shut down/restart cycle (I heard rumours this may be 
fixed later in Snappy with containers)
  - again, in a rush to save your silicon from becoming irreparably damaged 
shutdown is the fastest mechanism.  Snappy containers will not help. 

I'd recommend reading https://en.wikipedia.org/wiki/Thermal_design_power, there 
is  paragraph that states:
"Most modern processors will cause a therm-trip only upon a catastrophic 
cooling failure, such as a no longer operational fan or an incorrectly mounted 
heatsink."

So, the next step will be to see if we can see what thermald is doing.

1. Stop thermald so we can re-enable it with full debug on:

sudo systemctl stop thermald (if you are using systemd)

or

sudo service thermald stop (if you are using upstart)

2.  Run thermald for a while from the command line and capture debug
output:

sudo thermald --no-daemon --dbus-enable --loglevel=debug | tee
thermald.log

..run this say for 5-10 minutes and use your machine, then attach the
thermald.log to the bug report

3. Re-start themrald

sudo systemctl start thermald (if you are using systemd)

or

sudo service thermald start (if you are using upstart)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1491797

Title:
  Shuts down when supposed to suspend as a reaction to self-caused
  overheat, session lost

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1491797/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to