After some further tests I think it could be actually independent of
the microcode and rather be kernel 5.3's fault.
First perhaps some notes on that notebook:
It's an Fujitsu LIFEBOOK U757 with:
model name : Intel(R) Core(TM) i7-7600U CPU @ 2.80GHz
When I got the system in around 2017, IIRC, I already had considerable
CPU overheating problems seeing often messages like:
ov 17 14:41:22 heisenberg kernel: [ 36.347425] mce: CPU2: Core temperature
above threshold, cpu clock throttled (total events = 1)
Nov 17 14:41:22 heisenberg kernel: [ 36.347426] mce: CPU0: Core temperature
above threshold, cpu clock throttled (total events = 1)
Nov 17 14:41:22 heisenberg kernel: [ 36.347427] mce: CPU1: Package
temperature above threshold, cpu clock throttled (total events = 1)
Nov 17 14:41:22 heisenberg kernel: [ 36.347427] mce: CPU3: Package
temperature above threshold, cpu clock throttled (total events = 1)
Nov 17 14:41:22 heisenberg kernel: [ 36.347429] mce: CPU0: Package
temperature above threshold, cpu clock throttled (total events = 1)
Nov 17 14:41:22 heisenberg kernel: [ 36.347531] mce: CPU2: Package
temperature above threshold, cpu clock throttled (total events = 1)
Nov 17 14:41:22 heisenberg kernel: [ 36.348423] mce: CPU2: Core
temperature/speed normal
Nov 17 14:41:22 heisenberg kernel: [ 36.348424] mce: CPU0: Core
temperature/speed normal
Nov 17 14:41:22 heisenberg kernel: [ 36.348461] mce: CPU1: Package
temperature/speed normal
Nov 17 14:41:22 heisenberg kernel: [ 36.348461] mce: CPU3: Package
temperature/speed normal
Nov 17 14:41:22 heisenberg kernel: [ 36.348498] mce: CPU2: Package
temperature/speed normal
Nov 17 14:41:22 heisenberg kernel: [ 36.348568] mce: CPU0: Package
temperature/speed normal
Just with many thousands of events and temperatures reaching pretty
exactly 100°C.
Fujitsu support had no real solution, claiming it wouldn't happen under
Windows.
Eventually the solution was to disable the turbo:
/sys/devices/system/cpu/intel_pstate/no_turbo = 1
(and all my previous tests as well as the ones from this mail have that
set).
Since then I rarely see the temperature warnings from above, and if
it's usually only exactly one event during boot.
I guess the cooling of that slim ultrabook is just not designed well
enough to transport enough heat away if the turbo is on.
One further constant thing is that playback of videos always lead to
considerable CPU utilisation (and higher temperatures), much worse than
the previous ~2012 lifebook the university bought me.
I've never found a real solution for that,... video decoding
acceleration is enabled and seems to work but still,...
Suspicion was that it might be some issue in cinnamon, cause the
cinnamon process also gets quite high CPU usage when I play back
videos.
But when I've reported this ticket, things had gotten much worse (and
it was like that already for a week till I've took action or even
noticed it), especially when the system was basically idle, the temps
were also much higher and the fan running much louder/faster.
And as I've described before, there are situations when it get's really
hot (80° and more) an doesn't cool down again a lot, even if the it
becomes idle again.
Just before I did some more testing with different kernel/microcode
combinations:
**************************************************************
With 5.2.17 0xca-2019-09-26:
Event/s PID %CPU PR NI Task Init Function
59.82 1730 0.2 0 0 Xorg hrtimer_wakeup
55.83 1730 0.2 0 0 Xorg it_real_fn
14.96 3203 0.5 0 0 gnome-terminal- hrtimer_wakeup
13.96 3086 0.0 0 0 diodon hrtimer_wakeup
7.98 3065 0.2 0 0 cinnamon tick_sched_timer
3.99 3203 0.5 0 0 gnome-terminal- tick_sched_timer
2.99 32 0.0 0 0 [kworker/1:1] intel_uncore_fw_release_timer
1.99 509 0.0 0 0 [kworker/u8:4] intel_uncore_fw_release_timer
1.99 1730 0.2 0 0 Xorg intel_uncore_fw_release_timer
1.99 45 0.0 0 0 [kworker/2:1] intel_uncore_fw_release_timer
1.00 847 0.0 0 0 gmain hrtimer_wakeup
1.00 2861 0.0 0 0 gmain hrtimer_wakeup
1.00 236 0.0 0 0 [kworker/3:2] intel_uncore_fw_release_timer
1.00 751 0.0 0 0 haveged hrtimer_wakeup
1.00 2920 0.0 0 0 gmain hrtimer_wakeup
1.00 1730 0.2 0 0 Xorg tick_sched_timer
1.00 3065 0.3 0 0 cinnamon intel_uncore_fw_release_timer
173 Total events, 172.48 events/sec (kernel: 7.98, userspace: 164.51)
Event/s PID %CPU PR NI Task Init Function
68.00 1730 0.5 0 0 Xorg hrtimer_wakeup
65.00 1730 0.5 0 0 Xorg it_real_fn
20.00 3203 0.3 0 0 gnome-terminal- hrtimer_wakeup
14.00 3086 0.0 0 0 diodon hrtimer_wakeup
7.00 3065 0.4 0 0 cinnamon tick_sched_timer
5.00 32 0.0 0 0 [kworker/1:1] intel_uncore_fw_release_timer
4.00 1730 0.5 0 0 Xorg tick_sched_timer
4.00 1730 0.5 0 0 Xorg intel_uncore_fw_release_timer
2.00 751 0.0 0 0 haveged hrtimer_wakeup
1.00 2695 0.0 0 0 ssh-agent hrtimer_wakeup
1.00 3074 0.0 0 0 gdbus tick_sched_timer
1.00 3203 0.3 0 0 gnome-terminal- tick_sched_timer
1.00 145 0.0 0 0 [kworker/0:2] tick_sched_timer
1.00 3104 0.0 0 0 gdbus tick_sched_timer
194 Total events, 194.00 events/sec (kernel: 6.00, userspace: 188.00)
Event/s PID %CPU PR NI Task Init Function
63.00 1730 0.5 0 0 Xorg hrtimer_wakeup
61.00 1730 0.5 0 0 Xorg it_real_fn
14.00 3086 0.1 0 0 diodon hrtimer_wakeup
13.00 3203 0.3 0 0 gnome-terminal- hrtimer_wakeup
12.00 3065 0.5 0 0 cinnamon tick_sched_timer
7.00 32 0.0 0 0 [kworker/1:1] intel_uncore_fw_release_timer
5.00 1730 0.5 0 0 Xorg intel_uncore_fw_release_timer
3.00 1730 0.5 0 0 Xorg tick_sched_timer
2.00 173 0.0 0 0 [kworker/u8:3] intel_uncore_fw_release_timer
2.00 3203 0.3 0 0 gnome-terminal- tick_sched_timer
2.00 236 0.5 0 0 [kworker/3:2] intel_uncore_fw_release_timer
1.00 145 0.0 0 0 [kworker/0:2] intel_uncore_fw_release_timer
1.00 751 0.0 0 0 haveged hrtimer_wakeup
1.00 3086 0.0 0 0 diodon tick_sched_timer
187 Total events, 187.00 events/sec (kernel: 12.00, userspace: 175.00)
Event/s PID %CPU PR NI Task Init Function
60.00 1730 0.2 0 0 Xorg hrtimer_wakeup
55.00 1730 0.2 0 0 Xorg it_real_fn
16.00 3086 0.1 0 0 diodon hrtimer_wakeup
14.00 3203 0.2 0 0 gnome-terminal- hrtimer_wakeup
7.00 3065 0.2 0 0 cinnamon tick_sched_timer
5.00 32 0.0 0 0 [kworker/1:1] intel_uncore_fw_release_timer
2.00 1730 0.2 0 0 Xorg tick_sched_timer
1.00 3203 0.2 0 0 gnome-terminal- tick_sched_timer
1.00 751 0.0 0 0 haveged hrtimer_wakeup
1.00 3065 0.3 0 0 cinnamon hrtimer_wakeup
1.00 1730 0.2 0 0 Xorg intel_uncore_fw_release_timer
1.00 32 0.0 0 0 [kworker/1:1] tick_sched_timer
1.00 3065 0.4 0 0 cinnamon intel_uncore_fw_release_timer
165 Total events, 165.00 events/sec (kernel: 6.00, userspace: 159.00)
Event/s PID %CPU PR NI Task Init Function
65.00 1730 0.5 0 0 Xorg hrtimer_wakeup
61.00 1730 0.5 0 0 Xorg it_real_fn
34.00 3065 1.3 0 0 cinnamon tick_sched_timer
19.00 3086 0.1 0 0 diodon hrtimer_wakeup
14.00 3203 0.2 0 0 gnome-terminal- hrtimer_wakeup
11.00 842 0.0 0 0 NetworkManager tick_sched_timer
8.00 907 0.0 0 0 gdbus tick_sched_timer
8.00 841 0.0 0 0 dbus-daemon tick_sched_timer
5.00 3074 0.7 0 0 gdbus tick_sched_timer
5.00 32 0.0 0 0 [kworker/1:1] intel_uncore_fw_release_timer
4.00 3104 0.2 0 0 gdbus tick_sched_timer
2.00 3203 0.2 0 0 gnome-terminal- tick_sched_timer
2.00 236 0.0 0 0 [kworker/3:2] intel_uncore_fw_release_timer
2.00 841 0.0 0 0 <...> tick_sched_timer
2.00 3086 0.1 0 0 diodon tick_sched_timer
2.00 1730 0.5 0 0 Xorg tick_sched_timer
2.00 3088 0.0 0 0 nm-applet tick_sched_timer
2.00 888 0.0 0 0 wpa_supplicant hrtimer_wakeup
2.00 1730 0.5 0 0 Xorg intel_uncore_fw_release_timer
1.00 751 0.0 0 0 haveged hrtimer_wakeup
251 Total events, 251.00 events/sec (kernel: 7.00, userspace: 244.00)
Event/s PID %CPU PR NI Task Init Function
59.30 1730 0.5 0 0 Xorg hrtimer_wakeup
57.29 1730 0.5 0 0 Xorg it_real_fn
23.12 3203 0.0 0 0 gnome-terminal- hrtimer_wakeup
16.08 3086 0.0 0 0 diodon hrtimer_wakeup
7.04 32 0.0 0 0 [kworker/1:1] intel_uncore_fw_release_timer
7.04 3065 0.3 0 0 cinnamon tick_sched_timer
3.02 1730 0.5 0 0 Xorg tick_sched_timer
3.02 1730 0.5 0 0 Xorg intel_uncore_fw_release_timer
2.01 751 0.0 0 0 haveged hrtimer_wakeup
1.01 3203 0.0 0 0 gnome-terminal- tick_sched_timer
178 Total events, 178.89 events/sec (kernel: 7.04, userspace: 171.86)
^C Event/s PID %CPU PR NI Task Init Function
141.84 1730 0.0 0 0 Xorg hrtimer_wakeup
120.57 1730 0.0 0 0 Xorg it_real_fn
85.11 3203 1.8 0 0 gnome-terminal- hrtimer_wakeup
28.37 3203 1.8 0 0 gnome-terminal- tick_sched_timer
14.18 751 0.0 0 0 haveged hrtimer_wakeup
14.18 32 0.0 0 0 [kworker/1:1] intel_uncore_fw_release_timer
7.09 2718 0.0 0 0 <...> hrtimer_wakeup
7.09 7530 0.0 0 0 [kworker/u8:8] tick_sched_timer
7.09 3065 0.8 0 0 cinnamon hrtimer_wakeup
7.09 1730 0.0 0 0 Xorg intel_uncore_fw_release_timer
61 Total events, 432.62 events/sec (kernel: 21.28, userspace: 411.35)
=> I thought I'd had seen much higher numbers for hrtimer_wakeup when
running 5.3, but that didn't turn out to be the case
at an idle system (DE/cinnamon running but no real load)
root@heisenberg:~# sensors
coretemp-isa-0000
Adapter: ISA adapter
Package id 0: +54.0°C (high = +100.0°C, crit = +100.0°C)
Core 0: +52.0°C (high = +100.0°C, crit = +100.0°C)
Core 1: +52.0°C (high = +100.0°C, crit = +100.0°C)
iwlwifi-virtual-0
Adapter: Virtual device
temp1: +33.0°C
update-iniramfs -u -k all barely hits 70°C
**************************************************************
**************************************************************
With 5.3.9+0xCA-2019-09-26
eventstat didn't show considerably higher numbers for e.g.
hrtimer_wakeup
which I thought I'd had seen at first.
But now idle system (again cinnamon running) seems to run much hotter,
barely getting below 60°:
# sensors
coretemp-isa-0000
Adapter: ISA adapter
Package id 0: +66.0°C (high = +100.0°C, crit = +100.0°C)
Core 0: +61.0°C (high = +100.0°C, crit = +100.0°C)
Core 1: +61.0°C (high = +100.0°C, crit = +100.0°C)
iwlwifi-virtual-0
Adapter: Virtual device
temp1: +33.0°C
CMB1-acpi-0
Adapter: ACPI interface
in0: 16.58 V
curr1: 0.00 A
Here I did some apt/aptitude stuff to get older an intel-microcode from
stable or oldstable.
After that (installation of packages and update-initramfs) it took (I'd
say) noticeable longer (not extremely much, but noticable) till the CPU
cools down to the (still higher base level of) idle temps from above
(~60-68°)
Running update-initramfs -k all -u let's the temps go easily above 70°
up to 85°.
Interestingly sometimes it cools down again rather fast (but still only
to the 60° range).
Sometimes it doesn't.
Especially video playpack seems to be a killer.
Playing a:
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p,
720x304 [SAR 152:151 DAR 360:151], 529 kb/s, SAR 181:180 DAR 181:76, 25
fps, 25 tbr, 25k tbn, 50 tbc (default)
in full screen lets the CPU heat up to:
# sensors
coretemp-isa-0000
Adapter: ISA adapter
Package id 0: +93.0°C (high = +100.0°C, crit = +100.0°C)
Core 0: +83.0°C (high = +100.0°C, crit = +100.0°C)
Core 1: +81.0°C (high = +100.0°C, crit = +100.0°C)
iwlwifi-virtual-0
Adapter: Virtual device
temp1: +33.0°C
CMB1-acpi-0
Adapter: ACPI interface
in0: 16.57 V
curr1: 0.00 A
and it took quite a while to cool down, even though I've stopped the
video for a minute or so already.
**************************************************************
**************************************************************
With 5.3.9+0xb-2019-04-01:
I.e. current kernel, but even older microcode (the last one where I
though it was ok, was 3.20191112.1 ... but that might be just a
coincidence since on Nov 14 2019 I've installed kernel 5.3 packages,
and that is roughly around 3.20191113.1 (Fri, 15 Nov 2019) where I've
started to slowly notice the CPU temperature issues.
Idle temp sems to be around:
# sensors
iwlwifi-virtual-0
Adapter: Virtual device
temp1: +33.0°C
coretemp-isa-0000
Adapter: ISA adapter
Package id 0: +71.0°C (high = +100.0°C, crit = +100.0°C)
Core 0: +66.0°C (high = +100.0°C, crit = +100.0°C)
Core 1: +65.0°C (high = +100.0°C, crit = +100.0°C)
CMB1-acpi-0
Adapter: ACPI interface
in0: 16.57 V
curr1: 0.00 A
so here I concluded that maybe 5.3 is the offender... and not the
microcode!?
Installing the current microcode again and afterwards doing:
update-initramfs -k all -u
leads to temps around that:
$ sensors
iwlwifi-virtual-0
Adapter: Virtual device
temp1: +33.0°C
coretemp-isa-0000
Adapter: ISA adapter
Package id 0: +79.0°C (high = +100.0°C, crit = +100.0°C)
Core 0: +74.0°C (high = +100.0°C, crit = +100.0°C)
Core 1: +79.0°C (high = +100.0°C, crit = +100.0°C)
CMB1-acpi-0
Adapter: ACPI interface
in0: 16.57 V
curr1: 0.00 A
staying long at around:
iwlwifi-virtual-0
Adapter: Virtual device
temp1: +33.0°C
coretemp-isa-0000
Adapter: ISA adapter
Package id 0: +74.0°C (high = +100.0°C, crit = +100.0°C)
Core 0: +68.0°C (high = +100.0°C, crit = +100.0°C)
Core 1: +66.0°C (high = +100.0°C, crit = +100.0°C)
CMB1-acpi-0
Adapter: ACPI interface
in0: 16.57 V
curr1: 0.00 A
even though the initrd creation is already long over and top shows
nothing else.
**************************************************************
I'm now back at running 5.2.17-1 (2019-10-06) from the
linux-image-5.2.0-3-amd64-unsigned
package with the most recent intel-microcode package version.
Temperatures seem good (in the sense: as from before I noticed issues).
So my conclusion would be 5.3 is the bad boy...
Shall we reassign it to src:linux?
Cheers,
Chris.