Re: Recurrent alerts "Package temperature above threshold, cpu clock throttled"

2020-04-02 Thread l0f4r0
Hi again,

2 avr. 2020 à 00:30 de l0f...@tuta.io:

> => From now on, it seems I have 2 workarounds: turbo boost deactivation and 
> Debian thermald.
> I guess I should do some A/B testing...
>
I've just uninstalled thermald manually built from 01.org, and installed the 
official Debian package as suggested instead.
I've adopted a zero-configuration approach.
Here is my situation by default:

systemctl status thermald.service
  thermald.service - Thermal Daemon Service
   Loaded: loaded (/lib/systemd/system/thermald.service; enabled; vendor 
preset: enabled)
   Active: active (running) since Wed 2020-04-01 23:59:38 CEST; 15h ago
Main PID: 30158 (thermald)
    Tasks: 2 (limit: 4915)
   Memory: 2.3M
   CGroup: /system.slice/thermald.service 
   └─30158 /usr/sbin/thermald --no-daemon --dbus-enable

avril 01 23:59:38 thermald[30158]: sensor id 10 : No temp sysfs for reading raw 
temp
avril 01 23:59:38 thermald[30158]: sensor id 10 : No temp sysfs for reading raw 
temp
avril 01 23:59:38 thermald[30158]: sensor id 10 : No temp sysfs for reading raw 
temp
avril 01 23:59:38 thermald[30158]: I/O warning : failed to load external entity 
"/etc/thermald/thermal-conf.xml"
avril 01 23:59:38 thermald[30158]: error: could not parse file 
/etc/thermald/thermal-conf.xml
avril 01 23:59:38 thermald[30158]: sysfs open failed
avril 01 23:59:38 thermald[30158]: I/O warning : failed to load external entity 
"/etc/thermald/thermal-conf.xml"
avril 01 23:59:38 thermald[30158]: error: could not parse file 
/etc/thermald/thermal-conf.xml
avril 01 23:59:38 thermald[30158]: I/O warning : failed to load external entity 
"/etc/thermald/thermal-conf.xml"
avril 01 23:59:38 thermald[30158]: error: could not parse file 
/etc/thermald/thermal-conf.xml

NB : indeed I don't have /etc/thermald/thermal-conf.xml but it seems that 
thermald can work without it 
(https://bugs.launchpad.net/ubuntu/+source/thermald/+bug/1811788).

grep -i pstate /boot/config-$(uname -r)
CONFIG_X86_INTEL_PSTATE=y

cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_driver
intel_pstate
intel_pstate
intel_pstate
intel_pstate
intel_pstate
intel_pstate
intel_pstate
intel_pstate

cpupower frequency-info
analyzing CPU 0:
  driver: intel_pstate
  CPUs which run at the same hardware frequency: 0
  CPUs which need to have their frequency coordinated by software: 0
  maximum transition latency:  Cannot determine or is not supported.
  hardware limits: 400 MHz - 4.60 GHz
  available cpufreq governors: performance powersave
  current policy: frequency should be within 400 MHz and 4.60 GHz.
  The governor "powersave" may decide which speed to use
  within this range.
  current CPU frequency: Unable to call hardware
  current CPU frequency: 887 MHz (asserted by call to kernel)
  boost state support:
    Supported: yes
    Active: yes

Logs "CPUX: Package temperature above threshold, cpu clock throttled (total 
events = XXX)" are still present in journalctl :(

Do you still think thermald can help me? If so what would you try/configure 
please?
Maybe there is simply something required by thermald I haven't installed yet?

Thank you for your appreciated help :)
Best regards,
l0f4r0



Re: Recurrent alerts "Package temperature above threshold, cpu clock throttled"

2020-04-01 Thread l0f4r0
Hi,

8 mars 2020 à 22:46 de :

> So far turbo boost deactivation seems to resolve the issue. Do you think I'll 
> really lose performances?
> I think turbo can be used only in mono-threaded context. With HT activated, I 
> should not be in this context very often, right?
>
I've checked some resources on the internet. It seems people have very 
different opinions about HT/turbo boost. Some say the don't notice anything, 
others say it depends about what is processed, even benchmarks have 
difficulties to study the real impact because processors are very complicated 
components...

Anyway, I'll start on the principle that I'm not an expert and I don't really 
have prior experience about it to have any reference frame and compare things 
(I should not be able to be disappointed then).
However, I do have temperature issues. So if deactivating turbo boost resolves 
that, so be it as long as performances are not impacted severely...

>> BTW, have you checked for Lenovo-provided firmware/BIOS/EC updates?
>> I've seen temperature and fan profile-related fixes in a couple of them
>> (for other Lenovo models).
>>
> Good idea. It seems there is a new firmware indeed!
> If I'm right, I have N2JET83W (1.61) UEFI BIOS version and N2JHT32W (1.16) 
> Embedded Controller version.
> According to https://download.lenovo.com/pccbbs/mobiles/n2jul22w.txt, 1.62 
> (N2JET84W) and 1.17 (N2JHT33W) are available. Changelogs are as follows:
>
> [Important updates]
> - Addresses CVE-2019-0185 
> (https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-0185)
> - Security fix addresses LEN-29406 ST Microelectronics TPM Firmware ECDSA
>
> [New functions or enhancements]
> - Updated the CPU microcode.
>    (Note) Above update will show "Self-Healing BIOS  backup progressing ... 
> xx %"
>   massage on screen during BIOS update process.
> - Updated the Diagnostics module to version 04.11.000.
> - Supported for Battery Diagnostics.
> - Updated Charging LED to always On while AC adapter was connected.
>
> [Problem fixes]
> - Fixed an issue where system entered hibernation suddenly due to critical low
>   battery status detected incorrectly.
> - Fixed an issue where battery was not charged when AC adapter was connected
>   before computer was turned on.
> - Fixed an issue where system hang after disabled AMT setting in BIOS setup.
>
> I've never updated firmware on Linux. I know it can be a dangerous 
> operation...
> I've just installed fwupd-amd64-signed on my Debian 10 and it tells me:
>
> sudo fwupdmgr get-updates
> X390/T490s Thunderbolt Controller has firmware updates:
> GUID:    e773c51e-a20c-5b29-9f09-6bb0e0ef7560
> ID:  com.lenovo.ThinkPadN2JTF.firmware
> Update Version:  20.00
> Update Name: ThinkPad X390/ThinkPad T490s (Machine types: 
> 20Q0,20Q1,20SC,20SD,20NX,20NY) Thunderbolt Controller
> Update Summary:  Lenovo ThinkPad X390/ThinkPad T490s Thunderbolt 
> Firmware
> Update Remote ID:    lvfs
> Update Checksum: SHA1(6c0dce78ce4e91f6c8e79fdf3c8965077bd35219)
> Update Location: 
> https://fwupd.org/downloads/8eef957c95cb6f534448be1faa7bbfc8702d620f64b757d40ee5e0b6b7094c0e-Lenovo-ThinkPad-X390-SystemFirmware-01.cab
> Update Description:  Lenovo ThinkPad X390/ThinkPad T490s Thunderbolt 
> Firmware
> 
>   • DO NOT FORCE UPDATE Thunderbolt Controller. This 
> may damage the firmware.
> 
> No upgrades for 20Q0CTO1WW System Firmware, current is 0.1.61: 0.1.51=older
> UEFI Device Firmware has firmware updates:
> GUID:    ef5cdc85-9cf6-469d-9cb7-920b7dd6672b
> ID:  com.lenovo.ThinkPadN2JRN.firmware
> Update Version:  192.47.1524
> Update Name: ThinkPad T490s Consumer ME Update
> Update Summary:  Lenovo ThinkPad T490s Consumer ME Firmware
> Update Remote ID:    lvfs
> Update Checksum: SHA1(7689117b9f94b853d688c84051c18a989a76c7fb)
> Update Location: 
> https://fwupd.org/downloads/3e05da98267ebff8531c17f820b4dfaddc73c17f-Lenovo-ThinkPad-T490s-ConsumerMEFirmware-12.0.47.1524.cab
> Update Description:   • 0 Q2'19 Intel Platform Update (Hot Fix Release)
> 
>  Problem Fixes
> 
>  Security issues fixed:
> 
> No upgrades for UEFI Device Firmware, current is 0.1.16: 0.1.15=older, 
> 0.1.12=older
>
> It seems fwupd doesn't provide aforementioned 1.62 (N2JET84W), right? Maybe 
> it's just a matter of time?
> What would you do if you were me please? Manual update via dd on a USB 
> thumbdrive? Nothing?
>
I'm answering to myself after reading some materials:
* Not all updates are available through fwupd / LVFS even if the brand is 
already an identified partner.
* I updated both ECP (1.17) and UEFI (1.63) via fwupdmgr after downloading the 
firmwares from Lenovo website. The latter 

Re: Recurrent alerts "Package temperature above threshold, cpu clock throttled"

2020-03-08 Thread Ben Hutchings
On Sun, 2020-03-08 at 22:46 +0100, l0f...@tuta.io wrote:
[...]
> > So anyways, given that you have a new ultrabook with a powerful CPU, I
> > think thermald is probably the best solution to try.  You can read about
> > how it combines many other methods and aims to solve the problems
> > inherent to ultrabooks here (01.org is the Intel open source project):
> > 
> >  > 
> > https://01.org/linux-thermal-daemon/documentation/introduction-thermal-daemon
> > 
> Many thanks for the link. I didn't know about it. So I've:
> * downloaded https://github.com/intel/thermal_daemon/archive/v1.9.1.zip
[...]

thermald is already packaged in Debian, so it is probably better to
install that package.

Ben.

-- 
Ben Hutchings
The generation of random numbers is too important to be left to chance.
   - Robert Coveyou




signature.asc
Description: This is a digitally signed message part


Re: Recurrent alerts "Package temperature above threshold, cpu clock throttled"

2020-03-08 Thread Ben Hutchings
On Mon, 2020-01-20 at 14:27 -0700, Nicholas D Steeves wrote:
> Hi L0f4r0,
> 
> Ben (or other team members), if you're reading this and are short on
> time, would you please skip to the question at the bottom and reply to
> it?
[...]
> So anyways, given that you have a new ultrabook with a powerful CPU, I
> think thermald is probably the best solution to try.  You can read about
> how it combines many other methods and aims to solve the problems
> inherent to ultrabooks here (01.org is the Intel open source project):
> 
>   
> https://01.org/linux-thermal-daemon/documentation/introduction-thermal-daemon
> 
> Ben (and team), is there any reason why thermald isn't part of
> task-laptop?  If it solves L0f4r0's issue it might be worth adding it,
> and if it's true that laptops...these new ultrabooks...are being
> designed to require it (eg: insufficient cooling solution) then it would
> make sense to enable it for the general case that it purportedly solves.

If thermald doesn't require manual configuration, and its default
behaviour is reasonable on all laptops, then I think it should be added
to task-laptop.  However, currently all tasks are arch-independent
packages while thermald is x86-only.

If thermald doesn't behave reasonably on some laptops, or if there's
some reason why tasksel should not build arch-dependent packages, we
would need to find some other way to get it installed by default on the
laptops where it's really needed.

Anyway, this should be discussed on the debian-boot list, which is the
maintainer address for tasksel.

Ben.

-- 
Ben Hutchings
The generation of random numbers is too important to be left to chance.
   - Robert Coveyou




signature.asc
Description: This is a digitally signed message part


Re: Recurrent alerts "Package temperature above threshold, cpu clock throttled"

2020-03-08 Thread l0f4r0
Hi Nicholas,

First of all, thank you for your continuous help :)

20 janv. 2020 à 22:27 de nstee...@gmail.com:

>
>
>>> Have you tried disabling CPU freq boost?  When the ambient
>>> temperature is above 27°C my X220 and X230 need to have boost
>>> disabled to avoid overheating/throttling.
>>>
>> I've seen at least 2 ways to deactivate turbo boost:
>> 1) echo "1" to /sys/devices/system/cpu/intel_pstate/no_turbo
>> Visibly, as sysctl only works with /proc/sys (and not /sys), this
>> needs to be set permanently via a systemd service. What do you think
>> about this procedure:
>> https://blog.christophersmart.com/2017/02/08/manage-intel-turbo-boost-with-systemd/?
>> 2) modify MSR registers via wrmsr (https://askubuntu.com/a/619881). I don't 
>> know if there is persistance here...
>>
>> Do you use any of these? Something else?
>>
>
> I used the systemd method on my sister's old Macbook.  It seems to help
> with heat and fan noise, and everything is still consistently smooth, so
> we count it as a win.
>
So far, I've noticed that I don't have "temperature above threshold" alerts 
anymore if I deactivate turbo boost!
I've made some quick tests like:
stress-ng --cpu 4 --timeout 30s --metrics-brief
With turbo enabled my laptop raises to 80°C/4580MHz. Without it, it stays 
around 55°C/1800MHz.

What's weird is that those high temperatures during stress don't generate 
alerts necessarily. Even if T° is going to 80°C, I don't have "temperature 
above threshold" alerts for sure (??). Inversely, I can do nothing special but 
get those alerts...
I'm still confused & still don't know what really triggers those alerts then...

So far turbo boost deactivation seems to resolve the issue. Do you think I'll 
really lose performances?
I think turbo can be used only in mono-threaded context. With HT activated, I 
should not be in this context very often, right?

>> PS: My CPU is Intel(R) Core(TM) i7-8565U CPU @ 1.80GHz.
>>
> Oooh.  Earlier your wrote that this is an X390, right?  The powerful CPU
> in a thin and tiny case with lightweight cooling solution problem may
> apply.
>
Lenovo ThinkPad X390 indeed ;)

> BTW, have you checked for Lenovo-provided firmware/BIOS/EC updates?
> I've seen temperature and fan profile-related fixes in a couple of them
> (for other Lenovo models).
>
Good idea. It seems there is a new firmware indeed!
If I'm right, I have N2JET83W (1.61) UEFI BIOS version and N2JHT32W (1.16) 
Embedded Controller version.
According to https://download.lenovo.com/pccbbs/mobiles/n2jul22w.txt, 1.62 
(N2JET84W) and 1.17 (N2JHT33W) are available. Changelogs are as follows:

[Important updates]
- Addresses CVE-2019-0185 
(https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-0185)
- Security fix addresses LEN-29406 ST Microelectronics TPM Firmware ECDSA

[New functions or enhancements]
- Updated the CPU microcode.
   (Note) Above update will show "Self-Healing BIOS  backup progressing ... xx 
%"
  massage on screen during BIOS update process.
- Updated the Diagnostics module to version 04.11.000.
- Supported for Battery Diagnostics.
- Updated Charging LED to always On while AC adapter was connected.

[Problem fixes]
- Fixed an issue where system entered hibernation suddenly due to critical low
  battery status detected incorrectly.
- Fixed an issue where battery was not charged when AC adapter was connected
  before computer was turned on.
- Fixed an issue where system hang after disabled AMT setting in BIOS setup.

I've never updated firmware on Linux. I know it can be a dangerous operation...
I've just installed fwupd-amd64-signed on my Debian 10 and it tells me:

sudo fwupdmgr get-updates
X390/T490s Thunderbolt Controller has firmware updates:
GUID:    e773c51e-a20c-5b29-9f09-6bb0e0ef7560
ID:  com.lenovo.ThinkPadN2JTF.firmware
Update Version:  20.00
Update Name: ThinkPad X390/ThinkPad T490s (Machine types: 
20Q0,20Q1,20SC,20SD,20NX,20NY) Thunderbolt Controller
Update Summary:  Lenovo ThinkPad X390/ThinkPad T490s Thunderbolt 
Firmware
Update Remote ID:    lvfs
Update Checksum: SHA1(6c0dce78ce4e91f6c8e79fdf3c8965077bd35219)
Update Location: 
https://fwupd.org/downloads/8eef957c95cb6f534448be1faa7bbfc8702d620f64b757d40ee5e0b6b7094c0e-Lenovo-ThinkPad-X390-SystemFirmware-01.cab
Update Description:  Lenovo ThinkPad X390/ThinkPad T490s Thunderbolt 
Firmware

  • DO NOT FORCE UPDATE Thunderbolt Controller. This 
may damage the firmware.

No upgrades for 20Q0CTO1WW System Firmware, current is 0.1.61: 0.1.51=older
UEFI Device Firmware has firmware updates:
GUID:    ef5cdc85-9cf6-469d-9cb7-920b7dd6672b
ID:  com.lenovo.ThinkPadN2JRN.firmware
Update Version:  192.47.1524
Update Name: ThinkPad T490s Consumer ME Update
Update Summary:  Lenovo ThinkPad T490s Consumer ME Firmware
Update Remote ID:  

Re: Recurrent alerts "Package temperature above threshold, cpu clock throttled"

2020-01-20 Thread Nicholas D Steeves
Hi L0f4r0,

Ben (or other team members), if you're reading this and are short on
time, would you please skip to the question at the bottom and reply to
it?

 writes:

> At least, Internet resources indicate that's it's safer this way with
> HT deactivated (regarding MDS attacks) but I don't know (and really
> not anyone as it really depends on current CPU tasks) how my
> performances are impacted now...  In your case, it seems to be a
> benefit but surely it's not always the case...

You're absolutely right.  My approach is to sacrifice peak performance
for consistency, but other people may prefer (or require) peak
performance.

> Maybe I can just continue like this and restore HT if this is
> impossible to live with ;)
>

Anecdotally, I found HT makes interactivity under high load slightly
better.  Someone who knows about things like CPU context switches and
cache misses might be able to say why.  As you noted though, increased
risk of MDS attack.  Too bad disabling it didn't help with heat.

>> Have you tried disabling CPU freq boost?  When the ambient
>> temperature is above 27°C my X220 and X230 need to have boost
>> disabled to avoid overheating/throttling.
>>
> I've seen at least 2 ways to deactivate turbo boost:
> 1) echo "1" to /sys/devices/system/cpu/intel_pstate/no_turbo
> Visibly, as sysctl only works with /proc/sys (and not /sys), this
> needs to be set permanently via a systemd service. What do you think
> about this procedure:
> https://blog.christophersmart.com/2017/02/08/manage-intel-turbo-boost-with-systemd/?
> 2) modify MSR registers via wrmsr (https://askubuntu.com/a/619881). I don't 
> know if there is persistance here...
>
> Do you use any of these? Something else?
>

I used the systemd method on my sister's old Macbook.  It seems to help
with heat and fan noise, and everything is still consistently smooth, so
we count it as a win.

> PS: My CPU is Intel(R) Core(TM) i7-8565U CPU @ 1.80GHz.

Oooh.  Earlier your wrote that this is an X390, right?  The powerful CPU
in a thin and tiny case with lightweight cooling solution problem may
apply.

BTW, have you checked for Lenovo-provided firmware/BIOS/EC updates?
I've seen temperature and fan profile-related fixes in a couple of them
(for other Lenovo models).

> Actually my main questions are: 
>
> i) Are those temperature warnings legitimate?  I mean maybe those
> warnings have been wrongly triggered because probes are not accurate?
> I'm not speaking about HW failure (my laptop is brand new) but maybe I
> just don't have the right driver somewhere...  Indeed, it would be too
> bad to decrease all my performances with HT/turbo-boost disabled
> (security apart) whereas the journalctl warnings are wrong
> initially... ;)
>

Honestly, I'm not sure.  This seems plausible.  In your other email you wrote:

> I've just upgraded my kernel 4.19.0-6-amd64 to backports longterm
> 5.4.0-0.bpo.2-amd64 because I had issues with light-docker.  The
> aforementioned journalctl entries have been reassigned from crit/2 to
> warning/4!  Maybe it's not so serious? ^^

This severity change, plus what I've read other reports of newer laptops
overheating with recent kernels makes me wonder if there might be some
churn in the Intel P-State driver.  On reddit and stackexchange some
people have reported success disabling it and using the older
acpi-cpufreq driver instead.  From what I've read the P-State driver
bypasses the ACPI hints, so if I had to guess, the reported success of
this method would depend on not-buggy ACPI firmware that provides
hardware-specific hints that work better than the Intel solution for the
general case.  That said, the new solution is supposed to be better in
every way (read on).

> PS: I have installed intel-microcode 3.20191115.2~deb10u1. Still the
> same issues.  I don't know if it's an issue but microcode module is
> blacklisted in /etc/modprobe.d/intel-microcode-blacklist.conf (it
> seems to be a precaution regarding unsafe updates).
>

OT: IIRC this is to prevent updates at an unsafe time, and to use the
newer early microcode loading method rather than the older (later in the
boot process method).

> ii) Is it risky to do nothing about these temperature warnings?  I
> have no idea what EC means (Embedded Controller?) but you said EC
> eventually shutdowns the laptop if need be. I presume it's not really
> beneficial from the user point of view as the current tasks will be
> shutdowned and some work/data might be lost during the process.
>

Yes, "EC" means embedded controller :-)  Intel hardware is excellent
about shutting itself down before damage occurs.

So anyways, given that you have a new ultrabook with a powerful CPU, I
think thermald is probably the best solution to try.  You can read about
how it combines many other methods and aims to solve the problems
inherent to ultrabooks here (01.org is the Intel open source project):

  https://01.org/linux-thermal-daemon/documentation/introduction-thermal-daemon

Ben (and team), is the

Re: Recurrent alerts "Package temperature above threshold, cpu clock throttled"

2020-01-19 Thread l0f4r0
Hi,

5 janv. 2020 à 18:59 de l0f...@tuta.io:

> Multiple times per hour (despite no notable CPU-consuming activity), I get 
> something like the following journalctl crit/2 entry:
>
> kernel: CPUX: Package temperature above threshold, cpu clock throttled (total 
> events = Y)
> where X=[0-7] (I have a quad-core i7 with hyperthreading activated) and Y is 
> generally a 4-digit number.
>
I've just upgraded my kernel 4.19.0-6-amd64 to backports longterm 
5.4.0-0.bpo.2-amd64 because I had issues with light-docker.
The aforementioned journalctl entries have been reassigned from crit/2 to 
warning/4!
Maybe it's not so serious? ^^

Best regards,
l0f4r0



Re: Recurrent alerts "Package temperature above threshold, cpu clock throttled"

2020-01-19 Thread l0f4r0
Hi Nicholas, all,

Thank you very much for your answer and sorry for my delay...

6 janv. 2020 à 05:33 de nstee...@gmail.com:

> While I'm very much a junior member of the team, I have noted similar
> behaviour on a laptop as old as an X220 and wonder if it might be due to
> Spectre mitigations.  I have not noticed such log entries after
> disabling hyperthreading.  FYI, as someone who works with realtime audio
> I've found that hyperthreading is detrimental to worst-case latency.
> eg: after many hours trying to workaround troublesome unpredictable
> latency spikes the solution to disable hyperthreading emerged as the
> simplest solution.
>
I've just deactivated hyperthreading in my UEFI.
Still the same warnings in journalctl :(

At least, Internet resources indicate that's it's safer this way with HT 
deactivated (regarding MDS attacks) but I don't know (and really not anyone as 
it really depends on current CPU tasks) how my performances are impacted now...
In your case, it seems to be a benefit but surely it's not always the case...
Maybe I can just continue like this and restore HT if this is impossible to 
live with ;)

>>
>>
> Have you
> tried disabling CPU freq boost?  When the ambient temperature is above
> 27°C my X220 and X230 need to have boost disabled to avoid
> overheating/throttling.
>
I've seen at least 2 ways to deactivate turbo boost:
1) echo "1" to /sys/devices/system/cpu/intel_pstate/no_turbo
Visibly, as sysctl only works with /proc/sys (and not /sys), this needs to be 
set permanently via a systemd service. What do you think about this procedure: 
https://blog.christophersmart.com/2017/02/08/manage-intel-turbo-boost-with-systemd/?
2) modify MSR registers via wrmsr (https://askubuntu.com/a/619881). I don't 
know if there is persistance here...

Do you use any of these? Something else?

PS: My CPU is Intel(R) Core(TM) i7-8565U CPU @ 1.80GHz.
PPS: Even if it works, I don't know if I'm not gonna lose some more 
performances...

Actually my main questions are: 

i) Are those temperature warnings legitimate?
I mean maybe those warnings have been wrongly triggered because probes are not 
accurate? I'm not speaking about HW failure (my laptop is brand new) but maybe 
I just don't have the right driver somewhere...
Indeed, it would be too bad to decrease all my performances with HT/turbo-boost 
disabled (security apart) whereas the journalctl warnings are wrong 
initially... ;)

PS: I have installed intel-microcode 3.20191115.2~deb10u1. Still the same 
issues.
I don't know if it's an issue but microcode module is blacklisted in 
/etc/modprobe.d/intel-microcode-blacklist.conf (it seems to be a precaution 
regarding unsafe updates).

ii) Is it risky to do nothing about these temperature warnings?
I have no idea what EC means (Embedded Controller?) but you said EC eventually 
shutdowns the laptop if need be. I presume it's not really beneficial from the 
user point of view as the current tasks will be shutdowned and some work/data 
might be lost during the process.

Thanks again & Best regards :)
l0f4r0



Re: Recurrent alerts "Package temperature above threshold, cpu clock throttled"

2020-01-05 Thread Nicholas D Steeves
Hi l0f4r0,

Reply follows inline.

 writes:

> Hi team,
>
> Multiple times per hour (despite no notable CPU-consuming activity), I
> get something like the following journalctl crit/2 entry:
>
> kernel: CPUX: Package temperature above threshold, cpu clock throttled
> (total events = Y) where X=[0-7] (I have a quad-core i7 with
> hyperthreading activated) and Y is generally a 4-digit number.
>

While I'm very much a junior member of the team, I have noted similar
behaviour on a laptop as old as an X220 and wonder if it might be due to
Spectre mitigations.  I have not noticed such log entries after
disabling hyperthreading.  FYI, as someone who works with realtime audio
I've found that hyperthreading is detrimental to worst-case latency.
eg: after many hours trying to workaround troublesome unpredictable
latency spikes the solution to disable hyperthreading emerged as the
simplest solution.

> There are different threads about it on the Internet. Some seem to
> indicate a kernel issue, others says it's sensors or CPU paste fault
> and others don't explain anything and recommend to change the
> threshold or deactivate the alerts themselves (!).
>

Yeah, you're right, that seems ill-advised, even if one can depend on
the EC to shut down the laptop in case of thermal overload.  Have you
tried disabling CPU freq boost?  When the ambient temperature is above
27°C my X220 and X230 need to have boost disabled to avoid
overheating/throttling.

> Are you aware of anything Debian-kernel-related please (I didn't see
> any such bug at
> https://bugs.debian.org/cgi-bin/pkgreport.cgi?pkg=linux-image-4.19.0-6-amd64
> though) that could help me regarding that matter, considering the fact
> I'm on Debian 10 (kernel 4.19.67-2+deb10u2) with an amd64 Lenovo
> ThinkPad X390?  NB: My laptop is brand new, so I'm not really
> convinced of an hardware issue at this stage (it cannot be totally
> excluded though).
>

Sorry, I'm not aware of any, but someone else might be!

> Thank you in advance and Happy GNU Year! :)
> Best regards,
> l0f4r0

Thanks, you too! :-)


Best,
Nicholas


signature.asc
Description: PGP signature


Recurrent alerts "Package temperature above threshold, cpu clock throttled"

2020-01-05 Thread l0f4r0
Hi team,

Multiple times per hour (despite no notable CPU-consuming activity), I get 
something like the following journalctl crit/2 entry:

kernel: CPUX: Package temperature above threshold, cpu clock throttled (total 
events = Y)
where X=[0-7] (I have a quad-core i7 with hyperthreading activated) and Y is 
generally a 4-digit number.

There are different threads about it on the Internet. Some seem to indicate a 
kernel issue, others says it's sensors or CPU paste fault and others don't 
explain anything and recommend to change the threshold or deactivate the alerts 
themselves (!).

Are you aware of anything Debian-kernel-related please (I didn't see any such 
bug at 
https://bugs.debian.org/cgi-bin/pkgreport.cgi?pkg=linux-image-4.19.0-6-amd64 
though) that could help me regarding that matter, considering the fact I'm on 
Debian 10 (kernel 4.19.67-2+deb10u2) with an amd64 Lenovo ThinkPad X390?
NB: My laptop is brand new, so I'm not really convinced of an hardware issue at 
this stage (it cannot be totally excluded though).

Thank you in advance and Happy GNU Year! :)
Best regards,
l0f4r0