Re: inteldrm changes cause high temperature / fan speeds

2020-02-25 Thread Tero Koskinen

Hi,

Alex Karle wrote on 25.2.2020 6.51:

Hi Tero,

Apologies if this breaks the threading -- I wasn't subscribed to misc@
at the time the original was sent.

Have you (or any others) dug any deeper into this? 


My problem was more or less "solved" when the power supply from
my Optiplex died. Instead of fixing the supply, I simply recycled
the device and bought PC Engines APU2.

I still have another slightly newer Optiplex (running Linux), but
not sure when I will have time to test the latest OpenBSD on it
(might take many months).

Yours,
 Tero





Re: inteldrm changes cause high temperature / fan speeds

2020-02-24 Thread Alex Karle
Hi Tero,

Apologies if this breaks the threading -- I wasn't subscribed to misc@
at the time the original was sent.

Have you (or any others) dug any deeper into this? I've spent a good few
hours reading different related threads, but haven't found any solutions,
and you seem to have come closest to narrowing down the search space
for the root cause.

I am also experiencing a similar heat problem on my X220. When idling, I
am consistently seeing ~50deg Celcius for the CPU. I've seen this with a
fresh install of 6.6 and more recently on -current.

I downgraded to 6.5 and the issue disappeared (with idle CPU's at mid
thirties or so).

I should note that these temperatures hold for just idling in the
console (no X11).

Based on my symptoms, and your description, I have a hunch I might be
seeing the same issue you described.  In particular, the other reports
of heat seem to have been solved by tweaking the BIOS settings [1], but
I tried all the mentioned settings and more without any luck.

If anyone has any pointers on where to look as next steps (what part of
the 250k line diff might be problematic, troubleshooting steps, etc),
that would be very welcome too!

Thanks all for your time and help,
Alex

[1]: https://marc.info/?t=15738333783

dmesg:

OpenBSD 6.6-current (GENERIC.MP) #653: Thu Feb 20 21:40:37 MST 2020
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 4156157952 (3963MB)
avail mem = 4017606656 (3831MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.6 @ 0xdae9c000 (64 entries)
bios0: vendor LENOVO version "8DET76WW (1.46 )" date 06/21/2018
bios0: LENOVO 4286CTO
acpi0 at bios0: ACPI 4.0
acpi0: sleep states S0 S3 S4 S5
acpi0: tables DSDT FACP SLIC SSDT SSDT SSDT HPET APIC MCFG ECDT ASF! TCPA SSDT 
SSDT UEFI UEFI UEFI
acpi0: wakeup devices LID_(S3) SLPB(S3) IGBE(S4) EXP4(S4) EXP7(S4) EHC1(S3) 
EHC2(S3) HDEF(S4)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpihpet0 at acpi0: 14318179 Hz
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM) i7-2620M CPU @ 2.70GHz, 797.55 MHz, 06-2a-07
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
cpu0: apic clock running at 99MHz
cpu0: mwait min=64, max=64, C-substates=0.2.1.1.2, IBE
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Core(TM) i7-2620M CPU @ 2.70GHz, 797.42 MHz, 06-2a-07
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu1: 256KB 64b/line 8-way L2 cache
cpu1: smt 0, core 1, package 0
ioapic0 at mainbus0: apid 2 pa 0xfec0, version 20, 24 pins
acpimcfg0 at acpi0
acpimcfg0: addr 0xf800, bus 0-63
acpiec0 at acpi0
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus -1 (PEG_)
acpiprt2 at acpi0: bus 2 (EXP1)
acpiprt3 at acpi0: bus 3 (EXP2)
acpiprt4 at acpi0: bus -1 (EXP4)
acpiprt5 at acpi0: bus 13 (EXP5)
acpiprt6 at acpi0: bus 14 (EXP7)
acpicpu0 at acpi0: C3(200@109 io@0x416), C2(500@80 io@0x414), C1(1000@1 halt), 
PSS
acpicpu1 at acpi0: C3(200@109 io@0x416), C2(500@80 io@0x414), C1(1000@1 halt), 
PSS
acpipwrres0 at acpi0: PUBS, resource for EHC1, EHC2
acpitz0 at acpi0: critical temperature is 99 degC
acpibtn0 at acpi0: LID_
acpibtn1 at acpi0: SLPB
acpipci0 at acpi0 PCI0: 0x 0x0011 0x0001
acpicmos0 at acpi0
tpm0 at acpi0: TPM_ addr 0xfed4/0x5000, device 0x104a rev 0x4e
acpibat0 at acpi0: BAT0 model "42T4940" serial  5067 type LION oem "SANYO"
acpiac0 at acpi0: AC unit offline
acpithinkpad0 at acpi0: version 1.0
"PNP0C14" at acpi0 not configured
"PNP0C14" at acpi0 not configured
acpidock0 at acpi0: GDCK not docked (0)
acpivideo0 at acpi0: VID_
acpivout0 at acpivideo0: LCD0
acpivideo1 at acpi0: VID_
cpu0: using VERW MDS workaround (except on vmm entry)
cpu0: Enhanced SpeedStep 797 MHz: speeds: 2701, 2700, 2400, 2200, 2000, 1800, 
1600, 1400, 1200, 1000, 800 MHz
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 "Intel Core 2G Host" rev 0x09
inteldrm0 at pci0 dev 2 function 0 "Intel HD Graphics 3000" rev 0x09
drm0 at inteldrm0
inteldrm0: msi
"Intel 6 Series MEI" rev 0x04 at pci0 dev 22 function 0 not configured
em0 at pci0 dev 25 function 0 "Intel 82579LM" rev 0x04: msi, address 
f0:de:f1:66:e4:d2
ehci0 at pci0 dev 26 function 0 "Intel 6 Series USB" rev 0

Re: inteldrm changes cause high temperature / fan speeds

2019-11-13 Thread Tero Koskinen

Ted Unangst wrote on 13.11.2019 8.52:

Tero Koskinen wrote:

Eventually I pinned the problem down to April 14/15:

FAULTY 091f8f6587f dlg  Mon Apr 15 02:59:41 2019 +  the myx_cmd
FAULTY 1bbcb699ab8 dlg  Mon Apr 15 00:28:29 2019 +  there's a bunch
PROBLEM! 7f4dd37977d jsg  Sun Apr 14 10:14:50 2019 +  Update shared
drm code
OK 505701c75b3 visa Sun Apr 14 08:51:31 2019 +  Add lock

I must admit that I don't have yet any idea how to fix
the problematic commit (or what is actually wrong there).


This is not too surprising. It's still a bit of a mystery what's different
between machines that behave fine and those that don't.

I have the same machine, and it's never been problematic.

I note I'm at the same old bios I had when I first purchased it.
bios0: vendor LENOVO version "N23ET61W (1.36 )" date 01/17/2019


Note that my device is a desktop computer (Dell Optiplex 990)
with ultra small form factor (USFF) case - not Thinkpad or other laptop.

Otherwise I don't mind if fan or cpu is running at 100%, but
I am worried about the temperature. Idle 70C and 80+ C in
use temperatures will kill the device sooner or later (small case,
not so good ventilation).

Otto Moerbeek wrote on 13.11.2019 8.25:
>
> If you run top -S, do you see any process taking lots of CPU?

Nothing suspicious. I have some daemons, but they are mostly idle.

load averages:  0.06,  0.03,  0.00 
   gurb.koti 16:45:10
104 processes: 102 idle, 2 on processor 
 up  0:28
CPU0 states:  0.0% user,  0.0% nice,  0.0% sys,  0.0% spin,  0.0% intr, 
100% idle
CPU1 states:  0.0% user,  0.0% nice,  0.0% sys,  0.0% spin,  0.0% intr, 
100% idle
CPU2 states:  0.0% user,  0.0% nice,  0.0% sys,  0.0% spin,  0.0% intr, 
100% idle
CPU3 states:  0.0% user,  0.0% nice,  0.0% sys,  0.0% spin,  0.0% intr, 
100% idle

Memory: Real: 322M/1296M act/tot Free: 2542M Cache: 587M Swap: 0K/8189M

  PID USERNAME PRI NICE  SIZE   RES STATE WAIT  TIMECPU COMMAND
24353 root -2200K   11M sleep/3   -28:19  0.00% idle3
38595 root -2200K   11M sleep/2   -28:15  0.00% idle2
28685 root -2200K   11M sleep/1   -28:13  0.00% idle1
29534 root -2200K   11M sleep/0   -28:12  0.00% idle0
43437 _gitea100  129M   73M onproc/3  thrslee   0:04  0.00% gitea
20526 root  1000K   11M sleep/0   bored 0:01  0.00% softnet
0 root -1800K   11M sleep/0   schedul   0:01  0.00% swapper
93389 root -2200K   11M sleep/0   bored 0:01  0.00% 
softclock

66152 root  1000K   11M sleep/2   bored 0:01  0.00% systqmp
1 root  100  476K  444K idle  wait  0:01  0.00% init
83453 root  1000K   11M idle  bored 0:01  0.00% drmwq
49657 root  1000K   11M idle  bored 0:01  0.00% drmwq
61541 root  1000K   11M idle  usbatsk   0:01  0.00% usbatsk
51529 root  1000K   11M idle  bored 0:01  0.00% drmlwq
 9367 root  1000K   11M idle  bored 0:01  0.00% drmubwq
64175 root -1800K   11M idle  bored 0:01  0.00% smr
 2195 root  1000K   11M idle  bored 0:01  0.00% crynlk
7 root  1000K   11M idle  bored 0:01  0.00% drmtskl
 2429 root  1000K   11M idle  bored 0:01  0.00% drmlwq
38115 root  1000K   11M idle  bored 0:01  0.00% drmubwq
40710 root -1800K   11M sleep/1   reaper0:01  0.00% reaper
58981 root  68   200K   11M idle  pgzero0:01  0.00% 
zerothread
18061 www20   22M   27M sleep/1   select0:01  0.00% 
python2.7

29049 tkoskine  280 1504K 3624K onproc/0  - 0:00  0.00% top
77959 _unbound   20   33M   26M sleep/1   kqread0:00  0.00% unbound
 2955 www20   17M   21M sleep/0   select0:00  0.00% 
python2.7
45267 www20   13M   17M sleep/0   select0:00  0.00% 
python2.7

 5164 tkoskine   20 2128K 3068K sleep/0   kqread0:00  0.00% tmux
63466 root   20 1456K 4152K idle  poll  0:00  0.00% sshd
63357 _nsd   20   99M   83M idle  kqread0:00  0.00% nsd
92852 root  1800K   11M sleep/0   syncer0:00  0.00% update
48008 root   20 1616K 2044K sleep/0   poll  0:00  0.00% smbd
25978 root   20  800K  600K idle  kqread0:00  0.00% slaacd
95670 _nsd   20   32M   32M idle  poll  0:00  0.00% nsd
36212 root  1000K   11M idle  bored 0:00  0.00% i915
75047 root -2200K   11M idle  schto 0:00  0.00% 
i915/signal:2
53280 root -2200K   11M idle  schto 0:00  0.00% 
i915/signal:1
28187 root  1000K   11M idle  bored 0:00  0.00% 
i915-userptr-acq
78907 root -2200K   11M idle  schto 0:00  0.

Re: inteldrm changes cause high temperature / fan speeds (was: Downgrade 6.6 to 6.5)

2019-11-12 Thread Ted Unangst
Tero Koskinen wrote:
> Eventually I pinned the problem down to April 14/15:
> 
> FAULTY 091f8f6587f dlg  Mon Apr 15 02:59:41 2019 +  the myx_cmd
> FAULTY 1bbcb699ab8 dlg  Mon Apr 15 00:28:29 2019 +  there's a bunch
> PROBLEM! 7f4dd37977d jsg  Sun Apr 14 10:14:50 2019 +  Update shared 
> drm code
> OK 505701c75b3 visa Sun Apr 14 08:51:31 2019 +  Add lock
> 
> I must admit that I don't have yet any idea how to fix
> the problematic commit (or what is actually wrong there).

This is not too surprising. It's still a bit of a mystery what's different
between machines that behave fine and those that don't.

I have the same machine, and it's never been problematic.

I note I'm at the same old bios I had when I first purchased it.
bios0: vendor LENOVO version "N23ET61W (1.36 )" date 01/17/2019

And there are some other bios options, regarding bios/efi and thunderbolt and
suspend that can be set one way or the other. I have everything turned down to
whatever the "oldest" settings are. CMS boot, etc.



Re: inteldrm changes cause high temperature / fan speeds (was: Downgrade 6.6 to 6.5)

2019-11-12 Thread Otto Moerbeek
On Wed, Nov 13, 2019 at 08:19:15AM +0200, Tero Koskinen wrote:

> Hi,
> 
> Sorry if someone gets this twice. My first version didn't go to the list.
> 
> cho...@jtan.com wrote on 6.11.2019 19.52:
> > Theo de Raadt writes:
> >> I have some sort of X1rev6 and I don't see the problem.
> >>
> >> The situation is you have the hardware, and you also have the sourcecode,
> >> and the repository to traverse investigate the problem.
> >>
> >> That sounds hard, until you give it a try.
> > 
> > To be fair, it *is* hard. 
> 
> I have same problem on my Dell Optiplex 990 (running in "headless" mode,
> no monitor attached!).
> 
> After upgrade from 6.5-stable to 6.6-current, CPU temperature
> increased from 50C to 70C..80C and the fans are running at full speed.
> 
> So, I went and cloned OpenBSD src tree from https://github.com/openbsd/src/
> 
> Then I started bisecting kernel commits from 6.5-release to 6.6-current:
>  > ls -1 kernels
> bsd.apr10.a72c25aac8e43fe
> bsd.apr14.505701c75b30a46033a8
> bsd.apr23.53d03815630664
> bsd.apr27.c1f77a6b17d5a799d322
> bsd.apr29.47170b90f4a74
> bsd.apr5.b2516e1f98d4a5f7757
> bsd.jul31.ddc1a6c2c17
> bsd.jun.b2a28ec4ea
> bsd.jun1.535cf6c2b
> bsd.may1.4a0e86bfb04cce9
> bsd.may19.01b2b04ad452620a32
>  >
> 
> (Note the list isn't complete. I noticed that kernel became backwards
> incompatible at some point in April and I had to create a temporary
> 6.5 installation on another disk.)
> 
> Eventually I pinned the problem down to April 14/15:
> 
> FAULTY 091f8f6587f dlg  Mon Apr 15 02:59:41 2019 +  the myx_cmd
> FAULTY 1bbcb699ab8 dlg  Mon Apr 15 00:28:29 2019 +  there's a bunch
> PROBLEM! 7f4dd37977d jsg  Sun Apr 14 10:14:50 2019 +  Update shared 
> drm code
> OK 505701c75b3 visa Sun Apr 14 08:51:31 2019 +  Add lock
> 
> I must admit that I don't have yet any idea how to fix
> the problematic commit (or what is actually wrong there).
> 
> Reverting 7f4dd37977d didn't work for latest 6.6-current as there
> have been too many changes after that.
> 
> I also tried to check the latest changes from linux-4.19.y,
> but didn't spot anything useful.
> 
> Yours,
>   Tero

If you run top -S, do you see any process taking lots of CPU?

-Otto
> 
> 
> dmesg:
> OpenBSD 6.6-current (GENERIC.MP) #452: Mon Nov 11 19:08:23 MST 2019
>  dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> real mem = 4153880576 (3961MB)
> avail mem = 4015665152 (3829MB)
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root
> bios0 at mainbus0: SMBIOS rev. 2.6 @ 0xf2650 (71 entries)
> bios0: vendor Dell Inc. version "A19" date 08/26/2015
> bios0: Dell Inc. OptiPlex 990
> acpi0 at bios0: ACPI 4.0
> acpi0: sleep states S0 S3 S4 S5
> acpi0: tables DSDT FACP APIC TCPA SSDT MCFG HPET BOOT SSDT SSDT DMAR SLIC
> acpi0: wakeup devices EHC1(S3) EHC2(S3) HDEF(S4) GLAN(S4) RP01(S4) 
> PXSX(S4) RP02(S4) PXSX(S4) RP03(S4) PXSX(S4) RP04(S4) PXSX(S4) RP05(S4) 
> PXSX(S4) RP06(S4) PXSX(S4) [...]
> acpitimer0 at acpi0: 3579545 Hz, 24 bits
> acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
> cpu0 at mainbus0: apid 0 (boot processor)
> cpu0: Intel(R) Core(TM) i5-2400S CPU @ 2.50GHz, 2494.69 MHz, 06-2a-07
> cpu0: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
> cpu0: 256KB 64b/line 8-way L2 cache
> cpu0: smt 0, core 0, package 0
> mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
> cpu0: apic clock running at 99MHz
> cpu0: mwait min=64, max=64, C-substates=0.2.1.1, IBE
> cpu1 at mainbus0: apid 2 (application processor)
> cpu1: Intel(R) Core(TM) i5-2400S CPU @ 2.50GHz, 2494.35 MHz, 06-2a-07
> cpu1: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
> cpu1: 256KB 64b/line 8-way L2 cache
> cpu1: smt 0, core 1, package 0
> cpu2 at mainbus0: apid 4 (application processor)
> cpu2: Intel(R) Core(TM) i5-2400S CPU @ 2.50GHz, 2494.35 MHz, 06-2a-07
> cpu2: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
> cpu2: 256KB 64b/line 8-way L2 cache
> cpu2: smt 0, core 2, package 0
> cpu3 at mainbus0: apid 6 (application processor)
> cpu3: Intel(R) Core(TM) i5-2400S CPU @ 2.50GHz, 2494.35 MHz, 06-2a-07
> cpu3: 
> FPU,VME,