https://bugzilla.kernel.org/show_bug.cgi?id=36182

Martin Mokrejs <mmokr...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |mmokr...@gmail.com

--- Comment #39 from Martin Mokrejs <mmokr...@gmail.com> ---
I landed at this thread because I got this message on 3.10.9. However, I
realized during reading this thread it is probably about too many such lines in
syslogs, which is NOT my case. However, as it seems the code was almost dropped
from the kernel, I would like to add my comments.

Seems several reports here are about laptops, actually SandyBridge-based
laptops. I have yet another, i7-2640M. I realized that if I disable my
hyper-threaded cores:

echo 0 > /sys/devices/system/cpu/cpu2/online
echo 0 > /sys/devices/system/cpu/cpu3/online

that my singlethreaded applications run faster (I run two instances just to
fill physical cores). Per i7z tools I was reaching temperatures 95-98 oC while
with all 4 cores enabled it never reached to those temperatures and processing
speed/throughput was lower. disabling the HT-cores had also one other effect.
That the physical cores could have ran at higher boosted speeds, which of
course heated up more the CPU. I would say, that was the right way to test my
cooling.

I was glad kernel reported:

[ 1092.103952] CPU0: Package temperature above threshold, cpu clock throttled
(total events = 1)
[ 1092.103954] CPU1: Core temperature above threshold, cpu clock throttled
(total events = 1)
[ 1092.103957] CPU1: Package temperature above threshold, cpu clock throttled
(total events = 1)
[ 1092.104931] CPU1: Core temperature/speed normal
[ 1092.104933] CPU0: Package temperature/speed normal
[ 1092.104936] CPU1: Package temperature/speed normal
[ 1201.614297] mce: [Hardware Error]: Machine check events logged
[ 1395.598163] CPU1: Core temperature above threshold, cpu clock throttled
(total events = 21680)
[ 1395.598169] CPU1: Package temperature above threshold, cpu clock throttled
(total events = 22191)
[ 1395.598190] CPU0: Package temperature above threshold, cpu clock throttled
(total events = 22191)
[ 1395.599169] CPU1: Core temperature/speed normal
[ 1395.599171] CPU1: Package temperature/speed normal
[ 1395.599176] CPU0: Package temperature/speed normal
[ 1502.016525] mce: [Hardware Error]: Machine check events logged
[ 1698.841500] CPU1: Core temperature above threshold, cpu clock throttled
(total events = 46139)
[ 1698.841503] CPU0: Package temperature above threshold, cpu clock throttled
(total events = 47504)
[ 1698.841506] CPU1: Package temperature above threshold, cpu clock throttled
(total events = 47504)
[ 1698.842526] CPU0: Package temperature/speed normal
[ 1698.842528] CPU1: Core temperature/speed normal
[ 1698.842529] CPU1: Package temperature/speed normal
[ 1952.545072] mce: [Hardware Error]: Machine check events logged
[ 1999.213823] CPU0: Package temperature/speed normal
[ 1999.213826] CPU1: Core temperature/speed normal
[ 1999.213829] CPU1: Package temperature/speed normal
[ 2102.731048] mce: [Hardware Error]: Machine check events logged

[15125.078769] CPU1: Core temperature above threshold, cpu clock throttled
(total events = 1015693)
[15125.078771] CPU0: Package temperature above threshold, cpu clock throttled
(total events = 1048803)
[15125.078776] CPU1: Package temperature above threshold, cpu clock throttled
(total events = 1048803)
[15125.079794] CPU1: Core temperature/speed normal
[15125.079796] CPU0: Package temperature/speed normal
[15125.079798] CPU1: Package temperature/speed normal
[15425.600586] CPU1: Core temperature above threshold, cpu clock throttled
(total events = 1041101)
[15425.600588] CPU0: Package temperature above threshold, cpu clock throttled
(total events = 1075009)
[15425.600593] CPU1: Package temperature above threshold, cpu clock throttled
(total events = 1075009)
[15425.601591] CPU1: Core temperature/speed normal
[15425.601593] CPU0: Package temperature/speed normal
[15425.601596] CPU1: Package temperature/speed normal
[15725.995979] CPU1: Core temperature above threshold, cpu clock throttled
(total events = 1064631)
[15725.995983] CPU0: Package temperature above threshold, cpu clock throttled
(total events = 1099299)
[15725.995986] CPU1: Package temperature above threshold, cpu clock throttled
(total events = 1099299)
[15725.996987] CPU1: Core temperature/speed normal
[15725.996989] CPU0: Package temperature/speed normal
[15725.996991] CPU1: Package temperature/speed normal
[16301.492089] CPU1: Core temperature above threshold, cpu clock throttled
(total events = 1066448)
[16301.492091] CPU0: Package temperature above threshold, cpu clock throttled
(total events = 1101154)
[16301.492096] CPU1: Package temperature above threshold, cpu clock throttled
(total events = 1101154)
[16301.493096] CPU1: Core temperature/speed normal
[16301.493098] CPU0: Package temperature/speed normal
[16301.493098] CPU1: Package temperature/speed normal
[16607.731994] CPU1: Core temperature above threshold, cpu clock throttled
(total events = 1069217)
[16607.731999] CPU1: Package temperature above threshold, cpu clock throttled
(total events = 1104055)
[16607.732006] CPU0: Package temperature above threshold, cpu clock throttled
(total events = 1104055)
[16607.732958] CPU1: Core temperature/speed normal
[16607.732959] CPU1: Package temperature/speed normal
[16607.732982] CPU0: Package temperature/speed normal
[21761.864712] r8169 0000:05:00.0 enp5s0: link down
[21763.550884] r8169 0000:05:00.0 enp5s0: link up
[25099.840780] conftest[4957]: segfault at 0 ip 0000000000400570 sp
00007fff7a800450 error 4 in conftest[400000+1000]
[25100.156282] conftest[4981]: segfault at 0 ip 00007f674705bef6 sp
00007fffbec472d8 error 4 in libc-2.17.so[7f6746f39000+1a2000]
[25187.711268] CPU1: Core temperature above threshold, cpu clock throttled
(total events = 1071206)
[25187.711270] CPU0: Package temperature above threshold, cpu clock throttled
(total events = 1106064)
[25187.711275] CPU1: Package temperature above threshold, cpu clock throttled
(total events = 1106064)
[25187.712288] CPU1: Core temperature/speed normal
[25187.712290] CPU0: Package temperature/speed normal
[25187.712293] CPU1: Package temperature/speed normal
[25208.899908] conftest[15966]: segfault at 0 ip 0000000000400570 sp
00007fff169ea510 error 4 in conftest[400000+1000]
[25209.230326] conftest[15990]: segfault at 0 ip 00007f22fc360ef6 sp
00007fff8918c5d8 error 4 in libc-2.17.so[7f22fc23e000+1a2000]
[25288.933545] conftest[27177]: segfault at 0 ip 0000000000400570 sp
00007fffcb660da0 error 4 in conftest[400000+1000]
[25289.180398] conftest[27209]: segfault at 0 ip 00007f213482aef6 sp
00007fffd657a1e8 error 4 in libc-2.17.so[7f2134708000+1a2000]
[25488.084595] CPU1: Core temperature above threshold, cpu clock throttled
(total events = 1087585)
[25488.084598] CPU0: Package temperature above threshold, cpu clock throttled
(total events = 1123071)
[25488.084602] CPU1: Package temperature above threshold, cpu clock throttled
(total events = 1123071)
[25488.097613] CPU1: Core temperature/speed normal
[25488.097615] CPU0: Package temperature/speed normal
[25488.097621] CPU1: Package temperature/speed normal
[25788.465891] CPU1: Core temperature/speed normal
[25788.465893] CPU0: Package temperature/speed normal
[25788.465898] CPU1: Package temperature/speed normal
[26088.838199] CPU1: Core temperature/speed normal


mcelog said:
Hardware event. This is not a software error.
MCE 5
CPU 1 THERMAL EVENT TSC 1b06d3491f5a 
TIME 1375546009 Sat Aug  3 18:06:49 2013
Processor 1 below trip temperature. Throttling disabled
STATUS 8801028a MCGSTATUS 0
MCGCAP c07 APICID 2 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 42
Hardware event. This is not a software error.
MCE 6
CPU 1 THERMAL EVENT TSC 1c36a76eacb8 
TIME 1375546476 Sat Aug  3 18:14:36 2013
Processor 1 heated above trip temperature. Throttling enabled.
Please check your system cooling. Performance will be impacted
STATUS 880003cb MCGSTATUS 0
MCGCAP c07 APICID 2 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 42



My external LCD connected via DVI cabled to the i7 processor with builtin
graphics chip was blinking sometimes 4-5x during a 1minute window. I believe
that was caused by CPU being throttled or stepped back into higher speeds.
Thanks to these messages Dell replaced my CPU cooler and motherboard. The
technician also placed more thermal glue onto the CPU.


Now, with only the 2 physical cores enabled I reach temperatures 64-76 oC and
have less segmentation faults and no external LCD blinking at all. Messages
about CPU throttling are gone but finally, I am getting to subject of this
thread, I get :

[98101.254002] CPU1: Package power limit notification (total events = 2)
[98101.254004] CPU0: Package power limit notification (total events = 2)
[98101.255084] CPU1: Package power limit normal
[98101.255085] CPU0: Package power limit normal
[111450.779762] binaryurpReader[30277]: segfault at a0 ip 00007fc9d6aa20c7 sp
00007fc9efffe280 error 4 in libfwllo.so[7fc9d6a6b000+7e000]
[133993.045662] soffice.bin[7553]: segfault at 18 ip 00007f935f72641a sp
00007fff8d249ae0 error 4 in libvclplug_gtklo.so[7f935f6bb000+c1000]
[208645.410300] CPU1: Package power limit notification (total events = 7)
[208645.410302] CPU0: Package power limit notification (total events = 7)
[208645.421172] CPU0: Package power limit normal
[208645.421175] CPU1: Package power limit normal


You see, although my CPU is not overheating too much (new cooler and more
thermal glue) I still have some issues. Or maybe it just tells me that I am
really squezing maximum CPU power? So, a good sign in my case? Or would you say
that my CPU has a bad silicon and is still heating too much over the spec?
Aren't we all with SandyBridge laptops and CPUs at high frequency having a
cooling issue?

Either way, I am not disturbed but these messages but wonder why I never saw
previously the "Package power limit notification" with definitely faulty
cooling. The messages are helpful to study what is going on under the hood.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

------------------------------------------------------------------------------
Introducing Performance Central, a new site from SourceForge and 
AppDynamics. Performance Central is your source for news, insights, 
analysis and resources for efficient Application Performance Management. 
Visit us today!
http://pubads.g.doubleclick.net/gampad/clk?id=48897511&iu=/4140/ostg.clktrk
_______________________________________________
acpi-bugzilla mailing list
acpi-bugzilla@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/acpi-bugzilla

Reply via email to