Public bug reported:

I tried Intrepid's kernel on my Hardy C2Duo system.  I ran into a serious 
problem: sometimes the kernel would stop receiving keyboard events.  I think 
this behaviour is related to CPU load and a faulty temperature reading.  e.g. 
after pressing "+" in aptitude (triggering some dependency calculations), the 
key release event would go unnoticed.  I was in X running gnome-terminal, so I 
could click on another tab (the mouse was unaffected), and the "+" characters 
would be visible.  (That's how I'm sure it wasn't some other sort of hang.)  
The kernel log shows messages like
INFO: task xfsdatad/0:7209 blocked for more than 120 seconds.
...stack backtrace
...
INFO: task java:11768 blocked for more than 120 seconds.
...
(full log attached, see below).

 I've never seen anything like that before, in over 2 years of stable
operation on this hardware.


 2.6.27... ran fine overnight and through a day (without interactive use except 
a big soon after boot) just running azureus (java bittorrent).  I only started 
to notice problems after 7-zipping something, and making the CPU work.  After I 
noticed the pausing behaviour, I thought the machine might be overheating, and 
sensors (reading the coretemp module) showed my idle CPU temp at ~60, and my 
load CPU temp at ~80, even when I turned up my CPU and case fans to full speed. 
 Normally I max out at 70 with fans at slow speed, running two instances of 
burnP6.  In case it matters, I hadn't updated module-init-tools to the Intrepid 
version yet.  I'm pretty sure I was able to reproduce the key sticking after 
doing that and booting again, though, but I didn't run long enough to see any 
backtraces in the kernel log.

 It turns out that Hardy's 2.6.24 reads 15C cooler than Intrepid's 2.6.27, and 
that the BIOS on my mobo agrees with the lower number: idle at ~45C.  There was 
a big update to the coretemp driver between Hardy and Intrepid, and I bet it's 
the cause of the change in behaviour.
(Changelog-2.6.25)
commit bfe38ccf8d0b541f387f65267f6f3794be59233a
Merge: 20f8d2a... 25e9c86...
Author: Linus Torvalds <[EMAIL PROTECTED]>
Date:   Thu Feb 21 16:37:42 2008 -0800

    Merge branch 'release' of git://lm-sensors.org/kernel/mhoffman/hwmon-2.6
    
    * 'release' of git://lm-sensors.org/kernel/mhoffman/hwmon-2.6:
      hwmon: normal_i2c arrays should be const
      hwmon: New driver for Analog Devices ADT7473 sensor chip
      hwmon: (coretemp) Add Penryn CPU to coretemp
      hwmon: (coretemp) Add TjMax detection for mobile CPUs
      hwmon: (applesmc) sensors set for MacBook2
      hwmon: (thmc50) Storage class should be before const qualifier
      hwmon: (coretemp) fix section mismatch warning
      hwmon: (coretemp) Add maximum cooling temperature readout
      hwmon: (adm1026) Properly terminate sysfs groups
      hwmon: (vt8231) Update maintainer email address
      hwmon: (vt8231) Add individual alarm files
      hwmon: (via686a) Add individual alarm files
      hwmon: (smsc47m1) Add individual alarm files
      hwmon: (max1619) Add individual alarm and fault files
      hwmon: (lm92) Add individual alarm files

(coretemp has to guess how to interpret the temp number it gets from the 
hardware, hence the
coretemp coretemp.0: Using relative temperature scale!
With 2.6.24, the output includes: 
coretemp-isa-0000
Adapter: ISA adapter
Core 0:      +42.0°C  (crit = +85.0°C)
On 2.6.27, the output was something like : Core 0:      +60.0°C  (high = +86°C) 
(crit = +100.0°C)
and similar for Core 1.
I don't know why a bogus temp reading would make the system go crazy and hang 
for a long time, unless this temp is somehow communicated to ACPI, or Linux 
does something to try to throttle.
My BIOS doesn't have ACPI thermal support: /proc/acpi/thermal_zone/ is empty, 
even though thermal.ko is loaded.

 My hardware is an Intel DG965WH motherboard (g965 chipset) with
BIOS Version: MQ96510J.86A.1751.2008.0811.0002        Release Date: 08/11/2008
E6600 CPU (C2D @ 2.4GHz, 4MB cache).  4GB of RAM.

 It's late and I might be forgetting something.  I'll look over this
later...

** Affects: linux (Ubuntu)
     Importance: Undecided
         Status: New

-- 
linux 2.6.27-2.3: coretemp reads 15C too hot, and keyboard is occasionally 
unresponsive
https://bugs.launchpad.net/bugs/264290
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to