Cpufreq/ACPI problem? (basically still is: Re: Problem with IBM Thinkpad T30 shutting down due to high temperatures)

2009-08-12 Thread Christian Walther
Hi,

thank you for all your feedback.
I won't answer all replies in detail, but will summarise what I did to
give you some sort of report.
Doug made me think about the beginning of this situation. I can't tell
you for sure that I had the T30 working flawlessly, because I took the
original install from another, older thinkpad.
But I did change some BIOS settings, Interrupt settings, mainly, that
seem to cause problems with my Wireless NIC in the past. So I restored
the BIOS defaults. This seems to make the problem disappear, but to be
honest: I'm not sure if I messed up the ACPI table at all, or if this
is some sort of performance issue, because I now have all IO bound
devices on IRQ 11:

vgapci0: VGA-compatible display port 0x3000-0x30ff mem
0xe800-0xefff,0xd010-0xd010 irq 11 at device 0.0 on
pci1
uhci0: Intel 82801CA/CAM (ICH3) USB controller USB-A port
0x1800-0x181f irq 11 at device 29.0 on pci0
uhci1: Intel 82801CA/CAM (ICH3) USB controller USB-B port
0x1820-0x183f irq 11 at device 29.1 on pci0
uhci2: Intel 82801CA/CAM (ICH3) USB controller USB-C port
0x1840-0x185f irq 11 at device 29.2 on pci0
cbb0: TI1520 PCI-CardBus Bridge mem 0x5000-0x5fff irq 11 at
device 0.0 on pci2
cbb1: TI1520 PCI-CardBus Bridge mem 0x5100-0x51000fff irq 11 at
device 0.1 on pci2
fxp0: Intel 82801CAM (ICH3) Pro/100 VE Ethernet port 0x8000-0x803f
mem 0xd020-0xd0200fff irq 11 at device 8.0 on pci2
pcm0: Intel ICH3 (82801CA) port 0x1c00-0x1cff,0x18c0-0x18ff irq 11
at device 31.5 on pci0

This causes screen refresh problems (e.g. urxvt isn't able to draw new
lines as expected). Still, this didn't resolve the issue, so I took a
look at acpi_thermal.
Right now I have the following set in /etc/sysctl.conf

hw.acpi.thermal.user_override=1
hw.acpi.thermal.tz0._PSV=84.0C
hw.acpi.thermal.polling_rate=2

This basically gives me:

# sysctl -a|egrep (temp|freq|acpi.therm|acpi_ibm.*fan)
kern.acct_chkfreq: 15
kern.timecounter.tc.i8254.frequency: 1193182
kern.timecounter.tc.ACPI-fast.frequency: 3579545
kern.timecounter.tc.TSC.frequency: 20
net.inet.sctp.sack_freq: 2
net.inet6.ip6.use_tempaddr: 0
net.inet6.ip6.temppltime: 86400
net.inet6.ip6.tempvltime: 604800
net.inet6.ip6.prefer_tempaddr: 0
debug.cpufreq.verbose: 0
debug.cpufreq.lowest: 0
hw.acpi.thermal.min_runtime: 0
hw.acpi.thermal.polling_rate: 2
hw.acpi.thermal.user_override: 1
hw.acpi.thermal.tz0.temperature: 62.0C
hw.acpi.thermal.tz0.active: 0
hw.acpi.thermal.tz0.passive_cooling: 1
hw.acpi.thermal.tz0.thermal_flags: 0
hw.acpi.thermal.tz0._PSV: 84.0C
hw.acpi.thermal.tz0._HOT: -1
hw.acpi.thermal.tz0._CRT: 92.0C
hw.acpi.thermal.tz0._ACx: -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
hw.acpi.thermal.tz0._TC1: 5
hw.acpi.thermal.tz0._TC2: 3
hw.acpi.thermal.tz0._TSP: 600
machdep.acpi_timer_freq: 3579545
machdep.tsc_freq: 20
machdep.i8254_freq: 1193182
dev.acpi_ibm.0.fan_speed: 4465
dev.acpi_ibm.0.fan_level: 0
dev.acpi_ibm.0.fan: 1
dev.cpu.0.freq: 2000
dev.cpu.0.freq_levels: 2000/0 1750/0 1500/0 1250/0 1200/0 1050/0 900/0
750/0 600/0 450/0 300/0
dev.acpi_perf.0.freq_settings: 2000/0 1200/0
dev.cpufreq.0.%driver: cpufreq
dev.cpufreq.0.%parent: cpu0
dev.p4tcc.0.freq_settings: 1/-1 8750/-1 7500/-1 6250/-1 5000/-1
3750/-1 2500/-1

Active cooling doesn't seem to be supported. There is a fan of course,
and I can even set a fan level via dev.acpi_ibm.0.fan, but this is not
related to  hw.acpi.thermal.tz0._HOT and hw.acpi.thermal.tz0._ACx
(which is read only anyway).
According to dev.acpi_ibm.0.fan_speed the speed of the fan is
something between 4450 and 4780.

The interesting bit here is cpufreq and how it behaves. Lets have a
look at the output of the following loop:
# while true ; do temp=$( sysctl hw.acpi.thermal.tz0.temperature ) ;
freq=$( sysctl dev.cpu.0.freq ) ; printf %4s %4s\n $freq[17,$#freq]
$temp[34,$#temp] ; sleep 2 ; done
2000 84.0C
2000 85.0C
2000 85.0C
2000 85.0C
2000 86.0C
 300 86.0C
 300 86.0C
 300 86.0C
 300 85.0C
 300 84.0C
 300 82.0C
 300 81.0C

It appears that cpufreq requires at least eight seconds to reduce the
frequency. There are two issues I'm seeing here: Firstly
hw.acpi.thermal.polling_rate: 2 Either I get this one wrong, or
cpufreq doesn't react after every poll. I've seen this in the past,
but not as good as now. Secondly cpufreq doesn't seem to use
dev.cpu.0.freq_levels at all, but drop to the lowest frequency
available.
And it does the same the other way round, too.

I was able to built a new userland and kernel yesterday, so I'll do
some more testing with a decent system after a clean reboot. The
kernel I want to use next time will be plain GENERIC. This does not
turn on support for active cooling in any way, something I was
thinking about because according to acpi_ibm fan levels from 0 to 7
are supported. And setting them manually works, so I guess this should
be possible with acpi.thermal, too. Or am I mistaken and acpi.thermal
and acpi_ibm don't interact with each other?

The interesting bit here is cpufreq: Is the behaviour 

Re: 8.0-BETA2, console freezes

2009-08-12 Thread Eric Masson
Ed Schouten e...@80386.nl writes:

Hi Ed,

Back from vacation...

 I have also seen this on some of the systems I use myself, where
 switching VTs locks up the video for a second or two. It seems to be
 unrelated to any of my Syscons and TTY changes, because I have also
 experienced this before I worked on MPSAFE TTY.

Ok.

 Happily enough, this issue isn't present in my own console driver,
 because it doesn't reprogram the graphics hardware when switching
 virtual terminals, which I suspect it is related to.

I can't switch to another graphics card (integrated on board)...
Is there a maintainer for graphics support atm please ? I think I'll
bother him then ;)

Regards

Éric

-- 
 MH: Quand au Mac, dites Mr Jobs, voyez avec Sony pour installer une
 carte PSX2 sur les Mac.
 SP: Voilà. Sur le port Mezzanine de l'iMac, par exemple.
 -+- SP in Guide du Macounet Pervers : Bien utiliser la mezzanine -+-


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


What does mfi0: Copy out failed mean?

2009-08-12 Thread Václav Haisman
I am getting mfi0: Copy out failed message in logs, usually several times a
day. What does it mean? This is FreeBSD 7.2.

--
VH



signature.asc
Description: OpenPGP digital signature


Re: Cpufreq/ACPI problem? (basically still is: Re: Problem with IBM Thinkpad T30 shutting down due to high temperatures)

2009-08-12 Thread Roland Smith
On Wed, Aug 12, 2009 at 09:47:18AM +0200, Christian Walther wrote:
 Hi,
 
 thank you for all your feedback.
 I won't answer all replies in detail, but will summarise what I did to
 give you some sort of report.
 Doug made me think about the beginning of this situation. I can't tell
 you for sure that I had the T30 working flawlessly, because I took the
 original install from another, older thinkpad.
 But I did change some BIOS settings, Interrupt settings, mainly, that
 seem to cause problems with my Wireless NIC in the past. So I restored
 the BIOS defaults. This seems to make the problem disappear, but to be
 honest: I'm not sure if I messed up the ACPI table at all, or if this
 is some sort of performance issue, because I now have all IO bound
 devices on IRQ 11:
 
 vgapci0: VGA-compatible display port 0x3000-0x30ff mem
 0xe800-0xefff,0xd010-0xd010 irq 11 at device 0.0 on
 pci1
 uhci0: Intel 82801CA/CAM (ICH3) USB controller USB-A port
 0x1800-0x181f irq 11 at device 29.0 on pci0
 uhci1: Intel 82801CA/CAM (ICH3) USB controller USB-B port
 0x1820-0x183f irq 11 at device 29.1 on pci0
 uhci2: Intel 82801CA/CAM (ICH3) USB controller USB-C port
 0x1840-0x185f irq 11 at device 29.2 on pci0
 cbb0: TI1520 PCI-CardBus Bridge mem 0x5000-0x5fff irq 11 at
 device 0.0 on pci2
 cbb1: TI1520 PCI-CardBus Bridge mem 0x5100-0x51000fff irq 11 at
 device 0.1 on pci2
 fxp0: Intel 82801CAM (ICH3) Pro/100 VE Ethernet port 0x8000-0x803f
 mem 0xd020-0xd0200fff irq 11 at device 8.0 on pci2
 pcm0: Intel ICH3 (82801CA) port 0x1c00-0x1cff,0x18c0-0x18ff irq 11
 at device 31.5 on pci0
 
 This causes screen refresh problems (e.g. urxvt isn't able to draw new
 lines as expected). Still, this didn't resolve the issue, so I took a
 look at acpi_thermal.
 Right now I have the following set in /etc/sysctl.conf
 
 hw.acpi.thermal.user_override=1

According to acpi_thermal(4), you should not use decimal. So it should be 84C
instead of 84.0C.

 hw.acpi.thermal.tz0._PSV=84.0C
 hw.acpi.thermal.polling_rate=2
 
 This basically gives me:
 
 # sysctl -a|egrep (temp|freq|acpi.therm|acpi_ibm.*fan)
 kern.acct_chkfreq: 15
 kern.timecounter.tc.i8254.frequency: 1193182
 kern.timecounter.tc.ACPI-fast.frequency: 3579545
 kern.timecounter.tc.TSC.frequency: 20
 net.inet.sctp.sack_freq: 2
 net.inet6.ip6.use_tempaddr: 0
 net.inet6.ip6.temppltime: 86400
 net.inet6.ip6.tempvltime: 604800
 net.inet6.ip6.prefer_tempaddr: 0
 debug.cpufreq.verbose: 0


 debug.cpufreq.lowest: 0

You should look at dev.cpu.N.freq_levels, where N is the number of the
core. See cpufreq(4) and below.

snip
 hw.acpi.thermal.polling_rate: 2

The polling_rate is just the number of seconds between readings of the
temperature. Nothing more.

 hw.acpi.thermal.user_override: 1
 hw.acpi.thermal.tz0.temperature: 62.0C
 hw.acpi.thermal.tz0.active: 0
 hw.acpi.thermal.tz0.passive_cooling: 1
 hw.acpi.thermal.tz0.thermal_flags: 0


 hw.acpi.thermal.tz0._PSV: 84.0C

The _PSV setting means that the system will only start throttling the CPU when
temperature reaches 84°C! You might want to set that a little lower. The
system shuts down at 92°C. That seems to be a fine line to walk.

 hw.acpi.thermal.tz0._HOT: -1
 hw.acpi.thermal.tz0._CRT: 92.0C
 hw.acpi.thermal.tz0._ACx: -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
 hw.acpi.thermal.tz0._TC1: 5
 hw.acpi.thermal.tz0._TC2: 3
 hw.acpi.thermal.tz0._TSP: 600
 machdep.acpi_timer_freq: 3579545
 machdep.tsc_freq: 20
 machdep.i8254_freq: 1193182
 dev.acpi_ibm.0.fan_speed: 4465
 dev.acpi_ibm.0.fan_level: 0
 dev.acpi_ibm.0.fan: 1
 dev.cpu.0.freq: 2000
 dev.cpu.0.freq_levels: 2000/0 1750/0 1500/0 1250/0 1200/0 1050/0 900/0
 750/0 600/0 450/0 300/0
 dev.acpi_perf.0.freq_settings: 2000/0 1200/0
 dev.cpufreq.0.%driver: cpufreq
 dev.cpufreq.0.%parent: cpu0
 dev.p4tcc.0.freq_settings: 1/-1 8750/-1 7500/-1 6250/-1 5000/-1
 3750/-1 2500/-1
 
 Active cooling doesn't seem to be supported. There is a fan of course,
 and I can even set a fan level via dev.acpi_ibm.0.fan, but this is not
 related to  hw.acpi.thermal.tz0._HOT and hw.acpi.thermal.tz0._ACx
 (which is read only anyway).
 According to dev.acpi_ibm.0.fan_speed the speed of the fan is
 something between 4450 and 4780.
 
 The interesting bit here is cpufreq and how it behaves. Lets have a
 look at the output of the following loop:
 # while true ; do temp=$( sysctl hw.acpi.thermal.tz0.temperature ) ;
 freq=$( sysctl dev.cpu.0.freq ) ; printf %4s %4s\n $freq[17,$#freq]
 $temp[34,$#temp] ; sleep 2 ; done
 2000 84.0C
 2000 85.0C
 2000 85.0C
 2000 85.0C
 2000 86.0C
  300 86.0C
  300 86.0C
  300 86.0C
  300 85.0C
  300 84.0C
  300 82.0C
  300 81.0C
 
 It appears that cpufreq requires at least eight seconds to reduce the
 frequency. There are two issues I'm seeing here: Firstly
 hw.acpi.thermal.polling_rate: 2 Either I get this one wrong, or
 cpufreq doesn't react after every poll.

The latter, I think.

snip
 The interesting bit here is cpufreq: Is the behaviour normal and to be
 

Panic due to junk pointer in pf(4)

2009-08-12 Thread Peter Jeremy
My firewall (7.2p3/i386) recently panic'd:
Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x1065e
fault code  = supervisor read, page not present
...
I have a crashdump that shows:
#6  0xc06c9c1b in calltrap () at /usr/src/sys/i386/i386/exception.s:159
#7  0xc044ecd0 in pf_state_tree_lan_ext_RB_REMOVE_COLOR (head=0xc2a256a8, 
parent=0xc442c6a0, elm=0xc40aa8e0) at /usr/src/sys/contrib/pf/net/pf.c:391
#8  0xc044ef79 in pf_state_tree_lan_ext_RB_REMOVE (head=0xc2a256a8, 
elm=0xc404a11c) at /usr/src/sys/contrib/pf/net/pf.c:391
#9  0xc045383e in pf_unlink_state (cur=0xc404a11c)
at /usr/src/sys/contrib/pf/net/pf.c:1158
#10 0xc0456b6e in pf_purge_expired_states (maxcheck=119)
at /usr/src/sys/contrib/pf/net/pf.c:1242
#11 0xc04570f9 in pf_purge_thread (v=0x0)
at /usr/src/sys/contrib/pf/net/pf.c:998
#12 0xc0535781 in fork_exit (callout=0xc0456f50 pf_purge_thread, arg=0x0, 
frame=0xd2d4cd38) at /usr/src/sys/kern/kern_fork.c:810
#13 0xc06c9c90 in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:264

Working up, 'parent' in pf_state_tree_lan_ext_RB_REMOVE_COLOR() has
a garbage u.s.entry_lan_ext:
(kgdb) p parent-u
$3 = {s = {entry_lan_ext = {rbe_left = 0x10602, rbe_right = 0x5, 
  rbe_parent = 0xc40aa8e0, rbe_color = -1002258432}, entry_ext_gwy = {
  rbe_left = 0xc3c42238, rbe_right = 0x1, rbe_parent = 0x0, 
  rbe_color = 0}, entry_id = {rbe_left = 0xc3c54470, rbe_right = 0x0, 
  rbe_parent = 0x0, rbe_color = 0}, entry_list = {tqe_next = 0xc41f9e6c, 
  tqe_prev = 0x0}, kif = 0xc442c58c}, 
  ifname = \002\006\001\000\000\000\005\000à¨\nÄ\000ÀBÄ}

Does anyone have any suggestions on where to look next?

-- 
Peter Jeremy


pgpOHBWJIZphM.pgp
Description: PGP signature