Re: ACPI? problem with release 8.0 | Perhaps solved?

2010-04-16 Thread Malcolm Kay
 able to relate this directly to my
problem from Googling it seems that there some issues
with amdc1e under BSD, Linux and perhaps Windows. But all
the references seem to amd c1e are related to systems in
64 bit mode while I am running (or trying to run) i386 so
I wonder why I have:
  machdep.idle: amdc1e
   
Maybe my problem is not acpi as such but this idle mode.
 
  Could well be.  Someone on acpi@ will know about amdc1e, I
  don't, but any BIOS setting re C1E could be relevant to
  this.
 
My thought is to change this to
  machdep.idle: hlt
or even
  machdep.idle: acpi
 
  Maybe try setting it to acpi first (without any disabled
  parts) and try? Can't do any worse than crash the same?

 I think this should be my next task.
 I have on hand another machine (not mine) running realease 8.0
 but using an Intel Core i7 processor. This shows
   machdep.idle: acpi
   machdep.idle_available: spin, mwait, mwait_hlt, hlt, acpi,

Any comments or ideas please!
   
Thank you for your attention.
   
Malcolm Kay
   
On Sat, 10 Apr 2010 05:22 pm, Malcolm Kay wrote:
 My machine had two SATA 300GB drives
 (WDC WD3200KS-00PFB0 21.00M21) one carrying FreeBSD
 RELEASE-6.3 and the other RELEASE-7.0 all of which
 worked OK.

 Recently added SATA 1TB (WDC WD10EADS-00P8B0 01.00A01)
 and installed RELEASE 8.0 thereon. When I boot to
 RELEASE 8.0 I find after some time, few minutes to
 rather more minutes the system just powers down without
 warning or any obvious cause. It seems to mostly happen
 when the system is relatively quiet.
 
  Adam's suggestion to check that esp. CPU temperature is
  within spec is worth checking; if you don't have any thermal
  zones in your ACPI I'd be surprised, and maybe concerned.  A
  finger on the heatsink is next best.

 See my response to Adam.

 Suspecting the ACPI I added:
  hint.acpi.0.disabled=1
 to loader.conf.
 I then found RELEASE 8.0 would not boot -- or at least
 it was unable to mount root. I get a mountroot
 prompt but this seemed not to accept anything I could
 think of, and ? to list available targets yielded
 nothing. Rebooting and overriding this with option 2
 (enable ACPI) in the boot menu took me back to a
 bootable but fragile system.

 Changing the loader.conf entry to:
  debug.acpi.disabled=all
 had the same effect as the hint.acpi.0.disabled=1.
 
  As it should.

 I guess so but wondered whether 'all' meant all the
 individually selectables but still leaving some essential
 parts of acpi active.

 I then thought to be somewhat selective with
 debug.acpi.disabled and intended to try:
  debug.acpi.disabled=acad button cpu lid thermal timer
 video only now as I write this I discover I actually
 entered: debug.acpi.disabled=acadbutton cpu lid thermal
 timer video

 Now the RELEASE-8.0 booted but remained fragile.

 I've repaired this last entry and will proceed to try
 it. Meanwhile I feel I am fumbling about in the dark
 without sufficient (or any real) knowledge of the range
 of tasks performed by ACPI.

 Is my guess that I have an interaction problem between
 ACPI and RELEASE-8.0 a reasonable one? Where can I go
 from here?

 The system uses a Gigabyte GA-M55SLI-S4 mother board
 and the prcessor is AMD Athlon(tm) 64 X2 Dual Core
 Processor 5600+
 
  The last para may hold the primary keys to the solution set
  ..
 
  cheers, Ian

 I'll report (for posterity) if changing machdep.idle: works.

 Thanks for your attention and thoughts,

 Malcolm
 ___
 freebsd-questions@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-questions
 To unsubscribe, send any mail to
 freebsd-questions-unsubscr...@freebsd.org
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: ACPI? problem with release 8.0 | Perhaps solved?

2010-04-16 Thread Ian Smith
On Fri, 16 Apr 2010 17:13:48 +0930, Malcolm Kay wrote:

  My RELEASE-8.0 has now been up for about 2hr, not
  long enough to be sure the difficulty is circumvented,
  but long enough to look promising. Previously RELEASE-8.0
  has not stayed up more than about 4min. 

Sounds promising ..

  I tried setting machdep.idle to acpi and then to hlt without
  success. But I now have set machdep.idle=spin.

Wow, ok.  I only have a vague idea of how these work, but having to 
change this definitely indicates a bug somewhere; whether your BIOS 
settings or ACPI implementation or kernel or what else, I've no idea.

  Discovered there can be some problem in trying to set this
  too early -- in particular in loader.conf -- presumably because
  acpi.ko is not yet loaded. I ended up making sure everything was
  ready by putting:

Don't presume too easily .. acpi.ko gets loaded really early, it's 
needed fired up even before scanning busses and initialising most 
devices.  A verbose dmesg.boot should give some indication to anyone 
familiar with what should be.  An acpidump may be useful too.

Can you put files up anywhere to fetch?  If not, you can mail me them, 
they're each too big to attach to -questions.  The usual deal on acpi@ 
is to put up URL(s) to such files; I'd be happy to host them here.

But you really should take this afresh to acpi@ .. they don't bite, the 
worst that can happen is they'll ignore you :) and with a new message 
with the concise story to date, I'd expect someone to take an interest; 
maybe just to say 'turn this off|on' or or 'that was fixed in -stable 
last month' or 'try this patch' or 'show us your [whatever]' ..

 #!/bin/sh
 echo setting machdep.idle=spin
 /sbin/sysctl machdep.idle=spin
  in /etc/rc.local

Ok.  dmesg.boot then will show what happens before that gets switched.

If you enable console.log in syslog.conf that change will show up there
after boot messages, maybe other useful stuff, but at least show dmesg.

  To check what is happening I've created /usr/local/bin/sysctldump.sh as:
 #!/bin/sh
 [ -f /tmp/sysctl.dump.4 ]  mv -f /tmp/sysctl.dump.4 /tmp/sysctl.dump.5 
 [ -f /tmp/sysctl.dump.3 ]  mv -f /tmp/sysctl.dump.3 /tmp/sysctl.dump.4 
 [ -f /tmp/sysctl.dump.2 ]  mv -f /tmp/sysctl.dump.2 /tmp/sysctl.dump.3
 [ -f /tmp/sysctl.dump.1 ]  mv -f /tmp/sysctl.dump.1 /tmp/sysctl.dump.2 
 [ -f /tmp/sysctl.dump ]  mv -f /tmp/sysctl.dump /tmp/sysctl.dump.1
 sysctl -ao  /tmp/sysctl.dump
  and adding:
 #sysctl dump
 1-59/2  *   *   *   *   root   /usr/local/bin/sysctldump.sh
  to /etc/crontab.

sysctl -ao is likely Way Too Much Information, though I suppose diffs 
between them might show something useful changing over time.  'sysctl hw 
dev acpi' is probably plenty to chew on.

  I feel somewhat concerned that this cronjob may be sufficiently frequent to
  prevent the system looking for the idle state and thus circumventing the 
  problem in same other way. So I'm not yet convinced that I have a real 
  solution.

We're not talking about idle in the sense top shows you - this is about 
the kernel having nothing to do for perhaps hundreds of microseconds so 
entering a microsleep state.  The old 386s just had the HLT instruction 
which had the CPU wait for an interrupt (to save power).  These days 
there are multiple C-states with varying levels of power reduction with 
different latencies, ie times to wake up, usually managed by ACPI.

I suspect 'spin' just loops awaiting an interrupt, staying busy?

C1E is one such newer state.  I know nothing about it, but that's what 
your system thought it should use the amdc1e cpufreq? driver for, so 
your problem definitely seems related to that.  This clearly is within 
the ambit of the acpi@ list, and most of those folks seem rarely to have 
the sort of spare time needed to follow -questions.

Also at least check the change log between your BIOS and the latest; if 
there's anything related to C states or similar, you should try it; they 
always say not to do it unless you need to - you might need to, and that
might be all you need to do.

  I'll try removing the cronjob.
  
  Thanks again for your attention,
  Regards,
  
  Malcolm Kay

Thanks for cc'ing me, I read -digests which can take half a day and make 
replying a bit tedious, not to mention breaking list threading.

cheers, Ian

[..]
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: ACPI? problem with release 8.0

2010-04-12 Thread Malcolm Kay
I desperately need to make some progress on this issue.

Is it likely that the issue is real rather than hardware
or disk corruption? Earlier releases are operating OK on the same 
machine.

I have now confirmed that:
 debug.acpi.disabled=acad button cpu lid thermal timer video
still leaves the system crashing and powering down when idle for 
a while. And the more extensive:
 debug.acpi.disabled=acad bus children button cmbat cpu ec isa
 lid pci pci_link sysresource thermal timer video
does the same.

I don't really need power management but with acpi disabled the
disks are not visible to the system.

Are there sysctl variables that can influence this behaviour?
Currently I believe we have:

hw.acpi.supported_sleep_state: S1 S4 S5
hw.acpi.power_button_state: S5
hw.acpi.sleep_button_state: S1
hw.acpi.lid_switch_state: NONE
hw.acpi.standby_state: S1
hw.acpi.suspend_state: NONE
hw.acpi.sleep_delay: 1
hw.acpi.s4bios: 0
hw.acpi.verbose: 0
hw.acpi.disable_on_reboot: 0
hw.acpi.handle_reboot: 0
hw.acpi.reset_video: 0
hw.acpi.cpu.cx_lowest: C1
machdep.idle: amdc1e
machdep.idle_available: spin, amdc1e, hlt, acpi,

However on the earlier RELEASEs that work I note we do not have 
machdep.idle or machdep.idle_available. Instead I find:
machdep.cpu_idle_hlt: 1
machdep.hlt_cpus: 0

Although I've not been able to relate this directly to my problem 
from Googling it seems that there some issues with amdc1e under
BSD, Linux and perhaps Windows. But all the references seem to 
amd c1e are related to systems in 64 bit mode while I am running 
(or trying to run) i386 so I wonder why I have:
  machdep.idle: amdc1e

Maybe my problem is not acpi as such but this idle mode.

My thought is to change this to
  machdep.idle: hlt
or even
  machdep.idle: acpi

Any comments or ideas please!

Thank you for your attention.

Malcolm Kay


On Sat, 10 Apr 2010 05:22 pm, Malcolm Kay wrote:
 My machine had two SATA 300GB drives
 (WDC WD3200KS-00PFB0 21.00M21) one carrying FreeBSD
 RELEASE-6.3 and the other RELEASE-7.0 all of which worked OK.

 Recently added SATA 1TB (WDC WD10EADS-00P8B0 01.00A01) and
 installed RELEASE 8.0 thereon. When I boot to RELEASE 8.0
 I find after some time, few minutes to rather more minutes
 the system just powers down without warning or any obvious
 cause. It seems to mostly happen when the system is relatively
 quiet.

 Suspecting the ACPI I added:
  hint.acpi.0.disabled=1
 to loader.conf.
 I then found RELEASE 8.0 would not boot -- or at least
 it was unable to mount root. I get a mountroot prompt
 but this seemed not to accept anything I could think of,
 and ? to list available targets yielded nothing. Rebooting
 and overriding this with option 2 (enable ACPI) in the boot
 menu took me back to a bootable but fragile system.

 Changing the loader.conf entry to:
  debug.acpi.disabled=all
 had the same effect as the hint.acpi.0.disabled=1.

 I then thought to be somewhat selective with
 debug.acpi.disabled and intended to try:
  debug.acpi.disabled=acad button cpu lid thermal timer video
 only now as I write this I discover I actually entered:
  debug.acpi.disabled=acadbutton cpu lid thermal timer video

 Now the RELEASE-8.0 booted but remained fragile.

 I've repaired this last entry and will proceed to try it.
 Meanwhile I feel I am fumbling about in the dark without
 sufficient (or any real) knowledge of the range of tasks
 performed by ACPI.

 Is my guess that I have an interaction problem between ACPI
 and RELEASE-8.0 a reasonable one? Where can I go from here?

 The system uses a Gigabyte GA-M55SLI-S4 mother board and the
 prcessor is AMD Athlon(tm) 64 X2 Dual Core Processor 5600+

 Please offer suggestions or comments.

 Malcolm Kay


 ___
 freebsd-questions@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-questions
 To unsubscribe, send any mail to
 freebsd-questions-unsubscr...@freebsd.org
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: ACPI? problem with release 8.0

2010-04-12 Thread Adam Vande More
On Mon, Apr 12, 2010 at 1:01 AM, Malcolm Kay
malcolm@internode.on.netwrote:

 I desperately need to make some progress on this issue.

 Is it likely that the issue is real rather than hardware
 or disk corruption? Earlier releases are operating OK on the same
 machine.

 I have now confirmed that:
  debug.acpi.disabled=acad button cpu lid thermal timer video
 still leaves the system crashing and powering down when idle for
 a while. And the more extensive:
  debug.acpi.disabled=acad bus children button cmbat cpu ec isa
  lid pci pci_link sysresource thermal timer video
 does the same.

 I don't really need power management but with acpi disabled the
 disks are not visible to the system.

 Are there sysctl variables that can influence this behaviour?
 Currently I believe we have:

 hw.acpi.supported_sleep_state: S1 S4 S5
 hw.acpi.power_button_state: S5
 hw.acpi.sleep_button_state: S1
 hw.acpi.lid_switch_state: NONE
 hw.acpi.standby_state: S1
 hw.acpi.suspend_state: NONE
 hw.acpi.sleep_delay: 1
 hw.acpi.s4bios: 0
 hw.acpi.verbose: 0
 hw.acpi.disable_on_reboot: 0
 hw.acpi.handle_reboot: 0
 hw.acpi.reset_video: 0
 hw.acpi.cpu.cx_lowest: C1
 machdep.idle: amdc1e
 machdep.idle_available: spin, amdc1e, hlt, acpi,

 However on the earlier RELEASEs that work I note we do not have
 machdep.idle or machdep.idle_available. Instead I find:
 machdep.cpu_idle_hlt: 1
 machdep.hlt_cpus: 0

 Although I've not been able to relate this directly to my problem
 from Googling it seems that there some issues with amdc1e under
 BSD, Linux and perhaps Windows. But all the references seem to
 amd c1e are related to systems in 64 bit mode while I am running
 (or trying to run) i386 so I wonder why I have:
  machdep.idle: amdc1e

 Maybe my problem is not acpi as such but this idle mode.

 My thought is to change this to
  machdep.idle: hlt
 or even
  machdep.idle: acpi

 Any comments or ideas please!

 Thank you for your attention.


Is there anything in /var/log/messages which indicates the cause?  Can you
monitor cpu temp?


-- 
Adam Vande More
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: ACPI? problem with release 8.0

2010-04-12 Thread Ian Smith
 an interaction problem between ACPI
   and RELEASE-8.0 a reasonable one? Where can I go from here?
  
   The system uses a Gigabyte GA-M55SLI-S4 mother board and the
   prcessor is AMD Athlon(tm) 64 X2 Dual Core Processor 5600+

The last para may hold the primary keys to the solution set ..

cheers, Ian
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: ACPI? problem with release 8.0

2010-04-12 Thread Malcolm Kay
On Mon, 12 Apr 2010 04:40 pm, Adam Vande More wrote:
 On Mon, Apr 12, 2010 at 1:01 AM, Malcolm Kay

 malcolm@internode.on.netwrote:
  I desperately need to make some progress on this issue.
 
  Is it likely that the issue is real rather than hardware
  or disk corruption? Earlier releases are operating OK on the
  same machine.
 
  I have now confirmed that:
   debug.acpi.disabled=acad button cpu lid thermal timer video
  still leaves the system crashing and powering down when idle
  for a while. And the more extensive:
   debug.acpi.disabled=acad bus children button cmbat cpu ec
  isa lid pci pci_link sysresource thermal timer video
  does the same.
 
  I don't really need power management but with acpi disabled
  the disks are not visible to the system.
 
  Are there sysctl variables that can influence this
  behaviour? Currently I believe we have:
 
  hw.acpi.supported_sleep_state: S1 S4 S5
  hw.acpi.power_button_state: S5
  hw.acpi.sleep_button_state: S1
  hw.acpi.lid_switch_state: NONE
  hw.acpi.standby_state: S1
  hw.acpi.suspend_state: NONE
  hw.acpi.sleep_delay: 1
  hw.acpi.s4bios: 0
  hw.acpi.verbose: 0
  hw.acpi.disable_on_reboot: 0
  hw.acpi.handle_reboot: 0
  hw.acpi.reset_video: 0
  hw.acpi.cpu.cx_lowest: C1
  machdep.idle: amdc1e
  machdep.idle_available: spin, amdc1e, hlt, acpi,
 
  However on the earlier RELEASEs that work I note we do not
  have machdep.idle or machdep.idle_available. Instead I find:
  machdep.cpu_idle_hlt: 1
  machdep.hlt_cpus: 0
 
  Although I've not been able to relate this directly to my
  problem from Googling it seems that there some issues with
  amdc1e under BSD, Linux and perhaps Windows. But all the
  references seem to amd c1e are related to systems in 64 bit
  mode while I am running (or trying to run) i386 so I wonder
  why I have:
   machdep.idle: amdc1e
 
  Maybe my problem is not acpi as such but this idle mode.
 
  My thought is to change this to
   machdep.idle: hlt
  or even
   machdep.idle: acpi
 
  Any comments or ideas please!
 
  Thank you for your attention.

 Is there anything in /var/log/messages which indicates the
 cause?  Can you monitor cpu temp?

No clues in messages -- seems to just power down without any 
warning.

I don't seem to have any thermal monitoring readily available 
except in the BIOS screens -- which seem to indicate everything 
is fine. But I guess this is not really indicative of what is 
happening with a running system. But the same machine has run
earlier versions of FreeBSD staying up months at a time and only 
going down on power failures or on odd occassions I might want 
to look at BIOS settings or some such, so I feel fairly 
confident it is not a thermal issue.

Hmm, I think there might be a BIOS setting to switch on health 
reporting which I expect would show up under sysctl.

Thanks for the contribution.

The more I think about it the more I believe the issue is 
connected with machdep.idle: amdc1e
I am going to try changing this.

Thanks and regards,

Malcolm



___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: ACPI? problem with release 8.0

2010-04-12 Thread Malcolm Kay
 spec is worth checking; if you don't have any thermal zones in
 your ACPI I'd be surprised, and maybe concerned.  A finger on
 the heatsink is next best.

See my response to Adam.


Suspecting the ACPI I added:
 hint.acpi.0.disabled=1
to loader.conf.
I then found RELEASE 8.0 would not boot -- or at least
it was unable to mount root. I get a mountroot prompt
but this seemed not to accept anything I could think of,
and ? to list available targets yielded nothing.
Rebooting and overriding this with option 2 (enable ACPI)
in the boot menu took me back to a bootable but fragile
system.
   
Changing the loader.conf entry to:
 debug.acpi.disabled=all
had the same effect as the hint.acpi.0.disabled=1.

 As it should.

I guess so but wondered whether 'all' meant all the individually 
selectables but still leaving some essential parts of acpi 
active.

I then thought to be somewhat selective with
debug.acpi.disabled and intended to try:
 debug.acpi.disabled=acad button cpu lid thermal timer
video only now as I write this I discover I actually
entered: debug.acpi.disabled=acadbutton cpu lid thermal
timer video
   
Now the RELEASE-8.0 booted but remained fragile.
   
I've repaired this last entry and will proceed to try it.
Meanwhile I feel I am fumbling about in the dark without
sufficient (or any real) knowledge of the range of tasks
performed by ACPI.
   
Is my guess that I have an interaction problem between
ACPI and RELEASE-8.0 a reasonable one? Where can I go
from here?
   
The system uses a Gigabyte GA-M55SLI-S4 mother board and
the prcessor is AMD Athlon(tm) 64 X2 Dual Core Processor
5600+

 The last para may hold the primary keys to the solution set ..

 cheers, Ian

I'll report (for posterity) if changing machdep.idle: works.

Thanks for your attention and thoughts,

Malcolm
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


ACPI? problem with release 8.0

2010-04-10 Thread Malcolm Kay
My machine had two SATA 300GB drives 
(WDC WD3200KS-00PFB0 21.00M21) one carrying FreeBSD RELEASE-6.3
and the other RELEASE-7.0 all of which worked OK.

Recently added SATA 1TB (WDC WD10EADS-00P8B0 01.00A01) and 
installed RELEASE 8.0 thereon. When I boot to RELEASE 8.0
I find after some time, few minutes to rather more minutes
the system just powers down without warning or any obvious cause. 
It seems to mostly happen when the system is relatively quiet.

Suspecting the ACPI I added:
 hint.acpi.0.disabled=1
to loader.conf.
I then found RELEASE 8.0 would not boot -- or at least
it was unable to mount root. I get a mountroot prompt
but this seemed not to accept anything I could think of,
and ? to list available targets yielded nothing. Rebooting and 
overriding this with option 2 (enable ACPI) in the boot menu 
took me back to a bootable but fragile system.

Changing the loader.conf entry to:
 debug.acpi.disabled=all
had the same effect as the hint.acpi.0.disabled=1.

I then thought to be somewhat selective with debug.acpi.disabled
and intended to try:
 debug.acpi.disabled=acad button cpu lid thermal timer video
only now as I write this I discover I actually entered:
 debug.acpi.disabled=acadbutton cpu lid thermal timer video

Now the RELEASE-8.0 booted but remained fragile.

I've repaired this last entry and will proceed to try it.
Meanwhile I feel I am fumbling about in the dark without
sufficient (or any real) knowledge of the range of tasks 
performed by ACPI.

Is my guess that I have an interaction problem between ACPI and
RELEASE-8.0 a reasonable one? Where can I go from here?

The system uses a Gigabyte GA-M55SLI-S4 mother board and the 
prcessor is AMD Athlon(tm) 64 X2 Dual Core Processor 5600+

Please offer suggestions or comments.

Malcolm Kay


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org