8.3-PRERELEASE and ATA_CAM

2012-04-06 Thread Daniel Braniss
with the latest svn, I can't compile kernel with  options ATA_CAM:

...
linking kernel.debug
ata-disk.o(.text+0x93): In function `ad_init':
/r+d/stable/8.3/sys/dev/ata/ata-disk.c:389: undefined reference to 
`ata_setmode'
ata-disk.o(.text+0xaa):/r+d/stable/8.3/sys/dev/ata/ata-disk.c:397: undefined 
reference to `ata_wc'
ata-disk.o(.text+0xc5):/r+d/stable/8.3/sys/dev/ata/ata-disk.c:398: undefined 
reference to `ata_controlcmd'
ata-disk.o(.text+0x113):/r+d/stable/8.3/sys/dev/ata/ata-disk.c:400: undefined 
reference to `ata_controlcmd'
ata-disk.o(.text+0x133):/r+d/stable/8.3/sys/dev/ata/ata-disk.c:393: undefined 
reference to `ata_controlcmd'
ata-disk.o(.text+0x16d):/r+d/stable/8.3/sys/dev/ata/ata-disk.c:407: undefined 
reference to `ata_controlcmd'
ata-disk.o(.text+0x21a): In function `ad_shutdown':
/r+d/stable/8.3/sys/dev/ata/ata-disk.c:196: undefined reference to 
`ata_controlcmd'
ata-disk.o(.text+0x45c): In function `ad_detach':
/r+d/stable/8.3/sys/dev/ata/ata-disk.c:182: undefined reference to 
`ata_fail_requests'
...

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: [stable-ish 9] Dell R815 ipmi(4) attach failure

2012-04-06 Thread Alexander Motin

On 04/04/12 21:47, John Baldwin wrote:

On Wednesday, April 04, 2012 12:24:33 pm Doug Ambrisko wrote:

John Baldwin writes:
| On Tuesday, April 03, 2012 12:37:50 pm Doug Ambrisko wrote:
|  John Baldwin writes:
|  | On Monday, April 02, 2012 7:27:13 pm Doug Ambrisko wrote:
|  |  Doug Ambrisko writes:
|  |  | John Baldwin writes:
|  |  | | On Saturday, March 31, 2012 3:25:48 pm Doug Ambrisko wrote:
|  |  | |  Sean Bruno writes:
|  |  | |  | Noting a failure to attach to the onboard IPMI controller

with

| this
|  | dell
|  |  | |  | R815.  Not sure what to start poking at and thought I'd

though

| this
|  | over
|  |  | |  | here for comment.
|  |  | |  |
|  |  | |  | -bash-4.2$ dmesg |grep ipmi
|  |  | |  | ipmi0: KCS mode found at io 0xca8 on acpi
|  |  | |  | ipmi1:IPMI System Interface  on isa0
|  |  | |  | device_attach: ipmi1 attach returned 16
|  |  | |  | ipmi1:IPMI System Interface  on isa0
|  |  | |  | device_attach: ipmi1 attach returned 16
|  |  | |  | ipmi0: Timed out waiting for GET_DEVICE_ID
|  |  | |
|  |  | |  I've run into this recently.  A quick hack to fix it is:
|  |  | |
|  |  | |  Index: ipmi.c
|  |  | |
| ===
|  |  | |  RCS file: /cvs/src/sys/dev/ipmi/ipmi.c,v
|  |  | |  retrieving revision 1.14
|  |  | |  diff -u -p -r1.14 ipmi.c
|  |  | |  --- ipmi.c   14 Apr 2011 07:14:22 -  1.14
|  |  | |  +++ ipmi.c   31 Mar 2012 19:18:35 -
|  |  | |  @@ -695,7 +695,6 @@ ipmi_startup(void *arg)
|  |  | |   if (error == EWOULDBLOCK) {
|  |  | |   device_printf(dev, Timed out waiting for
| GET_DEVICE_ID\n);
|  |  | |   ipmi_free_request(req);
|  |  | |  -return;
|  |  | |   } else if (error) {
|  |  | |   device_printf(dev, Failed GET_DEVICE_ID: %d\n,

error);

|  |  | |   ipmi_free_request(req);
|  |  | |
|  |  | |  The issue is that the wakeup doesn't actually wake up the

msleep

|  |  | |  in ipmi_submit_driver_request.  The error being reported is

that

|  |  | |  the msleep timed out.  This doesn't seem to be critical

problem

|  |  | |  since after this things seemed to work work.  I saw this on

9.X.

|  |  | |  Haven't seen it on 8.2.  Not sure about -current.
|  |  | |
|  |  | |  It doesn't happen on all machines.
|  |  | |
|  |  | | Hmm, are you seeing the KCS thread manage the request but the
| wakeup()
|  | is
|  |  | | lost?
|  |  |
|  |  | It was a couple of weeks ago that I played with it.  I put

printf's

|  |  | around the msleep and wakeup.  I saw the wakeup called but the

sleep

|  |  | not get it.  I can try the test again later today.  Right now my

main

|  |  | work machine is recovering from a power outage.  This was with 9.0
|  |  | when I first saw it.  This issue seems to only happen at boot

time.

|  |  | If I kldload the module after the system is booted then it seems

to

| work
|  |  | okay.  The KCS part was working fine and got the data okay from

the

|  |  | request.  I haven't seen or heard any issues with 8.2.
|  |
|  |  With -current I patched ipmi.c with:
|  |  Index: ipmi.c
|  |  ===
|  |  --- ipmi.c  (revision 233806)
|  |  +++ ipmi.c  (working copy)
|  |  @@ -523,7 +523,11 @@
|  |   * waiter that we awaken.
|  |   */
|  |  if (req-ir_owner == NULL)
|  |  +{
|  |  +device_printf(sc-ipmi_dev, DEBUG %s %d before wakeup
|  | %d\n,__FUNCTION__,__LINE__,ticks);
|  |  wakeup(req);
|  |  +device_printf(sc-ipmi_dev, DEBUG %s %d after wakeup
|  | %d\n,__FUNCTION__,__LINE__,ticks);
|  |  +}
|  |  else {
|  |  dev = req-ir_owner;
|  |  TAILQ_INSERT_TAIL(dev-ipmi_completed_requests,

req,

|  | ir_link);
|  |  @@ -543,7 +547,11 @@
|  |  IPMI_LOCK(sc);
|  |  error = sc-ipmi_enqueue_request(sc, req);
|  |  if (error == 0)
|  |  +{
|  |  +device_printf(sc-ipmi_dev, DEBUG %s %d before msleep
|  | %d\n,__FUNCTION__,__LINE__,ticks);
|  |  error = msleep(req,sc-ipmi_lock, 0, ipmireq,

timo);

|  |  +device_printf(sc-ipmi_dev, DEBUG %s %d after msleep
|  | %d\n,__FUNCTION__,__LINE__,ticks);
|  |  +}
|  |  if (error == 0)
|  |  error = req-ir_error;
|  |  IPMI_UNLOCK(sc);
|  |  @@ -695,8 +703,11 @@
|  |  error = ipmi_submit_driver_request(sc, req, MAX_TIMEOUT);
|  |  if (error == EWOULDBLOCK) {
|  |  device_printf(dev, Timed out waiting for
| GET_DEVICE_ID\n);
|  |  +   printf(DJA\n);
|  |  +/*
|  |  ipmi_free_request(req);
|  |  return;
|  |  +*/
|  |  } else if (error) {
|  |  device_printf(dev, Failed GET_DEVICE_ID: %d\n,

error);

|  |  ipmi_free_request(req);
|  |
|  |  and get
|  |# dmesg | grep ipmi
|  |ipmi0: KCS mode found at io 0xca8 on acpi

Re: [stable-ish 9] Dell R815 ipmi(4) attach failure

2012-04-06 Thread Doug Ambrisko
Alexander Motin writes:
[ Charset ISO-8859-1 unsupported, converting... ]
| On 04/04/12 21:47, John Baldwin wrote:
|  On Wednesday, April 04, 2012 12:24:33 pm Doug Ambrisko wrote:
|  John Baldwin writes:
|  | On Tuesday, April 03, 2012 12:37:50 pm Doug Ambrisko wrote:
|  |  John Baldwin writes:
|  |  | On Monday, April 02, 2012 7:27:13 pm Doug Ambrisko wrote:
|  |  |  Doug Ambrisko writes:
|  |  |  | John Baldwin writes:
|  |  |  | | On Saturday, March 31, 2012 3:25:48 pm Doug Ambrisko wrote:
|  |  |  | |  Sean Bruno writes:
|  |  |  | |  | Noting a failure to attach to the onboard IPMI controller
|  with
|  | this
|  |  | dell
|  |  |  | |  | R815.  Not sure what to start poking at and thought I'd
|  though
|  | this
|  |  | over
|  |  |  | |  | here for comment.
|  |  |  | |  |
|  |  |  | |  | -bash-4.2$ dmesg |grep ipmi
|  |  |  | |  | ipmi0: KCS mode found at io 0xca8 on acpi
|  |  |  | |  | ipmi1:IPMI System Interface  on isa0
|  |  |  | |  | device_attach: ipmi1 attach returned 16
|  |  |  | |  | ipmi1:IPMI System Interface  on isa0
|  |  |  | |  | device_attach: ipmi1 attach returned 16
|  |  |  | |  | ipmi0: Timed out waiting for GET_DEVICE_ID
|  |  |  | |
|  |  |  | |  I've run into this recently.  A quick hack to fix it is:
|  |  |  | |
|  |  |  | |  Index: ipmi.c
|  |  |  | |
[snip]
|  | If you use -ct then you get a file you can feed into schedgraph.
|  | However, just reading the log, it seems that IRQ 20 keeps preempting
|  | the KCS worker thread preventing it from getting anything done.  Also,
|  | there seem to be a lot of threads on CPU 0's runqueue waiting for a
|  | chance to run (load average of 12 or 13 the entire time).  You can try
|  | just bumping up the max timeout from 3 seconds to higher perhaps.  Not
|  | sure why IRQ 20 keeps firing though.  It might be related to USB, so
|  | you could try fiddling with USB options in the BIOS perhaps, or disabling
|  | the USB drivers to see if that fixes IPMI.
| 
|  Tried without USB in kernel:
| http://people.freebsd.org/~ambrisko/ipmi_ktr_dump_no_usb.txt
| 
|  Hmm, it's still just running constantly (note that the idle thread is
|  _never_ scheduled).  The lion's share of the time seems to be spent in
|  xpt_thrd.  Note that there are several places where nothing happens except
|  that xpt_thrd runs constantly (spinning) during 10's of statclock ticks.  
I
|  would maybe start debugging that to see what in the world it is doing.  
Maybe
|  it is polling some hardware down in xpt_action() (i.e., xpt_action() for a
|  single bus called down into a driver and it is just spinning using polling
|  instead of sleeping and waiting for an interrupt).
| 
| xpt_thrd is a bus scanner thread. It is scheduled by CAM for every bus 
| on attach and by controller driver on hot-plug events. For some 
| controllers it may be quite CPU-hungry. For example, for legacy ATA 
| controllers, where bus reset may take many seconds of hardware polling, 
| while devices just spinning up. For ahci(4) it was improved about year 
| ago to not use polling when possible, but it still may loop for some 
| time if controller is not responding on reset. What mfi(4), mentioned in 
| log, does during scanning, I am not sure.

I thought that mfi(4) could be an issue.  There are some ata controllers
with nothing attached.  I built a GENERIC with USB and mfi commented out
and then the timeout issue went away:
  ipmi0: KCS mode found at io 0xca8 on acpi
  ipmi1: IPMI System Interface on isa0
  device_attach: ipmi1 attach returned 16
  ipmi1: IPMI System Interface on isa0
  device_attach: ipmi1 attach returned 16
  ipmi0: DEBUG ipmi_submit_driver_request 551 before msleep 1
  ipmi0: DEBUG ipmi_complete_request 527 before wakeup 2211
  ipmi0: DEBUG ipmi_complete_request 529 after wakeup 2272
  ipmi0: DEBUG ipmi_submit_driver_request 553 after msleep 2332
  ipmi0: IPMI device rev. 0, firmware rev. 1.61, version 2.0

Without mfi and with USB and it had issues:
  ipmi0: KCS mode found at io 0xca8 on acpi
  ipmi1: IPMI System Interface on isa0
  device_attach: ipmi1 attach returned 16
  ipmi1: IPMI System Interface on isa0
  device_attach: ipmi1 attach returned 16
  ipmi0: DEBUG ipmi_submit_driver_request 551 before msleep 2
  ipmi0: DEBUG ipmi_complete_request 527 before wakeup 3137
  ipmi0: DEBUG ipmi_complete_request 529 after wakeup 3199
  ipmi0: DEBUG ipmi_submit_driver_request 553 after msleep 3259
  ipmi0: Timed out waiting for GET_DEVICE_ID
  ipmi0: IPMI device rev. 0, firmware rev. 1.61, version 2.0

I can post more ktrdump traces if needed.  A 1U Dell machine without
mfi also has this problem.  As John mentioned it might be good to
bump up the timeout from 3s to 6s.  I did that with the USB no mfi
kernel and that passed:

  % dmesg | grep ipmi
  ipmi0: KCS mode found at io 0xca8 on acpi
  ipmi1: IPMI System Interface on isa0
  device_attach: ipmi1 attach returned 16
  ipmi1: IPMI System Interface on isa0
  device_attach: ipmi1 attach returned 16
  ipmi0: DEBUG 

Re: 8.3-PRERELEASE and ATA_CAM

2012-04-06 Thread Marius Strobl
On Fri, Apr 06, 2012 at 10:48:13AM +0300, Daniel Braniss wrote:
 with the latest svn, I can't compile kernel with  options ATA_CAM:
 
 ...
 linking kernel.debug
 ata-disk.o(.text+0x93): In function `ad_init':
 /r+d/stable/8.3/sys/dev/ata/ata-disk.c:389: undefined reference to 
 `ata_setmode'
 ata-disk.o(.text+0xaa):/r+d/stable/8.3/sys/dev/ata/ata-disk.c:397: undefined 
 reference to `ata_wc'
 ata-disk.o(.text+0xc5):/r+d/stable/8.3/sys/dev/ata/ata-disk.c:398: undefined 
 reference to `ata_controlcmd'
 ata-disk.o(.text+0x113):/r+d/stable/8.3/sys/dev/ata/ata-disk.c:400: undefined 
 reference to `ata_controlcmd'
 ata-disk.o(.text+0x133):/r+d/stable/8.3/sys/dev/ata/ata-disk.c:393: undefined 
 reference to `ata_controlcmd'
 ata-disk.o(.text+0x16d):/r+d/stable/8.3/sys/dev/ata/ata-disk.c:407: undefined 
 reference to `ata_controlcmd'
 ata-disk.o(.text+0x21a): In function `ad_shutdown':
 /r+d/stable/8.3/sys/dev/ata/ata-disk.c:196: undefined reference to 
 `ata_controlcmd'
 ata-disk.o(.text+0x45c): In function `ad_detach':
 /r+d/stable/8.3/sys/dev/ata/ata-disk.c:182: undefined reference to 
 `ata_fail_requests'
 ...
 

You seem to be using a mutually exclusive set of ata(4) options and
devices (previously, this erroneously wasn't a bug). When including
options ATA_CAM you do _not_ want to also include any of the following
devices:
device  atapicam
device  atadisk
device  ataraid
device  atapicd
device  atapifd
device  atapist

Instead you need the corresponding driver from the following set:
device  scbus
device  ch
device  da
device  sa
device  cd
device  pass

Marius

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: [stable-ish 9] Dell R815 ipmi(4) attach failure

2012-04-06 Thread Alexander Motin

On 04/06/12 20:12, Doug Ambrisko wrote:

Alexander Motin writes:
[ Charset ISO-8859-1 unsupported, converting... ]
| On 04/04/12 21:47, John Baldwin wrote:
|  On Wednesday, April 04, 2012 12:24:33 pm Doug Ambrisko wrote:
|  John Baldwin writes:
|  | On Tuesday, April 03, 2012 12:37:50 pm Doug Ambrisko wrote:
|  |   John Baldwin writes:
|  |   | On Monday, April 02, 2012 7:27:13 pm Doug Ambrisko wrote:
|  |   |   Doug Ambrisko writes:
|  |   |   | John Baldwin writes:
|  |   |   | | On Saturday, March 31, 2012 3:25:48 pm Doug Ambrisko wrote:
|  |   |   | |   Sean Bruno writes:
|  |   |   | |   | Noting a failure to attach to the onboard IPMI 
controller
|  with
|  | this
|  |   | dell
|  |   |   | |   | R815.  Not sure what to start poking at and thought I'd
|  though
|  | this
|  |   | over
|  |   |   | |   | here for comment.
|  |   |   | |   |
|  |   |   | |   | -bash-4.2$ dmesg |grep ipmi
|  |   |   | |   | ipmi0: KCS mode found at io 0xca8 on acpi
|  |   |   | |   | ipmi1:IPMI System Interface   on isa0
|  |   |   | |   | device_attach: ipmi1 attach returned 16
|  |   |   | |   | ipmi1:IPMI System Interface   on isa0
|  |   |   | |   | device_attach: ipmi1 attach returned 16
|  |   |   | |   | ipmi0: Timed out waiting for GET_DEVICE_ID
|  |   |   | |
|  |   |   | |   I've run into this recently.  A quick hack to fix it is:
|  |   |   | |
|  |   |   | |   Index: ipmi.c
|  |   |   | |
[snip]
|  | If you use -ct then you get a file you can feed into schedgraph.
|  | However, just reading the log, it seems that IRQ 20 keeps preempting
|  | the KCS worker thread preventing it from getting anything done.  Also,
|  | there seem to be a lot of threads on CPU 0's runqueue waiting for a
|  | chance to run (load average of 12 or 13 the entire time).  You can try
|  | just bumping up the max timeout from 3 seconds to higher perhaps.  Not
|  | sure why IRQ 20 keeps firing though.  It might be related to USB, so
|  | you could try fiddling with USB options in the BIOS perhaps, or disabling
|  | the USB drivers to see if that fixes IPMI.
|
|  Tried without USB in kernel:
|   http://people.freebsd.org/~ambrisko/ipmi_ktr_dump_no_usb.txt
|
|  Hmm, it's still just running constantly (note that the idle thread is
|  _never_ scheduled).  The lion's share of the time seems to be spent in
|  xpt_thrd.  Note that there are several places where nothing happens except
|  that xpt_thrd runs constantly (spinning) during 10's of statclock ticks.  
I
|  would maybe start debugging that to see what in the world it is doing.  
Maybe
|  it is polling some hardware down in xpt_action() (i.e., xpt_action() for a
|  single bus called down into a driver and it is just spinning using polling
|  instead of sleeping and waiting for an interrupt).
|
| xpt_thrd is a bus scanner thread. It is scheduled by CAM for every bus
| on attach and by controller driver on hot-plug events. For some
| controllers it may be quite CPU-hungry. For example, for legacy ATA
| controllers, where bus reset may take many seconds of hardware polling,
| while devices just spinning up. For ahci(4) it was improved about year
| ago to not use polling when possible, but it still may loop for some
| time if controller is not responding on reset. What mfi(4), mentioned in
| log, does during scanning, I am not sure.

I thought that mfi(4) could be an issue.  There are some ata controllers
with nothing attached.  I built a GENERIC with USB and mfi commented out
and then the timeout issue went away:
   ipmi0: KCS mode found at io 0xca8 on acpi
   ipmi1:IPMI System Interface  on isa0
   device_attach: ipmi1 attach returned 16
   ipmi1:IPMI System Interface  on isa0
   device_attach: ipmi1 attach returned 16
   ipmi0: DEBUG ipmi_submit_driver_request 551 before msleep 1
   ipmi0: DEBUG ipmi_complete_request 527 before wakeup 2211
   ipmi0: DEBUG ipmi_complete_request 529 after wakeup 2272
   ipmi0: DEBUG ipmi_submit_driver_request 553 after msleep 2332
   ipmi0: IPMI device rev. 0, firmware rev. 1.61, version 2.0

Without mfi and with USB and it had issues:
   ipmi0: KCS mode found at io 0xca8 on acpi
   ipmi1:IPMI System Interface  on isa0
   device_attach: ipmi1 attach returned 16
   ipmi1:IPMI System Interface  on isa0
   device_attach: ipmi1 attach returned 16
   ipmi0: DEBUG ipmi_submit_driver_request 551 before msleep 2
   ipmi0: DEBUG ipmi_complete_request 527 before wakeup 3137
   ipmi0: DEBUG ipmi_complete_request 529 after wakeup 3199
   ipmi0: DEBUG ipmi_submit_driver_request 553 after msleep 3259
   ipmi0: Timed out waiting for GET_DEVICE_ID
   ipmi0: IPMI device rev. 0, firmware rev. 1.61, version 2.0

I can post more ktrdump traces if needed.  A 1U Dell machine without
mfi also has this problem.  As John mentioned it might be good to
bump up the timeout from 3s to 6s.  I did that with the USB no mfi
kernel and that passed:

   % dmesg | grep ipmi
   ipmi0: KCS mode found at io 0xca8 on acpi
   ipmi1:IPMI System Interface  on isa0
   device_attach: 

Re: [stable-ish 9] Dell R815 ipmi(4) attach failure

2012-04-06 Thread Doug Ambrisko
Alexander Motin writes:
| On 04/06/12 20:12, Doug Ambrisko wrote:
|  Alexander Motin writes:
|  | On 04/04/12 21:47, John Baldwin wrote:
|  |  On Wednesday, April 04, 2012 12:24:33 pm Doug Ambrisko wrote:
|  |  John Baldwin writes:
|  |  | On Tuesday, April 03, 2012 12:37:50 pm Doug Ambrisko wrote:
|  |  |   John Baldwin writes:
|  |  |   | On Monday, April 02, 2012 7:27:13 pm Doug Ambrisko wrote:
|  |  |   |   Doug Ambrisko writes:
|  |  |   |   | John Baldwin writes:
|  |  |   |   | | On Saturday, March 31, 2012 3:25:48 pm Doug Ambrisko 
wrote:
|  |  |   |   | |   Sean Bruno writes:
|  |  |   |   | |   | Noting a failure to attach to the onboard IPMI 
controller
|  |  with
|  |  | this
|  |  |   | dell
|  |  |   |   | |   | R815.  Not sure what to start poking at and thought 
I'd
|  |  though
|  |  | this
|  |  |   | over
|  |  |   |   | |   | here for comment.
|  |  |   |   | |   |
|  |  |   |   | |   | -bash-4.2$ dmesg |grep ipmi
|  |  |   |   | |   | ipmi0: KCS mode found at io 0xca8 on acpi
|  |  |   |   | |   | ipmi1:IPMI System Interface   on isa0
|  |  |   |   | |   | device_attach: ipmi1 attach returned 16
|  |  |   |   | |   | ipmi1:IPMI System Interface   on isa0
|  |  |   |   | |   | device_attach: ipmi1 attach returned 16
|  |  |   |   | |   | ipmi0: Timed out waiting for GET_DEVICE_ID
|  |  |   |   | |
|  |  |   |   | |   I've run into this recently.  A quick hack to fix it 
is:
|  |  |   |   | |
|  |  |   |   | |   Index: ipmi.c
|  |  |   |   | |
|  [snip]
|  |  | If you use -ct then you get a file you can feed into schedgraph.
|  |  | However, just reading the log, it seems that IRQ 20 keeps preempting
|  |  | the KCS worker thread preventing it from getting anything done.  
Also,
|  |  | there seem to be a lot of threads on CPU 0's runqueue waiting for a
|  |  | chance to run (load average of 12 or 13 the entire time).  You can 
try
|  |  | just bumping up the max timeout from 3 seconds to higher perhaps.  
Not
|  |  | sure why IRQ 20 keeps firing though.  It might be related to USB, so
|  |  | you could try fiddling with USB options in the BIOS perhaps, or 
disabling
|  |  | the USB drivers to see if that fixes IPMI.
|  |
|  |  Tried without USB in kernel:
|  | http://people.freebsd.org/~ambrisko/ipmi_ktr_dump_no_usb.txt
|  |
|  |  Hmm, it's still just running constantly (note that the idle thread is
|  |  _never_ scheduled).  The lion's share of the time seems to be spent in
|  |  xpt_thrd.  Note that there are several places where nothing happens 
except
|  |  that xpt_thrd runs constantly (spinning) during 10's of statclock 
ticks.  I
|  |  would maybe start debugging that to see what in the world it is doing.  
Maybe
|  |  it is polling some hardware down in xpt_action() (i.e., xpt_action() 
for a
|  |  single bus called down into a driver and it is just spinning using 
polling
|  |  instead of sleeping and waiting for an interrupt).
|  |
|  | xpt_thrd is a bus scanner thread. It is scheduled by CAM for every bus
|  | on attach and by controller driver on hot-plug events. For some
|  | controllers it may be quite CPU-hungry. For example, for legacy ATA
|  | controllers, where bus reset may take many seconds of hardware polling,
|  | while devices just spinning up. For ahci(4) it was improved about year
|  | ago to not use polling when possible, but it still may loop for some
|  | time if controller is not responding on reset. What mfi(4), mentioned in
|  | log, does during scanning, I am not sure.
| 
|  I thought that mfi(4) could be an issue.  There are some ata controllers
|  with nothing attached.  I built a GENERIC with USB and mfi commented out
|  and then the timeout issue went away:
| ipmi0: KCS mode found at io 0xca8 on acpi
| ipmi1:IPMI System Interface  on isa0
| device_attach: ipmi1 attach returned 16
| ipmi1:IPMI System Interface  on isa0
| device_attach: ipmi1 attach returned 16
| ipmi0: DEBUG ipmi_submit_driver_request 551 before msleep 1
| ipmi0: DEBUG ipmi_complete_request 527 before wakeup 2211
| ipmi0: DEBUG ipmi_complete_request 529 after wakeup 2272
| ipmi0: DEBUG ipmi_submit_driver_request 553 after msleep 2332
| ipmi0: IPMI device rev. 0, firmware rev. 1.61, version 2.0
| 
|  Without mfi and with USB and it had issues:
| ipmi0: KCS mode found at io 0xca8 on acpi
| ipmi1:IPMI System Interface  on isa0
| device_attach: ipmi1 attach returned 16
| ipmi1:IPMI System Interface  on isa0
| device_attach: ipmi1 attach returned 16
| ipmi0: DEBUG ipmi_submit_driver_request 551 before msleep 2
| ipmi0: DEBUG ipmi_complete_request 527 before wakeup 3137
| ipmi0: DEBUG ipmi_complete_request 529 after wakeup 3199
| ipmi0: DEBUG ipmi_submit_driver_request 553 after msleep 3259
| ipmi0: Timed out waiting for GET_DEVICE_ID
| ipmi0: IPMI device rev. 0, firmware rev. 1.61, version 2.0
| 
|  I can post more ktrdump traces if needed.  A 1U Dell machine without
|  mfi also has this 

RE: 8.3-PRERELEASE and ATA_CAM

2012-04-06 Thread Dewayne Geraghty
Marius, 
Perhaps this mutual exclusivity issue between ATA_CAM with atapicam and
friends, should be mentioned in UPDATING as I'm sure the same question will
recur.

Thank-you for your guidance resolving the same issue that I had in 9.0
Stable.
Regards, Dewayne.


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 157k interrupts per second causing 60% CPU load on idle system

2012-04-06 Thread Matt Thyer
On 5 April 2012 01:18, Freddie Cash fjwc...@gmail.com wrote:

 On Wed, Apr 4, 2012 at 5:19 AM, Matt Thyer matt.th...@gmail.com wrote:
  So it seems that both the old and new mps driver have a problem with the
  Western Digital WD20EARX SATA 3 drive on a SuperMicro AOC-USAS2-L8i (SAS
  6G) controller (flashed with -IT firmware).

 I wouldn't say the driver has a problem with that specific drive.
 More that it might have a problem with a mixed SATA2/SATA3 setup.

 Sorry, that's what I meant to say but it now seems that the 157K
interrupts per second is probably not due to the SuperMicro AOC-USAS2-L8i.

Since moving the SATA 3 disk to the onboard Intel SATA 2 controller I'm no
longer having that disk evicted from the raidz2 pool with write errors and
I thought that the high interrupt rate issue had also been solved but it's
back again.

This is on 8-STABLE at revision 230921 (before the new driver hit 8-STABLE).

So now I need to go back to trying to determine what the cause is.

I'll stop posting in this thread as I don't think it's anything to do with
either the old or new version of this driver.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 157k interrupts per second causing 60% CPU load on idle system

2012-04-06 Thread Matt Thyer
On 7 April 2012 14:31, Matt Thyer matt.th...@gmail.com wrote:

 On 5 April 2012 01:18, Freddie Cash fjwc...@gmail.com wrote:

 On Wed, Apr 4, 2012 at 5:19 AM, Matt Thyer matt.th...@gmail.com wrote:
  So it seems that both the old and new mps driver have a problem with the
  Western Digital WD20EARX SATA 3 drive on a SuperMicro AOC-USAS2-L8i (SAS
  6G) controller (flashed with -IT firmware).

 I wouldn't say the driver has a problem with that specific drive.
 More that it might have a problem with a mixed SATA2/SATA3 setup.

 Sorry, that's what I meant to say but it now seems that the 157K
 interrupts per second is probably not due to the SuperMicro AOC-USAS2-L8i.

 Since moving the SATA 3 disk to the onboard Intel SATA 2 controller I'm no
 longer having that disk evicted from the raidz2 pool with write errors and
 I thought that the high interrupt rate issue had also been solved but it's
 back again.

 This is on 8-STABLE at revision 230921 (before the new driver hit
 8-STABLE).

 So now I need to go back to trying to determine what the cause is.

 I'll stop posting in this thread as I don't think it's anything to do with
 either the old or new version of this driver.


Oops... wrong thread I thought I was replying in -CURRENT.

So on to the root cause.

vmstat -i has shown that the issue was on irq 16.

Unfortunately there seems to be a lot of things on irq 16:

$  dmesg | grep irq 16
pcib1: PCI-PCI bridge irq 16 at device 1.0 on pci0
mps0: LSI SAS2008 port 0xee00-0xeeff mem
0xfbdfc000-0xfbdf,0xfbd8-0xfbdb irq 16 at device 0.0 on pci1
vgapci0: VGA-compatible display port 0xff00-0xff07 mem
0xfb40-0xfb7f,0xe000-0xefff irq 16 at device 2.0 on pci0
uhci0: UHCI (generic) USB controller port 0xfe00-0xfe1f irq 16 at device
26.0 on pci0
pcib2: ACPI PCI-PCI bridge irq 16 at device 28.0 on pci0
pcib3: ACPI PCI-PCI bridge irq 16 at device 28.4 on pci0
atapci0: JMicron JMB368 UDMA133 controller port
0xdf00-0xdf07,0xde00-0xde03,0xdd00-0xdd07,0xdc00-0xdc03,0xdb00-0xdb0f irq
16 at device 0.0 on pci3
pcib1: PCI-PCI bridge irq 16 at device 1.0 on pci0
mps0: LSI SAS2008 port 0xee00-0xeeff mem
0xfbdfc000-0xfbdf,0xfbd8-0xfbdb irq 16 at device 0.0 on pci1
vgapci0: VGA-compatible display port 0xff00-0xff07 mem
0xfb40-0xfb7f,0xe000-0xefff irq 16 at device 2.0 on pci0
uhci0: UHCI (generic) USB controller port 0xfe00-0xfe1f irq 16 at device
26.0 on pci0
pcib2: ACPI PCI-PCI bridge irq 16 at device 28.0 on pci0
pcib3: ACPI PCI-PCI bridge irq 16 at device 28.4 on pci0
atapci0: JMicron JMB368 UDMA133 controller port
0xdf00-0xdf07,0xde00-0xde03,0xdd00-0xdd07,0xdc00-0xdc03,0xdb00-0xdb0f irq
16 at device 0.0 on pci3

Any idea how to isolate which bit of hardware could be triggering the
interrupts ?

Unfortunately the only device I could remove would be the SuperMicro
AOC-USAS2-L8i (so yes I could eliminate that).

My biggest problem right now is not knowing how to trigger the issue.

At this stage I'm going to upgrade to 9-STABLE and see if it returns.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org