Re: ucom/uftdi high interrupt load

2011-06-16 Thread Hans Petter Selasky
On Thursday 16 June 2011 08:58:02 Charles Sprickman wrote:
> On Thu, 16 Jun 2011, Hans Petter Selasky wrote:
> > On Thursday 16 June 2011 02:01:03 Charles Sprickman wrote:
> >> On Tue, 14 Jun 2011, Hans Petter Selasky wrote:
> >>> On Tuesday 14 June 2011 02:58:44 Charles Sprickman wrote:
>  On Mon, 13 Jun 2011, Hans Petter Selasky wrote:
> > On Sunday 12 June 2011 23:50:24 Charles Sprickman wrote:
> >> On Sun, 12 Jun 2011, Hans Petter Selasky wrote:
> >>> On Saturday 11 June 2011 23:43:11 Charles Sprickman wrote:
> > Ok, then those quirks won't help.
> > 
> > For OHCI, I guess you should check vmstat -i.
>  
>  Oddly enough, the box paniced today, but it appeared to be related to
>  fxp. However in the coredump summary, I have "vmstat -i" output, and
>  ohci seems fairly high in comparison to everything else:
>  
>  vmstat -i
>  
>  interrupt  total   rate
>  irq4: uart0  106  0
>  irq10: ohci0   142322001 968176
>  irq14: ata0 1178  8
>  irq20: fxp0  3008691  20467
>  irq21: fxp1  1733357  11791
>  irq28: sym1   30  0
>  irq29: sym0  2624749  17855
>  cpu0: timer7280631004952810
>  cpu1: timer7280446844952684
>  Total 1605797896   10923795
>  
>  Also, just a brief summary of the panic, since it mentions the
>  interrupt
> >>> 
>  process again:
> >>> Hi,
> >>> 
> >>> The OHCI IRQ rate is too high. It should never exceed 1000 IRQ/s. Maybe
> >>> you can build a kernel with "options USB_DEBUG", then run the following
> >>> command and post some of the resulting dmesg:
> >>> 
> >>> sysctl hw.usb.ohci.debug=16 ; sleep 1; sysctl hw.usb.ohci.debug=0
> >> 
> >> Thanks again...  I just booted a kernel with USB_DEBUG and turned the
> >> sysctl on for a bit.  Was quite hard to turn it off though, but it also
> >> looks like time went backward on the machine, so maybe "sleep" never
> >> caught up with itself.  The output is pretty long, so I posted it here:
> >> 
> >> http://pastebin.com/HdnBYk6k  (set to never expire)
> >> 
> >> Another interesting note.  On boot, conserver failed to start for no
> >> reason I could find.  When I initially ran "vmstat -i" before manually
> >> starting conserver, the interrupt rate for ohci was much lower, maybe
> >> 30/S or so.  Starting conserver brought it up to 200-300/S.  Conserver
> >> was running during the debug logging.
> >> 
> >> Also a full dmesg is here:
> >> 
> >> http://pastebin.com/4kEYYNse
> > 
> > The logs look OK. Are you using suspend/resume on this machine?
> 
> Nope.  It's an old 1U server that we just use for utility tasks such as
> the console server.
> 
> I can tell you that during the debug, the box was nearly locked up.
> 
> Could something unusual be happening just due to the sheer number of USB
> to serial adapters involved?  There's 16 on one box, 8 on another.  In
> total, I think 20 are actually in use.

Hi,

See if any warnings pop up when you use a kernel with options USB_DEBUG.

It can be that this is overflowing the OHCI. Are you sure there is enough 
bandwidth on the OHCI to handle 16 concurrent streams?

Have you tested loopback on all 16 ports at the same time?

> 
> Any other information I can provide?
> 

Not at the present moment.

--HPS
___
freebsd-usb@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-usb
To unsubscribe, send any mail to "freebsd-usb-unsubscr...@freebsd.org"


Re: ucom/uftdi high interrupt load

2011-06-15 Thread Charles Sprickman

On Thu, 16 Jun 2011, Hans Petter Selasky wrote:


On Thursday 16 June 2011 02:01:03 Charles Sprickman wrote:

On Tue, 14 Jun 2011, Hans Petter Selasky wrote:

On Tuesday 14 June 2011 02:58:44 Charles Sprickman wrote:

On Mon, 13 Jun 2011, Hans Petter Selasky wrote:

On Sunday 12 June 2011 23:50:24 Charles Sprickman wrote:

On Sun, 12 Jun 2011, Hans Petter Selasky wrote:

On Saturday 11 June 2011 23:43:11 Charles Sprickman wrote:

Ok, then those quirks won't help.

For OHCI, I guess you should check vmstat -i.


Oddly enough, the box paniced today, but it appeared to be related to
fxp. However in the coredump summary, I have "vmstat -i" output, and
ohci seems fairly high in comparison to everything else:

vmstat -i

interrupt  total   rate
irq4: uart0  106  0
irq10: ohci0   142322001 968176
irq14: ata0 1178  8
irq20: fxp0  3008691  20467
irq21: fxp1  1733357  11791
irq28: sym1   30  0
irq29: sym0  2624749  17855
cpu0: timer7280631004952810
cpu1: timer7280446844952684
Total 1605797896   10923795

Also, just a brief summary of the panic, since it mentions the interrupt



process again:

Hi,

The OHCI IRQ rate is too high. It should never exceed 1000 IRQ/s. Maybe
you can build a kernel with "options USB_DEBUG", then run the following
command and post some of the resulting dmesg:

sysctl hw.usb.ohci.debug=16 ; sleep 1; sysctl hw.usb.ohci.debug=0


Thanks again...  I just booted a kernel with USB_DEBUG and turned the
sysctl on for a bit.  Was quite hard to turn it off though, but it also
looks like time went backward on the machine, so maybe "sleep" never
caught up with itself.  The output is pretty long, so I posted it here:

http://pastebin.com/HdnBYk6k  (set to never expire)

Another interesting note.  On boot, conserver failed to start for no
reason I could find.  When I initially ran "vmstat -i" before manually
starting conserver, the interrupt rate for ohci was much lower, maybe 30/S
or so.  Starting conserver brought it up to 200-300/S.  Conserver was
running during the debug logging.

Also a full dmesg is here:

http://pastebin.com/4kEYYNse


The logs look OK. Are you using suspend/resume on this machine?


Nope.  It's an old 1U server that we just use for utility tasks such as 
the console server.


I can tell you that during the debug, the box was nearly locked up.

Could something unusual be happening just due to the sheer number of USB 
to serial adapters involved?  There's 16 on one box, 8 on another.  In 
total, I think 20 are actually in use.


Any other information I can provide?

Thanks,

Charles


--HPS


___
freebsd-usb@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-usb
To unsubscribe, send any mail to "freebsd-usb-unsubscr...@freebsd.org"


Re: ucom/uftdi high interrupt load

2011-06-15 Thread Hans Petter Selasky
On Thursday 16 June 2011 02:01:03 Charles Sprickman wrote:
> On Tue, 14 Jun 2011, Hans Petter Selasky wrote:
> > On Tuesday 14 June 2011 02:58:44 Charles Sprickman wrote:
> >> On Mon, 13 Jun 2011, Hans Petter Selasky wrote:
> >>> On Sunday 12 June 2011 23:50:24 Charles Sprickman wrote:
>  On Sun, 12 Jun 2011, Hans Petter Selasky wrote:
> > On Saturday 11 June 2011 23:43:11 Charles Sprickman wrote:
> >>> Ok, then those quirks won't help.
> >>> 
> >>> For OHCI, I guess you should check vmstat -i.
> >> 
> >> Oddly enough, the box paniced today, but it appeared to be related to
> >> fxp. However in the coredump summary, I have "vmstat -i" output, and
> >> ohci seems fairly high in comparison to everything else:
> >> 
> >> vmstat -i
> >> 
> >> interrupt  total   rate
> >> irq4: uart0  106  0
> >> irq10: ohci0   142322001 968176
> >> irq14: ata0 1178  8
> >> irq20: fxp0  3008691  20467
> >> irq21: fxp1  1733357  11791
> >> irq28: sym1   30  0
> >> irq29: sym0  2624749  17855
> >> cpu0: timer7280631004952810
> >> cpu1: timer7280446844952684
> >> Total 1605797896   10923795
> >> 
> >> Also, just a brief summary of the panic, since it mentions the interrupt
> > 
> >> process again:
> > Hi,
> > 
> > The OHCI IRQ rate is too high. It should never exceed 1000 IRQ/s. Maybe
> > you can build a kernel with "options USB_DEBUG", then run the following
> > command and post some of the resulting dmesg:
> > 
> > sysctl hw.usb.ohci.debug=16 ; sleep 1; sysctl hw.usb.ohci.debug=0
> 
> Thanks again...  I just booted a kernel with USB_DEBUG and turned the
> sysctl on for a bit.  Was quite hard to turn it off though, but it also
> looks like time went backward on the machine, so maybe "sleep" never
> caught up with itself.  The output is pretty long, so I posted it here:
> 
> http://pastebin.com/HdnBYk6k  (set to never expire)
> 
> Another interesting note.  On boot, conserver failed to start for no
> reason I could find.  When I initially ran "vmstat -i" before manually
> starting conserver, the interrupt rate for ohci was much lower, maybe 30/S
> or so.  Starting conserver brought it up to 200-300/S.  Conserver was
> running during the debug logging.
> 
> Also a full dmesg is here:
> 
> http://pastebin.com/4kEYYNse

The logs look OK. Are you using suspend/resume on this machine?

--HPS
___
freebsd-usb@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-usb
To unsubscribe, send any mail to "freebsd-usb-unsubscr...@freebsd.org"


Re: ucom/uftdi high interrupt load

2011-06-15 Thread Charles Sprickman

On Tue, 14 Jun 2011, Hans Petter Selasky wrote:


On Tuesday 14 June 2011 02:58:44 Charles Sprickman wrote:

On Mon, 13 Jun 2011, Hans Petter Selasky wrote:

On Sunday 12 June 2011 23:50:24 Charles Sprickman wrote:

On Sun, 12 Jun 2011, Hans Petter Selasky wrote:

On Saturday 11 June 2011 23:43:11 Charles Sprickman wrote:


Ok, then those quirks won't help.

For OHCI, I guess you should check vmstat -i.


Oddly enough, the box paniced today, but it appeared to be related to fxp.
However in the coredump summary, I have "vmstat -i" output, and ohci seems
fairly high in comparison to everything else:

vmstat -i

interrupt  total   rate
irq4: uart0  106  0
irq10: ohci0   142322001 968176
irq14: ata0 1178  8
irq20: fxp0  3008691  20467
irq21: fxp1  1733357  11791
irq28: sym1   30  0
irq29: sym0  2624749  17855
cpu0: timer7280631004952810
cpu1: timer7280446844952684
Total 1605797896   10923795

Also, just a brief summary of the panic, since it mentions the interrupt
process again:


Hi,

The OHCI IRQ rate is too high. It should never exceed 1000 IRQ/s. Maybe you
can build a kernel with "options USB_DEBUG", then run the following command
and post some of the resulting dmesg:

sysctl hw.usb.ohci.debug=16 ; sleep 1; sysctl hw.usb.ohci.debug=0


Thanks again...  I just booted a kernel with USB_DEBUG and turned the 
sysctl on for a bit.  Was quite hard to turn it off though, but it also 
looks like time went backward on the machine, so maybe "sleep" never 
caught up with itself.  The output is pretty long, so I posted it here:


http://pastebin.com/HdnBYk6k  (set to never expire)

Another interesting note.  On boot, conserver failed to start for no 
reason I could find.  When I initially ran "vmstat -i" before manually 
starting conserver, the interrupt rate for ohci was much lower, maybe 30/S 
or so.  Starting conserver brought it up to 200-300/S.  Conserver was 
running during the debug logging.


Also a full dmesg is here:

http://pastebin.com/4kEYYNse

Thanks,

Charles


--HPS



#7  0x8059139b in fxp_new_rfabuf (sc=0x8564c000, rxp=0x8564c1c0)
 at /usr/src/sys/dev/fxp/if_fxp.c:2611
#8  0x8059285b in fxp_intr (xsc=0x8564c000)
 at /usr/src/sys/dev/fxp/if_fxp.c:1931
#9  0x8067b1db in intr_event_execute_handlers (p=0x8553d7f8,
ie=0x8557d080)
 at /usr/src/sys/kern/kern_intr.c:1220
#10 0x8067c8eb in ithread_loop (arg=0x856525d0)
 at /usr/src/sys/kern/kern_intr.c:1233
#11 0x80678f11 in fork_exit (callout=0x8067c880 ,
 arg=0x856525d0, frame=0xd80e7d38) at /usr/src/sys/kern/kern_fork.c:844
#12 0x80931de0 in fork_trampoline () at
/usr/src/sys/i386/i386/exception.s:270

And also unrelated to usb, but fairly bizarre "netstat -m" output:

18446744073709550887/1355/626/25600 mbuf clusters in use
(current/cache/total/max)
18014398509480560K/3497K/2073K bytes allocated to network
(current/cache/total)

Sorry for all the extra noise, but I'm not adept enough at determining
whether this panic was usb related or fxp related.

Thanks,

Charles


--HPS



___
freebsd-usb@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-usb
To unsubscribe, send any mail to "freebsd-usb-unsubscr...@freebsd.org"


Re: ucom/uftdi high interrupt load

2011-06-14 Thread Hans Petter Selasky
On Tuesday 14 June 2011 02:58:44 Charles Sprickman wrote:
> On Mon, 13 Jun 2011, Hans Petter Selasky wrote:
> > On Sunday 12 June 2011 23:50:24 Charles Sprickman wrote:
> >> On Sun, 12 Jun 2011, Hans Petter Selasky wrote:
> >>> On Saturday 11 June 2011 23:43:11 Charles Sprickman wrote:
> > 
> > Ok, then those quirks won't help.
> > 
> > For OHCI, I guess you should check vmstat -i.
> 
> Oddly enough, the box paniced today, but it appeared to be related to fxp.
> However in the coredump summary, I have "vmstat -i" output, and ohci seems
> fairly high in comparison to everything else:
> 
> vmstat -i
> 
> interrupt  total   rate
> irq4: uart0  106  0
> irq10: ohci0   142322001 968176
> irq14: ata0 1178  8
> irq20: fxp0  3008691  20467
> irq21: fxp1  1733357  11791
> irq28: sym1   30  0
> irq29: sym0  2624749  17855
> cpu0: timer7280631004952810
> cpu1: timer7280446844952684
> Total 1605797896   10923795
> 
> Also, just a brief summary of the panic, since it mentions the interrupt
> process again:

Hi,

The OHCI IRQ rate is too high. It should never exceed 1000 IRQ/s. Maybe you 
can build a kernel with "options USB_DEBUG", then run the following command 
and post some of the resulting dmesg:

sysctl hw.usb.ohci.debug=16 ; sleep 1; sysctl hw.usb.ohci.debug=0

--HPS

> 
> #7  0x8059139b in fxp_new_rfabuf (sc=0x8564c000, rxp=0x8564c1c0)
>  at /usr/src/sys/dev/fxp/if_fxp.c:2611
> #8  0x8059285b in fxp_intr (xsc=0x8564c000)
>  at /usr/src/sys/dev/fxp/if_fxp.c:1931
> #9  0x8067b1db in intr_event_execute_handlers (p=0x8553d7f8,
> ie=0x8557d080)
>  at /usr/src/sys/kern/kern_intr.c:1220
> #10 0x8067c8eb in ithread_loop (arg=0x856525d0)
>  at /usr/src/sys/kern/kern_intr.c:1233
> #11 0x80678f11 in fork_exit (callout=0x8067c880 ,
>  arg=0x856525d0, frame=0xd80e7d38) at /usr/src/sys/kern/kern_fork.c:844
> #12 0x80931de0 in fork_trampoline () at
> /usr/src/sys/i386/i386/exception.s:270
> 
> And also unrelated to usb, but fairly bizarre "netstat -m" output:
> 
> 18446744073709550887/1355/626/25600 mbuf clusters in use
> (current/cache/total/max)
> 18014398509480560K/3497K/2073K bytes allocated to network
> (current/cache/total)
> 
> Sorry for all the extra noise, but I'm not adept enough at determining
> whether this panic was usb related or fxp related.
> 
> Thanks,
> 
> Charles
> 
> > --HPS
___
freebsd-usb@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-usb
To unsubscribe, send any mail to "freebsd-usb-unsubscr...@freebsd.org"


Re: ucom/uftdi high interrupt load

2011-06-13 Thread Charles Sprickman

On Mon, 13 Jun 2011, Hans Petter Selasky wrote:


On Sunday 12 June 2011 23:50:24 Charles Sprickman wrote:

On Sun, 12 Jun 2011, Hans Petter Selasky wrote:

On Saturday 11 June 2011 23:43:11 Charles Sprickman wrote:

Hello,

We ran into an odd problem last week with our serial consoles after
moving the USB to serial adapters from an old 4.11 box to a box running
8.1.  We have two boxes that incorporate (I assume) hubs and a bunch of
FTDI serial interfaces.  One has 16 ports, the other 8.  Each is
plugged directly into a USB port on the rear of the mainboard.  We run
conserver[1] to handle access to the serial ports.  From what I've
observed, this application opens the ports when the daemon starts - it
logs any output (handy for panics, or anything else that might spit
interesting info to the console) and waits for clients to connect to
it.

Everything had been working fine for a few weeks.  The box was rebooted
recently to enable PostgreSQL to start normally (bumped SHM stuff in
loader.conf).  After six days, we found that the consoles were
unresponsive.  Restarting conserver brought us this each time we
connected to a console for full read/write access:

[Thu Jun  9 10:04:59 2011] conserver (50113): ERROR: [h22]
open(/dev/ttyU4): Interrupted system call: forcing down
[Thu Jun  9 10:04:59 2011] conserver (50112): ERROR: [h21]
open(/dev/ttyU11): Interrupted system call: forcing down

All devices still appeared in /dev.  Stopping conserver and confirming
it and all child processes were gone and then using picocom and cu
yielded no response on the serial ports.

We also found (after the fact) that around the time the consoles became
unresponsive, cpu usage went to nearly 90% and was mostly in the kernel
process "intr":

root   12 70.5  0.0 0   136  ??  WL   Fri12AM 120:01.47  [intr]

A graph showing cpu usage (red is "system"):
http://i.imgur.com/0yO5l.png

I should note that we know the cpu spike and devices becoming
unresponsive can be correlated because one of the serial ports runs a
temperature monitor which is tied into our monitoring.  When the data
goes stale, we get notified.

Issuing a "usbconfig -u 0 reset" caused all devices except for the root
hub to disappear and not come back.  CPU usage also dipped a bit after
that.  Rebooting was the only way to resolve the issue - perhaps
plugging and unplugging would have worked, but that's a bit too complex
for our remote hands.

I can supply full dmesg and more, but for now, here's a summary of the
usb info from dmesg:

FreeBSD 8.1-RELEASE #7: Wed Dec 22 00:49:50 EST 2010

ohci0:  mem 0xfe9fc000-0xfe9fcfff irq 10
at device 15.2 on pci0
ohci0: [ITHREAD]
...
usbus0:  on ohci0
usbus0: 12Mbps Full Speed USB v1.0
ugen0.1: <(0x1166)> at usbus0
...
uhub0: <(0x1166) OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on
usbus0
uhub0: 4 ports with 4 removable, self powered
ugen0.2:  at usbus0
uhub1:  on
usbus0 uhub1: 7 ports with 7 removable, self powered
ugen0.3:  at usbus0
uftdi0:  on usbus0
uftdi1:  on usbus0
ugen0.4:  at usbus0
uftdi2:  on usbus0
uftdi3:  on usbus0
ugen0.5:  at usbus0
uftdi4:  on usbus0
uftdi5:  on usbus0
ugen0.6:  at usbus0
uftdi6:  on usbus0
uftdi7:  on usbus0
ugen0.7:  at usbus0
uftdi8:  on usbus0
uftdi9:  on usbus0
ugen0.8:  at usbus0
uftdi10:  on usbus0
uftdi11:  on usbus0
ugen0.9:  at usbus0
uhub2:  on
usbus0 uhub2: 4 ports with 4 removable, self powered
ugen0.10:  at usbus0
uftdi12:  on usbus0
uftdi13:  on usbus0
ugen0.11:  at usbus0
... (mangling below is as it appears in dmesg)
da1 at sym0 bus 0 scbus0 target 1 lun 0uftdi14:  on usbus0
da1:  Fixed Direct Access SCSI-3 device
uftdi15:  on usbus0
...
Root mount waiting for: usbus0
ugen0.12:  at usbus0
uhub3: 
on usbus0
Root mount waiting for: usbus0
uhub3: 7 ports with 7 removable, self powered
ugen0.13:  at usbus0
uftdi16:  on usbus0
Root mount waiting for: usbus0
ugen0.14:  at usbus0
uftdi17:  on usbus0
Root mount waiting for: usbus0
ugen0.15:  at usbus0
uftdi18:  on usbus0

ugen0.16:  at usbus0Root mount waiting for:
  usbus0

uftdi19:  on usbus0
ugen0.17:  at usbus0
uftdi20:  on usbus0
Root mount waiting for: usbus0
ugen0.18:  at usbus0
uftdi21:  on usbus0
Root mount waiting for: usbus0
ugen0.19:  at usbus0
uhub4: 
on usbus0
uhub4: 4 ports with 4 removable, self powered
Root mount waiting for: usbus0
ugen0.20:  at usbus0
uftdi22:  on usbus0
Root mount waiting for: usbus0
ugen0.21:  at usbus0
uftdi23:  on usbus0
Trying to mount root from zfs:zroot

Thanks,

Charles


Hi,

Try to get output from vmstat -i.

Also try to set the:

hw.usb.ehci.iaadbug=1

and

hw.usb.ehci.lostintrbug=1

in /boot/loader.conf


Quick question - this host only has USB 1.1 - so I don't think I have
ehci, do I?  To be honest, I'm not totally clear on what's what as far as
uhci, ehci, and ohci.  This is what's seen in sysctl output related to
USB:

[spork@h12 ~]$ sysctl -a|grep usb|more
hw.pci.usb_early_takeover: 1
hw.usb.no_boot_wait: 0
hw.usb.debug: 0
hw.usb.usb_lang_mask: 255
hw.usb.usb_lang_id: 9
hw.usb.te

Re: ucom/uftdi high interrupt load [SEC=UNCLASSIFIED]

2011-06-13 Thread Hans Petter Selasky
On Monday 13 June 2011 05:18:04 Wilkinson, Alex wrote:
> 0n Sun, Jun 12, 2011 at 10:45:44PM +0200, Hans Petter Selasky wrote:
> >Also try to set the:
> >
> >hw.usb.ehci.iaadbug=1
> 
> erm, what is meant by:
> 
>#sysctl -d hw.usb.ehci.iaadbug
> hw.usb.ehci.iaadbug: Enable doorbell bug workaround
> 
> doorbell ?
> 
>-Alex

Doorbell is an attribute for a DMA schedule synchronization mechanism.

--HPS
___
freebsd-usb@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-usb
To unsubscribe, send any mail to "freebsd-usb-unsubscr...@freebsd.org"


Re: ucom/uftdi high interrupt load

2011-06-13 Thread Hans Petter Selasky
On Sunday 12 June 2011 23:50:24 Charles Sprickman wrote:
> On Sun, 12 Jun 2011, Hans Petter Selasky wrote:
> > On Saturday 11 June 2011 23:43:11 Charles Sprickman wrote:
> >> Hello,
> >> 
> >> We ran into an odd problem last week with our serial consoles after
> >> moving the USB to serial adapters from an old 4.11 box to a box running
> >> 8.1.  We have two boxes that incorporate (I assume) hubs and a bunch of
> >> FTDI serial interfaces.  One has 16 ports, the other 8.  Each is
> >> plugged directly into a USB port on the rear of the mainboard.  We run
> >> conserver[1] to handle access to the serial ports.  From what I've
> >> observed, this application opens the ports when the daemon starts - it
> >> logs any output (handy for panics, or anything else that might spit
> >> interesting info to the console) and waits for clients to connect to
> >> it.
> >> 
> >> Everything had been working fine for a few weeks.  The box was rebooted
> >> recently to enable PostgreSQL to start normally (bumped SHM stuff in
> >> loader.conf).  After six days, we found that the consoles were
> >> unresponsive.  Restarting conserver brought us this each time we
> >> connected to a console for full read/write access:
> >> 
> >> [Thu Jun  9 10:04:59 2011] conserver (50113): ERROR: [h22]
> >> open(/dev/ttyU4): Interrupted system call: forcing down
> >> [Thu Jun  9 10:04:59 2011] conserver (50112): ERROR: [h21]
> >> open(/dev/ttyU11): Interrupted system call: forcing down
> >> 
> >> All devices still appeared in /dev.  Stopping conserver and confirming
> >> it and all child processes were gone and then using picocom and cu
> >> yielded no response on the serial ports.
> >> 
> >> We also found (after the fact) that around the time the consoles became
> >> unresponsive, cpu usage went to nearly 90% and was mostly in the kernel
> >> process "intr":
> >> 
> >> root   12 70.5  0.0 0   136  ??  WL   Fri12AM 120:01.47  [intr]
> >> 
> >> A graph showing cpu usage (red is "system"):
> >> http://i.imgur.com/0yO5l.png
> >> 
> >> I should note that we know the cpu spike and devices becoming
> >> unresponsive can be correlated because one of the serial ports runs a
> >> temperature monitor which is tied into our monitoring.  When the data
> >> goes stale, we get notified.
> >> 
> >> Issuing a "usbconfig -u 0 reset" caused all devices except for the root
> >> hub to disappear and not come back.  CPU usage also dipped a bit after
> >> that.  Rebooting was the only way to resolve the issue - perhaps
> >> plugging and unplugging would have worked, but that's a bit too complex
> >> for our remote hands.
> >> 
> >> I can supply full dmesg and more, but for now, here's a summary of the
> >> usb info from dmesg:
> >> 
> >> FreeBSD 8.1-RELEASE #7: Wed Dec 22 00:49:50 EST 2010
> >> 
> >> ohci0:  mem 0xfe9fc000-0xfe9fcfff irq 10
> >> at device 15.2 on pci0
> >> ohci0: [ITHREAD]
> >> ...
> >> usbus0:  on ohci0
> >> usbus0: 12Mbps Full Speed USB v1.0
> >> ugen0.1: <(0x1166)> at usbus0
> >> ...
> >> uhub0: <(0x1166) OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on
> >> usbus0
> >> uhub0: 4 ports with 4 removable, self powered
> >> ugen0.2:  at usbus0
> >> uhub1:  on
> >> usbus0 uhub1: 7 ports with 7 removable, self powered
> >> ugen0.3:  at usbus0
> >> uftdi0:  on usbus0
> >> uftdi1:  on usbus0
> >> ugen0.4:  at usbus0
> >> uftdi2:  on usbus0
> >> uftdi3:  on usbus0
> >> ugen0.5:  at usbus0
> >> uftdi4:  on usbus0
> >> uftdi5:  on usbus0
> >> ugen0.6:  at usbus0
> >> uftdi6:  on usbus0
> >> uftdi7:  on usbus0
> >> ugen0.7:  at usbus0
> >> uftdi8:  on usbus0
> >> uftdi9:  on usbus0
> >> ugen0.8:  at usbus0
> >> uftdi10:  on usbus0
> >> uftdi11:  on usbus0
> >> ugen0.9:  at usbus0
> >> uhub2:  on
> >> usbus0 uhub2: 4 ports with 4 removable, self powered
> >> ugen0.10:  at usbus0
> >> uftdi12:  on usbus0
> >> uftdi13:  on usbus0
> >> ugen0.11:  at usbus0
> >> ... (mangling below is as it appears in dmesg)
> >> da1 at sym0 bus 0 scbus0 target 1 lun 0uftdi14:  >> ADAPTER> on usbus0
> >> da1:  Fixed Direct Access SCSI-3 device
> >> uftdi15:  on usbus0
> >> ...
> >> Root mount waiting for: usbus0
> >> ugen0.12:  at usbus0
> >> uhub3: 
> >> on usbus0
> >> Root mount waiting for: usbus0
> >> uhub3: 7 ports with 7 removable, self powered
> >> ugen0.13:  at usbus0
> >> uftdi16:  on usbus0
> >> Root mount waiting for: usbus0
> >> ugen0.14:  at usbus0
> >> uftdi17:  on usbus0
> >> Root mount waiting for: usbus0
> >> ugen0.15:  at usbus0
> >> uftdi18:  on usbus0
> >> 
> >> ugen0.16:  at usbus0Root mount waiting for:
> >>   usbus0
> >> 
> >> uftdi19:  on usbus0
> >> ugen0.17:  at usbus0
> >> uftdi20:  on usbus0
> >> Root mount waiting for: usbus0
> >> ugen0.18:  at usbus0
> >> uftdi21:  on usbus0
> >> Root mount waiting for: usbus0
> >> ugen0.19:  at usbus0
> >> uhub4: 
> >> on usbus0
> >> uhub4: 4 ports with 4 removable, self powered
> >> Root mount waiting for: usbus0
> >> ugen0.20:  at usbus0
> >> uftdi22:  on usbus0
> >> Root mount waiting for: usbus0
> >> uge

Re: ucom/uftdi high interrupt load [SEC=UNCLASSIFIED]

2011-06-12 Thread Wilkinson, Alex

0n Sun, Jun 12, 2011 at 10:45:44PM +0200, Hans Petter Selasky wrote: 

>Also try to set the:
>
>hw.usb.ehci.iaadbug=1

erm, what is meant by:

   #sysctl -d hw.usb.ehci.iaadbug
hw.usb.ehci.iaadbug: Enable doorbell bug workaround

doorbell ?

   -Alex


IMPORTANT: This email remains the property of the Department of Defence and is 
subject to the jurisdiction of section 70 of the Crimes Act 1914. If you have 
received this email in error, you are requested to contact the sender and 
delete the email.
___
freebsd-usb@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-usb
To unsubscribe, send any mail to "freebsd-usb-unsubscr...@freebsd.org"


Re: ucom/uftdi high interrupt load

2011-06-12 Thread Charles Sprickman

On Sun, 12 Jun 2011, Hans Petter Selasky wrote:


On Saturday 11 June 2011 23:43:11 Charles Sprickman wrote:

Hello,

We ran into an odd problem last week with our serial consoles after moving
the USB to serial adapters from an old 4.11 box to a box running 8.1.  We
have two boxes that incorporate (I assume) hubs and a bunch of FTDI serial
interfaces.  One has 16 ports, the other 8.  Each is plugged directly into
a USB port on the rear of the mainboard.  We run conserver[1] to handle
access to the serial ports.  From what I've observed, this application
opens the ports when the daemon starts - it logs any output (handy for
panics, or anything else that might spit interesting info to the console)
and waits for clients to connect to it.

Everything had been working fine for a few weeks.  The box was rebooted
recently to enable PostgreSQL to start normally (bumped SHM stuff in
loader.conf).  After six days, we found that the consoles were
unresponsive.  Restarting conserver brought us this each time we
connected to a console for full read/write access:

[Thu Jun  9 10:04:59 2011] conserver (50113): ERROR: [h22]
open(/dev/ttyU4): Interrupted system call: forcing down
[Thu Jun  9 10:04:59 2011] conserver (50112): ERROR: [h21]
open(/dev/ttyU11): Interrupted system call: forcing down

All devices still appeared in /dev.  Stopping conserver and confirming it
and all child processes were gone and then using picocom and cu yielded no
response on the serial ports.

We also found (after the fact) that around the time the consoles became
unresponsive, cpu usage went to nearly 90% and was mostly in the kernel
process "intr":

root   12 70.5  0.0 0   136  ??  WL   Fri12AM 120:01.47  [intr]

A graph showing cpu usage (red is "system"):
http://i.imgur.com/0yO5l.png

I should note that we know the cpu spike and devices becoming unresponsive
can be correlated because one of the serial ports runs a temperature
monitor which is tied into our monitoring.  When the data goes stale, we
get notified.

Issuing a "usbconfig -u 0 reset" caused all devices except for the root
hub to disappear and not come back.  CPU usage also dipped a bit after
that.  Rebooting was the only way to resolve the issue - perhaps plugging
and unplugging would have worked, but that's a bit too complex for our
remote hands.

I can supply full dmesg and more, but for now, here's a summary of the usb
info from dmesg:

FreeBSD 8.1-RELEASE #7: Wed Dec 22 00:49:50 EST 2010

ohci0:  mem 0xfe9fc000-0xfe9fcfff irq 10 at
device 15.2 on pci0
ohci0: [ITHREAD]
...
usbus0:  on ohci0
usbus0: 12Mbps Full Speed USB v1.0
ugen0.1: <(0x1166)> at usbus0
...
uhub0: <(0x1166) OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on
usbus0
uhub0: 4 ports with 4 removable, self powered
ugen0.2:  at usbus0
uhub1:  on usbus0
uhub1: 7 ports with 7 removable, self powered
ugen0.3:  at usbus0
uftdi0:  on usbus0
uftdi1:  on usbus0
ugen0.4:  at usbus0
uftdi2:  on usbus0
uftdi3:  on usbus0
ugen0.5:  at usbus0
uftdi4:  on usbus0
uftdi5:  on usbus0
ugen0.6:  at usbus0
uftdi6:  on usbus0
uftdi7:  on usbus0
ugen0.7:  at usbus0
uftdi8:  on usbus0
uftdi9:  on usbus0
ugen0.8:  at usbus0
uftdi10:  on usbus0
uftdi11:  on usbus0
ugen0.9:  at usbus0
uhub2:  on usbus0
uhub2: 4 ports with 4 removable, self powered
ugen0.10:  at usbus0
uftdi12:  on usbus0
uftdi13:  on usbus0
ugen0.11:  at usbus0
... (mangling below is as it appears in dmesg)
da1 at sym0 bus 0 scbus0 target 1 lun 0uftdi14: 
on usbus0
da1:  Fixed Direct Access SCSI-3 device uftdi15:
 on usbus0
...
Root mount waiting for: usbus0
ugen0.12:  at usbus0
uhub3: 
on usbus0
Root mount waiting for: usbus0
uhub3: 7 ports with 7 removable, self powered
ugen0.13:  at usbus0
uftdi16:  on usbus0
Root mount waiting for: usbus0
ugen0.14:  at usbus0
uftdi17:  on usbus0
Root mount waiting for: usbus0
ugen0.15:  at usbus0
uftdi18:  on usbus0
ugen0.16:  at usbus0Root mount waiting for:
  usbus0
uftdi19:  on usbus0
ugen0.17:  at usbus0
uftdi20:  on usbus0
Root mount waiting for: usbus0
ugen0.18:  at usbus0
uftdi21:  on usbus0
Root mount waiting for: usbus0
ugen0.19:  at usbus0
uhub4: 
on usbus0
uhub4: 4 ports with 4 removable, self powered
Root mount waiting for: usbus0
ugen0.20:  at usbus0
uftdi22:  on usbus0
Root mount waiting for: usbus0
ugen0.21:  at usbus0
uftdi23:  on usbus0
Trying to mount root from zfs:zroot

Thanks,

Charles


Hi,

Try to get output from vmstat -i.

Also try to set the:

hw.usb.ehci.iaadbug=1

and

hw.usb.ehci.lostintrbug=1

in /boot/loader.conf


Quick question - this host only has USB 1.1 - so I don't think I have 
ehci, do I?  To be honest, I'm not totally clear on what's what as far as 
uhci, ehci, and ohci.  This is what's seen in sysctl output related to 
USB:


[spork@h12 ~]$ sysctl -a|grep usb|more
hw.pci.usb_early_takeover: 1
hw.usb.no_boot_wait: 0
hw.usb.debug: 0
hw.usb.usb_lang_mask: 255
hw.usb.usb_lang_id: 9
hw.usb.template: 0
hw.usb.power_timeout: 30
hw.usb.ucom.cons_baud: 9600
hw.usb.ucom.cons_unit: -1
dev.usbus.0.%des

Re: ucom/uftdi high interrupt load

2011-06-12 Thread Hans Petter Selasky
On Saturday 11 June 2011 23:43:11 Charles Sprickman wrote:
> Hello,
> 
> We ran into an odd problem last week with our serial consoles after moving
> the USB to serial adapters from an old 4.11 box to a box running 8.1.  We
> have two boxes that incorporate (I assume) hubs and a bunch of FTDI serial
> interfaces.  One has 16 ports, the other 8.  Each is plugged directly into
> a USB port on the rear of the mainboard.  We run conserver[1] to handle
> access to the serial ports.  From what I've observed, this application
> opens the ports when the daemon starts - it logs any output (handy for
> panics, or anything else that might spit interesting info to the console)
> and waits for clients to connect to it.
> 
> Everything had been working fine for a few weeks.  The box was rebooted
> recently to enable PostgreSQL to start normally (bumped SHM stuff in
> loader.conf).  After six days, we found that the consoles were
> unresponsive.  Restarting conserver brought us this each time we
> connected to a console for full read/write access:
> 
> [Thu Jun  9 10:04:59 2011] conserver (50113): ERROR: [h22]
> open(/dev/ttyU4): Interrupted system call: forcing down
> [Thu Jun  9 10:04:59 2011] conserver (50112): ERROR: [h21]
> open(/dev/ttyU11): Interrupted system call: forcing down
> 
> All devices still appeared in /dev.  Stopping conserver and confirming it
> and all child processes were gone and then using picocom and cu yielded no
> response on the serial ports.
> 
> We also found (after the fact) that around the time the consoles became
> unresponsive, cpu usage went to nearly 90% and was mostly in the kernel
> process "intr":
> 
> root   12 70.5  0.0 0   136  ??  WL   Fri12AM 120:01.47  [intr]
> 
> A graph showing cpu usage (red is "system"):
> http://i.imgur.com/0yO5l.png
> 
> I should note that we know the cpu spike and devices becoming unresponsive
> can be correlated because one of the serial ports runs a temperature
> monitor which is tied into our monitoring.  When the data goes stale, we
> get notified.
> 
> Issuing a "usbconfig -u 0 reset" caused all devices except for the root
> hub to disappear and not come back.  CPU usage also dipped a bit after
> that.  Rebooting was the only way to resolve the issue - perhaps plugging
> and unplugging would have worked, but that's a bit too complex for our
> remote hands.
> 
> I can supply full dmesg and more, but for now, here's a summary of the usb
> info from dmesg:
> 
> FreeBSD 8.1-RELEASE #7: Wed Dec 22 00:49:50 EST 2010
> 
> ohci0:  mem 0xfe9fc000-0xfe9fcfff irq 10 at
> device 15.2 on pci0
> ohci0: [ITHREAD]
> ...
> usbus0:  on ohci0
> usbus0: 12Mbps Full Speed USB v1.0
> ugen0.1: <(0x1166)> at usbus0
> ...
> uhub0: <(0x1166) OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on
> usbus0
> uhub0: 4 ports with 4 removable, self powered
> ugen0.2:  at usbus0
> uhub1:  on usbus0
> uhub1: 7 ports with 7 removable, self powered
> ugen0.3:  at usbus0
> uftdi0:  on usbus0
> uftdi1:  on usbus0
> ugen0.4:  at usbus0
> uftdi2:  on usbus0
> uftdi3:  on usbus0
> ugen0.5:  at usbus0
> uftdi4:  on usbus0
> uftdi5:  on usbus0
> ugen0.6:  at usbus0
> uftdi6:  on usbus0
> uftdi7:  on usbus0
> ugen0.7:  at usbus0
> uftdi8:  on usbus0
> uftdi9:  on usbus0
> ugen0.8:  at usbus0
> uftdi10:  on usbus0
> uftdi11:  on usbus0
> ugen0.9:  at usbus0
> uhub2:  on usbus0
> uhub2: 4 ports with 4 removable, self powered
> ugen0.10:  at usbus0
> uftdi12:  on usbus0
> uftdi13:  on usbus0
> ugen0.11:  at usbus0
> ... (mangling below is as it appears in dmesg)
> da1 at sym0 bus 0 scbus0 target 1 lun 0uftdi14: 
> on usbus0
> da1:  Fixed Direct Access SCSI-3 device uftdi15:
>  on usbus0
> ...
> Root mount waiting for: usbus0
> ugen0.12:  at usbus0
> uhub3: 
> on usbus0
> Root mount waiting for: usbus0
> uhub3: 7 ports with 7 removable, self powered
> ugen0.13:  at usbus0
> uftdi16:  on usbus0
> Root mount waiting for: usbus0
> ugen0.14:  at usbus0
> uftdi17:  on usbus0
> Root mount waiting for: usbus0
> ugen0.15:  at usbus0
> uftdi18:  on usbus0
> ugen0.16:  at usbus0Root mount waiting for:
>   usbus0
> uftdi19:  on usbus0
> ugen0.17:  at usbus0
> uftdi20:  on usbus0
> Root mount waiting for: usbus0
> ugen0.18:  at usbus0
> uftdi21:  on usbus0
> Root mount waiting for: usbus0
> ugen0.19:  at usbus0
> uhub4: 
> on usbus0
> uhub4: 4 ports with 4 removable, self powered
> Root mount waiting for: usbus0
> ugen0.20:  at usbus0
> uftdi22:  on usbus0
> Root mount waiting for: usbus0
> ugen0.21:  at usbus0
> uftdi23:  on usbus0
> Trying to mount root from zfs:zroot
> 
> Thanks,
> 
> Charles

Hi,

Try to get output from vmstat -i.

Also try to set the:

hw.usb.ehci.iaadbug=1

and

hw.usb.ehci.lostintrbug=1

in /boot/loader.conf

--HPS
___
freebsd-usb@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-usb
To unsubscribe, send any mail to "freebsd-usb-unsubscr...@freebsd.org"


ucom/uftdi high interrupt load

2011-06-11 Thread Charles Sprickman

Hello,

We ran into an odd problem last week with our serial consoles after moving 
the USB to serial adapters from an old 4.11 box to a box running 8.1.  We 
have two boxes that incorporate (I assume) hubs and a bunch of FTDI serial 
interfaces.  One has 16 ports, the other 8.  Each is plugged directly into 
a USB port on the rear of the mainboard.  We run conserver[1] to handle 
access to the serial ports.  From what I've observed, this application 
opens the ports when the daemon starts - it logs any output (handy for 
panics, or anything else that might spit interesting info to the console) 
and waits for clients to connect to it.


Everything had been working fine for a few weeks.  The box was rebooted 
recently to enable PostgreSQL to start normally (bumped SHM stuff in 
loader.conf).  After six days, we found that the consoles were 
unresponsive.  Restarting conserver brought us this each time we 
connected to a console for full read/write access:


[Thu Jun  9 10:04:59 2011] conserver (50113): ERROR: [h22] 
open(/dev/ttyU4): Interrupted system call: forcing down
[Thu Jun  9 10:04:59 2011] conserver (50112): ERROR: [h21] 
open(/dev/ttyU11): Interrupted system call: forcing down


All devices still appeared in /dev.  Stopping conserver and confirming it 
and all child processes were gone and then using picocom and cu yielded no 
response on the serial ports.


We also found (after the fact) that around the time the consoles became 
unresponsive, cpu usage went to nearly 90% and was mostly in the kernel 
process "intr":


root   12 70.5  0.0 0   136  ??  WL   Fri12AM 120:01.47  [intr]

A graph showing cpu usage (red is "system"):
http://i.imgur.com/0yO5l.png

I should note that we know the cpu spike and devices becoming unresponsive 
can be correlated because one of the serial ports runs a temperature 
monitor which is tied into our monitoring.  When the data goes stale, we 
get notified.


Issuing a "usbconfig -u 0 reset" caused all devices except for the root 
hub to disappear and not come back.  CPU usage also dipped a bit after 
that.  Rebooting was the only way to resolve the issue - perhaps plugging 
and unplugging would have worked, but that's a bit too complex for our 
remote hands.


I can supply full dmesg and more, but for now, here's a summary of the usb 
info from dmesg:


FreeBSD 8.1-RELEASE #7: Wed Dec 22 00:49:50 EST 2010

ohci0:  mem 0xfe9fc000-0xfe9fcfff irq 10 at 
device 15.2 on pci0

ohci0: [ITHREAD]
...
usbus0:  on ohci0
usbus0: 12Mbps Full Speed USB v1.0
ugen0.1: <(0x1166)> at usbus0
...
uhub0: <(0x1166) OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on 
usbus0

uhub0: 4 ports with 4 removable, self powered
ugen0.2:  at usbus0
uhub1:  on usbus0
uhub1: 7 ports with 7 removable, self powered
ugen0.3:  at usbus0
uftdi0:  on usbus0
uftdi1:  on usbus0
ugen0.4:  at usbus0
uftdi2:  on usbus0
uftdi3:  on usbus0
ugen0.5:  at usbus0
uftdi4:  on usbus0
uftdi5:  on usbus0
ugen0.6:  at usbus0
uftdi6:  on usbus0
uftdi7:  on usbus0
ugen0.7:  at usbus0
uftdi8:  on usbus0
uftdi9:  on usbus0
ugen0.8:  at usbus0
uftdi10:  on usbus0
uftdi11:  on usbus0
ugen0.9:  at usbus0
uhub2:  on usbus0
uhub2: 4 ports with 4 removable, self powered
ugen0.10:  at usbus0
uftdi12:  on usbus0
uftdi13:  on usbus0
ugen0.11:  at usbus0
... (mangling below is as it appears in dmesg)
da1 at sym0 bus 0 scbus0 target 1 lun 0uftdi14:  
on usbus0
da1:  Fixed Direct Access SCSI-3 device uftdi15: 
 on usbus0 
...

Root mount waiting for: usbus0
ugen0.12:  at usbus0
uhub3:  
on usbus0

Root mount waiting for: usbus0
uhub3: 7 ports with 7 removable, self powered
ugen0.13:  at usbus0
uftdi16:  on usbus0
Root mount waiting for: usbus0
ugen0.14:  at usbus0
uftdi17:  on usbus0
Root mount waiting for: usbus0
ugen0.15:  at usbus0
uftdi18:  on usbus0
ugen0.16:  at usbus0Root mount waiting for:
 usbus0
uftdi19:  on usbus0
ugen0.17:  at usbus0
uftdi20:  on usbus0
Root mount waiting for: usbus0
ugen0.18:  at usbus0
uftdi21:  on usbus0
Root mount waiting for: usbus0
ugen0.19:  at usbus0
uhub4:  
on usbus0

uhub4: 4 ports with 4 removable, self powered
Root mount waiting for: usbus0
ugen0.20:  at usbus0
uftdi22:  on usbus0
Root mount waiting for: usbus0
ugen0.21:  at usbus0
uftdi23:  on usbus0
Trying to mount root from zfs:zroot

Thanks,

Charles

[1] - http://www.conserver.com/
___
freebsd-usb@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-usb
To unsubscribe, send any mail to "freebsd-usb-unsubscr...@freebsd.org"