em0 lock up / hangs (WAS: em0: Watchdog timeout -- resetting)

2011-02-01 Thread Lev Serebryakov
Hello, Eugene.
You wrote 1 февраля 2011 г., 15:38:33:

> Eugene wrote:
>> You could give a try to netisr parallelism of RELENG_8 instead of POLLING
>> (and tune interrupt throttling) if your box does not have lots of dynamic
>> interfaces like when using mpd.

> Jack wrote:
>> I don't test POLLING, sounds like its broken, I don't understand
>> why you think you need you need it?  This hardware supports
>> MSI why not use it?

>   I send one answer to two messages, because data is the same.

>   Here it is snapshot of "top -S" with "H" pressed when server sends
> 1Gbit/s via SMB with polling (Windows'7 client copies 8GiB sparse file to very
> fast local disk):
>   the same without polling, with net.isr settings:
> # sysctl net.isr
> net.isr.direct: 0
> net.isr.direct_force: 0
  After these settings server lost connection. It works locally, no
 panic, but "ping gateway" shows "No buffer space available", and any
 other "network activity" shows the same message.

 Up-down of interface helps.

 I attached outputs of:

 vmstat -m
 netstat -m
 sysctl dev.em0

 BEFORE interface reset

  No polling, net.isr.direct=0, net.isr.direct_force=0

-- 
// Black Lion AKA Lev Serebryakov 

sysctl.dev.em0.log
Description: Binary data


vmstat-m.log
Description: Binary data


netstat-m.log
Description: Binary data
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: em0: Watchdog timeout -- resetting

2011-02-01 Thread Lev Serebryakov
Hello, Eugene.
You wrote 1 февраля 2011 г., 16:52:57:

>> = INTR - ISR.DIRECT=1
>> Real speed (accroding to Windows'7 report) ~101MiB/s.
>>   I've re-created file to flush caches on both sides between trys.

> netisr queues help to deal with lots of incoming traffic.
> If you bother about outgoing traffic only, it won't help.
 This server is mostly-R/O storage server, so I bother about outgoing
 traffic.

 And now, after switching polling off & experiments, it is lost --
 about 30 minutes after experiments it stops answer on pings and
 other network activity. I'll be near local console only at night to
 report panic or something else.

-- 
// Black Lion AKA Lev Serebryakov 

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em0: Watchdog timeout -- resetting

2011-02-01 Thread Eugene Grosbein
On 01.02.2011 18:38, Lev Serebryakov wrote:

> = INTR - ISR.DIRECT=1
> Real speed (accroding to Windows'7 report) ~101MiB/s.
> 
>   I've re-created file to flush caches on both sides between trys.
> 

netisr queues help to deal with lots of incoming traffic.
If you bother about outgoing traffic only, it won't help.

Eugene Grosbein
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em0: Watchdog timeout -- resetting

2011-02-01 Thread Lev Serebryakov
Hello, Eugene & Jack.
You wrote 1 февраля 2011 г., 11:23:23:

Eugene wrote:
> You could give a try to netisr parallelism of RELENG_8 instead of POLLING
> (and tune interrupt throttling) if your box does not have lots of dynamic
> interfaces like when using mpd.

Jack wrote:
> I don't test POLLING, sounds like its broken, I don't understand
> why you think you need you need it?  This hardware supports
> MSI why not use it?

  I send one answer to two messages, because data is the same.

  Here it is snapshot of "top -S" with "H" pressed when server sends
1Gbit/s via SMB with polling (Windows'7 client copies 8GiB sparse file to very
fast local disk):


= POLLING
CPU:  0.5% user,  0.0% nice,  0.6% system,  1.3% interrupt, 98.1% idle
  PID USERNAME PRI NICE   SIZERES STATE   C   TIME   WCPU COMMAND
   11 root 171 ki31 0K32K CPU11  90.1H 100.00% {idle: cpu1}
   11 root 171 ki31 0K32K RUN 0  82.1H 100.00% {idle: cpu0}
   12 root -64- 0K   304K WAIT0  33:40  0.68% {irq18: uhci2 
ehc}
   12 root -44- 0K   304K WAIT1 225:22  0.00% {swi1: netisr 
0}
   14 root -68- 0K   528K -   1  16:19  0.00% {usbus3}
   12 root -40- 0K   304K WAIT0  14:25  0.00% {swi2: cambio}
   12 root -64- 0K   304K WAIT1  12:50  0.00% {irq22: ahci0}
4 root  -8- 0K16K -   0  12:26  0.00% g_down
= POLLING
NB: no "smbd" process at all in first 8 positions.
Real speed (accroding to Windows'7 report) ~75MiB/s.


  the same without polling, with net.isr settings:
# sysctl net.isr
net.isr.numthreads: 1
net.isr.maxprot: 16
net.isr.defaultqlimit: 256
net.isr.maxqlimit: 10240
net.isr.bindthreads: 0
net.isr.maxthreads: 1
net.isr.direct: 0
net.isr.direct_force: 0

= INTR - ISR.DIRECT=0
CPU:  3.8% user,  0.0% nice, 26.5% system,  6.6% interrupt, 63.2% idle
  PID USERNAME PRI NICE   SIZERES STATE   C   TIME   WCPU COMMAND
   11 root 171 ki31 0K32K RUN 0  82.1H 83.59% {idle: cpu0}
   11 root 171 ki31 0K32K RUN 1  90.1H 64.06% {idle: cpu1}
33873 root  720 28912K  5432K select  0   0:28 34.96% smbd
   12 root -44- 0K   304K WAIT0 225:29  9.18% {swi1: netisr 
0}
0 root -680 0K   128K -   1   0:02  6.30% {em0 taskq}
   12 root -68- 0K   304K WAIT0   0:00  1.56% {irq20: em0 
fwohc}
7 root  44- 0K16K psleep  0   3:12  0.39% pagedaemon
   12 root -64- 0K   304K WAIT1  33:41  0.20% {irq18: uhci2 
ehc}
   14 root -68- 0K   528K -   0  16:19  0.00% {usbus3}
   12 root -40- 0K   304K WAIT0  14:25  0.00% {swi2: cambio}
= INTR - ISR.DIRECT=0
Real speed (accroding to Windows'7 report) ~85MiB/s.

  the same without polling, with net.isr settings:
# sysctl net.isr
net.isr.numthreads: 1
net.isr.maxprot: 16
net.isr.defaultqlimit: 256
net.isr.maxqlimit: 10240
net.isr.bindthreads: 0
net.isr.maxthreads: 1
net.isr.direct: 1
net.isr.direct_force: 1

= INTR - ISR.DIRECT=1
CPU:  2.8% user,  0.0% nice, 30.1% system,  1.7% interrupt, 65.4% idle
  PID USERNAME PRI NICE   SIZERES STATE   C   TIME   WCPU COMMAND
   11 root 171 ki31 0K32K RUN 1  90.2H 89.36% {idle: cpu1}
   11 root 171 ki31 0K32K RUN 0  82.2H 67.87% {idle: cpu0}
33873 root 1030 28912K  5424K CPU00   0:51 33.98% smbd
0 root -680 0K   128K -   1   0:06 12.70% {em0 taskq}
   12 root -68- 0K   304K WAIT0   0:01  1.66% {irq20: em0 
fwohc}
7 root  45- 0K16K psleep  0   3:12  0.78% pagedaemon
   12 root -64- 0K   304K WAIT0  33:42  0.20% {irq18: uhci2 
ehc}
   12 root -44- 0K   304K WAIT1 225:33  0.00% {swi1: netisr 
0}
   14 root -68- 0K   528K -   1  16:20  0.00% {usbus3}
   12 root -40- 0K   304K WAIT0  14:25  0.00% {swi2: cambio}
= INTR - ISR.DIRECT=1
Real speed (accroding to Windows'7 report) ~101MiB/s.

  I've re-created file to flush caches on both sides between trys.

-- 
// Black Lion AKA Lev Serebryakov 

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em0: Watchdog timeout -- resetting

2011-02-01 Thread Damien Fleuriot
We have tried POLLING here on Intel cards attached to the igb driver
(see my post entitled "High interrupt rate on a PF box + performance"
from 27/01/2011".

This broke carp *badly* and we switched back to interrupts.


You say a single thread eats up a full CPU core, can you post a top to
show the %interrupt and your smb process' usage ?


On 2/1/11 10:28 AM, Jack Vogel wrote:
> I don't test POLLING, sounds like its broken, I don't understand
> why you think you need you need it?  This hardware supports
> MSI why not use it?
> 
> Jack
> 
> 
> 2011/1/31 Lev Serebryakov 
> 
>> Hello, Freebsd-stable.
>> You wrote 1 февраля 2011 г., 10:24:16:
>>
>>>   And all connections are reset. Before latest commits to driver
>>> this system paniced in swi_clock. Now it works without panics, but
>>> seems, that problem is not fixed completely.
>>   I forgot to give one last pice of information: POLLING is in action.
>> Without it single thread copy from this server via SMB eats one core
>> of CPU completely.
>>
>> --
>> // Black Lion AKA Lev Serebryakov 
>>
>> ___
>> freebsd-stable@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
>> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>>
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em0: Watchdog timeout -- resetting

2011-02-01 Thread Jack Vogel
I don't test POLLING, sounds like its broken, I don't understand
why you think you need you need it?  This hardware supports
MSI why not use it?

Jack


2011/1/31 Lev Serebryakov 

> Hello, Freebsd-stable.
> You wrote 1 февраля 2011 г., 10:24:16:
>
> >   And all connections are reset. Before latest commits to driver
> > this system paniced in swi_clock. Now it works without panics, but
> > seems, that problem is not fixed completely.
>   I forgot to give one last pice of information: POLLING is in action.
> Without it single thread copy from this server via SMB eats one core
> of CPU completely.
>
> --
> // Black Lion AKA Lev Serebryakov 
>
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em0: Watchdog timeout -- resetting

2011-02-01 Thread Eugene Grosbein
On 01.02.2011 13:58, Lev Serebryakov wrote:
> Hello, Freebsd-stable.
> You wrote 1 февраля 2011 г., 10:24:16:
> 
>>   And all connections are reset. Before latest commits to driver
>> this system paniced in swi_clock. Now it works without panics, but
>> seems, that problem is not fixed completely.
>   I forgot to give one last pice of information: POLLING is in action.
> Without it single thread copy from this server via SMB eats one core
> of CPU completely.
> 

You could give a try to netisr parallelism of RELENG_8 instead of POLLING
(and tune interrupt throttling) if your box does not have lots of dynamic
interfaces like when using mpd.

In /etc/sysctl.conf:

net.isr.direct=0
net.isr.direct_force=0

Eugene Grosbein.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em0: Watchdog timeout -- resetting

2011-02-01 Thread Lev Serebryakov
Hello, Freebsd-stable.
You wrote 1 февраля 2011 г., 10:24:16:

>   And all connections are reset. Before latest commits to driver
> this system paniced in swi_clock. Now it works without panics, but
> seems, that problem is not fixed completely.
  I forgot to give one last pice of information: POLLING is in action.
Without it single thread copy from this server via SMB eats one core
of CPU completely.

-- 
// Black Lion AKA Lev Serebryakov 

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


em0: Watchdog timeout -- resetting

2011-01-31 Thread Lev Serebryakov
Hello, Freebsd-stable.

  System is 8-STABLE (8.2-PRERELEASE) with very last e1000 driver
(cvsupped 27 Jan, last commits to e1000 were done 22 Jan).

  NIC is:

em0:  port 0xdc00-0xdc1f mem 
0xfea4-0xfea5,0xfea79000-0xfea79fff irq 20 at device 25.0 on pci0
em0: No MSI/MSIX using a Legacy IRQ
em0: [FILTER]

em0@pci0:0:25:0:class=0x02 card=0x82681043 chip=0x10bd8086 rev=0x02 
hdr=0x00
vendor = 'Intel Corporation'
device = 'Intel 82566DM Gigabit Ethernet Adapter (82566DM)'
class  = network
subclass   = ethernet
bar   [10] = type Memory, range 32, base 0xfea4, size 131072, enabled
bar   [14] = type Memory, range 32, base 0xfea79000, size 4096, enabled
bar   [18] = type I/O Port, range 32, base 0xdc00, size 32, enabled

 It is on-board LAN on ASUS P5R-VM DO MoBo (Q35 chipset).

 I have these tunables in "/etc/loader.conf"

hw.em.rxd=4096
hw.em.txd=4096


 And these non-standard sysctls:

dev.em.0.rx_int_delay=200
dev.em.0.tx_int_delay=200
dev.em.0.rx_abs_int_delay=4000
dev.em.0.tx_abs_int_delay=4000
dev.em.0.rx_processing_limit=4096

 Several times a day I got messages like this:

em0: Watchdog timeout -- resetting
em0: Queue(0) tdh = 1302, hw tdt = 1265
em0: TX(0) desc avail = 31,Next TX to Clean = 1296

em0: Watchdog timeout -- resetting
em0: Queue(0) tdh = 3999, hw tdt = 3959
em0: TX(0) desc avail = 31,Next TX to Clean = 3990

em0: Watchdog timeout -- resetting
em0: Queue(0) tdh = 1431, hw tdt = 1394
em0: TX(0) desc avail = 31,Next TX to Clean = 1425

  And all connections are reset. Before latest commits to driver
this system paniced in swi_clock. Now it works without panics, but
seems, that problem is not fixed completely.

-- 
// Black Lion AKA Lev Serebryakov 

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


"em0: watchdog timeout -- resetting" revisited

2007-11-02 Thread Ted Strzalkowski
I am having some issues with the em(4) card.  I am getting the "em0:
watchdog timeout -- resetting" error.  The box has a DFI KT600-AL mobo
and a Intel Pro/1000 GT nic.  I am running 6.2-RELEASE.  I have rma'd
the card twice already and get the same error.  I've tried different
PCI ports, different switch ports, and different cables with the same
negative results.  I rebuilt the kernel without em support and used
the latest em(4) driver from Intel (v6.6.6) as a module.  I still get
the same error.  The card is connected via Cat6 patch to a 3Com
OfficeConnect unmanaged GigE switch.  The switch shows 1000bTX
full-duplex link via LED's.  Below is some info from the system:

Script started on Fri Nov  2 16:48:32 2007
freenas:~# uname -a

FreeBSD freenas.local 6.2-RELEASE-p8 FreeBSD 6.2-RELEASE-p8 #0: Thu
Nov  1 14:37:18 EDT 2007
[EMAIL PROTECTED]:/usr/obj/freenas/usr/src/sys/FREENAS-i386  i386
freenas:~# dmesg

Copyright (c) 1992-2007 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 6.2-RELEASE-p8 #0: Thu Nov  1 14:37:18 EDT 2007
[EMAIL PROTECTED]:/usr/obj/freenas/usr/src/sys/FREENAS-i386
MPTable: 
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: AMD Athlon(tm)  (1999.79-MHz 686-class CPU)
  Origin = "AuthenticAMD"  Id = 0x6a0  Stepping = 0
  
Features=0x383fbff
  AMD Features=0xc0400800
real memory  = 536870912 (512 MB)
avail memory = 461340672 (439 MB)
ioapic0: Assuming intbase of 0
ioapic0  irqs 0-23 on motherboard
wlan: mac acl policy registered
kbd1 at kbdmux0
PadLock: No ACE support.
module_register_init: MOD_LOAD (padlock, 0xc0896bb0, 0) error 22
ath_hal: 0.9.17.2 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413)
rr174x: RocketRAID 174x controller driver v1.02 (Feb  1 2007 10:51:17)
ACPI-0159: *** Error: AcpiLoadTables: Could not get RSDP, AE_NO_ACPI_TABLES
ACPI-0213: *** Error: AcpiLoadTables: Could not load tables:
AE_NO_ACPI_TABLES
ACPI: table load failed: AE_NO_ACPI_TABLES
cpu0 on motherboard
pcib0:  pcibus 0 on motherboard
pci0:  on pcib0
agp0:  mem
0xd000-0xd7ff at device 0.0 on pci0
pcib1:  at device 1.0 on pci0
pci1:  on pcib1
pci1:  at device 0.0 (no driver attached)
em0:  port
0xd000-0xd03f mem 0xe31a-0xe31b,0xe318-0xe319 irq 16
at device 13.0 on pci0
em0: Ethernet address: 00:1b:21:0a:0d:5a
em0: [FAST]
atapci0:  port
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xd400-0xd40f at device 15.0 on
pci0
ata0:  on atapci0
ata1:  on atapci0
uhci0:  port 0xd800-0xd81f irq 21 at device
16.0 on pci0
uhci0: [GIANT-LOCKED]
usb0:  on uhci0
usb0: USB revision 1.0
uhub0: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1:  port 0xdc00-0xdc1f irq 21 at device
16.1 on pci0
uhci1: [GIANT-LOCKED]
usb1:  on uhci1
usb1: USB revision 1.0
uhub1: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhci2:  port 0xe000-0xe01f irq 21 at device
16.2 on pci0
uhci2: [GIANT-LOCKED]
usb2:  on uhci2
usb2: USB revision 1.0
uhub2: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
uhci3:  port 0xe400-0xe41f irq 21 at device
16.3 on pci0
uhci3: [GIANT-LOCKED]
usb3:  on uhci3
usb3: USB revision 1.0
uhub3: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub3: 2 ports with 2 removable, self powered
ehci0:  mem 0xe31c-0xe31c00ff irq
21 at device 16.4 on pci0
ehci0: [GIANT-LOCKED]
usb4: EHCI version 1.0
usb4: companion controllers, 2 ports each: usb0 usb1 usb2 usb3
usb4:  on ehci0
usb4: USB revision 2.0
uhub4: VIA EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub4: 8 ports with 8 removable, self powered
isab0:  at device 17.0 on pci0
isa0:  on isab0
rr174x0:  port 0xe800-0xe8ff mem 0xe300-0xe30f irq 19
at device 20.0 on pci0
rr174x: adapter at PCI 0:20:0, IRQ 19
pmtimer0 on isa0
orm0:  at iomem
0xc-0xcf7ff,0xd-0xd0fff,0xd1000-0xd8fff on isa0
atkbdc0:  at port 0x60,0x64 on isa0
atkbd0:  irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
ppc0:  at port 0x378-0x37f irq 7 on isa0
ppc0: Generic chipset (NIBBLE-only) in COMPATIBLE mode
ppbus0:  on ppc0
ppi0:  on ppbus0
sc0:  at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A
sio1 at port 0x2f8-0x2ff irq 3 on isa0
sio1: type 16550A
vga0:  at port 0x3c0-0x3df iomem 0xa-0xb on isa0
unknown:  can't assign resources (port)
speaker0:  at port 0x61 on isa0
unknown:  can't assign resources (memory)
unknown:  can't assign resources (port)
unknown:  can't assign resources (port)
unknown:  can't assign resources (port)
Timecounter "TSC" frequency 1999790331 Hz quality 800
Timecounters tick every 10.000 msec
rr174x: sta

Re: em0: watchdog timeout -- resetting (6.1-STABLE)

2006-09-16 Thread Frode Nordahl

Hello,

Just wanted to send a me too on this issue. Whenever it happends I  
can see our Cisco switch reporting the interface going down and up as  
well (Line Protocol).



FreeBSD localhost.localdomain 6.2-PRERELEASE FreeBSD 6.2-PRERELEASE  
#1: Wed Sep 13 00:10:04 CEST 2006 [EMAIL PROTECTED]:/ 
usr/obj/usr/src/sys/PT  i386


em0: flags=8843 mtu 1500
options=b
media: Ethernet autoselect (1000baseTX )
status: active

[EMAIL PROTECTED]:11:0:  class=0x02 card=0x10048086 chip=0x10048086  
rev=0x02 hdr=0x00

vendor   = 'Intel Corporation'
device   = '82543GC Gigabit Ethernet Controller (Copper)'
class= network
subclass = ethernet

(This is a add-in 64bit PCI card.)

I am stress-testing -STABLE on a spare server to aid in making 6.2 as  
bugfree as possible.


It is set up as a NFS server with two Linux NFS clients connected  
that is concurrently extracting 5 copies of /usr/src to it, and  
running a program that creates millions of files with random UID's to  
test for QUOTA issues.


On the server I repeatedly dump the exported filesystem with snapshot  
and cache enabled. (dump -L -C 32 -af /dev/null ...)



I'm building todays -STABLE on a different server with SMP and two em  
NIC's onboard, and will start similar tests on it to see if I can  
reproduce the watchdog timeouts there as well.


--
Frode Nordahl



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: em0: watchdog timeout -- resetting (6.1-STABLE)

2006-09-15 Thread Jack Vogel

On 9/15/06, Martin Nilsson <[EMAIL PROTECTED]> wrote:

I'm also seeing these on a Supermicro PDSMi board with a recent stable.
Please tell me what debugging info that is needed to fix this.

/Martin


FreeBSD mailbox 6.2-PRERELEASE FreeBSD 6.2-PRERELEASE #1: Sun Sep 10
17:43:15 CEST 2006 [EMAIL PROTECTED]:/usr/obj-local/usr/src/sys/SMP  amd64

lspci -v output:

04:00.0 Ethernet controller: Intel Corporation 82573E Gigabit Ethernet
Controller (Copper) (rev 03)
 Subsystem: Super Micro Computer Inc Unknown device 108c
 Flags: bus master, fast devsel, latency 0, IRQ 16
 Memory at ed20 (32-bit, non-prefetchable)
 I/O ports at 4000
 Capabilities: [c8] Power Management version 2
 Capabilities: [d0] Message Signalled Interrupts: 64bit+
Queue=0/0 Enable-
 Capabilities: [e0] Express Endpoint IRQ 0

05:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet
Controller
 Subsystem: Super Micro Computer Inc Unknown device 109a
 Flags: bus master, fast devsel, latency 0, IRQ 17
 Memory at ed30 (32-bit, non-prefetchable)
 I/O ports at 5000
 Capabilities: [c8] Power Management version 2
 Capabilities: [d0] Message Signalled Interrupts: 64bit+
Queue=0/0 Enable-
 Capabilities: [e0] Express Endpoint IRQ 0


Martin, do you see similar problems using either port, I ask because this
system may be similar to one that Yahoo has and there was only a
problem with one port and not the other, can you check this out please?

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: em0: watchdog timeout -- resetting (6.1-STABLE)

2006-09-15 Thread Jack Vogel

On 9/14/06, David C. Myers <[EMAIL PROTECTED]> wrote:


> watchdogs mean that the transmit ring is not being cleaned, so the
> question is what is your machine doing at 100% cpu, if its that busy
> the network watchdogs may just be a side effect and not the real
> problem?


I get them with a completely idle machine.  My home directory is mounted
via NFS (from FreeBSD 6.1 on an amd64 machine), and with the kernel from
earlier this week, the machine would just hang for 30 seconds to a
couple of minutes.  A slew of "watchdog timeout" messages would appear.
  Then I'd get a moment's responsiveness out of the machine, then
another long wait, then a moment's responsiveness, then a long wait...

The machine would never recover from this cycle (at least, so far as I
was patient enough to wait).

Going back to a kernel dated late July resolved everything.

Someone else asked me for the hardware version of my em0 board...


[EMAIL PROTECTED]:10:0:  class=0x02 card=0x002e8086 chip=0x100e8086 rev=0x02
hdr=0x00vendor   = 'Intel Corporation'
 device   = '82540EM Gigabit Ethernet Controller'
 class= network
 subclass = ethernet


Could you perhaps go back to the kernel you say was stable and then
drop in the latest em driver? Or if that has issues building do it the
other way around, take the em driver from the build that gave you no
problems and put it on this kernel you are running now?

It would be helpful to know if this is a driver problem or something
in the stack.

Cheers,

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: em0: watchdog timeout -- resetting (6.1-STABLE)

2006-09-15 Thread Eugene Kazarinov

Something with em0 is really wrong. I dont get timeouts, but

Before cvsup I had 6.0-PRERELEASE and didn't have a problem.
Now I have "FreeBSD 6.2-PRERELEASE #8: Fri Sep 15 03:44:49 MSD 2006" and the
problem is so:
(On machine I have LARGE_NAT, em0, em1, em2)
on fresh system ping to www.ru from client computer (goes to inet via nat)
is 3-5ms
after few hours (i see it in the night) then traffic is smaller ping to
www.ru is 11-12 ms.
Why?
after reboot it still gut for a few ours.

FreeBSD/amd64
kernel with
options DEVICE_POLLING
options HZ=2500


with HZ=1000 and without DEVICE_POLLING nothing changes - 11-12 still goes
after few hours.

PS Should I downgrade to 6.0-RELEASE or earlier
or tonight cvsup updates could resolve a problem (files sounds like tcp...):
Checkout src/sys/contrib/ipfilter/netinet/ip_nat.h
Edit src/sys/netinet/in_pcb.c
Edit src/sys/netinet/tcp_input.c
Edit src/sys/netinet/tcp_subr.c
Edit src/sys/netinet/tcp_timer.c
Edit src/sys/netinet/tcp_timer.h
Edit src/sys/netinet/tcp_var.h
Edit src/sys/sys/param.h
Edit src/usr.sbin/pkg_install/add/main.c

PPS Now I rebuild kernels and  tomorrow night will se.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: em0: watchdog timeout -- resetting (6.1-STABLE)

2006-09-15 Thread Craig Boston
On Thu, Sep 14, 2006 at 02:27:29AM +0200, Ronald Klop wrote:
> Them manual page em(4) mentions trying another cable when the watchdog  
> timeout happens, so I tried that. But it didn't help.
> Is there anything I can test to (help) debug this?
> It happens a lot when my machine is under load. (100% CPU)
> Is it possible that it happens since I upgraded the memory from 1GB to 2  
> GB?

I don't think it's the cable.  I started getting these recently as well
(starting about a week ago).  Always when there's a lot of CPU and disk
I/O load.

Also sometimes my USB keyboard would become unresponsive at about the
same time (under high load).  Sometimes it would stutter and act like
the key was being held down for a second or two.

I built a new kernel (6.2-PRE now) on 9/12.  The keyboard problem seems
to be gone but I still get the em watchdog timeouts occasionally.

Craig
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: em0: watchdog timeout -- resetting (6.1-STABLE)

2006-09-15 Thread Martin Nilsson

I'm also seeing these on a Supermicro PDSMi board with a recent stable.
Please tell me what debugging info that is needed to fix this.

/Martin


FreeBSD mailbox 6.2-PRERELEASE FreeBSD 6.2-PRERELEASE #1: Sun Sep 10 
17:43:15 CEST 2006 [EMAIL PROTECTED]:/usr/obj-local/usr/src/sys/SMP  amd64


lspci -v output:

04:00.0 Ethernet controller: Intel Corporation 82573E Gigabit Ethernet 
Controller (Copper) (rev 03)

Subsystem: Super Micro Computer Inc Unknown device 108c
Flags: bus master, fast devsel, latency 0, IRQ 16
Memory at ed20 (32-bit, non-prefetchable)
I/O ports at 4000
Capabilities: [c8] Power Management version 2
Capabilities: [d0] Message Signalled Interrupts: 64bit+ 
Queue=0/0 Enable-

Capabilities: [e0] Express Endpoint IRQ 0

05:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet 
Controller

Subsystem: Super Micro Computer Inc Unknown device 109a
Flags: bus master, fast devsel, latency 0, IRQ 17
Memory at ed30 (32-bit, non-prefetchable)
I/O ports at 5000
Capabilities: [c8] Power Management version 2
Capabilities: [d0] Message Signalled Interrupts: 64bit+ 
Queue=0/0 Enable-

Capabilities: [e0] Express Endpoint IRQ 0

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: em0: watchdog timeout -- resetting (6.1-STABLE)

2006-09-14 Thread Ronald Klop
On Fri, 15 Sep 2006 02:06:08 +0200, David C. Myers <[EMAIL PROTECTED]>  
wrote:





watchdogs mean that the transmit ring is not being cleaned, so the
question is what is your machine doing at 100% cpu, if its that busy
the network watchdogs may just be a side effect and not the real
problem?



I get them with a completely idle machine.  My home directory is mounted  
via NFS (from FreeBSD 6.1 on an amd64 machine), and with the kernel from  
earlier this week, the machine would just hang for 30 seconds to a  
couple of minutes.  A slew of "watchdog timeout" messages would appear.  
  Then I'd get a moment's responsiveness out of the machine, then  
another long wait, then a moment's responsiveness, then a long wait...


The machine would never recover from this cycle (at least, so far as I  
was patient enough to wait).


Going back to a kernel dated late July resolved everything.

Someone else asked me for the hardware version of my em0 board...


[EMAIL PROTECTED]:10:0:  class=0x02 card=0x002e8086 chip=0x100e8086 rev=0x02  
hdr=0x00vendor   = 'Intel Corporation'

 device   = '82540EM Gigabit Ethernet Controller'
 class= network
 subclass = ethernet


-David.


This sounds familiar to my problem. I solved it today by enabling polling.  
I know it's a workaround.


--
 Ronald Klop
 Amsterdam, The Netherlands
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: em0: watchdog timeout -- resetting (6.1-STABLE)

2006-09-14 Thread David C. Myers



watchdogs mean that the transmit ring is not being cleaned, so the
question is what is your machine doing at 100% cpu, if its that busy
the network watchdogs may just be a side effect and not the real
problem?



I get them with a completely idle machine.  My home directory is mounted 
via NFS (from FreeBSD 6.1 on an amd64 machine), and with the kernel from 
earlier this week, the machine would just hang for 30 seconds to a 
couple of minutes.  A slew of "watchdog timeout" messages would appear. 
 Then I'd get a moment's responsiveness out of the machine, then 
another long wait, then a moment's responsiveness, then a long wait...


The machine would never recover from this cycle (at least, so far as I 
was patient enough to wait).


Going back to a kernel dated late July resolved everything.

Someone else asked me for the hardware version of my em0 board...


[EMAIL PROTECTED]:10:0:  class=0x02 card=0x002e8086 chip=0x100e8086 rev=0x02 
hdr=0x00vendor   = 'Intel Corporation'

device   = '82540EM Gigabit Ethernet Controller'
class= network
subclass = ethernet


-David.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: em0: watchdog timeout -- resetting (6.1-STABLE)

2006-09-14 Thread Dan Olson


Jack Vogel wrote:

On 9/13/06, Ronald Klop <[EMAIL PROTECTED]> wrote:
...


Them manual page em(4) mentions trying another cable when the watchdog
timeout happens, so I tried that. But it didn't help.
Is there anything I can test to (help) debug this?
It happens a lot when my machine is under load. (100% CPU)
Is it possible that it happens since I upgraded the memory from 1GB to 2
GB?


watchdogs mean that the transmit ring is not being cleaned, so the
question is what is your machine doing at 100% cpu, if its that busy
the network watchdogs may just be a side effect and not the real
problem?

Jack


I see these too when installing packages over nfs on my Laptop. If I run 
with a low level of network traffic, i.e. ssh compile, and peg out the 
cpu with a benchmark such as flops, I don't see these timeouts.


6.1-STABLE FreeBSD 6.1-STABLE #0: Sat Aug 26 14:45:40 CDT 2006

[EMAIL PROTECTED]:1:0:   class=0x02 card=0x05491014 chip=0x101e8086 rev=0x03 
hdr=0x00

vendor   = 'Intel Corporation'
device   = '82540EP Gigabit Ethernet Controller (Mobile)'
class= network

Any suggestions?

Thanks

Dan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: em0: watchdog timeout -- resetting (6.1-STABLE)

2006-09-13 Thread Jack Vogel

On 9/13/06, Ronald Klop <[EMAIL PROTECTED]> wrote:
...


Them manual page em(4) mentions trying another cable when the watchdog
timeout happens, so I tried that. But it didn't help.
Is there anything I can test to (help) debug this?
It happens a lot when my machine is under load. (100% CPU)
Is it possible that it happens since I upgraded the memory from 1GB to 2
GB?


watchdogs mean that the transmit ring is not being cleaned, so the
question is what is your machine doing at 100% cpu, if its that busy
the network watchdogs may just be a side effect and not the real
problem?

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: em0: watchdog timeout -- resetting (6.1-STABLE)

2006-09-13 Thread Mike Tancsa

At 10:20 PM 9/13/2006, David Myers wrote:


Sep  5 11:55:12 ronald kernel: em0: watchdog timeout -- resetting



I got a bazillion of these, and a completely unusable machine, when 
I upgraded to 6.1-stable sources as of two days ago.  The machine 
would simply freeze for minutes at a time.  Going back to my 
previous kernel (dating from late July) made everything just fine.


So something got seriously broken in the em driver in the last few weeks.


Which version of the NIC do you have ? (pciconf -lv )

---Mike 


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: em0: watchdog timeout -- resetting (6.1-STABLE)

2006-09-13 Thread David Myers



Sep  5 11:55:12 ronald kernel: em0: watchdog timeout -- resetting



I got a bazillion of these, and a completely unusable machine, when I 
upgraded to 6.1-stable sources as of two days ago.  The machine would 
simply freeze for minutes at a time.  Going back to my previous kernel 
(dating from late July) made everything just fine.


So something got seriously broken in the em driver in the last few 
weeks.


-David.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: em0: watchdog timeout -- resetting (6.1-STABLE)

2006-09-13 Thread Ronald Klop
On Tue, 05 Sep 2006 23:52:05 +0200, Ronald Klop  
<[EMAIL PROTECTED]> wrote:



Hello,

I get these errors a lot.

Sep  5 11:55:12 ronald kernel: em0: watchdog timeout -- resetting
Sep  5 11:55:12 ronald kernel: em0: link state changed to DOWN
Sep  5 11:55:14 ronald kernel: em0: link state changed to UP
Sep  5 12:00:37 ronald kernel: em0: watchdog timeout -- resetting
Sep  5 12:00:37 ronald kernel: em0: link state changed to DOWN
Sep  5 12:00:39 ronald kernel: em0: link state changed to UP

I tried turning off rxcsum/txcsum and set these sysctl's.
dev.em.0.rx_int_delay: 0
dev.em.0.tx_int_delay: 0 (default 66)
But the error is still there.
Searching the internet and the list provides more of the same problems,
but I didn't find an answer.

My dmesg is attached.

Is there any info I need to provide to debug this or can I try patches?


Them manual page em(4) mentions trying another cable when the watchdog  
timeout happens, so I tried that. But it didn't help.

Is there anything I can test to (help) debug this?
It happens a lot when my machine is under load. (100% CPU)
Is it possible that it happens since I upgraded the memory from 1GB to 2  
GB?


(dmesg was attached to my previous mail, but I can provide it again.)

Ronald.

--
 Ronald Klop
 Amsterdam, The Netherlands
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: em0: watchdog timeout -- resetting (6.1-STABLE)

2006-09-05 Thread Kent Stewart
On Tuesday 05 September 2006 14:52, Ronald Klop wrote:
> Hello,
>
> I get these errors a lot.
>
> Sep  5 11:55:12 ronald kernel: em0: watchdog timeout -- resetting
> Sep  5 11:55:12 ronald kernel: em0: link state changed to DOWN
> Sep  5 11:55:14 ronald kernel: em0: link state changed to UP
> Sep  5 12:00:37 ronald kernel: em0: watchdog timeout -- resetting
> Sep  5 12:00:37 ronald kernel: em0: link state changed to DOWN
> Sep  5 12:00:39 ronald kernel: em0: link state changed to UP

So am I. Especially when I transfer a GB or 2 from Windows XP to 
6.1-stable. I use the FreeBSD machine as a backup for digital photos 
and my ripped mp3 files. A photo session is usually in excess of 1 GB 
and can hang with the watchdog timeout.

Kent

>
> I tried turning off rxcsum/txcsum and set these sysctl's.
> dev.em.0.rx_int_delay: 0
> dev.em.0.tx_int_delay: 0 (default 66)
> But the error is still there.
> Searching the internet and the list provides more of the same
> problems, but I didn't find an answer.
>
> My dmesg is attached.
>
> Is there any info I need to provide to debug this or can I try
> patches?
>
> Ronald.

-- 
Kent Stewart
Richland, WA

http://www.soyandina.com/ "I am Andean project".
http://users.owt.com/kstewart/index.html
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


em0: watchdog timeout -- resetting (6.1-STABLE)

2006-09-05 Thread Ronald Klop

Hello,

I get these errors a lot.

Sep  5 11:55:12 ronald kernel: em0: watchdog timeout -- resetting
Sep  5 11:55:12 ronald kernel: em0: link state changed to DOWN
Sep  5 11:55:14 ronald kernel: em0: link state changed to UP
Sep  5 12:00:37 ronald kernel: em0: watchdog timeout -- resetting
Sep  5 12:00:37 ronald kernel: em0: link state changed to DOWN
Sep  5 12:00:39 ronald kernel: em0: link state changed to UP

I tried turning off rxcsum/txcsum and set these sysctl's.
dev.em.0.rx_int_delay: 0
dev.em.0.tx_int_delay: 0 (default 66)
But the error is still there.
Searching the internet and the list provides more of the same problems,  
but I didn't find an answer.


My dmesg is attached.

Is there any info I need to provide to debug this or can I try patches?

Ronald.

--
 Ronald Klop
 Amsterdam, The Netherlands

dmesg.boot
Description: Binary data
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"