Re: em0: Watchdog timeout -- resetting

2011-02-01 Thread Lev Serebryakov
Hello, Freebsd-stable.
You wrote 1 февраля 2011 г., 10:24:16:

   And all connections are reset. Before latest commits to driver
 this system paniced in swi_clock. Now it works without panics, but
 seems, that problem is not fixed completely.
  I forgot to give one last pice of information: POLLING is in action.
Without it single thread copy from this server via SMB eats one core
of CPU completely.

-- 
// Black Lion AKA Lev Serebryakov l...@freebsd.org

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: em0: Watchdog timeout -- resetting

2011-02-01 Thread Eugene Grosbein
On 01.02.2011 13:58, Lev Serebryakov wrote:
 Hello, Freebsd-stable.
 You wrote 1 февраля 2011 г., 10:24:16:
 
   And all connections are reset. Before latest commits to driver
 this system paniced in swi_clock. Now it works without panics, but
 seems, that problem is not fixed completely.
   I forgot to give one last pice of information: POLLING is in action.
 Without it single thread copy from this server via SMB eats one core
 of CPU completely.
 

You could give a try to netisr parallelism of RELENG_8 instead of POLLING
(and tune interrupt throttling) if your box does not have lots of dynamic
interfaces like when using mpd.

In /etc/sysctl.conf:

net.isr.direct=0
net.isr.direct_force=0

Eugene Grosbein.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: em0: Watchdog timeout -- resetting

2011-02-01 Thread Jack Vogel
I don't test POLLING, sounds like its broken, I don't understand
why you think you need you need it?  This hardware supports
MSI why not use it?

Jack


2011/1/31 Lev Serebryakov l...@freebsd.org

 Hello, Freebsd-stable.
 You wrote 1 февраля 2011 г., 10:24:16:

And all connections are reset. Before latest commits to driver
  this system paniced in swi_clock. Now it works without panics, but
  seems, that problem is not fixed completely.
   I forgot to give one last pice of information: POLLING is in action.
 Without it single thread copy from this server via SMB eats one core
 of CPU completely.

 --
 // Black Lion AKA Lev Serebryakov l...@freebsd.org

 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: em0: Watchdog timeout -- resetting

2011-02-01 Thread Damien Fleuriot
We have tried POLLING here on Intel cards attached to the igb driver
(see my post entitled High interrupt rate on a PF box + performance
from 27/01/2011.

This broke carp *badly* and we switched back to interrupts.


You say a single thread eats up a full CPU core, can you post a top to
show the %interrupt and your smb process' usage ?


On 2/1/11 10:28 AM, Jack Vogel wrote:
 I don't test POLLING, sounds like its broken, I don't understand
 why you think you need you need it?  This hardware supports
 MSI why not use it?
 
 Jack
 
 
 2011/1/31 Lev Serebryakov l...@freebsd.org
 
 Hello, Freebsd-stable.
 You wrote 1 февраля 2011 г., 10:24:16:

   And all connections are reset. Before latest commits to driver
 this system paniced in swi_clock. Now it works without panics, but
 seems, that problem is not fixed completely.
   I forgot to give one last pice of information: POLLING is in action.
 Without it single thread copy from this server via SMB eats one core
 of CPU completely.

 --
 // Black Lion AKA Lev Serebryakov l...@freebsd.org

 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: em0: Watchdog timeout -- resetting

2011-02-01 Thread Lev Serebryakov
Hello, Eugene  Jack.
You wrote 1 февраля 2011 г., 11:23:23:

Eugene wrote:
 You could give a try to netisr parallelism of RELENG_8 instead of POLLING
 (and tune interrupt throttling) if your box does not have lots of dynamic
 interfaces like when using mpd.

Jack wrote:
 I don't test POLLING, sounds like its broken, I don't understand
 why you think you need you need it?  This hardware supports
 MSI why not use it?

  I send one answer to two messages, because data is the same.

  Here it is snapshot of top -S with H pressed when server sends
1Gbit/s via SMB with polling (Windows'7 client copies 8GiB sparse file to very
fast local disk):


= POLLING
CPU:  0.5% user,  0.0% nice,  0.6% system,  1.3% interrupt, 98.1% idle
  PID USERNAME PRI NICE   SIZERES STATE   C   TIME   WCPU COMMAND
   11 root 171 ki31 0K32K CPU11  90.1H 100.00% {idle: cpu1}
   11 root 171 ki31 0K32K RUN 0  82.1H 100.00% {idle: cpu0}
   12 root -64- 0K   304K WAIT0  33:40  0.68% {irq18: uhci2 
ehc}
   12 root -44- 0K   304K WAIT1 225:22  0.00% {swi1: netisr 
0}
   14 root -68- 0K   528K -   1  16:19  0.00% {usbus3}
   12 root -40- 0K   304K WAIT0  14:25  0.00% {swi2: cambio}
   12 root -64- 0K   304K WAIT1  12:50  0.00% {irq22: ahci0}
4 root  -8- 0K16K -   0  12:26  0.00% g_down
= POLLING
NB: no smbd process at all in first 8 positions.
Real speed (accroding to Windows'7 report) ~75MiB/s.


  the same without polling, with net.isr settings:
# sysctl net.isr
net.isr.numthreads: 1
net.isr.maxprot: 16
net.isr.defaultqlimit: 256
net.isr.maxqlimit: 10240
net.isr.bindthreads: 0
net.isr.maxthreads: 1
net.isr.direct: 0
net.isr.direct_force: 0

= INTR - ISR.DIRECT=0
CPU:  3.8% user,  0.0% nice, 26.5% system,  6.6% interrupt, 63.2% idle
  PID USERNAME PRI NICE   SIZERES STATE   C   TIME   WCPU COMMAND
   11 root 171 ki31 0K32K RUN 0  82.1H 83.59% {idle: cpu0}
   11 root 171 ki31 0K32K RUN 1  90.1H 64.06% {idle: cpu1}
33873 root  720 28912K  5432K select  0   0:28 34.96% smbd
   12 root -44- 0K   304K WAIT0 225:29  9.18% {swi1: netisr 
0}
0 root -680 0K   128K -   1   0:02  6.30% {em0 taskq}
   12 root -68- 0K   304K WAIT0   0:00  1.56% {irq20: em0 
fwohc}
7 root  44- 0K16K psleep  0   3:12  0.39% pagedaemon
   12 root -64- 0K   304K WAIT1  33:41  0.20% {irq18: uhci2 
ehc}
   14 root -68- 0K   528K -   0  16:19  0.00% {usbus3}
   12 root -40- 0K   304K WAIT0  14:25  0.00% {swi2: cambio}
= INTR - ISR.DIRECT=0
Real speed (accroding to Windows'7 report) ~85MiB/s.

  the same without polling, with net.isr settings:
# sysctl net.isr
net.isr.numthreads: 1
net.isr.maxprot: 16
net.isr.defaultqlimit: 256
net.isr.maxqlimit: 10240
net.isr.bindthreads: 0
net.isr.maxthreads: 1
net.isr.direct: 1
net.isr.direct_force: 1

= INTR - ISR.DIRECT=1
CPU:  2.8% user,  0.0% nice, 30.1% system,  1.7% interrupt, 65.4% idle
  PID USERNAME PRI NICE   SIZERES STATE   C   TIME   WCPU COMMAND
   11 root 171 ki31 0K32K RUN 1  90.2H 89.36% {idle: cpu1}
   11 root 171 ki31 0K32K RUN 0  82.2H 67.87% {idle: cpu0}
33873 root 1030 28912K  5424K CPU00   0:51 33.98% smbd
0 root -680 0K   128K -   1   0:06 12.70% {em0 taskq}
   12 root -68- 0K   304K WAIT0   0:01  1.66% {irq20: em0 
fwohc}
7 root  45- 0K16K psleep  0   3:12  0.78% pagedaemon
   12 root -64- 0K   304K WAIT0  33:42  0.20% {irq18: uhci2 
ehc}
   12 root -44- 0K   304K WAIT1 225:33  0.00% {swi1: netisr 
0}
   14 root -68- 0K   528K -   1  16:20  0.00% {usbus3}
   12 root -40- 0K   304K WAIT0  14:25  0.00% {swi2: cambio}
= INTR - ISR.DIRECT=1
Real speed (accroding to Windows'7 report) ~101MiB/s.

  I've re-created file to flush caches on both sides between trys.

-- 
// Black Lion AKA Lev Serebryakov l...@freebsd.org

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: em0: Watchdog timeout -- resetting

2011-02-01 Thread Eugene Grosbein
On 01.02.2011 18:38, Lev Serebryakov wrote:

 = INTR - ISR.DIRECT=1
 Real speed (accroding to Windows'7 report) ~101MiB/s.
 
   I've re-created file to flush caches on both sides between trys.
 

netisr queues help to deal with lots of incoming traffic.
If you bother about outgoing traffic only, it won't help.

Eugene Grosbein
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: em0: Watchdog timeout -- resetting

2011-02-01 Thread Lev Serebryakov
Hello, Eugene.
You wrote 1 февраля 2011 г., 16:52:57:

 = INTR - ISR.DIRECT=1
 Real speed (accroding to Windows'7 report) ~101MiB/s.
   I've re-created file to flush caches on both sides between trys.

 netisr queues help to deal with lots of incoming traffic.
 If you bother about outgoing traffic only, it won't help.
 This server is mostly-R/O storage server, so I bother about outgoing
 traffic.

 And now, after switching polling off  experiments, it is lost --
 about 30 minutes after experiments it stops answer on pings and
 other network activity. I'll be near local console only at night to
 report panic or something else.

-- 
// Black Lion AKA Lev Serebryakov l...@freebsd.org

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: em0: watchdog timeout -- resetting (6.1-STABLE)

2006-09-16 Thread Frode Nordahl

Hello,

Just wanted to send a me too on this issue. Whenever it happends I  
can see our Cisco switch reporting the interface going down and up as  
well (Line Protocol).



FreeBSD localhost.localdomain 6.2-PRERELEASE FreeBSD 6.2-PRERELEASE  
#1: Wed Sep 13 00:10:04 CEST 2006 [EMAIL PROTECTED]:/ 
usr/obj/usr/src/sys/PT  i386


em0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500
options=bRXCSUM,TXCSUM,VLAN_MTU
media: Ethernet autoselect (1000baseTX full-duplex)
status: active

[EMAIL PROTECTED]:11:0:  class=0x02 card=0x10048086 chip=0x10048086  
rev=0x02 hdr=0x00

vendor   = 'Intel Corporation'
device   = '82543GC Gigabit Ethernet Controller (Copper)'
class= network
subclass = ethernet

(This is a add-in 64bit PCI card.)

I am stress-testing -STABLE on a spare server to aid in making 6.2 as  
bugfree as possible.


It is set up as a NFS server with two Linux NFS clients connected  
that is concurrently extracting 5 copies of /usr/src to it, and  
running a program that creates millions of files with random UID's to  
test for QUOTA issues.


On the server I repeatedly dump the exported filesystem with snapshot  
and cache enabled. (dump -L -C 32 -af /dev/null ...)



I'm building todays -STABLE on a different server with SMP and two em  
NIC's onboard, and will start similar tests on it to see if I can  
reproduce the watchdog timeouts there as well.


--
Frode Nordahl



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: em0: watchdog timeout -- resetting (6.1-STABLE)

2006-09-15 Thread Martin Nilsson

I'm also seeing these on a Supermicro PDSMi board with a recent stable.
Please tell me what debugging info that is needed to fix this.

/Martin


FreeBSD mailbox 6.2-PRERELEASE FreeBSD 6.2-PRERELEASE #1: Sun Sep 10 
17:43:15 CEST 2006 [EMAIL PROTECTED]:/usr/obj-local/usr/src/sys/SMP  amd64


lspci -v output:

04:00.0 Ethernet controller: Intel Corporation 82573E Gigabit Ethernet 
Controller (Copper) (rev 03)

Subsystem: Super Micro Computer Inc Unknown device 108c
Flags: bus master, fast devsel, latency 0, IRQ 16
Memory at ed20 (32-bit, non-prefetchable)
I/O ports at 4000
Capabilities: [c8] Power Management version 2
Capabilities: [d0] Message Signalled Interrupts: 64bit+ 
Queue=0/0 Enable-

Capabilities: [e0] Express Endpoint IRQ 0

05:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet 
Controller

Subsystem: Super Micro Computer Inc Unknown device 109a
Flags: bus master, fast devsel, latency 0, IRQ 17
Memory at ed30 (32-bit, non-prefetchable)
I/O ports at 5000
Capabilities: [c8] Power Management version 2
Capabilities: [d0] Message Signalled Interrupts: 64bit+ 
Queue=0/0 Enable-

Capabilities: [e0] Express Endpoint IRQ 0

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: em0: watchdog timeout -- resetting (6.1-STABLE)

2006-09-15 Thread Craig Boston
On Thu, Sep 14, 2006 at 02:27:29AM +0200, Ronald Klop wrote:
 Them manual page em(4) mentions trying another cable when the watchdog  
 timeout happens, so I tried that. But it didn't help.
 Is there anything I can test to (help) debug this?
 It happens a lot when my machine is under load. (100% CPU)
 Is it possible that it happens since I upgraded the memory from 1GB to 2  
 GB?

I don't think it's the cable.  I started getting these recently as well
(starting about a week ago).  Always when there's a lot of CPU and disk
I/O load.

Also sometimes my USB keyboard would become unresponsive at about the
same time (under high load).  Sometimes it would stutter and act like
the key was being held down for a second or two.

I built a new kernel (6.2-PRE now) on 9/12.  The keyboard problem seems
to be gone but I still get the em watchdog timeouts occasionally.

Craig
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: em0: watchdog timeout -- resetting (6.1-STABLE)

2006-09-15 Thread Eugene Kazarinov

Something with em0 is really wrong. I dont get timeouts, but

Before cvsup I had 6.0-PRERELEASE and didn't have a problem.
Now I have FreeBSD 6.2-PRERELEASE #8: Fri Sep 15 03:44:49 MSD 2006 and the
problem is so:
(On machine I have LARGE_NAT, em0, em1, em2)
on fresh system ping to www.ru from client computer (goes to inet via nat)
is 3-5ms
after few hours (i see it in the night) then traffic is smaller ping to
www.ru is 11-12 ms.
Why?
after reboot it still gut for a few ours.

FreeBSD/amd64
kernel with
options DEVICE_POLLING
options HZ=2500


with HZ=1000 and without DEVICE_POLLING nothing changes - 11-12 still goes
after few hours.

PS Should I downgrade to 6.0-RELEASE or earlier
or tonight cvsup updates could resolve a problem (files sounds like tcp...):
Checkout src/sys/contrib/ipfilter/netinet/ip_nat.h
Edit src/sys/netinet/in_pcb.c
Edit src/sys/netinet/tcp_input.c
Edit src/sys/netinet/tcp_subr.c
Edit src/sys/netinet/tcp_timer.c
Edit src/sys/netinet/tcp_timer.h
Edit src/sys/netinet/tcp_var.h
Edit src/sys/sys/param.h
Edit src/usr.sbin/pkg_install/add/main.c

PPS Now I rebuild kernels and  tomorrow night will se.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: em0: watchdog timeout -- resetting (6.1-STABLE)

2006-09-15 Thread Jack Vogel

On 9/14/06, David C. Myers [EMAIL PROTECTED] wrote:


 watchdogs mean that the transmit ring is not being cleaned, so the
 question is what is your machine doing at 100% cpu, if its that busy
 the network watchdogs may just be a side effect and not the real
 problem?


I get them with a completely idle machine.  My home directory is mounted
via NFS (from FreeBSD 6.1 on an amd64 machine), and with the kernel from
earlier this week, the machine would just hang for 30 seconds to a
couple of minutes.  A slew of watchdog timeout messages would appear.
  Then I'd get a moment's responsiveness out of the machine, then
another long wait, then a moment's responsiveness, then a long wait...

The machine would never recover from this cycle (at least, so far as I
was patient enough to wait).

Going back to a kernel dated late July resolved everything.

Someone else asked me for the hardware version of my em0 board...


[EMAIL PROTECTED]:10:0:  class=0x02 card=0x002e8086 chip=0x100e8086 rev=0x02
hdr=0x00vendor   = 'Intel Corporation'
 device   = '82540EM Gigabit Ethernet Controller'
 class= network
 subclass = ethernet


Could you perhaps go back to the kernel you say was stable and then
drop in the latest em driver? Or if that has issues building do it the
other way around, take the em driver from the build that gave you no
problems and put it on this kernel you are running now?

It would be helpful to know if this is a driver problem or something
in the stack.

Cheers,

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: em0: watchdog timeout -- resetting (6.1-STABLE)

2006-09-15 Thread Jack Vogel

On 9/15/06, Martin Nilsson [EMAIL PROTECTED] wrote:

I'm also seeing these on a Supermicro PDSMi board with a recent stable.
Please tell me what debugging info that is needed to fix this.

/Martin


FreeBSD mailbox 6.2-PRERELEASE FreeBSD 6.2-PRERELEASE #1: Sun Sep 10
17:43:15 CEST 2006 [EMAIL PROTECTED]:/usr/obj-local/usr/src/sys/SMP  amd64

lspci -v output:

04:00.0 Ethernet controller: Intel Corporation 82573E Gigabit Ethernet
Controller (Copper) (rev 03)
 Subsystem: Super Micro Computer Inc Unknown device 108c
 Flags: bus master, fast devsel, latency 0, IRQ 16
 Memory at ed20 (32-bit, non-prefetchable)
 I/O ports at 4000
 Capabilities: [c8] Power Management version 2
 Capabilities: [d0] Message Signalled Interrupts: 64bit+
Queue=0/0 Enable-
 Capabilities: [e0] Express Endpoint IRQ 0

05:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet
Controller
 Subsystem: Super Micro Computer Inc Unknown device 109a
 Flags: bus master, fast devsel, latency 0, IRQ 17
 Memory at ed30 (32-bit, non-prefetchable)
 I/O ports at 5000
 Capabilities: [c8] Power Management version 2
 Capabilities: [d0] Message Signalled Interrupts: 64bit+
Queue=0/0 Enable-
 Capabilities: [e0] Express Endpoint IRQ 0


Martin, do you see similar problems using either port, I ask because this
system may be similar to one that Yahoo has and there was only a
problem with one port and not the other, can you check this out please?

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: em0: watchdog timeout -- resetting (6.1-STABLE)

2006-09-14 Thread Dan Olson


Jack Vogel wrote:

On 9/13/06, Ronald Klop [EMAIL PROTECTED] wrote:
...


Them manual page em(4) mentions trying another cable when the watchdog
timeout happens, so I tried that. But it didn't help.
Is there anything I can test to (help) debug this?
It happens a lot when my machine is under load. (100% CPU)
Is it possible that it happens since I upgraded the memory from 1GB to 2
GB?


watchdogs mean that the transmit ring is not being cleaned, so the
question is what is your machine doing at 100% cpu, if its that busy
the network watchdogs may just be a side effect and not the real
problem?

Jack


I see these too when installing packages over nfs on my Laptop. If I run 
with a low level of network traffic, i.e. ssh compile, and peg out the 
cpu with a benchmark such as flops, I don't see these timeouts.


6.1-STABLE FreeBSD 6.1-STABLE #0: Sat Aug 26 14:45:40 CDT 2006

[EMAIL PROTECTED]:1:0:   class=0x02 card=0x05491014 chip=0x101e8086 rev=0x03 
hdr=0x00

vendor   = 'Intel Corporation'
device   = '82540EP Gigabit Ethernet Controller (Mobile)'
class= network

Any suggestions?

Thanks

Dan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: em0: watchdog timeout -- resetting (6.1-STABLE)

2006-09-14 Thread David C. Myers



watchdogs mean that the transmit ring is not being cleaned, so the
question is what is your machine doing at 100% cpu, if its that busy
the network watchdogs may just be a side effect and not the real
problem?



I get them with a completely idle machine.  My home directory is mounted 
via NFS (from FreeBSD 6.1 on an amd64 machine), and with the kernel from 
earlier this week, the machine would just hang for 30 seconds to a 
couple of minutes.  A slew of watchdog timeout messages would appear. 
 Then I'd get a moment's responsiveness out of the machine, then 
another long wait, then a moment's responsiveness, then a long wait...


The machine would never recover from this cycle (at least, so far as I 
was patient enough to wait).


Going back to a kernel dated late July resolved everything.

Someone else asked me for the hardware version of my em0 board...


[EMAIL PROTECTED]:10:0:  class=0x02 card=0x002e8086 chip=0x100e8086 rev=0x02 
hdr=0x00vendor   = 'Intel Corporation'

device   = '82540EM Gigabit Ethernet Controller'
class= network
subclass = ethernet


-David.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: em0: watchdog timeout -- resetting (6.1-STABLE)

2006-09-14 Thread Ronald Klop
On Fri, 15 Sep 2006 02:06:08 +0200, David C. Myers [EMAIL PROTECTED]  
wrote:





watchdogs mean that the transmit ring is not being cleaned, so the
question is what is your machine doing at 100% cpu, if its that busy
the network watchdogs may just be a side effect and not the real
problem?



I get them with a completely idle machine.  My home directory is mounted  
via NFS (from FreeBSD 6.1 on an amd64 machine), and with the kernel from  
earlier this week, the machine would just hang for 30 seconds to a  
couple of minutes.  A slew of watchdog timeout messages would appear.  
  Then I'd get a moment's responsiveness out of the machine, then  
another long wait, then a moment's responsiveness, then a long wait...


The machine would never recover from this cycle (at least, so far as I  
was patient enough to wait).


Going back to a kernel dated late July resolved everything.

Someone else asked me for the hardware version of my em0 board...


[EMAIL PROTECTED]:10:0:  class=0x02 card=0x002e8086 chip=0x100e8086 rev=0x02  
hdr=0x00vendor   = 'Intel Corporation'

 device   = '82540EM Gigabit Ethernet Controller'
 class= network
 subclass = ethernet


-David.


This sounds familiar to my problem. I solved it today by enabling polling.  
I know it's a workaround.


--
 Ronald Klop
 Amsterdam, The Netherlands
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: em0: watchdog timeout -- resetting (6.1-STABLE)

2006-09-13 Thread Ronald Klop
On Tue, 05 Sep 2006 23:52:05 +0200, Ronald Klop  
[EMAIL PROTECTED] wrote:



Hello,

I get these errors a lot.

Sep  5 11:55:12 ronald kernel: em0: watchdog timeout -- resetting
Sep  5 11:55:12 ronald kernel: em0: link state changed to DOWN
Sep  5 11:55:14 ronald kernel: em0: link state changed to UP
Sep  5 12:00:37 ronald kernel: em0: watchdog timeout -- resetting
Sep  5 12:00:37 ronald kernel: em0: link state changed to DOWN
Sep  5 12:00:39 ronald kernel: em0: link state changed to UP

I tried turning off rxcsum/txcsum and set these sysctl's.
dev.em.0.rx_int_delay: 0
dev.em.0.tx_int_delay: 0 (default 66)
But the error is still there.
Searching the internet and the list provides more of the same problems,
but I didn't find an answer.

My dmesg is attached.

Is there any info I need to provide to debug this or can I try patches?


Them manual page em(4) mentions trying another cable when the watchdog  
timeout happens, so I tried that. But it didn't help.

Is there anything I can test to (help) debug this?
It happens a lot when my machine is under load. (100% CPU)
Is it possible that it happens since I upgraded the memory from 1GB to 2  
GB?


(dmesg was attached to my previous mail, but I can provide it again.)

Ronald.

--
 Ronald Klop
 Amsterdam, The Netherlands
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: em0: watchdog timeout -- resetting (6.1-STABLE)

2006-09-13 Thread David Myers



Sep  5 11:55:12 ronald kernel: em0: watchdog timeout -- resetting



I got a bazillion of these, and a completely unusable machine, when I 
upgraded to 6.1-stable sources as of two days ago.  The machine would 
simply freeze for minutes at a time.  Going back to my previous kernel 
(dating from late July) made everything just fine.


So something got seriously broken in the em driver in the last few 
weeks.


-David.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: em0: watchdog timeout -- resetting (6.1-STABLE)

2006-09-13 Thread Mike Tancsa

At 10:20 PM 9/13/2006, David Myers wrote:


Sep  5 11:55:12 ronald kernel: em0: watchdog timeout -- resetting



I got a bazillion of these, and a completely unusable machine, when 
I upgraded to 6.1-stable sources as of two days ago.  The machine 
would simply freeze for minutes at a time.  Going back to my 
previous kernel (dating from late July) made everything just fine.


So something got seriously broken in the em driver in the last few weeks.


Which version of the NIC do you have ? (pciconf -lv )

---Mike 


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: em0: watchdog timeout -- resetting (6.1-STABLE)

2006-09-13 Thread Jack Vogel

On 9/13/06, Ronald Klop [EMAIL PROTECTED] wrote:
...


Them manual page em(4) mentions trying another cable when the watchdog
timeout happens, so I tried that. But it didn't help.
Is there anything I can test to (help) debug this?
It happens a lot when my machine is under load. (100% CPU)
Is it possible that it happens since I upgraded the memory from 1GB to 2
GB?


watchdogs mean that the transmit ring is not being cleaned, so the
question is what is your machine doing at 100% cpu, if its that busy
the network watchdogs may just be a side effect and not the real
problem?

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: em0: watchdog timeout -- resetting (6.1-STABLE)

2006-09-05 Thread Kent Stewart
On Tuesday 05 September 2006 14:52, Ronald Klop wrote:
 Hello,

 I get these errors a lot.

 Sep  5 11:55:12 ronald kernel: em0: watchdog timeout -- resetting
 Sep  5 11:55:12 ronald kernel: em0: link state changed to DOWN
 Sep  5 11:55:14 ronald kernel: em0: link state changed to UP
 Sep  5 12:00:37 ronald kernel: em0: watchdog timeout -- resetting
 Sep  5 12:00:37 ronald kernel: em0: link state changed to DOWN
 Sep  5 12:00:39 ronald kernel: em0: link state changed to UP

So am I. Especially when I transfer a GB or 2 from Windows XP to 
6.1-stable. I use the FreeBSD machine as a backup for digital photos 
and my ripped mp3 files. A photo session is usually in excess of 1 GB 
and can hang with the watchdog timeout.

Kent


 I tried turning off rxcsum/txcsum and set these sysctl's.
 dev.em.0.rx_int_delay: 0
 dev.em.0.tx_int_delay: 0 (default 66)
 But the error is still there.
 Searching the internet and the list provides more of the same
 problems, but I didn't find an answer.

 My dmesg is attached.

 Is there any info I need to provide to debug this or can I try
 patches?

 Ronald.

-- 
Kent Stewart
Richland, WA

http://www.soyandina.com/ I am Andean project.
http://users.owt.com/kstewart/index.html
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]