Re: em(4) detailed errors
Hi, On Thu, 18 Nov 2010 16:38:55 +0100 Manuel Guesdon ml+openbsd.m...@oxymium.net wrote: | Is there a way to get detailed em(4) device errors without having to | recompile kernel with EM_DEBUG ? | I try to find in-errors reason(s) but netstat only gives errors as a sum of | dropped_pkts + stats.rxerrc + stats.crcerrs + sc-stats.algnerrc +... as far | as I can see :-( I took me some time to upgrade to 4.8 version and modify kernel to get detail info on demand. I still have the problem on multiple servers (but with very similar hardware and software). em4 at pci11 dev 0 function 0 Intel PRO/1000 (82576) rev 0x01: apic 9 int 15 (irq 15), address 00:25:90:05:53:3e em4: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500 lladdr 00:25:90:05:53:3e description: br1.th2 priority: 0 groups: pf-off media: Ethernet autoselect (1000baseT full-duplex,master,rxpause,txpause) status: active inet6 fe80::225:90ff:fe05:533e%em4 prefixlen 64 scopeid 0x5 #netstat -I em4 -d NameMtu Network Address IpktsIerrsOpkts Oerrs Colls Drop em4 1500 Link 00:25:90:05:53:3e 8936976317 4614835 5430820423 0 00 em4 1500 fe80::%em4/ fe80::225:90ff:fe 8936976317 4614835 5430820423 0 00 Detailed stats: em4: Dropped PKTS = 0 em4: Excessive collisions = 0 em4: Symbol errors = 0 em4: Sequence errors = 0 em4: Defer count = 353 em4: Missed Packets = 4241586 em4: Receive No Buffers = 5297798 em4: Receive Length Errors = 0 em4: Receive errors = 0 em4: Crc errors = 0 em4: Alignment errors = 0 em4: Carrier extension errors = 0 em4: RX overruns = 372913 em4: watchdog timeouts = 0 em4: XON Rcvd = 3086 em4: XON Xmtd = 592675 em4: XOFF Rcvd = 164449 em4: XOFF Xmtd = 4833995 em4: Good Packets Rcvd = 8936940571 em4: Good Packets Xmtd = 5430798347 At this time, the interface carry around 56mbps inbound and 35Mbps outbound Server load is 0.14 The em4 interface is connected to a an interface on another server with near same config (but I get same kind of problem for interfaces connected to switch with copper and fiber). Errors seems a little related to interface load but not very closely. Each servers have 2xQuad-ports cards (82576) + 2 ports on motherboard (82576 too). I was thinking of problem with interrupt mitigation. Any idea, comments, things to test ? Thank you ! Manuel
Re: em(4) detailed errors
On 2010-11-23, Toni Mueller openbsd-m...@oeko.net wrote: Hi, On Tue, 23.11.2010 at 11:07:40 -0500, Ted Unangst ted.unan...@gmail.com wrote: On Tue, Nov 23, 2010 at 10:02 AM, Otto Moerbeek o...@drijf.net wrote: On Tue, Nov 23, 2010 at 03:16:57PM +0100, Toni Mueller wrote: # ifconfig em3 em3: flags=8b43UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST mtu 1500 B B B B lladdr 00:30:48:94:0b:21 B B B B priority: 0 B B B B media: Ethernet autoselect (1000baseT full-duplex,master) ^ B B B B status: active I would rather investigate why the PROMISC and ALLMULTI flags are set on this interface. trunked? thanks for your input. No, the interface is configured in a very straightforward way without any bells and whistles. It has a four IPv4 addresses, plus one auto-generated IPv6 address (link layer local). I don't use briding and didn't enable multicast in /etc/sysctl.conf, either. carp will do this too (and it seems it doesn't clear the PROMISC,ALLMULTI even when the carp interface is destroyed).
Re: em(4) detailed errors
Hi, On Thu, 18.11.2010 at 16:38:55 +0100, Manuel Guesdon ml+openbsd.m...@oxymium.net wrote: Is there a way to get detailed em(4) device errors without having to recompile kernel with EM_DEBUG ? I try to find in-errors reason(s) but netstat only gives errors as a sum of dropped_pkts + stats.rxerrc + stats.crcerrs + sc-stats.algnerrc +... as far as I can see :-( I'm having a similar problem. On one 4x em(4) machine, I get a lot of input errors and, much more serious, intermittend packet loss, but only on one interface out of two with similar traffic levels (~1-4kpps per direction). After reading the latest em(4) threads, I also found this very strange thing, which must have been automatically configured: # ifconfig em3 em3: flags=8b43UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST mtu 1500 lladdr 00:30:48:94:0b:21 priority: 0 media: Ethernet autoselect (1000baseT full-duplex,master) ^ status: active I'm unsure about how to remove this feature from this (physical) interface, and the machine uses none of carp, pfsync or sasync. The hardware for this interface is em3 at pci5 dev 0 function 0 Intel PRO/1000MT (82573L) rev 0x00: apic 2 int 17 (irq 11), address 00:30:48:94:0b:21 as detected by OpenBSD 4.8-stable (i386). The ability to selectively enable or disable debugging for individual devices at runtime would be a great feature, from a sysadmin's perspective. -- Kind regards, --Toni++
Re: em(4) detailed errors
On Tue, Nov 23, 2010 at 03:16:57PM +0100, Toni Mueller wrote: Hi, On Thu, 18.11.2010 at 16:38:55 +0100, Manuel Guesdon ml+openbsd.m...@oxymium.net wrote: Is there a way to get detailed em(4) device errors without having to recompile kernel with EM_DEBUG ? I try to find in-errors reason(s) but netstat only gives errors as a sum of dropped_pkts + stats.rxerrc + stats.crcerrs + sc-stats.algnerrc +... as far as I can see :-( I'm having a similar problem. On one 4x em(4) machine, I get a lot of input errors and, much more serious, intermittend packet loss, but only on one interface out of two with similar traffic levels (~1-4kpps per direction). After reading the latest em(4) threads, I also found this very strange thing, which must have been automatically configured: # ifconfig em3 em3: flags=8b43UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST mtu 1500 lladdr 00:30:48:94:0b:21 priority: 0 media: Ethernet autoselect (1000baseT full-duplex,master) ^ status: active I'm unsure about how to remove this feature from this (physical) interface, and the machine uses none of carp, pfsync or sasync. The hardware for this interface is If you wonder about the master in the media line then be assured that all is fine. 1000BaseT require autoselection to always run and every link needs one PHY running as master (normaly the switch). em3 at pci5 dev 0 function 0 Intel PRO/1000MT (82573L) rev 0x00: apic 2 int 17 (irq 11), address 00:30:48:94:0b:21 as detected by OpenBSD 4.8-stable (i386). The ability to selectively enable or disable debugging for individual devices at runtime would be a great feature, from a sysadmin's perspective. -- :wq Claudio
Re: em(4) detailed errors
On Tue, Nov 23, 2010 at 03:16:57PM +0100, Toni Mueller wrote: Hi, On Thu, 18.11.2010 at 16:38:55 +0100, Manuel Guesdon ml+openbsd.m...@oxymium.net wrote: Is there a way to get detailed em(4) device errors without having to recompile kernel with EM_DEBUG ? I try to find in-errors reason(s) but netstat only gives errors as a sum of dropped_pkts + stats.rxerrc + stats.crcerrs + sc-stats.algnerrc +... as far as I can see :-( I'm having a similar problem. On one 4x em(4) machine, I get a lot of input errors and, much more serious, intermittend packet loss, but only on one interface out of two with similar traffic levels (~1-4kpps per direction). After reading the latest em(4) threads, I also found this very strange thing, which must have been automatically configured: # ifconfig em3 em3: flags=8b43UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST mtu 1500 lladdr 00:30:48:94:0b:21 priority: 0 media: Ethernet autoselect (1000baseT full-duplex,master) ^ status: active I'm unsure about how to remove this feature from this (physical) interface, and the machine uses none of carp, pfsync or sasync. The hardware for this interface is I would rather investigate why the PROMISC and ALLMULTI flags are set on this interface. -Otto em3 at pci5 dev 0 function 0 Intel PRO/1000MT (82573L) rev 0x00: apic 2 int 17 (irq 11), address 00:30:48:94:0b:21 as detected by OpenBSD 4.8-stable (i386). The ability to selectively enable or disable debugging for individual devices at runtime would be a great feature, from a sysadmin's perspective. -- Kind regards, --Toni++
Re: em(4) detailed errors
On Tue, Nov 23, 2010 at 10:02 AM, Otto Moerbeek o...@drijf.net wrote: On Tue, Nov 23, 2010 at 03:16:57PM +0100, Toni Mueller wrote: # ifconfig em3 em3: flags=8b43UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST mtu 1500 lladdr 00:30:48:94:0b:21 priority: 0 media: Ethernet autoselect (1000baseT full-duplex,master) ^ status: active I would rather investigate why the PROMISC and ALLMULTI flags are set on this interface. trunked?
Re: em(4) detailed errors
Hi, On Tue, 23.11.2010 at 11:07:40 -0500, Ted Unangst ted.unan...@gmail.com wrote: On Tue, Nov 23, 2010 at 10:02 AM, Otto Moerbeek o...@drijf.net wrote: On Tue, Nov 23, 2010 at 03:16:57PM +0100, Toni Mueller wrote: # ifconfig em3 em3: flags=8b43UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST mtu 1500 B B B B lladdr 00:30:48:94:0b:21 B B B B priority: 0 B B B B media: Ethernet autoselect (1000baseT full-duplex,master) ^ B B B B status: active I would rather investigate why the PROMISC and ALLMULTI flags are set on this interface. trunked? thanks for your input. No, the interface is configured in a very straightforward way without any bells and whistles. It has a four IPv4 addresses, plus one auto-generated IPv6 address (link layer local). I don't use briding and didn't enable multicast in /etc/sysctl.conf, either. There are also no processes specifically using this interface (ie, no tcpdump or similar). This is the whole process list: $ ps ax PID TT STAT TIME COMMAND 1 ?? Is 0:00.01 /sbin/init 2399 ?? Is 0:00.00 ntpd: [priv] (ntpd) 19341 ?? I 0:00.09 ntpd: ntp engine (ntpd) 12690 ?? I 0:00.01 ntpd: dns engine (ntpd) 11247 ?? Is 0:00.02 /usr/sbin/sshd -u0 2024 ?? Is 0:00.31 cron 32158 ?? Ss 0:01.19 sendmail: accepting connections (sendmail) 24559 ?? Ss 0:17.55 bgpd: parent (bgpd) 12368 ?? S 0:15.77 bgpd: session engine (bgpd) 18994 ?? S 1:05.98 bgpd: route decision engine (bgpd) 8611 ?? Ss 0:02.39 ifstated -v 11105 ?? S 0:05.28 syslogd -n -a /var/www/dev/log -a /var/empty/dev/log 27237 ?? Is 0:00.03 syslogd: [priv] (syslogd) 27968 ?? S 0:00.51 pflogd: [running] -s 256 -i pflog0 -f /var/log/pflog (pflogd) 13936 ?? Is 0:00.05 pflogd: [priv] (pflogd) 31560 ?? Ss 0:00.39 sshd: u...@ttyp0 (sshd) 29917 ?? Ss 0:00.44 sshd: u...@ttyp1 (sshd) 29148 p0 Ss+ 0:00.03 bash 16540 p1 Ss 0:00.04 bash 28953 p1 R+/10:00.00 ps -ax 17757 C0- S 0:00.53 runsvdir -P /var/service log: ... 9629 C0 Is+ 0:00.00 /usr/libexec/getty std.9600 ttyC0 397 C1 Is+ 0:00.00 /usr/libexec/getty std.9600 ttyC1 25085 C2 Is+ 0:00.00 /usr/libexec/getty std.9600 ttyC2 32349 C3 Is+ 0:00.00 /usr/libexec/getty std.9600 ttyC3 12522 C5 Is+ 0:00.00 /usr/libexec/getty std.9600 ttyC5 $ None of these suggests to me that ALLMULTI or PROMISC should be on, and ifconfig's man page doesn't suggest that I can easily turn them off. If you have any suggestions about how to debug this, I'm all ears. TIA! Kind regards, --Toni++
em(4) detailed errors
Hi, Is there a way to get detailed em(4) device errors without having to recompile kernel with EM_DEBUG ? I try to find in-errors reason(s) but netstat only gives errors as a sum of dropped_pkts + stats.rxerrc + stats.crcerrs + sc-stats.algnerrc +... as far as I can see :-( Manuel