Re: em(4) detailed errors

2011-01-28 Thread Manuel Guesdon
Hi,

On Thu, 18 Nov 2010 16:38:55 +0100
Manuel Guesdon ml+openbsd.m...@oxymium.net wrote:
| Is there a way to get detailed em(4) device errors without having to
| recompile kernel with EM_DEBUG ?
| I try to find in-errors reason(s) but netstat only gives errors as a sum of
| dropped_pkts + stats.rxerrc + stats.crcerrs + sc-stats.algnerrc +... as far
| as I can see :-(

I took me some time to upgrade to 4.8 version and modify kernel to get detail
info on demand.
I still have the problem on multiple servers (but with very similar hardware 
and software).


em4 at pci11 dev 0 function 0 Intel PRO/1000 (82576) rev 0x01: apic 9 int 15 
(irq 15), address 00:25:90:05:53:3e


em4: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500
lladdr 00:25:90:05:53:3e
description: br1.th2
priority: 0
groups: pf-off
media: Ethernet autoselect (1000baseT 
full-duplex,master,rxpause,txpause) 
status: active 
inet6 fe80::225:90ff:fe05:533e%em4 prefixlen 64 scopeid 0x5 

#netstat -I em4 -d
NameMtu   Network Address  IpktsIerrsOpkts  
Oerrs Colls  Drop 
em4 1500  Link  00:25:90:05:53:3e 8936976317 4614835  5430820423 
0 00 
em4 1500  fe80::%em4/ fe80::225:90ff:fe 8936976317 4614835  5430820423  
0 00

Detailed stats:
em4: Dropped PKTS = 0
em4: Excessive collisions = 0
em4: Symbol errors = 0
em4: Sequence errors = 0
em4: Defer count = 353
em4: Missed Packets = 4241586
em4: Receive No Buffers = 5297798
em4: Receive Length Errors = 0
em4: Receive errors = 0
em4: Crc errors = 0
em4: Alignment errors = 0
em4: Carrier extension errors = 0
em4: RX overruns = 372913
em4: watchdog timeouts = 0
em4: XON Rcvd = 3086
em4: XON Xmtd = 592675
em4: XOFF Rcvd = 164449
em4: XOFF Xmtd = 4833995
em4: Good Packets Rcvd = 8936940571
em4: Good Packets Xmtd = 5430798347

At this time, the interface carry around 56mbps inbound and 35Mbps outbound
Server load is 0.14

The em4 interface is connected to a an interface on another server with near 
same config (but I get same kind of problem for interfaces connected to switch 
with copper and fiber).

Errors seems a little related to interface load but not very closely.

Each servers have 2xQuad-ports cards (82576) + 2 ports on motherboard (82576 
too).

I was thinking of problem with interrupt mitigation.

Any idea, comments, things to test ?

Thank you !

Manuel



Re: em(4) detailed errors

2010-11-24 Thread Stuart Henderson
On 2010-11-23, Toni Mueller openbsd-m...@oeko.net wrote:
 Hi,

 On Tue, 23.11.2010 at 11:07:40 -0500, Ted Unangst ted.unan...@gmail.com 
 wrote:
 On Tue, Nov 23, 2010 at 10:02 AM, Otto Moerbeek o...@drijf.net wrote:
  On Tue, Nov 23, 2010 at 03:16:57PM +0100, Toni Mueller wrote:
  # ifconfig em3
  em3:
  flags=8b43UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST mtu 
  1500
  B  B  B  B  lladdr 00:30:48:94:0b:21
  B  B  B  B  priority: 0
  B  B  B  B  media: Ethernet autoselect (1000baseT full-duplex,master)
  ^
  B  B  B  B  status: active
 
  I would rather investigate why the PROMISC and ALLMULTI flags are set
  on this interface.
 
 trunked?

 thanks for your input. No, the interface is configured in a very
 straightforward way without any bells and whistles. It has a four IPv4
 addresses, plus one auto-generated IPv6 address (link layer local).
 I don't use briding and didn't enable multicast in /etc/sysctl.conf,
 either.

carp will do this too (and it seems it doesn't clear the
PROMISC,ALLMULTI even when the carp interface is destroyed).



Re: em(4) detailed errors

2010-11-23 Thread Toni Mueller
Hi,

On Thu, 18.11.2010 at 16:38:55 +0100, Manuel Guesdon 
ml+openbsd.m...@oxymium.net wrote:
 Is there a way to get detailed em(4) device errors without having to
 recompile kernel with EM_DEBUG ?
 I try to find in-errors reason(s) but netstat only gives errors as a sum of
 dropped_pkts + stats.rxerrc + stats.crcerrs + sc-stats.algnerrc +... as far
 as I can see :-(

I'm having a similar problem. On one 4x em(4) machine, I get a lot of
input errors and, much more serious, intermittend packet loss, but only
on one interface out of two with similar traffic levels (~1-4kpps per
direction).

After reading the latest em(4) threads, I also found this very strange
thing, which must have been automatically configured:

# ifconfig em3
em3:
flags=8b43UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST mtu 1500
lladdr 00:30:48:94:0b:21
priority: 0
media: Ethernet autoselect (1000baseT full-duplex,master)
^
status: active


I'm unsure about how to remove this feature from this (physical)
interface, and the machine uses none of carp, pfsync or sasync.
The hardware for this interface is

em3 at pci5 dev 0 function 0 Intel PRO/1000MT (82573L) rev 0x00: apic 2 int 
17 (irq 11), address 00:30:48:94:0b:21

as detected by OpenBSD 4.8-stable (i386).

The ability to selectively enable or disable debugging for individual
devices at runtime would be a great feature, from a sysadmin's
perspective.


-- 
Kind regards,
--Toni++



Re: em(4) detailed errors

2010-11-23 Thread Claudio Jeker
On Tue, Nov 23, 2010 at 03:16:57PM +0100, Toni Mueller wrote:
 Hi,
 
 On Thu, 18.11.2010 at 16:38:55 +0100, Manuel Guesdon 
 ml+openbsd.m...@oxymium.net wrote:
  Is there a way to get detailed em(4) device errors without having to
  recompile kernel with EM_DEBUG ?
  I try to find in-errors reason(s) but netstat only gives errors as a sum of
  dropped_pkts + stats.rxerrc + stats.crcerrs + sc-stats.algnerrc +... as far
  as I can see :-(
 
 I'm having a similar problem. On one 4x em(4) machine, I get a lot of
 input errors and, much more serious, intermittend packet loss, but only
 on one interface out of two with similar traffic levels (~1-4kpps per
 direction).
 
 After reading the latest em(4) threads, I also found this very strange
 thing, which must have been automatically configured:
 
 # ifconfig em3
 em3:
 flags=8b43UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST mtu 1500
 lladdr 00:30:48:94:0b:21
 priority: 0
 media: Ethernet autoselect (1000baseT full-duplex,master)
 ^
 status: active
 
 
 I'm unsure about how to remove this feature from this (physical)
 interface, and the machine uses none of carp, pfsync or sasync.
 The hardware for this interface is
 

If you wonder about the master in the media line then be assured that
all is fine. 1000BaseT require autoselection to always run and every link
needs one PHY running as master (normaly the switch).

 em3 at pci5 dev 0 function 0 Intel PRO/1000MT (82573L) rev 0x00: apic 2 int 
 17 (irq 11), address 00:30:48:94:0b:21
 
 as detected by OpenBSD 4.8-stable (i386).
 
 The ability to selectively enable or disable debugging for individual
 devices at runtime would be a great feature, from a sysadmin's
 perspective.
 

-- 
:wq Claudio



Re: em(4) detailed errors

2010-11-23 Thread Otto Moerbeek
On Tue, Nov 23, 2010 at 03:16:57PM +0100, Toni Mueller wrote:

 Hi,
 
 On Thu, 18.11.2010 at 16:38:55 +0100, Manuel Guesdon 
 ml+openbsd.m...@oxymium.net wrote:
  Is there a way to get detailed em(4) device errors without having to
  recompile kernel with EM_DEBUG ?
  I try to find in-errors reason(s) but netstat only gives errors as a sum of
  dropped_pkts + stats.rxerrc + stats.crcerrs + sc-stats.algnerrc +... as far
  as I can see :-(
 
 I'm having a similar problem. On one 4x em(4) machine, I get a lot of
 input errors and, much more serious, intermittend packet loss, but only
 on one interface out of two with similar traffic levels (~1-4kpps per
 direction).
 
 After reading the latest em(4) threads, I also found this very strange
 thing, which must have been automatically configured:
 
 # ifconfig em3
 em3:
 flags=8b43UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST mtu 1500
 lladdr 00:30:48:94:0b:21
 priority: 0
 media: Ethernet autoselect (1000baseT full-duplex,master)
 ^
 status: active
 
 
 I'm unsure about how to remove this feature from this (physical)
 interface, and the machine uses none of carp, pfsync or sasync.
 The hardware for this interface is

I would rather investigate why the PROMISC and ALLMULTI flags are set
on this interface.

-Otto

 
 em3 at pci5 dev 0 function 0 Intel PRO/1000MT (82573L) rev 0x00: apic 2 int 
 17 (irq 11), address 00:30:48:94:0b:21
 
 as detected by OpenBSD 4.8-stable (i386).
 
 The ability to selectively enable or disable debugging for individual
 devices at runtime would be a great feature, from a sysadmin's
 perspective.
 
 
 -- 
 Kind regards,
 --Toni++



Re: em(4) detailed errors

2010-11-23 Thread Ted Unangst
On Tue, Nov 23, 2010 at 10:02 AM, Otto Moerbeek o...@drijf.net wrote:
 On Tue, Nov 23, 2010 at 03:16:57PM +0100, Toni Mueller wrote:
 # ifconfig em3
 em3:
 flags=8b43UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST mtu
1500
 lladdr 00:30:48:94:0b:21
 priority: 0
 media: Ethernet autoselect (1000baseT full-duplex,master)
 ^
 status: active

 I would rather investigate why the PROMISC and ALLMULTI flags are set
 on this interface.

trunked?



Re: em(4) detailed errors

2010-11-23 Thread Toni Mueller
Hi,

On Tue, 23.11.2010 at 11:07:40 -0500, Ted Unangst ted.unan...@gmail.com wrote:
 On Tue, Nov 23, 2010 at 10:02 AM, Otto Moerbeek o...@drijf.net wrote:
  On Tue, Nov 23, 2010 at 03:16:57PM +0100, Toni Mueller wrote:
  # ifconfig em3
  em3:
  flags=8b43UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST mtu 
  1500
  B  B  B  B  lladdr 00:30:48:94:0b:21
  B  B  B  B  priority: 0
  B  B  B  B  media: Ethernet autoselect (1000baseT full-duplex,master)
  ^
  B  B  B  B  status: active
 
  I would rather investigate why the PROMISC and ALLMULTI flags are set
  on this interface.
 
 trunked?

thanks for your input. No, the interface is configured in a very
straightforward way without any bells and whistles. It has a four IPv4
addresses, plus one auto-generated IPv6 address (link layer local).
I don't use briding and didn't enable multicast in /etc/sysctl.conf,
either.

There are also no processes specifically using this interface (ie, no
tcpdump or similar). This is the whole process list:


$ ps ax
  PID TT  STAT   TIME COMMAND
1 ??  Is  0:00.01 /sbin/init
 2399 ??  Is  0:00.00 ntpd: [priv] (ntpd)
19341 ??  I   0:00.09 ntpd: ntp engine (ntpd)
12690 ??  I   0:00.01 ntpd: dns engine (ntpd)
11247 ??  Is  0:00.02 /usr/sbin/sshd -u0
 2024 ??  Is  0:00.31 cron
32158 ??  Ss  0:01.19 sendmail: accepting connections (sendmail)
24559 ??  Ss  0:17.55 bgpd: parent (bgpd)
12368 ??  S   0:15.77 bgpd: session engine (bgpd)
18994 ??  S   1:05.98 bgpd: route decision engine (bgpd)
 8611 ??  Ss  0:02.39 ifstated -v
11105 ??  S   0:05.28 syslogd -n -a /var/www/dev/log -a /var/empty/dev/log
27237 ??  Is  0:00.03 syslogd: [priv] (syslogd)
27968 ??  S   0:00.51 pflogd: [running] -s 256 -i pflog0 -f /var/log/pflog 
(pflogd)
13936 ??  Is  0:00.05 pflogd: [priv] (pflogd)
31560 ??  Ss  0:00.39 sshd: u...@ttyp0 (sshd)
29917 ??  Ss  0:00.44 sshd: u...@ttyp1 (sshd)
29148 p0  Ss+ 0:00.03 bash
16540 p1  Ss  0:00.04 bash
28953 p1  R+/10:00.00 ps -ax
17757 C0- S   0:00.53 runsvdir -P /var/service log: 
...
 9629 C0  Is+ 0:00.00 /usr/libexec/getty std.9600 ttyC0
  397 C1  Is+ 0:00.00 /usr/libexec/getty std.9600 ttyC1
25085 C2  Is+ 0:00.00 /usr/libexec/getty std.9600 ttyC2
32349 C3  Is+ 0:00.00 /usr/libexec/getty std.9600 ttyC3
12522 C5  Is+ 0:00.00 /usr/libexec/getty std.9600 ttyC5
$


None of these suggests to me that ALLMULTI or PROMISC should be on,
and ifconfig's man page doesn't suggest that I can easily turn
them off.

If you have any suggestions about how to debug this, I'm all ears.


TIA!



Kind regards,
--Toni++



em(4) detailed errors

2010-11-18 Thread Manuel Guesdon
Hi,

Is there a way to get detailed em(4) device errors without having to
recompile kernel with EM_DEBUG ?
I try to find in-errors reason(s) but netstat only gives errors as a sum of
dropped_pkts + stats.rxerrc + stats.crcerrs + sc-stats.algnerrc +... as far
as I can see :-(


Manuel