On Wed, Mar 02, 2011 at 08:34:02PM +0100, Manuel Guesdon wrote: > On Wed, 2 Mar 2011 21:52:03 +0900 > Ryan McBride <mcbr...@openbsd.org> wrote: > > >| On Mon, Feb 28, 2011 at 12:49:01PM +0100, Manuel Guesdon wrote: > >| > OK. Anyway NIC buffers restrict buffered packets number. But the problem > >| > remain: why a (for exemple) dual Xeon E5520@2.27GHz with Intel PRO/1000 > >| > (82576) can't route 150kpps without Ierr :-) > >| > http://www.oxymium.net/tmp/core3-dmesg > >| > >| I've done some more comprehensive testing and talked to some other > >| developers, and it seems that 150kpps is in the range of what is > >| expected for such hardware with an unoptimized install. > > Thank you for the help !
Hmpf. My last tests where done with ix(4) and it performed way better. Not sure if something got back into em(4) that makes the driver slow or if it is something different. > > > >| One thing that seems to have a big performance impact is > >| net.inet.ip.ifq.maxlen. If and only if your network cards are all > >| supported by MCLGETI (ie, they show LWM/CWM/HWM values in 'systat > >| mbufs', you can try increasing ifq.maxlen until you don't see > >| net.inet.ip.ifq.drops incrementing anymore under constant load. > > Yes all my nic interfaces have LWM/CWM/HWM values: > IFACE LIVELOCKS SIZE ALIVE LWM HWM CWM > System 256 83771 5502 > 2k 160 1252 > em0 37 2k 4 4 256 4 > em1 258 2k 4 4 256 4 > em2 372751 2k 7 4 256 7 > em3 8258 2k 4 4 256 4 > em4 25072 2k 63 4 256 63 > em5 3658 2k 8 4 256 8 > em6 501288 2k 24 4 256 24 > em7 22 2k 4 4 256 4 > em8 36551 2k 23 4 256 23 > em9 52053 2k 5 4 256 4 > Woohoo. That is a lot of livelocks you hit. In other words you are losing ticks by something spinning to long in the kernel. Interfaces with a very low CWM but a high pps rate are the ones you need to investigate about. Additionally I would like to see your netstat -m and vmstat -m output. If I see it right you have 83771 mbufs allocated in your system. This sounds like a serious mbuf leak and could actually be the reason for your bad performance. It is very well possible that most of your buffer allocations fail causing the tiny rings and suboptimal performance. > I've already increased to 2048 some time ago with good effect on ifq.drops > but even when ifq.drops doesn't increase, I still have > Ierrs on interfaces (I've just verified this right now) :-) Having some Ierrs is not a big issue always put them in perspective with the number of packets received. e.g. em6 1500 <Link> 00:30:48:9c:3a:80 72007980648 143035 62166589667 0 0 This interface had 143035 Ierrs but it also passed 72 billion packets so this is far less then 1% and not a problem. > I've made some change to em some time ago to dump card stats with -debug > option and it give me this stuff like this: > ----------------------- > em4: Dropped PKTS = 0 > em4: Excessive collisions = 0 > em4: Symbol errors = 0 > em4: Sequence errors = 0 > em4: Defer count = 3938 > em4: Missed Packets = 17728103 > em4: Receive No Buffers = 21687370 > em4: Receive Length Errors = 0 > em4: Receive errors = 0 > em4: Crc errors = 0 > em4: Alignment errors = 0 > em4: Carrier extension errors = 0 > em4: RX overruns = 1456725 > em4: watchdog timeouts = 0 > em4: XON Rcvd = 31813 > em4: XON Xmtd = 2304158 > em4: XOFF Rcvd = 935928 > em4: XOFF Xmtd = 20031226 > em4: Good Packets Rcvd = 33772245185 > em4: Good Packets Xmtd = 20662758161 > ----------------------- > em4: Dropped PKTS = 0 > em4: Excessive collisions = 0 > em4: Symbol errors = 0 > em4: Sequence errors = 0 > em4: Defer count = 3938 > em4: Missed Packets = 17728457 > em4: Receive No Buffers = 21687421 > em4: Receive Length Errors = 0 > em4: Receive errors = 0 > em4: Crc errors = 0 > em4: Alignment errors = 0 > em4: Carrier extension errors = 0 > em4: RX overruns = 1456730 > em4: watchdog timeouts = 0 > em4: XON Rcvd = 31813 > em4: XON Xmtd = 2304166 > em4: XOFF Rcvd = 935928 > em4: XOFF Xmtd = 20031588 > em4: Good Packets Rcvd = 33772265127 > em4: Good Packets Xmtd = 20662759039 > > So If I well understand this, the card indicate that there are Missed Packets > because the nic have sometime not enough buffer space to store them which > seems stange with 8000 int/s and an 40K buffer (40K for Rx, 24K for Tx as > seen in if_em.c) > The FIFO on the card don't matter that much. The problem is the DMA ring and the amount of slots on the ring that are actually usable. This is the CWM in the systat mbuf output. MCLGETI() reduces the buffers on the ring to limit the work getting into the system over a specific network card. > > One of my interrogation is how to know that the system is heavy loaded. > systat -s 2 vmstat, give me these informations: > > Proc:r d s w Csw Trp Sys Int Sof Flt > 14 149 2 509 20118 98 31 > > 3.5%Int 0.5%Sys 0.0%Usr 0.0%Nic 96.0%Idle > | | | | | | | | | | | > > which make me think that the system is really not very loaded but I may miss > a point.... > So you have this 3.5% Int and 0.5% Sys load and are still hitting tons of LIVELOCKS (e.g. the counters increase all the time)? It really looks like there is a different problem (the mentioned mbuf leak) slowing you down. -- :wq Claudio