Re: Re[4]: serious networking (em) performance (ggate and NFS) problem

2004-11-22 Thread Robert Watson

On Sun, 21 Nov 2004, Sean McNeil wrote:

 I have to disagree.  Packet loss is likely according to some of my
 tests.  With the re driver, no change except placing a 100BT setup with
 no packet loss to a gigE setup (both linksys switches) will cause
 serious packet loss at 20Mbps data rates.  I have discovered the only
 way to get good performance with no packet loss was to
 
 1) Remove interrupt moderation
 2) defrag each mbuf that comes in to the driver.

Sounds like you're bumping into a queue limit that is made worse by
interrupting less frequently, resulting in bursts of packets that are
relatively large, rather than a trickle of packets at a higher rate.
Perhaps a limit on the number of outstanding descriptors in the driver or
hardware and/or a limit in the netisr/ifqueue queue depth.  You might try
changing the default IFQ_MAXLEN from 50 to 128 to increase the size of the
ifnet and netisr queues.  You could also try setting net.isr.enable=1 to
enable direct dispatch, which in the in-bound direction would reduce the
number of context switches and queueing.  It sounds like the device driver
has a limit of 256 receive and transmit descriptors, which one supposes is
probably derived from the hardware limit, but I have no documentation on
hand so can't confirm that.

It would be interesting on the send and receive sides to inspect the
counters for drops at various points in the network stack; i.e., are we
dropping packets at the ifq handoff because we're overfilling the
descriptors in the driver, are packets dropped on the inbound path going
into the netisr due to over-filling before the netisr is scheduled, etc. 
And, it's probably interesting to look at stats on filling the socket
buffers for the same reason: if bursts of packets come up the stack, the
socket buffers could well be being over-filled before the user thread can
run.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Principal Research Scientist, McAfee Research



___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Re[4]: serious networking (em) performance (ggate and NFS) problem

2004-11-22 Thread Sean McNeil
On Mon, 2004-11-22 at 11:34 +, Robert Watson wrote:
 On Sun, 21 Nov 2004, Sean McNeil wrote:
 
  I have to disagree.  Packet loss is likely according to some of my
  tests.  With the re driver, no change except placing a 100BT setup with
  no packet loss to a gigE setup (both linksys switches) will cause
  serious packet loss at 20Mbps data rates.  I have discovered the only
  way to get good performance with no packet loss was to
  
  1) Remove interrupt moderation
  2) defrag each mbuf that comes in to the driver.
 
 Sounds like you're bumping into a queue limit that is made worse by
 interrupting less frequently, resulting in bursts of packets that are
 relatively large, rather than a trickle of packets at a higher rate.
 Perhaps a limit on the number of outstanding descriptors in the driver or
 hardware and/or a limit in the netisr/ifqueue queue depth.  You might try
 changing the default IFQ_MAXLEN from 50 to 128 to increase the size of the
 ifnet and netisr queues.  You could also try setting net.isr.enable=1 to
 enable direct dispatch, which in the in-bound direction would reduce the
 number of context switches and queueing.  It sounds like the device driver
 has a limit of 256 receive and transmit descriptors, which one supposes is
 probably derived from the hardware limit, but I have no documentation on
 hand so can't confirm that.

I've tried bumping IFQ_MAXLEN and it made no difference.  I could rerun
this test to be 100% certain I suppose.  It was done a while back.  I
haven't tried net.isr.enable=1, but packet loss is in the transmission
direction.  The device driver has been modified to have 1024 transmit
and receive descriptors each as that is the hardware limitation.  That
didn't matter either.  With 1024 descriptors I still lost packets
without the m_defrag.

The most difficult thing for me to understand is:  if this is some sort
of resource limitation why will it work with a slower phy layer
perfectly and not with the gigE?  The only thing I could think of was
that the old driver was doing m_defrag calls when it filled the transmit
descriptor queues up to a certain point.  Understanding the effects of
m_defrag would be helpful in figuring this out I suppose.

 It would be interesting on the send and receive sides to inspect the
 counters for drops at various points in the network stack; i.e., are we
 dropping packets at the ifq handoff because we're overfilling the
 descriptors in the driver, are packets dropped on the inbound path going
 into the netisr due to over-filling before the netisr is scheduled, etc. 
 And, it's probably interesting to look at stats on filling the socket
 buffers for the same reason: if bursts of packets come up the stack, the
 socket buffers could well be being over-filled before the user thread can
 run.

Yes, this would be very interesting and should point out the problem.  I
would do such a thing if I had enough knowledge of the network pathways.
Alas, I am very green in this area.  The receive side has no issues,
though, so I would focus on transmit counters (with assistance).



signature.asc
Description: This is a digitally signed message part


Re: Re[4]: serious networking (em) performance (ggate and NFS) problem

2004-11-22 Thread John-Mark Gurney
Sean McNeil wrote this message on Mon, Nov 22, 2004 at 12:14 -0800:
 On Mon, 2004-11-22 at 11:34 +, Robert Watson wrote:
  On Sun, 21 Nov 2004, Sean McNeil wrote:
  
   I have to disagree.  Packet loss is likely according to some of my
   tests.  With the re driver, no change except placing a 100BT setup with
   no packet loss to a gigE setup (both linksys switches) will cause
   serious packet loss at 20Mbps data rates.  I have discovered the only
   way to get good performance with no packet loss was to
   
   1) Remove interrupt moderation
   2) defrag each mbuf that comes in to the driver.
  
  Sounds like you're bumping into a queue limit that is made worse by
  interrupting less frequently, resulting in bursts of packets that are
  relatively large, rather than a trickle of packets at a higher rate.
  Perhaps a limit on the number of outstanding descriptors in the driver or
  hardware and/or a limit in the netisr/ifqueue queue depth.  You might try
  changing the default IFQ_MAXLEN from 50 to 128 to increase the size of the
  ifnet and netisr queues.  You could also try setting net.isr.enable=1 to
  enable direct dispatch, which in the in-bound direction would reduce the
  number of context switches and queueing.  It sounds like the device driver
  has a limit of 256 receive and transmit descriptors, which one supposes is
  probably derived from the hardware limit, but I have no documentation on
  hand so can't confirm that.
 
 I've tried bumping IFQ_MAXLEN and it made no difference.  I could rerun

And the default for if_re is RL_IFQ_MAXLEN which is already 512...  As
is mentioned below, the card can do 64 segments (which usually means 32
packets since each packet usually has a header + payload in seperate
packets)...

 this test to be 100% certain I suppose.  It was done a while back.  I
 haven't tried net.isr.enable=1, but packet loss is in the transmission
 direction.  The device driver has been modified to have 1024 transmit
 and receive descriptors each as that is the hardware limitation.  That
 didn't matter either.  With 1024 descriptors I still lost packets
 without the m_defrag.

hmmm...  you know, I wonder if this is a problem with the if_re not
pulling enough data from memory before starting the transmit...  Though
we currently have it set for unlimited... so, that doesn't seem like it
would be it..

 The most difficult thing for me to understand is:  if this is some sort
 of resource limitation why will it work with a slower phy layer
 perfectly and not with the gigE?  The only thing I could think of was
 that the old driver was doing m_defrag calls when it filled the transmit
 descriptor queues up to a certain point.  Understanding the effects of
 m_defrag would be helpful in figuring this out I suppose.

maybe the chip just can't keep the transmit fifo loaded at the higher
speeds...  is it possible vls is doing a writev for multisegmented UDP
packet?   I'll have to look at this again...

  It would be interesting on the send and receive sides to inspect the
  counters for drops at various points in the network stack; i.e., are we
  dropping packets at the ifq handoff because we're overfilling the
  descriptors in the driver, are packets dropped on the inbound path going
  into the netisr due to over-filling before the netisr is scheduled, etc. 
  And, it's probably interesting to look at stats on filling the socket
  buffers for the same reason: if bursts of packets come up the stack, the
  socket buffers could well be being over-filled before the user thread can
  run.
 
 Yes, this would be very interesting and should point out the problem.  I
 would do such a thing if I had enough knowledge of the network pathways.
 Alas, I am very green in this area.  The receive side has no issues,
 though, so I would focus on transmit counters (with assistance).

-- 
  John-Mark Gurney  Voice: +1 415 225 5579

 All that I will do, has been done, All that I have, has not.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Re[4]: serious networking (em) performance (ggate and NFS) problem

2004-11-22 Thread Sean McNeil
Hi John-Mark,

On Mon, 2004-11-22 at 13:31 -0800, John-Mark Gurney wrote:
 Sean McNeil wrote this message on Mon, Nov 22, 2004 at 12:14 -0800:
  On Mon, 2004-11-22 at 11:34 +, Robert Watson wrote:
   On Sun, 21 Nov 2004, Sean McNeil wrote:
   
I have to disagree.  Packet loss is likely according to some of my
tests.  With the re driver, no change except placing a 100BT setup with
no packet loss to a gigE setup (both linksys switches) will cause
serious packet loss at 20Mbps data rates.  I have discovered the only
way to get good performance with no packet loss was to

1) Remove interrupt moderation
2) defrag each mbuf that comes in to the driver.
   
   Sounds like you're bumping into a queue limit that is made worse by
   interrupting less frequently, resulting in bursts of packets that are
   relatively large, rather than a trickle of packets at a higher rate.
   Perhaps a limit on the number of outstanding descriptors in the driver or
   hardware and/or a limit in the netisr/ifqueue queue depth.  You might try
   changing the default IFQ_MAXLEN from 50 to 128 to increase the size of the
   ifnet and netisr queues.  You could also try setting net.isr.enable=1 to
   enable direct dispatch, which in the in-bound direction would reduce the
   number of context switches and queueing.  It sounds like the device driver
   has a limit of 256 receive and transmit descriptors, which one supposes is
   probably derived from the hardware limit, but I have no documentation on
   hand so can't confirm that.
  
  I've tried bumping IFQ_MAXLEN and it made no difference.  I could rerun
 
 And the default for if_re is RL_IFQ_MAXLEN which is already 512...  As
 is mentioned below, the card can do 64 segments (which usually means 32
 packets since each packet usually has a header + payload in seperate
 packets)...

It sounds like you believe this is an if_re-only problem.  I had the
feeling that the if_em driver performance problems were related in some
way.  I noticed that if_em does not do anything with m_defrag and
thought it might be a little more than coincidence.

  this test to be 100% certain I suppose.  It was done a while back.  I
  haven't tried net.isr.enable=1, but packet loss is in the transmission
  direction.  The device driver has been modified to have 1024 transmit
  and receive descriptors each as that is the hardware limitation.  That
  didn't matter either.  With 1024 descriptors I still lost packets
  without the m_defrag.
 
 hmmm...  you know, I wonder if this is a problem with the if_re not
 pulling enough data from memory before starting the transmit...  Though
 we currently have it set for unlimited... so, that doesn't seem like it
 would be it..

Right.  Plus it now has 1024 descriptors on my machine and, like I said,
made little difference.

  The most difficult thing for me to understand is:  if this is some sort
  of resource limitation why will it work with a slower phy layer
  perfectly and not with the gigE?  The only thing I could think of was
  that the old driver was doing m_defrag calls when it filled the transmit
  descriptor queues up to a certain point.  Understanding the effects of
  m_defrag would be helpful in figuring this out I suppose.
 
 maybe the chip just can't keep the transmit fifo loaded at the higher
 speeds...  is it possible vls is doing a writev for multisegmented UDP
 packet?   I'll have to look at this again...

I suppose.  As I understand it, though, it should be sending out
1316-byte data packets at a metered pace.  Also, wouldn't it behave the
same for 100BT vs. gigE?  Shouldn't I see packet loss with 100BT if this
is the case?

   It would be interesting on the send and receive sides to inspect the
   counters for drops at various points in the network stack; i.e., are we
   dropping packets at the ifq handoff because we're overfilling the
   descriptors in the driver, are packets dropped on the inbound path going
   into the netisr due to over-filling before the netisr is scheduled, etc. 
   And, it's probably interesting to look at stats on filling the socket
   buffers for the same reason: if bursts of packets come up the stack, the
   socket buffers could well be being over-filled before the user thread can
   run.
  
  Yes, this would be very interesting and should point out the problem.  I
  would do such a thing if I had enough knowledge of the network pathways.
  Alas, I am very green in this area.  The receive side has no issues,
  though, so I would focus on transmit counters (with assistance).
 


signature.asc
Description: This is a digitally signed message part


Re: Re[4]: serious networking (em) performance (ggate and NFS) problem

2004-11-22 Thread Matthew Dillon
:Increasing the interrupt moderation frequency worked on the re driver,
:but it only made it marginally better.  Even without moderation,
:however, I could lose packets without m_defrag.  I suspect that there is
:something in the higher level layers that is causing the packet loss.  I
:have no explanation why m_defrag makes such a big difference for me, but
:it does.  I also have no idea why a 20Mbps UDP stream can lose data over
:gigE phy and not lose anything over 100BT... without the above mentioned
:changes that is.

It kinda sounds like the receiver's UDP buffer is not large enough to
handle the burst traffic.  100BT is a much slower transport and the
receiver (userland process) was likely able drain its buffer before
new packets arrived.

Use netstat -s to observe the drop statistics for udp on both the
sender and receiver sides.  You may also be able to get some useful
information looking at the ip stats on both sides too.

Try bumping up net.inet.udp.recvspace and see if that helps.

In anycase, you should be able to figure out where the drops are occuring
by observing netstat -s output.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re[4]: serious networking (em) performance (ggate and NFS) problem

2004-11-21 Thread Shunsuke SHINOMIYA

 Thank you, Matt.
 
 Very interesting, but the only reason you get lower results is simply
 because the TCP window is not big enough.  That's it.
 

 Yes, I knew that adjusting TCP window size is important to use up a link.
 However I wanted to show adjusting the parameters of Interrupt
 Moderation affects network performance.

 And I think a packet loss was occured by enabled Interrupt Moderation.
 The mechanism of a packet loss in this case is not cleared, but I think
 inappropriate TCP window size is not the only reason.

 I found TCP throuput improvement at disabled Interrupt Moderation is related
 to congestion avoidance phase of TCP. Because these standard deviations are
 decreased when Interrupt Moderation is disabled.

 The following two results are outputs of `iperf -P 10'. without TCP
 window size adjustment too. I think, the difference of each throughput
 at same measurement shows congestion avoidance worked.

o with default setting of Interrupt Moderation.
 [ ID] Interval   Transfer Bandwidth
 [ 13]  0.0-10.0 sec  80.1 MBytes  67.2 Mbits/sec
 [ 11]  0.0-10.0 sec   121 MBytes   102 Mbits/sec
 [ 12]  0.0-10.0 sec  98.9 MBytes  83.0 Mbits/sec
 [  4]  0.0-10.0 sec  91.8 MBytes  76.9 Mbits/sec
 [  7]  0.0-10.0 sec   127 MBytes   106 Mbits/sec
 [  5]  0.0-10.0 sec   106 MBytes  88.8 Mbits/sec
 [  6]  0.0-10.0 sec   113 MBytes  94.4 Mbits/sec
 [ 10]  0.0-10.0 sec   117 MBytes  98.2 Mbits/sec
 [  9]  0.0-10.0 sec   113 MBytes  95.0 Mbits/sec
 [  8]  0.0-10.0 sec  93.0 MBytes  78.0 Mbits/sec
 [SUM]  0.0-10.0 sec  1.04 GBytes   889 Mbits/sec

o with disabled Interrupt Moderation.
 [ ID] Interval   Transfer Bandwidth
 [  7]  0.0-10.0 sec   106 MBytes  88.9 Mbits/sec
 [ 10]  0.0-10.0 sec   107 MBytes  89.7 Mbits/sec
 [  8]  0.0-10.0 sec   107 MBytes  89.4 Mbits/sec
 [  9]  0.0-10.0 sec   107 MBytes  90.0 Mbits/sec
 [ 11]  0.0-10.0 sec   106 MBytes  89.2 Mbits/sec
 [ 12]  0.0-10.0 sec   104 MBytes  87.6 Mbits/sec
 [  4]  0.0-10.0 sec   106 MBytes  88.7 Mbits/sec
 [ 13]  0.0-10.0 sec   106 MBytes  88.9 Mbits/sec
 [  5]  0.0-10.0 sec   106 MBytes  88.9 Mbits/sec
 [  6]  0.0-10.0 sec   107 MBytes  89.9 Mbits/sec
 [SUM]  0.0-10.0 sec  1.04 GBytes   891 Mbits/sec


 But, By decreasing TCP windows size, it could avoid.
o with default setting of Interrupt Moderation and iperf -P 10 -w 28.3k
 [ ID] Interval   Transfer Bandwidth
 [ 12]  0.0-10.0 sec   111 MBytes  93.0 Mbits/sec
 [  4]  0.0-10.0 sec   106 MBytes  88.8 Mbits/sec
 [ 11]  0.0-10.0 sec   107 MBytes  89.9 Mbits/sec
 [  9]  0.0-10.0 sec   109 MBytes  91.6 Mbits/sec
 [  5]  0.0-10.0 sec   109 MBytes  91.5 Mbits/sec
 [ 13]  0.0-10.0 sec   108 MBytes  90.8 Mbits/sec
 [ 10]  0.0-10.0 sec   107 MBytes  89.7 Mbits/sec
 [  8]  0.0-10.0 sec   110 MBytes  92.3 Mbits/sec
 [  6]  0.0-10.0 sec   111 MBytes  93.2 Mbits/sec
 [  7]  0.0-10.0 sec   108 MBytes  90.6 Mbits/sec
 [SUM]  0.0-10.0 sec  1.06 GBytes   911 Mbits/sec


 Measureing TCP throughput was not appropriate way to indicate an effect
 of Interrupt Moderation clearly. It's my mistake. TCP is too
 complicated. :)

-- 
Shunsuke SHINOMIYA [EMAIL PROTECTED]

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Re[4]: serious networking (em) performance (ggate and NFS) problem

2004-11-21 Thread Matthew Dillon

: Yes, I knew that adjusting TCP window size is important to use up a link.
: However I wanted to show adjusting the parameters of Interrupt
: Moderation affects network performance.
:
: And I think a packet loss was occured by enabled Interrupt Moderation.
: The mechanism of a packet loss in this case is not cleared, but I think
: inappropriate TCP window size is not the only reason.

Packet loss is not likely, at least not for the contrived tests we
are doing because GiGE links have hardware flow control (I'm fairly
sure).

One could calculate the worst case small-packet build up in the receive
ring.  I'm not sure what the minimum pad for GiGE is, but lets say it's
64 bytes.  Then the packet rate would be around 1.9M pps or 244 packets
per interrupt at a moderation frequency of 8000 hz.  The ring is 256
packets.  But, don't forget the hardware flow control!  The switch
has some buffering too.

hmm... me thinks I now understand why 8000 was chosen as the default :-)

I would say that this means packet loss due to the interrupt moderation
is highly unlikely, at least in theory, but if one were paranoid one
might want to use a higher moderation frequency, say 16000 hz, to be sure.

: I found TCP throuput improvement at disabled Interrupt Moderation is related
: to congestion avoidance phase of TCP. Because these standard deviations are
: decreased when Interrupt Moderation is disabled.
:
: The following two results are outputs of `iperf -P 10'. without TCP
: window size adjustment too. I think, the difference of each throughput
: at same measurement shows congestion avoidance worked.
:
:o with default setting of Interrupt Moderation.
: [ ID] Interval   Transfer Bandwidth
: [ 13]  0.0-10.0 sec  80.1 MBytes  67.2 Mbits/sec
: [ 11]  0.0-10.0 sec   121 MBytes   102 Mbits/sec
: [ 12]  0.0-10.0 sec  98.9 MBytes  83.0 Mbits/sec
: [  4]  0.0-10.0 sec  91.8 MBytes  76.9 Mbits/sec
: [  7]  0.0-10.0 sec   127 MBytes   106 Mbits/sec
: [  5]  0.0-10.0 sec   106 MBytes  88.8 Mbits/sec
: [  6]  0.0-10.0 sec   113 MBytes  94.4 Mbits/sec
: [ 10]  0.0-10.0 sec   117 MBytes  98.2 Mbits/sec
: [  9]  0.0-10.0 sec   113 MBytes  95.0 Mbits/sec
: [  8]  0.0-10.0 sec  93.0 MBytes  78.0 Mbits/sec
: [SUM]  0.0-10.0 sec  1.04 GBytes   889 Mbits/sec

Certainly overall send/response latency will be effected by up to 1/freq,
e.g. 1/8000 = 125 uS (x2 hosts == 250 uS worst case), which is readily
observable by running ping:

[intrate]
[set on both boxes]

max:64 bytes from 216.240.41.62: icmp_seq=2 ttl=64 time=0.057 ms
10: 64 bytes from 216.240.41.62: icmp_seq=8 ttl=64 time=0.061 ms
3:  64 bytes from 216.240.41.62: icmp_seq=5 ttl=64 time=0.078 ms
8000:   64 bytes from 216.240.41.62: icmp_seq=3 ttl=64 time=0.176 ms
(large stddev too, e.g. 0.188, 0.166, etc).

But this is only relevant for applications that require that sort of
response time == not very many applications.  Note that a large packet
will turn the best case 57 uS round trip into a 140 uS round trip with
the EM card.

It might be interesting to see how interrupt moderation effects a
buildworld over NFS as that certainly results in a huge amount of
synchronous transactional traffic.

: Measureing TCP throughput was not appropriate way to indicate an effect
: of Interrupt Moderation clearly. It's my mistake. TCP is too
: complicated. :)
:
:-- 
:Shunsuke SHINOMIYA [EMAIL PROTECTED]

It really just comes down to how sensitive a production system is to
round trip times within the range of effect of the moderation frequency.
Usually the answer is: not very.  That is, the benefit is not sufficient
to warrent the additional interrupt load that turning moderation off
would create.  And even if low latency is desired it is not actually
necessary to turn off moderation.  It could be set fairly high,
e.g. 2, to reap most of the benefit.

Processing overheads are also important.  If the network is loaded down
you will wind up eating a significant chunk of cpu with moderation turned
off.  This is readily observable by running vmstat during an iperf test.

iperf test ~700 MBits/sec reported for all tested moderation frequencies.
using iperf -w 63.5K on DragonFly.  I would be interesting in knowing how
FreeBSD fares, though SMP might skew the reality too much to be 
meaningful.

moderation  cpu
frequency   %idle

10  2% idle
3   7% idle
2   35% idle
1   60% idle
800066% idle

In otherwords, if you are doing more then just shoving bits around the
network, for example if you need to read or write the disk or do some
sort of computation or other activity that requires cpu, turning off
moderation could wind up being a very, very bad 

Re: Re[4]: serious networking (em) performance (ggate and NFS) problem

2004-11-21 Thread Sean McNeil
On Sun, 2004-11-21 at 20:42 -0800, Matthew Dillon wrote:
 : Yes, I knew that adjusting TCP window size is important to use up a link.
 : However I wanted to show adjusting the parameters of Interrupt
 : Moderation affects network performance.
 :
 : And I think a packet loss was occured by enabled Interrupt Moderation.
 : The mechanism of a packet loss in this case is not cleared, but I think
 : inappropriate TCP window size is not the only reason.
 
 Packet loss is not likely, at least not for the contrived tests we
 are doing because GiGE links have hardware flow control (I'm fairly
 sure).

I have to disagree.  Packet loss is likely according to some of my
tests.  With the re driver, no change except placing a 100BT setup with
no packet loss to a gigE setup (both linksys switches) will cause
serious packet loss at 20Mbps data rates.  I have discovered the only
way to get good performance with no packet loss was to

1) Remove interrupt moderation
2) defrag each mbuf that comes in to the driver.

Doing both of these, I get excellent performance without any packet
loss.  All my testing has been with UDP packets, however, and nothing
was checked for TCP.

 One could calculate the worst case small-packet build up in the receive
 ring.  I'm not sure what the minimum pad for GiGE is, but lets say it's
 64 bytes.  Then the packet rate would be around 1.9M pps or 244 packets
 per interrupt at a moderation frequency of 8000 hz.  The ring is 256
 packets.  But, don't forget the hardware flow control!  The switch
 has some buffering too.
 
 hmm... me thinks I now understand why 8000 was chosen as the default :-)
 
 I would say that this means packet loss due to the interrupt moderation
 is highly unlikely, at least in theory, but if one were paranoid one
 might want to use a higher moderation frequency, say 16000 hz, to be sure.

Your calculations are based on the mbufs being a particular size, no?
What happens if they are seriously defragmented?  Is this what you mean
by small-packet?  Are you assuming the mbufs are as small as they get?
How small can they go?  1 byte? 1 MTU?

Increasing the interrupt moderation frequency worked on the re driver,
but it only made it marginally better.  Even without moderation,
however, I could lose packets without m_defrag.  I suspect that there is
something in the higher level layers that is causing the packet loss.  I
have no explanation why m_defrag makes such a big difference for me, but
it does.  I also have no idea why a 20Mbps UDP stream can lose data over
gigE phy and not lose anything over 100BT... without the above mentioned
changes that is.



signature.asc
Description: This is a digitally signed message part