On Fri, Sep 02, 2016 at 10:33:19AM +0200, Andreas Bartelt wrote:
> On 09/02/16 10:24, Alexander Bluhm wrote:
> > I see a performance drop to 10 Mbit/sec on some old i386 machines
> > with em(4). Can you try this kernel diff to see wether it is the
> > same problem?
> >
>
> yes, >50 MB/s now. Thanks!
So I think this happens:
sosend() uses m_getuio() now to allocate a MAXMCLBYTES mbuf cluster,
that is 65536 bytes. sbreserve() calculates the upper cluster limit
min(cc * 2, sb_max + (sb_max / MCLBYTES) * MSIZE). In our case it
is min(1024*16 * 2, (256*1024) + ((256*1024) / (1 << 11)) * 256)
== min(32768, 294912) == 32768.
So after allocating a single mbuf cluster the sending socket buffer
has no space anymore. As tcp_output() keeps the mbuf cluster for
retransmits, it will be freed only after all ACKs have been received.
That kills performance totally.
To allow cycling through the mbufs periodically, I think we need
space for at least 3 of them. Note that this diff also affects the
mbuf size on the receiving side, but I think it does not matter
much as the data size is also limited.
Andreas, can you revert the diff I sent previously and try this one
instead?
ok?
bluhm
Index: kern/uipc_socket2.c
===================================================================
RCS file: /data/mirror/openbsd/cvs/src/sys/kern/uipc_socket2.c,v
retrieving revision 1.64
diff -u -p -r1.64 uipc_socket2.c
--- kern/uipc_socket2.c 28 Jun 2016 14:47:00 -0000 1.64
+++ kern/uipc_socket2.c 2 Sep 2016 10:29:10 -0000
@@ -397,7 +397,8 @@ sbreserve(struct sockbuf *sb, u_long cc)
if (cc == 0 || cc > sb_max)
return (1);
sb->sb_hiwat = cc;
- sb->sb_mbmax = min(cc * 2, sb_max + (sb_max / MCLBYTES) * MSIZE);
+ sb->sb_mbmax = max(3 * MAXMCLBYTES,
+ min(cc * 2, sb_max + (sb_max / MCLBYTES) * MSIZE));
if (sb->sb_lowat > sb->sb_hiwat)
sb->sb_lowat = sb->sb_hiwat;
return (0);