> 2015-06-26 9:20 GMT+03:00 Claudio Jeker <cje...@diehard.n-r-g.com>: >> On Fri, Jun 26, 2015 at 04:59:32AM +0300, Sergey Ryazanov wrote: >>> Hello, >>> >>> during building l2tp tunnel with xl2tpd-1.3.1 I was faced with its too >>> low upload performance. When download, the speed is 20 mbit/s at >>> nearly 100% CPU utilization. CPU is Pentium D 930 3 GHz. When upload, >>> the speed is below 2 mbit/s at nearly zero CPU utilization. >>> >>> First, I examined xl2tpd code and did not find any potential issues. >>> Then I compiled it with -pg option and do a quick test with help of >>> iperf(1): 4 TCP flows, direction is toward the L2TP server, 2 min >>> test. Then I run gprof and got pretty strange output: >>> > [skip] >>> >>> During upload tests, everything looks like if xl2tpd doesn't perform >>> any work and stucks somewhere in I/O operation. >>> >>> May be there are some options, what could be tuned to speedup ppp(4) >>> I/O performance or did I missed something during my tests? I am in >>> doubts. Any clues? >>> >> >> Can you get a ktrace output to figure out what write is doing? >> Could it be that it busy loops with EINTR or EAGAIN? >> It sure smells like something is going on there. >> > > I did make the trace, which shows that write(2) works good, there are no > errors: > # kdump -f ktrace.out-0-tx | grep 'RET write' | wc -l > 23999 > # kdump -f ktrace.out-0-tx | grep 'RET write.*errno' | wc -l > 0 > > That was bad news. Let's talk about something good. I finally found a way > to speed up the upload. I got 91 mbit/s, as reported by speedtest.net, > over 100 mbit Ethernet link (at 100% CPU utilization, with the patched > non-SMP kernel). > > Looks like the issue is caused by too small size of pty output buffer, and > too small watermarks, which control the pty buffer filling. When pty driver > requests the tty allocation, it passes 0 as baud rate. For any rates, which > is less or equal to 115200, tty driver allocates an output buffer of size > of 1024 byte. And most likely, hardcoded watermarks in the ppp discipline > code are selected according to this buffer size. May be these values were > reasonable for 56k modems, but not for 100 mbit uplink. > > Patch for tests is inlined below. All numbers are arbitrary selected values. > I just took first reasonable values and got a positive result, without any > further experiments. > > This patch is not suitable for merging, since it just quick and dirty fix. > To solve the issue in more generic way I see several approaches, each of > which has pros and cons: > (a) increase default value (as in this patch); > (b) provide some API (IOCTL) to control buffer size from pppd(8); > (c) make some hack that would reveal pty for high-speed links and increase > their buffer. > > Any thoughts? > > P.S. If I can get 91 mbit/s of upload rate, then why I get only 20 mbit/s > of download rate on the same machine? > > Index: kern/tty_pty.c > =================================================================== > RCS file: /cvs/src/sys/kern/tty_pty.c,v > retrieving revision 1.70 > diff -u -p -r1.70 tty_pty.c > --- kern/tty_pty.c 10 Feb 2015 21:56:10 -0000 1.70 > +++ kern/tty_pty.c 28 Jun 2015 14:18:16 -0000 > @@ -58,6 +58,7 @@ > #include <sys/rwlock.h> > > #define BUFSIZ 100 /* Chunk size iomoved to/from user */ > +#define PTY_DEF_BAUD 1000000 > > /* > * pts == /dev/tty[p-zP-T][0-9a-zA-Z] > @@ -192,7 +193,7 @@ check_pty(int minor) > if (!pt_softc[minor]) { > pti = malloc(sizeof(struct pt_softc), M_DEVBUF, > M_WAITOK|M_ZERO); > - pti->pt_tty = ttymalloc(0); > + pti->pt_tty = ttymalloc(PTY_DEF_BAUD); > ptydevname(minor, pti); > pt_softc[minor] = pti; > } > @@ -235,7 +236,7 @@ ptsopen(dev_t dev, int flag, int devtype > > pti = pt_softc[minor(dev)]; > if (!pti->pt_tty) { > - tp = pti->pt_tty = ttymalloc(0); > + tp = pti->pt_tty = ttymalloc(PTY_DEF_BAUD); > } else > tp = pti->pt_tty; > if ((tp->t_state & TS_ISOPEN) == 0) { > @@ -413,7 +414,7 @@ ptcopen(dev_t dev, int flag, int devtype > > pti = pt_softc[minor(dev)]; > if (!pti->pt_tty) { > - tp = pti->pt_tty = ttymalloc(0); > + tp = pti->pt_tty = ttymalloc(PTY_DEF_BAUD); > } else > tp = pti->pt_tty; > if (tp->t_oproc) > Index: net/ppp_tty.c > =================================================================== > RCS file: /cvs/src/sys/net/ppp_tty.c,v > retrieving revision 1.33 > diff -u -p -r1.33 ppp_tty.c > --- net/ppp_tty.c 3 Jun 2015 00:50:09 -0000 1.33 > +++ net/ppp_tty.c 28 Jun 2015 14:18:16 -0000 > @@ -163,8 +163,8 @@ struct pool ppp_pkts; > /* This is a NetBSD-1.0 or later kernel. */ > #define CCOUNT(q) ((q)->c_cc) > > -#define PPP_LOWAT 100 /* Process more output when < LOWAT on queue > */ > -#define PPP_HIWAT 400 /* Don't start a new packet if HIWAT > on queue */ > +#define PPP_LOWAT 1024 /* Process more output when < LOWAT on queue > */ > +#define PPP_HIWAT 4096 /* Don't start a new packet if HIWAT on queue > */ > > /* > * Line specific open routine for async tty devices.
Ping. -- Sergey