On 6/6/15, Florian Obser <[email protected]> wrote:
> On Sat, Jun 06, 2015 at 07:05:46PM +0000, Florian Obser wrote:
>> On Fri, Jun 05, 2015 at 05:15:09PM -0500, Matthew Martin wrote:
>> > >Synopsis: Serving large files with httpd eats fscktons of memory.
>> > >Category: system
>> > >Environment:
>> >    System      : OpenBSD 5.7
>> >    Details     : OpenBSD 5.7 (GENERIC) #0: Thu Apr 30 22:01:01 CEST 2015
>> >                    
>> > [email protected]:/binpatchng/work-binpatch57-amd64/src/sys/arch/amd64/compile/GENERIC
>> >
>> >    Architecture: OpenBSD.amd64
>> >    Machine     : amd64
>> > >Description:
>> > A couple of people concurrently downloading iso sized files can drag
>> > the server into using all of swap (there's only a gig) in a
>> > frighteningly short time.
>> >
>> > Revision 1.30 of
>> > http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/usr.sbin/httpd/server_file.c
>> > had message "Adjust the read/write watermarks according to the TCP
>> > send buffer. This fixes sending of large files. Previously, httpd was
>> > reading the input file too quickly and could run out of memory when
>> > filling the input buffer."
>> >
>> > That pretty well describes the situation I'm seeing now with 5.7
>> > (-RELEASE + m:tier binpatches). It also happens with -CURRENT (as of
>> > Jun 5) httpd compiled and slapped in its place.
>> >
>> > >How-To-Repeat:
>> > Something along the lines of
>> > hostA# dd if=/dev/zero of=/var/www/htdocs/test_512.dat bs=1M count=512
>> > hostB$ ftp http://hostA/test_512.dat
>> >
>> > >Fix:
>> > I've attempted various settings of 'tcp socket buffer' and shotgunning
>> > around the lines modified in 1.30 to no avail. Trying to limit its
>> > consumption via login.conf just makes it die when it hits the limit.
>>
>> the problem is that we have to bufferevents.  clt->clt_bev which is
>> handleing the accept(2)ed socket and clt->clt_srvbev which is
>> handleing the fd of the served file.
>>
>> it goes like this:
>> clt_bev: Here I am, brain the size of a planet, and they ask me to serve
>>          a file.
>> clt_srvbev: Whee, I can read clt_sndbufsiz bytes from the file. Hello
>>             clt_bev, here, have some data.
>> clt_bev: Ok, I can't send it right now, but I have a buffer, I'll put
>>          the data at the end.
>> [last message repeated n times]
>> clt_bev: Phew, I can send some data to the client, hopfully clt_bev
>>          stops to send me data soon.
>>
>> and so on... In the end the whole iso ends up in clt_bev.
>>
>> I'm not sure yet about the 32 and 256 magic numbers, they seem to it
>> some kind of sweet spot on *my* system. Make them to small and
>> everything slows down, make them to large and you waste memory.
>>
>> Please try this:
>>
>
> this gets rid of a compiler warning:
>
> diff --git server.c server.c
> index ca67a47..8725e2c 100644
> --- server.c
> +++ server.c
> @@ -704,7 +704,7 @@ server_input(struct client *clt)
>
>       /* Adjust write watermark to the socket buffer output size */
>       bufferevent_setwatermark(clt->clt_bev, EV_WRITE,
> -         clt->clt_sndbufsiz, 0);
> +         32 * clt->clt_sndbufsiz, 0);
>       /* Read at most amount of data that fits in one fcgi record. */
>       bufferevent_setwatermark(clt->clt_bev, EV_READ, 0, FCGI_CONTENT_SIZE);
>
> @@ -729,6 +729,10 @@ server_write(struct bufferevent *bev, void *arg)
>               goto done;
>
>       bufferevent_enable(bev, EV_READ);
> +
> +     if (clt->clt_srvbev && !(clt->clt_srvbev->enabled & EV_READ))
> +             bufferevent_enable(clt->clt_srvbev, EV_READ);
> +
>       return;
>   done:
>       (*bev->errorcb)(bev, EVBUFFER_WRITE|EVBUFFER_EOF, bev->cbarg);
> @@ -769,6 +773,11 @@ server_read(struct bufferevent *bev, void *arg)
>               goto fail;
>       if (clt->clt_done)
>               goto done;
> +
> +     if (EVBUFFER_LENGTH(EVBUFFER_OUTPUT(clt->clt_bev)) > (size_t) 256 *
> +         clt->clt_sndbufsiz)
> +             bufferevent_disable(bev, EV_READ);
> +
>       return;
>   done:
>       (*bev->errorcb)(bev, EVBUFFER_READ|EVBUFFER_EOF, bev->cbarg);
>
>
> --
> I'm not entirely sure you are real.
>


Works great.
$ sudo pkg_delete apache-httpd # :)

Here's some meaningless benchmarks. Disregard grotesque CPU usage:
It's a virtual machine under virtualbox running on a commodity craptop
with a slow disk trying to read from one VM and write to another.  And
everything else because who cares about benchmarks. But as there
doesn't seem to be anything else too horrific I'll throw it on the
live site and maybe get some of the other recent changes stressed a
bit too. If I hit anything else I'll double check on an entirely
-CURRENT system.

#####
-RELEASE + binpatch httpd:
#####

### After booting, before doing anything: ###
load averages:  0.57,  0.24,  0.10    test.vm.local 12:53:35
35 processes: 34 idle, 1 on processor
1   CPUs:  1.1% user,  0.0% nice,  1.5% system,  0.1% interrupt, 97.3% idle
Memory: Real: 23M/73M act/tot Free: 165M Cache: 20M Swap: 0K/1024M

  PID USERNAME PRI NICE  SIZE   RES STATE     WAIT      TIME    CPU COMMAND
    1 root      10    0  516K  524K sleep     wait      0:01  0.00% init
20593 admin      2    0 1168K 2140K sleep     kqread    0:00  0.00% tmux
21635 admin      2    0 3664K 2576K sleep     select    0:00  0.00% sshd
23875 root       2    0 3768K 3260K idle      poll      0:00  0.00% sshd
20006 _syslogd   2    0 1100K 1416K sleep     kqread    0:00  0.00% syslogd
28916 root      18    0  660K  776K sleep     pause     0:00  0.00% ksh
16269 www        2    0 1152K 3316K sleep     kqread    0:00  0.00% httpd
16187 _pflogd    4    0  684K  368K sleep     bpf       0:00  0.00% pflogd
 3341 root       2    0 1284K 2344K sleep     kqread    0:00  0.00% httpd
29154 admin      2    0  876K 1856K idle      kqread    0:00  0.00% tmux
 5628 _smtpq     2    0 1576K 2188K idle      kqread    0:00  0.00% smtpd
24627 www        2    0 1160K 3284K sleep     kqread    0:00  0.00% httpd
  339 www        2    0 1156K 3268K sleep     kqread    0:00  0.00% httpd
28963 admin     18    0  632K  724K idle      pause     0:00  0.00% ksh
22747 _smtpd     2    0 1628K 2192K idle      kqread    0:00  0.00% smtpd
18664 root       2    0 1576K 2140K idle      kqread    0:00  0.00% smtpd
23311 root       3    0  400K 1156K idle      ttyin     0:00  0.00% getty
20789 admin     18    0  628K  740K idle      pause     0:00  0.00% ksh



### About 50% through transferring test_512.dat ###
load averages:  1.76,  0.68,  0.28    test.vm.local 12:56:17
35 processes: 1 running, 33 idle, 1 on processor
1   CPUs:  6.6% user,  0.0% nice,  8.6% system,  4.9% interrupt, 79.8% idle
Memory: Real: 116M/212M act/tot Free: 26M Cache: 21M Swap: 137M/1024M

  PID USERNAME PRI NICE  SIZE   RES STATE     WAIT      TIME    CPU COMMAND
16269 www       61    0  257M  114M run       -         0:51 67.29% httpd



### After completion. ###
load averages:  1.62,  0.96,  0.42    test.vm.local 12:57:30
35 processes: 1 running, 33 idle, 1 on processor
1   CPUs:  5.4% user,  0.0% nice, 13.1% system,  7.5% interrupt, 73.9% idle
Memory: Real: 2976K/48M act/tot Free: 190M Cache: 24M Swap: 17M/1024M

  PID USERNAME PRI NICE  SIZE   RES STATE     WAIT      TIME    CPU COMMAND
16269 www        2    0 1204K  784K idle      kqread    1:20 15.97% httpd



#####
-RELEASE + binpatch + -CURRENT (Jun 6th) httpd WITHOUT this patch:
#####



### Before transfer. ###
load averages:  0.52,  0.16,  0.06    test.vm.local 13:06:32
31 processes: 30 idle, 1 on processor
1   CPUs:  1.8% user,  0.0% nice,  3.0% system,  0.1% interrupt, 95.1% idle
Memory: Real: 20M/67M act/tot Free: 171M Cache: 18M Swap: 0K/1024M

  PID USERNAME PRI NICE  SIZE   RES STATE     WAIT      TIME    CPU COMMAND
    1 root      10    0  508K  508K idle      wait      0:01  0.00% init
15488 root       2    0 3768K 3256K sleep     poll      0:00  0.00% sshd
13043 _syslogd   2    0 1136K 1456K sleep     kqread    0:00  0.00% syslogd
27239 root       2    0 1280K 2312K idle      kqread    0:00  0.00% httpd
19145 www        2    0 1156K 3184K idle      kqread    0:00  0.00% httpd
12066 www        2    0 1160K 3152K idle      kqread    0:00  0.00% httpd
 9620 www        2    0 1152K 3132K idle      kqread    0:00  0.00% httpd
 5805 admin      2    0 3648K 2524K sleep     select    0:00  0.00% sshd
15055 admin     18    0  628K  732K sleep     pause     0:00  0.00% ksh
31667 _smtpq     2    0 1580K 2220K idle      kqread    0:00  0.00% smtpd
   12 root       3    0  404K 1152K idle      ttyin     0:00  0.00% getty
22421 _pflogd    4    0  684K  360K sleep     bpf       0:00  0.00% pflogd
18060 root       2    0 1580K 2144K idle      kqread    0:00  0.00% smtpd
27368 _smtpd     2    0 1512K 2080K idle      kqread    0:00  0.00% smtpd
 2771 root       3    0  396K 1132K idle      ttyin     0:00  0.00% getty
18925 _smtpd     2    0 1432K 2004K idle      kqread    0:00  0.00% smtpd
 1105 _smtpd     2    0 1568K 2728K idle      kqread    0:00  0.00% smtpd
30262 root       3    0  400K 1140K idle      ttyin     0:00  0.00% getty



### During transfer. ###
load averages:  1.28,  0.40,  0.15    test.vm.local 13:08:06
31 processes: 1 running, 29 idle, 1 on processor
1   CPUs:  8.5% user,  0.0% nice,  6.6% system,  4.6% interrupt, 80.3% idle
Memory: Real: 157M/238M act/tot Free: 508K Cache: 12M Swap: 64M/1024M

  PID USERNAME PRI NICE  SIZE   RES STATE     WAIT      TIME    CPU COMMAND
19145 www       36    0  385M  156M run       -         0:25 60.84% httpd



### During transfer. ###
load averages:  1.65,  0.56,  0.22    test.vm.local 13:08:33
31 processes: 1 running, 29 idle, 1 on processor
1   CPUs:  7.2% user,  0.0% nice, 14.3% system,  8.4% interrupt, 70.1% idle
Memory: Real: 115M/211M act/tot Free: 27M Cache: 19M Swap: 139M/1024M

  PID USERNAME PRI NICE  SIZE   RES STATE     WAIT      TIME    CPU COMMAND
19145 www       64    0  257M  114M run       -         0:44 65.53% httpd



### During transfer. ###
load averages:  1.70,  0.63,  0.25    test.vm.local 13:08:54
31 processes: 1 running, 29 idle, 1 on processor
1   CPUs:  6.5% user,  0.0% nice, 18.7% system, 10.4% interrupt, 64.4% idle
Memory: Real: 140M/225M act/tot Free: 13M Cache: 24M Swap: 210M/1024M

  PID USERNAME PRI NICE  SIZE   RES STATE     WAIT      TIME    CPU COMMAND
19145 www       57    0  257M  139M run       -         0:58 67.48% httpd



### After transfer. ###
load averages:  0.93,  0.67,  0.29    test.vm.local 13:10:03
31 processes: 30 idle, 1 on processor
1   CPUs:  4.9% user,  0.0% nice, 18.2% system, 10.6% interrupt, 66.3% idle
Memory: Real: 2332K/48M act/tot Free: 190M Cache: 24M Swap: 16M/1024M

  PID USERNAME PRI NICE  SIZE   RES STATE     WAIT      TIME    CPU COMMAND
19145 www        2    0 1244K  832K idle      kqread    1:13  6.64% httpd



#####
-RELEASE + binpatch + -CURRENT (Jun 6th) httpd WITH this patch.
#####



### Before transfer. ###
load averages:  0.40,  0.10,  0.03    test.vm.local 13:16:22
31 processes: 30 idle, 1 on processor
1   CPUs:  4.9% user,  0.0% nice,  8.6% system,  0.5% interrupt, 86.1% idle
Memory: Real: 20M/67M act/tot Free: 171M Cache: 18M Swap: 0K/1024M

  PID USERNAME PRI NICE  SIZE   RES STATE     WAIT      TIME    CPU COMMAND
30213 root       2    0 3772K 3244K sleep     poll      0:00  0.15% sshd
    1 root      10    0  520K  524K sleep     wait      0:01  0.00% init
 1916 _syslogd   2    0 1096K 1440K sleep     kqread    0:00  0.00% syslogd
21647 root       2    0 1284K 2296K sleep     kqread    0:00  0.00% httpd
23629 www        2    0 1156K 3148K sleep     kqread    0:00  0.00% httpd
  717 www        2    0 1164K 3204K sleep     kqread    0:00  0.00% httpd
19611 www        2    0 1156K 3152K sleep     kqread    0:00  0.00% httpd
 7000 admin      2    0 3652K 2520K sleep     select    0:00  0.00% sshd
20000 _smtpq     2    0 1572K 2184K sleep     kqread    0:00  0.00% smtpd
21661 root       2    0 1572K 2124K sleep     kqread    0:00  0.00% smtpd
21614 admin     18    0  632K  740K sleep     pause     0:00  0.00% ksh
18344 root       3    0  404K 1152K sleep     ttyin     0:00  0.00% getty
23896 root       3    0  408K 1156K sleep     ttyin     0:00  0.00% getty
25101 root       3    0  404K 1148K sleep     ttyin     0:00  0.00% getty
 8370 root       3    0  400K 1144K sleep     ttyin     0:00  0.00% getty
17674 _smtpd     2    0 1624K 2212K sleep     kqread    0:00  0.00% smtpd
21802 root       2    0  632K  616K sleep     netio     0:00  0.00% pflogd
28181 root       3    0  396K 1144K sleep     ttyin     0:00  0.00% getty



### During transfer ###
load averages:  0.37,  0.17,  0.07    test.vm.local 13:17:51
31 processes: 1 running, 29 idle, 1 on processor
1   CPUs:  1.0% user,  0.0% nice,  3.2% system,  1.8% interrupt, 94.1% idle
Memory: Real: 29M/81M act/tot Free: 157M Cache: 24M Swap: 0K/1024M

  PID USERNAME PRI NICE  SIZE   RES STATE     WAIT      TIME    CPU COMMAND
  717 www       41    0 9444K   11M run       -         0:03 11.18% httpd



### During transfer ###
load averages:  0.44,  0.19,  0.08    test.vm.local 13:18:09
31 processes: 1 running, 29 idle, 1 on processor
1   CPUs:  0.9% user,  0.0% nice,  5.6% system,  5.3% interrupt, 88.3% idle
Memory: Real: 29M/81M act/tot Free: 157M Cache: 24M Swap: 0K/1024M

  PID USERNAME PRI NICE  SIZE   RES STATE     WAIT      TIME    CPU COMMAND
  717 www       46    0 9444K   11M run       -         0:11 28.03% httpd



### During transfer ###
load averages:  0.70,  0.43,  0.20    test.vm.local 13:21:47
31 processes: 30 idle, 1 on processor
1   CPUs:  0.4% user,  0.0% nice,  8.8% system, 10.3% interrupt, 80.4% idle
Memory: Real: 29M/82M act/tot Free: 156M Cache: 24M Swap: 0K/1024M

  PID USERNAME PRI NICE  SIZE   RES STATE     WAIT      TIME    CPU COMMAND
  717 www       -5    0 9440K   11M sleep     getblk    0:39 23.97% httpd



### During transfer ###
load averages:  0.73,  0.46,  0.22    test.vm.local 13:22:14
31 processes: 30 idle, 1 on processor
1   CPUs:  0.5% user,  0.0% nice,  9.6% system, 11.4% interrupt, 78.5% idle
Memory: Real: 29M/82M act/tot Free: 156M Cache: 24M Swap: 0K/1024M

  PID USERNAME PRI NICE  SIZE   RES STATE     WAIT      TIME    CPU COMMAND
  717 www        2    0 9440K   11M sleep     kqread    0:50 34.33% httpd



### After transfer ###
load averages:  0.25,  0.38,  0.21    test.vm.local 13:23:40
31 processes: 30 idle, 1 on processor
1   CPUs:  0.4% user,  0.0% nice,  8.2% system,  9.9% interrupt, 81.5% idle
Memory: Real: 21M/74M act/tot Free: 164M Cache: 24M Swap: 0K/1024M

  PID USERNAME PRI NICE  SIZE   RES STATE     WAIT      TIME    CPU COMMAND
  717 www        2    0 1248K 3348K idle      kqread    0:55  0.44% httpd

Reply via email to