On Wednesday 15 July 2015 23:38:33 Jack Burton wrote: > In 5.7-stable & -current, httpd, when listening for TLS, does not close > the client socket when tls_accept_socket() returns any non-recoverable > error. The problem manifests most often when a client connects but does > not attempt TLS handshake. > > Steps to reproduce: > > * Configure httpd to listen for TLS requests > * From a remote host, open TCP session to httpd on port 443 > * Close the TCP connection without sending any data > * Wait at least 2 * MSL to avoid false positives > * fstat continues to show the open socket for as long as httpd runs > > This causes the number of sockets httpd has opne to keep growing until > it reaches (openfiles-cur - 7), at which point httpd stops responding > to HTTPS requests altogether. > > Real world occurrence: > > We've seen this fairly regularly on a host with only modest HTTPS load, > where httpd stops responding to HTTPS requests anywhere between 2 hours > and 4 days after httpd restart. > > See the thread "httpd stops accepting connections after a few hours on > current" on misc@ for further background and several independent reports > of the observed behaviour. > > Despite the title of that thread, the same behaviour is seen in > 5.7-stable. > > Sorry, I don't have any hosts running -current at the moment, but I've > written a trivial patch against 5.7-stable to treat that particular > failure mode in the same way as was already being done for EV_TIMEOUTs. > That fixes the issue for us here (been in place on one production host > with a modest [2req/sec avg] load for 4 hours with no obvious > regressions and no stale sockets -- previously we were getting at least > several stale sockets appearing every hour). The good folks on misc@ > suggested I should post my patch to tech@, so here it is: > > --- usr.sbin/httpd/server.c.orig Wed Jul 15 20:40:16 2015 > +++ usr.sbin/httpd/server.c Wed Jul 15 20:50:15 2015 > @@ -932,6 +932,7 @@ server_accept_tls(int fd, short event, void *arg) > struct client *clt = (struct client *)arg; > struct server *srv = (struct server *)clt->clt_srv; > int ret; > + char *errmsg; > > if (event == EV_TIMEOUT) { > server_close(clt, "TLS accept timeout"); > @@ -952,8 +953,13 @@ server_accept_tls(int fd, short event, void *arg) > server_accept_tls, &clt->clt_tv_start, > &srv->srv_conf.timeout, clt); > } else if (ret != 0) { > - log_warnx("%s: TLS accept failed - %s", __func__, > - tls_error(srv->srv_tls_ctx)); > + if (asprintf(&errmsg, "%s: TLS accept failed - %s", > + __func__, tls_error(srv->srv_tls_ctx)) < 0) { > + server_close(clt, "server_accept_tls: TLS accept failed"); > + } else { > + server_close(clt, errmsg); > + free(errmsg); > + } > return; > } > > This is the first time I've posted a patch for OpenBSD, so if I've erred > in style or method please correct me and I'll try to do better next > time.
Thanks for finding this and providing a diff. I've just committed a slightly different version, which simply adds the missing server_close() call - the version above will result in additional noise in the logs (both the function name and the full details from tls_error()), which we'd rather avoid.