I've run into a strangeness with my TLS-based web server. It seems
that, for every incoming request, three file descriptors are used
(they all seem to be sockets), and they aren't immediately cleaned up.

With keep-alive disabled, they CAN be disposed of, but it takes an
explicit gc() call. With keep-alive active, they aren't garbage, so
they stick around even with a gc() call. (Which is probably correct,
but there might need to be a limit on how many get retained.)


constant listen_addr = "::", listen_port = 9876;

void http_handler(Protocols.HTTP.Server.Request req) {
    int before = sizeof(get_dir("/proc/self/fd"));
    int garbo = gc(); //Disabling this check results in FDs accumulating.
    int after = sizeof(get_dir("/proc/self/fd"));
    werror("Garbage %O, closed %d files, now %d open\n", garbo, before
- after, after);
    //The "Connection: close" header is vital to the sockets becoming garbage.
    //Without it, they are retained pending a followup request, which
doesn't change the
    //fundamental issue but does mean that a call to gc() doesn't clean them up.
    req->response_and_finish((["data": "OK", "extra_heads":
(["Connection": "close"])]));
    //req->response_and_finish((["data": "OK"]));
}

int main() {
    //If you don't have a cert, the first request is slower b/c
generating self-signed.
    //This ONLY happens with SSL connections. There are *three* file
descriptors wasted
    //for every request.
    string cert = Stdio.read_file("certificate_local.pem");
    string key = Stdio.read_file("privkey_local.pem");
    array certs = cert && Standards.PEM.Messages(cert)->get_certificates();
    string pk = key && Standards.PEM.simple_decode(key);
    Protocols.HTTP.Server.SSLPort(http_handler, listen_port,
listen_addr, pk, certs);
    return -1;
}


What are the three FDs used for? Presumably one of them is the actual
incoming socket, but the other two are less obvious.

This phenomenon does not seem to happen with non-SSL ports, possibly
relating to the simpler shutdown sequence for unencrypted sockets.

This isn't usually a major problem, as the GC does get run eventually,
but under certain types of workloads, it's easy to overload it and run
out of FDs. This can be triggered by removing the gc() call from each
request, and then any of these will spin up until the process FD limit
is hit:

// Pike:
while (1) Protocols.HTTP.get_url_data("https://YOUR_SERVER.EXAMPLE:9876/";);

# Python:
import requests
while 1: requests.get("https://YOUR_SERVER.EXAMPLE:9876/";)

: Bash:
while wget -qO/dev/null https://YOUR_SERVER.EXAMPLE:9876/; do true; done

// JavaScript in a web browser, if not blocked:
while (1) await(fetch("https://YOUR_SERVER.EXAMPLE:9876/";));

So I think it's probably not purely an issue with a client library misbehaving.

Is there a better way to handle this than simply forcing garbage
collection every request?

ChrisA

Reply via email to