So, I'm looking to add support to nginx for transparent reverse proxies via UNIX sockets, and am hoping for some advice on how to do it with minimal impact on the rest of the server (both performance, and code complexity).
For clarity, a transparent reverse proxy takes an incoming (TCP) connection and copies the file descriptor to an 'upstream' backend, via a UNIX socket, with the sendmsg system call. The determination of which upstream server should receive the file descriptor is made via calling recv on the connection with the MSG_PEEK flag, and then inspecting either the HTTP header, or the TLS header to determine the intended hostname. The upstream backend catches the file descriptor via recvmsg, and then proceeds as though it were directly listening on the TCP interface. Transparent proxies have a number of performance and convenience advantages over the conventional HTTP proxy protocol. On the performance front, once the reverse proxy passes the file descriptor to the upstream backend, it can immediately close its copy of the file handle. This accomplishes several things. First, there is no need to tell the upstream host anything about the file handle (like peer address or port), as upstream can use the usual socket functions to obtain that information, or set any socket options needed. Second, there is no need to maintain a connection between the proxy and upstream, so the number of open file handles per connection is reduced by 1. Third, there is no need to copy or buffer data between the TCP connection and the upstream server, which means large files can be sent directly via the sendfile system call. On the complexity front, beyond adding one indirection step in obtaining the file descriptor, the upstream backend has no increased complexity relative to running directly on the TCP port. There is no need to read the peer address from the proxy. Socket options can be set directly on the file descriptor, rather than relaying requests to set them through the proxy. Calling shutdown on the file descriptor will shutdown the connection to the client machine, no delay, no ability for the proxy to fail to flush the pending data or close the connection. Additionally, the reverse proxy is greatly simplified, taking less than 100 LoC, most of which are simply to parse the TLS headers. Anyway, after looking at the development_guide for nginx, and poking through the source code, I see a couple of possible ways to implement this. The simplest way is to only accept a single file descriptor per incoming connection on the UNIX socket, shutting down the incoming connection after receiving one file descriptor. This can be done inside ngx_event_accept, by retrieving the file descriptor, shutting down the accepted connection (c->fd), and setting c->fd to the new file descriptor. Unfortunately, this requires the reverse proxy to be reliably fast at passing the incoming file descriptor, but is trivially simple for testing. I think a better solution would be a new ls->handler, which runs when the socket is ready for reading, and then runs the existing ls->handler (ngx_http_init_connection) once the file descriptor is fetched and the existing socket is closed. The more intrusive, but technically superior way to implement running behind a transparent reverse proxy, is to reuse the socket connection from the reverse proxy for an unlimited number of file descriptors. This would involve a new handler for ngx_connection_t, which adds the connection to the list of connections watched by poll, kqueue or similar. When the connection is ready for reading, recvmsg is used to fetch the file descriptor(s), which are then initialized like a normal http connection. The UNIX socket connection would then persist either until the peer disconnects, or nginx shuts down. I have implemented the first method, as a proof of concept, but I have several questions before trying to implement the second method. First, should this use a new nginx listener, rather than simply a setting on the existing listener? I suspect the answer is 'yes', since I think it needs its own handler, which either runs before or instead of ngx_http_init_connection. Second, is there any reason the UNIX socket connections can't get put in the same pool as the TCP connections? Third, should this be implemented in its own file, similarly to how http/2 is separated out? Fourth, what should the config file syntax be? In my proof of concept version, I just added a flag after the 'listen unix:path', but I could see an advantage to defining the expected file descriptors separate from the UNIX socket. _______________________________________________ nginx-devel mailing list [email protected] http://mailman.nginx.org/mailman/listinfo/nginx-devel
