Hello! Over the last few months, we (a small team of developers including me and Jan Prachař, both from CDN77) developed a missing feature for the proxy caching in Nginx. We are happy to share this feature with the community in the following patch series.
We serve a large number of files to an immense number of clients and often multiple clients want the same file at the very same time - especially when it came to streaming (when a file is crafted on the upstream in real-time and getting it could take seconds). Previously there were two options in Nginx when using proxy caching: * pass all incoming requests to the origin * use proxy_cache_lock feature, pass only the first request (served in real-time) and let other requests wait until the first request completion We didn't like any of these options (the first one effectively disables CDN and the second one is unusable for streaming). We considered using Varnish, which solves this problem better, but we are very happy with the Nginx infrastructure we have. Thus we came with the third option. We developed the proxy_cache_tempfile mechanism, which acts similarly to the proxy_cache_lock, but instead of locking other requests waiting for the file completion, we open the tempfile used by the primary request and periodically serve parts of it to the waiting requests. Because there may be multiple tempfiles for the same file (for example when the file expires before it is fully downloaded), we use shared memory per cache with `ngx_http_file_cache_tf_node_t` for each created tempfiles to synchronize all workers. When a new request is passed to the origin, we record its tempfile number and when another request is received, we try to open tempfile with this number and serve from it. When tempfile is already used for some secondary request, it sticks with this same tempfile until its completion. To accomplish this we rely on the POSIX filesystem feature, when you can open file and retain its file descriptor even when it is moved to a new location (on the same filesystem). I'm afraid that this would be hard to accomplish on Windows and this feature will be non-Windows only. We tested this feature thoroughly for the last few months and we use it already in part of our infrastructure without noticing any negative impact, We noticed only a very small increase in memory usage and a minimal increase in CPU and disk io usage (which corresponds with the increased throughput of the server). We also did some synthetic benchmarks where we compared vanilla nginx and our patched version with and without cache lock and with cache tempfiles. Results of the benchmarks, charts, and scripts we used for it are available on my Github: https://github.com/setnicka/nginx-tempfiles-benchmark It should work also for fastcgi, uwsgi, and scgi caches (as it uses internally the same mechanism), but we didn't do testing of these. New config: * proxy_cache_tempfile on; -- activate the whole tempfile logic * proxy_cache_tempfile_timeout 5s; -- how long to wait for tempfile before 504 * proxy_cache_tempfile_loop 50ms; -- loop time for check tempfiles (ans same for fastcgi_cache, uwsgi_cache and scgi_cache) New option for proxy_cache_path: tf_zone=name:size (defaults to key zone name with _tf suffix and 10M size). It creates a shared memory zone used to store tempfiles nodes. We would be very grateful for any reviews and other testing. Jiří Setnička CDN77 _______________________________________________ nginx-devel mailing list -- [email protected] To unsubscribe send an email to [email protected]
