Hello! On Thu, Aug 19, 2021 at 12:28:59AM +1000, Robert Mueller wrote:
> > Could you please test if compiling with > > --with-cc-opt="-DNGX_HAVE_EPOLLEXCLUSIVE=0" > > improves things, notably on production systems? In my limited > > testing it seems to be improve things, and if this is indeed the > > case, we can consider removing use of EPOLLEXCLUSIVE. > > I can try this tomorrow, but did you see the link Jan posted to the > cloudflare blog? > > https://blog.cloudflare.com/the-sad-state-of-linux-socket-balancing/ > > This explains the problem we're seeing exactly and why reuseport fixes it. Yes, I've seen it. It also suggests that EPOLLEXCLUSIVE might be responsible for the balancing change you've observed with recent kernels, something I've also suspected. > > > As you can see, without the reuseport option, this causes severe > > > scalability problems for us. > > > > I tend to think that reuseport is a bad option for load balancing > > between worker processes, as it can be easily tricked by an outside > > actor to select a particular worker process, and this opens an > > obvious DoS attack vector. > > Really? Can you explain how this is possible? Since reuseport uses hash of the source address to balance incoming connections between sockets, the client can choose a source port to use so the hash will direct the connection to a particular socket, that is, to a particular worker process. This in turn makes it possible to overload this worker process (which is usually several times easier than overloading all worker processes), degrading or completely denying service to clients who happen to be balanced to the same worker process. > Also given that cloudflare use this option, and I expect > cloudflare are literally the largest users of nginx in the world > and also have to deal with extreme adversarial environments > given they run a service to protect against DDoS, I would expect > they would be aware of any potential DoS vector in this regard, > or if not aware, extremely interested in hearing about it! I believe Cloudflare has enough resources and/or enough mitigations in place to don't care. -- Maxim Dounin http://mdounin.ru/ _______________________________________________ nginx-devel mailing list nginx-devel@nginx.org http://mailman.nginx.org/mailman/listinfo/nginx-devel