Tom Preston-Werner <[email protected]> wrote: > I'm doing some benchmarking on our new Rackspace frontend machines (8 > core, 16GB) and running into some problems with the Unix domain socket > setup. At high request rates (on simple pages) I'm getting a lot of > HTTP 502 errors from Nginx. Nothing shows up in the Unicorn error log, > but Nginx has the following in its error log:
Hi Tom, At what request rates were you running into this? Also how large are your responses? It could be the listen() backlog overflowing if Unicorn isn't logging anything. Anything in the system/kernel logs (doubtful, actually)? Does increasing the listen :backlog parameter work? Default is 1024 (which is pretty high already), maybe try a higher number along with the net.core.netdev_max_backlog sysctl. Is there a large discrepancy between the times your benchmark client logs, the request time nginx logs, and whatever Rails/Rack logs for request times for any particular request? If the Rails/Rack logging times all seem consistently low but your nginx/benchmark has some weird spikes/outliers, then some are stuck in the kernel listen backlog. How much of the 8 cores are being used on those boxes when this starts happening? > 2009/09/17 19:36:52 [error] 28277#0: *524824 connect() to > unix:/data/github/current/tmp/sockets/unicorn.sock failed (11: > Resource temporarily unavailable) while connecting to upstream, > client: 172.17.1.5, server: github.com, request: "GET /site/junk > HTTP/1.1", upstream: > "http://unix:/data/github/current/tmp/sockets/unic > orn.sock:/site/junk", host: "github.com" Raising proxy_connect_timeout in nginx may be a work around, what is it set to now? On the other hand, keeping it (and :backlog in Unicorn) low would give better indications for failover to other hosts. > This problem does not exist with the nginx -> haproxy -> unicorn > setup. Thinking this might be a file descriptor problem, I upped the > fd limit to 32768 with no luck. Then I tried upping net.core.somaxconn > to 262144 which also had no effect. I thought I'd ask about the > problem here to see if anyone knows a simple solution that I'm > missing. Perhaps there is an Nginx configuration directive I need? > Thanks. Unicorn rocks! Definitely not a file descriptor problem (at least not inside Unicorn). Also, I'm not sure there's a reason to keep haproxy between nginx and Unicorn... Maybe haproxy in front of the entire cluster of servers. Are you already hitting higher request rates (and more consistent times logged by client/nginx) with: nginx -> unicorn/unix vs nginx -> unicorn/tcp(localhost) ? Under extremely high loads, 502s may actually be wanted since it allows failover to a less loaded box if there's uneven balancing; but we really need to have numbers on the request rates. -- Eric Wong _______________________________________________ mongrel-unicorn mailing list [email protected] http://rubyforge.org/mailman/listinfo/mongrel-unicorn
