On Fri, Mar 6, 2009 at 8:43 AM, Willy Tarreau <[email protected]> wrote:
> Hi Michael,
>
> On Thu, Mar 05, 2009 at 01:04:06PM -0800, Michael Fortson wrote:
>> I'm trying to understand why our proxied requests have a much greater
>> chance of significant delay than non-proxied requests.
>>
>> The server is an 8-core (dual quad) Intel machine. Making requests
>> directly to the nginx backend is just far more reliable. Here's a
>> shell script output that just continuously requests a blank 0k image
>> file from nginx directly on its own port, and spits out a timestamp if
>> the delay isn't 0 or 1 seconds:
>>
>> Thu Mar 5 12:36:17 PST 2009
>> beginning continuous test of nginx port 8080
>> Thu Mar 5 12:38:06 PST 2009
>> Nginx Time is 2 seconds
>>
>>
>>
>> Here's the same test running through haproxy, simultaneously:
>>
>> Thu Mar 5 12:36:27 PST 2009
>> beginning continuous test of haproxy port 80
>> Thu Mar 5 12:39:39 PST 2009
>> Nginx Time is 3 seconds
>> Thu Mar 5 12:39:48 PST 2009
>> Nginx Time is 3 seconds
>> Thu Mar 5 12:39:55 PST 2009
>> Nginx Time is 3 seconds
>> Thu Mar 5 12:40:03 PST 2009
>> Nginx Time is 3 seconds
>> Thu Mar 5 12:40:45 PST 2009
>> Nginx Time is 3 seconds
>> Thu Mar 5 12:40:48 PST 2009
>> Nginx Time is 3 seconds
>> Thu Mar 5 12:40:55 PST 2009
>> Nginx Time is 3 seconds
>> Thu Mar 5 12:40:58 PST 2009
>> Nginx Time is 3 seconds
>> Thu Mar 5 12:41:55 PST 2009
>> Nginx Time is 3 seconds
>> Thu Mar 5 12:42:01 PST 2009
>> Nginx Time is 3 seconds
>> Thu Mar 5 12:42:08 PST 2009
>> Nginx Time is 3 seconds
>> Thu Mar 5 12:42:29 PST 2009
>> Nginx Time is 3 seconds
>> Thu Mar 5 12:42:38 PST 2009
>> Nginx Time is 3 seconds
>> Thu Mar 5 12:43:05 PST 2009
>> Nginx Time is 3 seconds
>> Thu Mar 5 12:43:15 PST 2009
>> Nginx Time is 3 seconds
>> Thu Mar 5 12:44:08 PST 2009
>> Nginx Time is 3 seconds
>> Thu Mar 5 12:44:25 PST 2009
>> Nginx Time is 3 seconds
>> Thu Mar 5 12:44:30 PST 2009
>> Nginx Time is 3 seconds
>> Thu Mar 5 12:44:33 PST 2009
>> Nginx Time is 3 seconds
>> Thu Mar 5 12:44:39 PST 2009
>> Nginx Time is 3 seconds
>> Thu Mar 5 12:44:46 PST 2009
>> Nginx Time is 3 seconds
>> Thu Mar 5 12:44:54 PST 2009
>> Nginx Time is 3 seconds
>> Thu Mar 5 12:45:07 PST 2009
>> Nginx Time is 3 seconds
>> Thu Mar 5 12:45:16 PST 2009
>> Nginx Time is 3 seconds
>> Thu Mar 5 12:45:45 PST 2009
>> Nginx Time is 3 seconds
>> Thu Mar 5 12:45:54 PST 2009
>> Nginx Time is 3 seconds
>> Thu Mar 5 12:45:58 PST 2009
>> Nginx Time is 3 seconds
>> Thu Mar 5 12:46:05 PST 2009
>> Nginx Time is 3 seconds
>> Thu Mar 5 12:46:08 PST 2009
>> Nginx Time is 3 seconds
>> Thu Mar 5 12:46:32 PST 2009
>> Nginx Time is 3 seconds
>> Thu Mar 5 12:46:48 PST 2009
>> Nginx Time is 3 seconds
>> Thu Mar 5 12:46:53 PST 2009
>> Nginx Time is 3 seconds
>> Thu Mar 5 12:46:58 PST 2009
>> Nginx Time is 3 seconds
>> Thu Mar 5 12:47:40 PST 2009
>> Nginx Time is 3 seconds
>
> 3 seconds is typically a TCP retransmit. You have network losses somewhere
> from/to your haproxy. Would you happen to be running on a gigabit port
> connected to a 100 Mbps switch ? What type of NIC is this ? I've seen
> many problems with broadcom netxtreme 2 (bnx2) caused by buggy firmwares,
> but it seems to work fine for other people after a firmware upgrade.
>
>> My sanitized haproxy config is here (mongrel backend was omitted for 
>> brevity) :
>> http://pastie.org/408729
>>
>> Are the ACLs just too expensive?
>
> Not at all. Especially in your case. To reach 3 seconds of latency, you would
> need hundreds of thousands of ACLs, so this is clearly unrelated to your 
> config.
>
>> Nginx is running with 4 processes, and the box shows mostly idle.
>
> ... which indicates that you aren't burning CPU cycles processing ACLs ;-)
>
> It is also possible that some TCP settings are too low for your load, but
> I don't know what your load is. Above a few hundreds-thousands of sessions
> per second, you will need to do some tuning, otherwise you can end up with
> similar situations.
>
> Regards,
> Willy
>
>

Hmm. I think it is gigabit connected to 100 Mb (all Dell rack-mount
servers and switches). The nginx backend runs on the same machine as
haproxy and is referenced via 127.0.0.1 -- does that still involve a
real network port? Should I try the test all on localhost to isolate
it from any networking retransmits?

Here's a peek at the stats page after about a day of running (this
should help demonstrate current loading)
http://pastie.org/409632

Reply via email to