Hi Chris, On Thu, Jun 07, 2012 at 04:39:14PM -0500, Chris LeBlanc wrote: > Hey Willy, > > Thanks for taking the time to respond. > > It jumps from about 300-500ms to 3000-4000ms. There's a few other sites that > have the same variance on the same server. Typically 10x jumps in latency. > There's no rhyme or reason I can find. Here's a fancy pants graph. > > http://screencast.com/t/mcZ4wRQaBTt
Jumps of 3s usually mean that there is some packet loss on the server path. Here it looks like most of the time you have the normal response time with 3 extra seconds. For instance, this can happen if the request is lost after the SYN-SYN/ACK-ACK sequence. But it is also possible that this delay happens behind the server if it relies on a database or any other component. > Here's also an excerpt from my log from today, I filtered only request > > 1000ms. These are requests from the same graph above. I don't even see the > highest response times that are listed on that graph (one peaking above > 4000ms). > > Jun 7 01:07:28 localhost.localdomain haproxy[15517]: 50.58.2.137:50248 > [07/Jun/2012:01:07:27.446] http sld-web-3/sld-web-3 0/0/1/1170/1171 200 3961 > - - ---- 0/0/0/0/0 0/0 "GET / HTTP/1.1" > Jun 7 02:07:28 localhost.localdomain haproxy[15517]: 50.58.2.137:51699 > [07/Jun/2012:02:07:27.887] http sld-web-3/sld-web-3 0/0/1/1085/1086 200 4554 > - - ---- 1/1/1/1/0 0/0 "GET / HTTP/1.1" > Jun 7 07:07:32 localhost.localdomain haproxy[15517]: 50.58.2.137:58952 > [07/Jun/2012:07:07:31.296] http sld-web-3/sld-web-3 0/0/1/1035/1036 200 4554 > - - ---- 2/2/1/1/0 0/0 "GET / HTTP/1.1" > Jun 7 13:42:47 localhost.localdomain haproxy[15517]: 50.58.2.137:52084 > [07/Jun/2012:13:42:46.380] http sld-web-3/sld-web-3 43/0/2/1126/1171 200 3540 > - - ---- 9/9/1/1/0 0/0 "GET / HTTP/1.1" All I can see here is that the connection is always fast (1-2 ms) and the response is around 1s. I don't konw from this log if the response time is normalor not for the service. One point which looks strange is the 4th request above : the request took 43ms to come in. It could very well be a request from an ADSL client, but the 3 other requests from the same client were immediate (0ms). So this leads me to think that while it can entirely be the client's fault, there may also be some loss on the client->haproxy path with fast retransmit from the client. Or you may have a virtual machine somewhere in the path. I think that you could play with halog on your logs. Use "halog -pct < log", it will report percentile outputs for request, connect, response and data times. You can then graph them. If you see discontiguous values causing steps, it generally means you have something causing real trouble (typically if you notice that 10% or the responses are above 3s, or if you see that connect times are jumping to 3s, even just a few ones). Regards, Willy

