Hi Chris,

On Thu, Jun 07, 2012 at 04:39:14PM -0500, Chris LeBlanc wrote:
> Hey Willy,
> 
> Thanks for taking the time to respond.
> 
> It jumps from about 300-500ms to 3000-4000ms. There's a few other sites that
> have the same variance on the same server. Typically 10x jumps in latency.
> There's no rhyme or reason I can find. Here's a fancy pants graph.
> 
> http://screencast.com/t/mcZ4wRQaBTt

Jumps of 3s usually mean that there is some packet loss on the server path.
Here it looks like most of the time you have the normal response time with
3 extra seconds.

For instance, this can happen if the request is lost after the SYN-SYN/ACK-ACK
sequence. But it is also possible that this delay happens behind the server if
it relies on a database or any other component.

> Here's also an excerpt from my log from today, I filtered only request >
> 1000ms. These are requests from the same graph above. I don't even see the
> highest response times that are listed on that graph (one peaking above
> 4000ms).
> 
> Jun  7 01:07:28 localhost.localdomain haproxy[15517]: 50.58.2.137:50248 
> [07/Jun/2012:01:07:27.446] http sld-web-3/sld-web-3 0/0/1/1170/1171 200 3961 
> - - ---- 0/0/0/0/0 0/0 "GET / HTTP/1.1"
> Jun  7 02:07:28 localhost.localdomain haproxy[15517]: 50.58.2.137:51699 
> [07/Jun/2012:02:07:27.887] http sld-web-3/sld-web-3 0/0/1/1085/1086 200 4554 
> - - ---- 1/1/1/1/0 0/0 "GET / HTTP/1.1"
> Jun  7 07:07:32 localhost.localdomain haproxy[15517]: 50.58.2.137:58952 
> [07/Jun/2012:07:07:31.296] http sld-web-3/sld-web-3 0/0/1/1035/1036 200 4554 
> - - ---- 2/2/1/1/0 0/0 "GET / HTTP/1.1"
> Jun  7 13:42:47 localhost.localdomain haproxy[15517]: 50.58.2.137:52084 
> [07/Jun/2012:13:42:46.380] http sld-web-3/sld-web-3 43/0/2/1126/1171 200 3540 
> - - ---- 9/9/1/1/0 0/0 "GET / HTTP/1.1"

All I can see here is that the connection is always fast (1-2 ms) and the
response is around 1s. I don't konw from this log if the response time is
normalor not for the service.

One point which looks strange is the 4th request above : the request took
43ms to come in. It could very well be a request from an ADSL client, but
the 3 other requests from the same client were immediate (0ms). So this
leads me to think that while it can entirely be the client's fault, there
may also be some loss on the client->haproxy path with fast retransmit from
the client. Or you may have a virtual machine somewhere in the path.

I think that you could play with halog on your logs. Use "halog -pct < log",
it will report percentile outputs for request, connect, response and data
times. You can then graph them. If you see discontiguous values causing
steps, it generally means you have something causing real trouble (typically
if you notice that 10% or the responses are above 3s, or if you see that
connect times are jumping to 3s, even just a few ones).

Regards,
Willy


Reply via email to