Re: http-reuse always, work quite well
Hi Brendan, On Sat, Oct 22, 2016 at 11:39:51AM -0400, Brendan Kearney wrote: > On 10/22/2016 02:08 AM, Willy Tarreau wrote: > > > You're welcome. Please note that the reuse mechanism is not perfect and > > can still be improved. So do not hesitate to report any issue you find, > > we definitely need real-world feedback like this. I cannot promise that > > every issue will be fixed, but at least we need to consider them and see > > what can be done. > > > > Cheers, > > Willy > > > i have http interception in place, using iptables/DNAT to redirect traffic > to haproxy and load balance to 2 squid instances. i was using aggressive > mode http-reuse and it seemed to provide better streaming experience for > roku/sling. In fact it can be better for all use cases since you save one round trip for most requests by reusing an already connected connection. > after a period of time, the performance degraded and the > experience was worse than original state. buffering, lag and pixelation are > the symptoms. i did not try to use the always mode, and turned http-reuse > off for the interception i am doing. the issue has cleared since. That's a useful report. I don't see a reason here. Hmmm thinking about it there could be one in fact. Maybe you ended up with too many connections on your squids ? But even in that situation, aggressive would still ensure that some of the requests are properly delivered. Another possibility is that some TIME_WAIT connections have been accumulating on the haproxy machine in the direction of the squid servers, because aggressive will still result in closing some connections when they cannot be stolen. What OS are you using ? On Linux it's known to close cleanly (SO_LINGER works well), I don't know for other OSes. Or do you happen to have a firewall between linux and squid which would sometimes catch the reset but with the squid occasionally not getting it ? That's an issue which sometimes affects http-server-close as well. > while interception and transparent proxying seem to be problematic, explicit > proxying and internal http have both seen a marked improvement in > performance. I'm now thinking that transparent could indeed be an issue in this case, because you can sometimes force to close an existing connection from an ip:port source, and suddenly when the client speaks it has to immediately reopen. I'd say that in terms of logging it's a bit ugly to reuse someone else's connection because the requests that come to your squid can pretend to be from client A while in fact client A was the first one to cause the connection to establish and client B is the one sending the next request. Do you use "usesrc client" or "usesrc clientip" ? The former will force the same ip:port to be reused and it can indeed cause some trouble due to the other side not always getting the close in time. If the latter, the source port is dynamically allocated so the ports should rotate. > no scientific collection of data has been done, but page load > times have been noticeably improved. i may move from aggressive to always > for these backends. In my experience, Squid is quite good at keeping connections opened, so yes it might work well without causing trouble to users. Willy
Re: http-reuse always, work quite well
On 22/10/2016 08:08 πμ, Willy Tarreau wrote: > Hi Pavlos, > > On Fri, Oct 21, 2016 at 03:01:52PM +0200, Pavlos Parissis wrote: >>> I'm not surprized that always works better, but my point is that if it's >>> much better it can be useful to stay with it, but if it's only 1% better >>> it's not worth it. >>> >> >> It is way better:-), see Marcin's response. > > Ah sorry, I missed it. Indeed it looks much better, but we don't have > the reference (no reuse) on this graph. I will try to rerun the test tomorrow, which runs on production servers with real user traffic:-) > If the no reuse shows 10 times > higher average times, then it means "safe" reuse brings a 10 times > improvement and "always" brings 20 times so it's a matter of choice. > However if safe does approximately the same as no reuse, for sure > "always" is almost needed. > > while "always" is optimal, strictly speaking it's > not very clean if the clients are not always willing to retry a failed > first request, and browsers typically fall into that category. A real > world case can be a request dequeued to a connection that just closes. What is the response of HAProxy to clients in this case? HTTP 50N? >>> >>> No, the client-side connection will simply be aborted so that the client >>> can decide whether to retry or not. >> >> Connection will be aborted by haproxy sending TCP RST? > > As much as possible yes. The principle is to let the client retry the > request (since it is the only one knowing whether it's safe or not). > >>> I'd suggest a rule of thumb (maybe this should be added to the doc) : watch >>> your logs over a long period. If you don't see queue timeouts, nor request >>> timeouts, it's probably safe enough to use "always". >> >> Which field on the log do we need to watch? Tq? > > Tw (time spent waiting in the queue), Tc (time spent getting a connection), > and of course the termination flags, everything with a C or Q on the second > char needs to be analysed. > I looked at the logs for a period of 11hours and found zero occurrences of C or Q. I also didn't noticed any change on Tw and Tc timers. I will keep an eye. >>> Each time you notice >>> one of them, there is a small risk of impacting another client. It's not >>> rocket science but the risks depend on the same parameters. >> >> >> Thanks a lot for yet another reach in information replies. > > You're welcome. Please note that the reuse mechanism is not perfect and > can still be improved. So do not hesitate to report any issue you find, > we definitely need real-world feedback like this. I cannot promise that > every issue will be fixed, but at least we need to consider them and see > what can be done. > Acked, I will report any issues we may find. Thanks, Pavlos signature.asc Description: OpenPGP digital signature
Re: http-reuse always, work quite well
On 10/22/2016 02:08 AM, Willy Tarreau wrote: You're welcome. Please note that the reuse mechanism is not perfect and can still be improved. So do not hesitate to report any issue you find, we definitely need real-world feedback like this. I cannot promise that every issue will be fixed, but at least we need to consider them and see what can be done. Cheers, Willy i have http interception in place, using iptables/DNAT to redirect traffic to haproxy and load balance to 2 squid instances. i was using aggressive mode http-reuse and it seemed to provide better streaming experience for roku/sling. after a period of time, the performance degraded and the experience was worse than original state. buffering, lag and pixelation are the symptoms. i did not try to use the always mode, and turned http-reuse off for the interception i am doing. the issue has cleared since. while interception and transparent proxying seem to be problematic, explicit proxying and internal http have both seen a marked improvement in performance. no scientific collection of data has been done, but page load times have been noticeably improved. i may move from aggressive to always for these backends. keep up the good work, and thanks for some really great software, brendan
Re: http-reuse always, work quite well
Hi Pavlos, On Fri, Oct 21, 2016 at 03:01:52PM +0200, Pavlos Parissis wrote: > > I'm not surprized that always works better, but my point is that if it's > > much better it can be useful to stay with it, but if it's only 1% better > > it's not worth it. > > > > It is way better:-), see Marcin's response. Ah sorry, I missed it. Indeed it looks much better, but we don't have the reference (no reuse) on this graph. If the no reuse shows 10 times higher average times, then it means "safe" reuse brings a 10 times improvement and "always" brings 20 times so it's a matter of choice. However if safe does approximately the same as no reuse, for sure "always" is almost needed. > >>> while "always" is optimal, strictly speaking it's > >>> not very clean if the clients are not always willing to retry a failed > >>> first request, and browsers typically fall into that category. A real > >>> world case can be a request dequeued to a connection that just closes. > >> > >> What is the response of HAProxy to clients in this case? HTTP 50N? > > > > No, the client-side connection will simply be aborted so that the client > > can decide whether to retry or not. > > Connection will be aborted by haproxy sending TCP RST? As much as possible yes. The principle is to let the client retry the request (since it is the only one knowing whether it's safe or not). > > I'd suggest a rule of thumb (maybe this should be added to the doc) : watch > > your logs over a long period. If you don't see queue timeouts, nor request > > timeouts, it's probably safe enough to use "always". > > Which field on the log do we need to watch? Tq? Tw (time spent waiting in the queue), Tc (time spent getting a connection), and of course the termination flags, everything with a C or Q on the second char needs to be analysed. > > Each time you notice > > one of them, there is a small risk of impacting another client. It's not > > rocket science but the risks depend on the same parameters. > > > Thanks a lot for yet another reach in information replies. You're welcome. Please note that the reuse mechanism is not perfect and can still be improved. So do not hesitate to report any issue you find, we definitely need real-world feedback like this. I cannot promise that every issue will be fixed, but at least we need to consider them and see what can be done. Cheers, Willy
Re: http-reuse always, work quite well
On 21/10/2016 08:14 πμ, Willy Tarreau wrote: > Hi Pavlos, > > On Wed, Oct 19, 2016 at 08:28:34AM +0200, Pavlos Parissis wrote: >>> That's really great, thanks for the feedback. Have you tried the other >>> http-reuse options ? >> >> A workmate did the experimentation on http-reuse and I only know that >> 'always' >> worked better for us. I will ask him to provide some details. > > I'm not surprized that always works better, but my point is that if it's > much better it can be useful to stay with it, but if it's only 1% better > it's not worth it. > It is way better:-), see Marcin's response. >>> while "always" is optimal, strictly speaking it's >>> not very clean if the clients are not always willing to retry a failed >>> first request, and browsers typically fall into that category. A real >>> world case can be a request dequeued to a connection that just closes. >> >> What is the response of HAProxy to clients in this case? HTTP 50N? > > No, the client-side connection will simply be aborted so that the client > can decide whether to retry or not. Connection will be aborted by haproxy sending TCP RST? > Sometimes even the first request of > the connection will benefit from a retry, but normally only subsequent > requests are supposed to be retried. > >>> In theory in your case, "aggressive" should do the same as "always", >>> though since you know your applications it will not improve anything. >>> However if "safe" works well enough (even if it causes a few more >>> connections), you should instead use it. If the gains are minimal, >>> then you'll have to compare :-) >>> >>> Oh, last thing, "always" is generally fine when connecting to a static >>> server or a cache because you don't break end-user browsing session in >>> the rare case an error happens. >>> >> >> Fortunately, our applications don't suffer from this. Applications don't >> store >> locally any session information. If a browser requests 10 HTTP requests over >> 4 >> TCP connection, those HTTP requests will be always delivered to different >> servers. Furthermore, we use uWSGI which does also load balancing of >> requests to >> a set of processes which do not share any information. > > OK but what I meant is that if a request fails on the application side, it > generally has some impact on the user's browsing session. A post which > returns an error, some automatic filling of a list not being performed, > etc. In browsers nowadays it's hard to force to retry a failed action > and reloading the page is not always an option. For a static server if > an image fails to load, that's just a minor issue and the user can always > right-click on it, select "view image" and reload it. > > I'd suggest a rule of thumb (maybe this should be added to the doc) : watch > your logs over a long period. If you don't see queue timeouts, nor request > timeouts, it's probably safe enough to use "always". Which field on the log do we need to watch? Tq? > Each time you notice > one of them, there is a small risk of impacting another client. It's not > rocket science but the risks depend on the same parameters. Thanks a lot for yet another reach in information replies. Pavlos signature.asc Description: OpenPGP digital signature
Re: http-reuse always, work quite well
Hi Pavlos, On Wed, Oct 19, 2016 at 08:28:34AM +0200, Pavlos Parissis wrote: > > That's really great, thanks for the feedback. Have you tried the other > > http-reuse options ? > > A workmate did the experimentation on http-reuse and I only know that 'always' > worked better for us. I will ask him to provide some details. I'm not surprized that always works better, but my point is that if it's much better it can be useful to stay with it, but if it's only 1% better it's not worth it. > > while "always" is optimal, strictly speaking it's > > not very clean if the clients are not always willing to retry a failed > > first request, and browsers typically fall into that category. A real > > world case can be a request dequeued to a connection that just closes. > > What is the response of HAProxy to clients in this case? HTTP 50N? No, the client-side connection will simply be aborted so that the client can decide whether to retry or not. Sometimes even the first request of the connection will benefit from a retry, but normally only subsequent requests are supposed to be retried. > > In theory in your case, "aggressive" should do the same as "always", > > though since you know your applications it will not improve anything. > > However if "safe" works well enough (even if it causes a few more > > connections), you should instead use it. If the gains are minimal, > > then you'll have to compare :-) > > > > Oh, last thing, "always" is generally fine when connecting to a static > > server or a cache because you don't break end-user browsing session in > > the rare case an error happens. > > > > Fortunately, our applications don't suffer from this. Applications don't store > locally any session information. If a browser requests 10 HTTP requests over 4 > TCP connection, those HTTP requests will be always delivered to different > servers. Furthermore, we use uWSGI which does also load balancing of requests > to > a set of processes which do not share any information. OK but what I meant is that if a request fails on the application side, it generally has some impact on the user's browsing session. A post which returns an error, some automatic filling of a list not being performed, etc. In browsers nowadays it's hard to force to retry a failed action and reloading the page is not always an option. For a static server if an image fails to load, that's just a minor issue and the user can always right-click on it, select "view image" and reload it. I'd suggest a rule of thumb (maybe this should be added to the doc) : watch your logs over a long period. If you don't see queue timeouts, nor request timeouts, it's probably safe enough to use "always". Each time you notice one of them, there is a small risk of impacting another client. It's not rocket science but the risks depend on the same parameters. Cheers, Willy
Re: http-reuse always, work quite well
On 15/10/2016 09:31 πμ, Willy Tarreau wrote: > Hi Pavlos, > > On Fri, Oct 14, 2016 at 04:33:20PM +0200, Pavlos Parissis wrote: >> Hi, >> >> I just want to drop a note and mention that http-reuse works very well >> for us: >> >> % ss -t state established '( sport = :http )'|wc -l >> 2113 >> >> % ss -t state established '( dport = :http )'| wc -l >> 408 >> >> As, you can see connections established to backend servers are much less >> than the connections from clients. In the attached graph you can see >> 24-hour pattern doesn't influence that much the number of connections to >> backend. > > That's really great, thanks for the feedback. Have you tried the other > http-reuse options ? A workmate did the experimentation on http-reuse and I only know that 'always' worked better for us. I will ask him to provide some details. > while "always" is optimal, strictly speaking it's > not very clean if the clients are not always willing to retry a failed > first request, and browsers typically fall into that category. A real > world case can be a request dequeued to a connection that just closes. What is the response of HAProxy to clients in this case? HTTP 50N? > Of course this very rarely happens, but "very rarely" is not something > your end users will accept when they face it :-) > True, very much true. > In theory in your case, "aggressive" should do the same as "always", > though since you know your applications it will not improve anything. > However if "safe" works well enough (even if it causes a few more > connections), you should instead use it. If the gains are minimal, > then you'll have to compare :-) > > Oh, last thing, "always" is generally fine when connecting to a static > server or a cache because you don't break end-user browsing session in > the rare case an error happens. > Fortunately, our applications don't suffer from this. Applications don't store locally any session information. If a browser requests 10 HTTP requests over 4 TCP connection, those HTTP requests will be always delivered to different servers. Furthermore, we use uWSGI which does also load balancing of requests to a set of processes which do not share any information. Cheers, Pavlos signature.asc Description: OpenPGP digital signature
Re: http-reuse always, work quite well
Hi Pavlos, On Fri, Oct 14, 2016 at 04:33:20PM +0200, Pavlos Parissis wrote: > Hi, > > I just want to drop a note and mention that http-reuse works very well > for us: > > % ss -t state established '( sport = :http )'|wc -l > 2113 > > % ss -t state established '( dport = :http )'| wc -l > 408 > > As, you can see connections established to backend servers are much less > than the connections from clients. In the attached graph you can see > 24-hour pattern doesn't influence that much the number of connections to > backend. That's really great, thanks for the feedback. Have you tried the other http-reuse options ? while "always" is optimal, strictly speaking it's not very clean if the clients are not always willing to retry a failed first request, and browsers typically fall into that category. A real world case can be a request dequeued to a connection that just closes. Of course this very rarely happens, but "very rarely" is not something your end users will accept when they face it :-) In theory in your case, "aggressive" should do the same as "always", though since you know your applications it will not improve anything. However if "safe" works well enough (even if it causes a few more connections), you should instead use it. If the gains are minimal, then you'll have to compare :-) Oh, last thing, "always" is generally fine when connecting to a static server or a cache because you don't break end-user browsing session in the rare case an error happens. Thanks for the feedback! Willy
Re: http-reuse always, work quite well
On 14/10/2016 08:49 μμ, Aleksandar Lazic wrote: > Hi > > Am 14-10-2016 16:33, schrieb Pavlos Parissis: >> Hi, >> >> I just want to drop a note and mention that http-reuse works very well >> for us: >> >> % ss -t state established '( sport = :http )'|wc -l >> 2113 >> >> % ss -t state established '( dport = :http )'| wc -l >> 408 >> >> As, you can see connections established to backend servers are much less >> than the connections from clients. In the attached graph you can see >> 24-hour pattern doesn't influence that much the number of connections to >> backend. > > Great ;-) > > Just for my curiosity which great working version do you have in place? > We use HAPEE 1.6 which is HAProxy 1.6.9 version plus some patches from 1.7. You should expect the same behavior from HAProxy. Cheers, Pavlos signature.asc Description: OpenPGP digital signature
Re: http-reuse always, work quite well
Hi Am 14-10-2016 16:33, schrieb Pavlos Parissis: Hi, I just want to drop a note and mention that http-reuse works very well for us: % ss -t state established '( sport = :http )'|wc -l 2113 % ss -t state established '( dport = :http )'| wc -l 408 As, you can see connections established to backend servers are much less than the connections from clients. In the attached graph you can see 24-hour pattern doesn't influence that much the number of connections to backend. Great ;-) Just for my curiosity which great working version do you have in place? Thank you very much for your answer. Best regards Aleks