SV: SV: VS: Haparoxy hangs in one minute on config reload
Hi Willy, Thanks, we will try this ASAP. Regards Terje -Opprinnelig melding- Fra: Willy Tarreau [mailto:w...@1wt.eu] Sendt: tirsdag, april 23, 2013 11:26 Til: Borgen, Terje Kopi: 'haproxy@formilux.org' Emne: Re: SV: VS: Haparoxy hangs in one minute on config reload Hi Terje, On Tue, Apr 23, 2013 at 11:17:13AM +0200, Borgen, Terje wrote: > Hi Willy. > I am sorry it took so long. We got another problem that got my full > attention for months. Firewall issues with database connections. This > is now solved and today this HAProxy reload problem occurred again. > We have checked the logs and it's not the first time after we added "option > nolinger". > > You mentioned that using port 8080 could be part of the problem. Our > other setup which runs on port 80 have never had this problem. > Do You think changing the listen port to eg 85 might solve the problem? Yes definitely, please try this. Regards, Willy
SV: VS: Haparoxy hangs in one minute on config reload
Hi Willy. I am sorry it took so long. We got another problem that got my full attention for months. Firewall issues with database connections. This is now solved and today this HAProxy reload problem occurred again. We have checked the logs and it's not the first time after we added "option nolinger". You mentioned that using port 8080 could be part of the problem. Our other setup which runs on port 80 have never had this problem. Do You think changing the listen port to eg 85 might solve the problem? Best regards Terje -Opprinnelig melding----- Fra: Borgen, Terje Sendt: torsdag, desember 06, 2012 16:56 Til: Willy Tarreau Kopi: haproxy@formilux.org Emne: SV: VS: Haparoxy hangs in one minute on config reload Hi Willy, Nice to know that a fix is on its way. Looking forward to that. We are in a process of migrating from Windows/WebSphere and have another twenty-five Jetty-apps that will run on this environment. With health checks from all these applications the problem might be bigger than it is today. I have put "option nolinger" in all the backends with backend-check in our test-environment. This change will be merged into production on Monday, but it might take some time before we know for sure if this has improved the situation. Its only one week left to do changes before Christmas, so I am an not sure how many reloads there will be before next Year. Thanks for great help so far. I will update You as soon as we get five or more successful reloads (or worst case, a reload that hangs in one minute again) Regards Terje -Opprinnelig melding- Fra: Willy Tarreau [mailto:w...@1wt.eu] Sendt: 5. desember 2012 22:43 Til: Borgen, Terje Kopi: haproxy@formilux.org Emne: Re: VS: Haparoxy hangs in one minute on config reload Hi Terje, On Wed, Dec 05, 2012 at 09:33:19AM +0100, Borgen, Terje wrote: > Hi Willy, > Thanks for Your quick response. > I think You might be onto something here. We have a similar setup with > haproxy using port 80 and have never experienced this problem in that > environment. OK. > /proc/sys/net/ipv4/ip_local_port_range says 32768-61000, so nothing > special here. We have another similar problem when restarting the > Jetty-servers on the same server. We always get an error saying that > the port is in use and we have to wait one minute before it can start > again. The Jetty ports (as You can see in the config) are also outside > the ip_local_port_range. But this might be another problem since it happens > every restart. Yes, typically a listening port bound without SO_REUSEADDR. Very common in fact. > Some additional info: > - We have two identical servers running apache http server, haproxy > and jetty servers. Most of the traffic hits the main server, and the > reload problem have never happened on the failover server. So this > problem might be "traffic-related". > - For one week we changed the inter-parameter on the clusters from > default 2000 to 6 leaving rise/fall as default. In that period the > problem never occurred. OK, I see. The health checks are causing too many time-wait sockets. This issue was very recently fixed (in 1.5-dev14) as haproxy now closes health check sockets with a TCP reset, thus avoiding the TIME_WAIT. I'm pretty sure they're the one causing the issue as I've experienced a similar one recently (reason why I fixed it :-)). I have not backported this yet as I wanted to keep an observation period. However you can try something : put "option nolinger" in your BACKENDS, not your frontends, otherwise some clients will experience truncated responses!!! All backend connections (including checks) will be closed by a reset and you should see much less TIME_WAIT sockets between haproxy and the servers. Regards, Willy
SV: VS: Haparoxy hangs in one minute on config reload
Hi Willy, Nice to know that a fix is on its way. Looking forward to that. We are in a process of migrating from Windows/WebSphere and have another twenty-five Jetty-apps that will run on this environment. With health checks from all these applications the problem might be bigger than it is today. I have put "option nolinger" in all the backends with backend-check in our test-environment. This change will be merged into production on Monday, but it might take some time before we know for sure if this has improved the situation. Its only one week left to do changes before Christmas, so I am an not sure how many reloads there will be before next Year. Thanks for great help so far. I will update You as soon as we get five or more successful reloads (or worst case, a reload that hangs in one minute again) Regards Terje -Opprinnelig melding- Fra: Willy Tarreau [mailto:w...@1wt.eu] Sendt: 5. desember 2012 22:43 Til: Borgen, Terje Kopi: haproxy@formilux.org Emne: Re: VS: Haparoxy hangs in one minute on config reload Hi Terje, On Wed, Dec 05, 2012 at 09:33:19AM +0100, Borgen, Terje wrote: > Hi Willy, > Thanks for Your quick response. > I think You might be onto something here. We have a similar setup with > haproxy using port 80 and have never experienced this problem in that > environment. OK. > /proc/sys/net/ipv4/ip_local_port_range says 32768-61000, so nothing > special here. We have another similar problem when restarting the > Jetty-servers on the same server. We always get an error saying that > the port is in use and we have to wait one minute before it can start > again. The Jetty ports (as You can see in the config) are also outside > the ip_local_port_range. But this might be another problem since it happens > every restart. Yes, typically a listening port bound without SO_REUSEADDR. Very common in fact. > Some additional info: > - We have two identical servers running apache http server, haproxy > and jetty servers. Most of the traffic hits the main server, and the > reload problem have never happened on the failover server. So this > problem might be "traffic-related". > - For one week we changed the inter-parameter on the clusters from > default 2000 to 6 leaving rise/fall as default. In that period the > problem never occurred. OK, I see. The health checks are causing too many time-wait sockets. This issue was very recently fixed (in 1.5-dev14) as haproxy now closes health check sockets with a TCP reset, thus avoiding the TIME_WAIT. I'm pretty sure they're the one causing the issue as I've experienced a similar one recently (reason why I fixed it :-)). I have not backported this yet as I wanted to keep an observation period. However you can try something : put "option nolinger" in your BACKENDS, not your frontends, otherwise some clients will experience truncated responses!!! All backend connections (including checks) will be closed by a reset and you should see much less TIME_WAIT sockets between haproxy and the servers. Regards, Willy
Re: VS: Haparoxy hangs in one minute on config reload
Hi Willy, Thanks for Your quick response. I think You might be onto something here. We have a similar setup with haproxy using port 80 and have never experienced this problem in that environment. /proc/sys/net/ipv4/ip_local_port_range says 32768-61000, so nothing special here. We have another similar problem when restarting the Jetty-servers on the same server. We always get an error saying that the port is in use and we have to wait one minute before it can start again. The Jetty ports (as You can see in the config) are also outside the ip_local_port_range. But this might be another problem since it happens every restart. Some additional info: - We have two identical servers running apache http server, haproxy and jetty servers. Most of the traffic hits the main server, and the reload problem have never happened on the failover server. So this problem might be "traffic-related". - For one week we changed the inter-parameter on the clusters from default 2000 to 6 leaving rise/fall as default. In that period the problem never occurred. Regards, Terje -Opprinnelig melding- Fra: Willy Tarreau [mailto:w...@1wt.eu] Sendt: 5. desember 2012 08:22 Til: Borgen, Terje Kopi: haproxy@formilux.org Emne: Re: VS: Haparoxy hangs in one minute on config reload Hi Terje, On Tue, Dec 04, 2012 at 12:58:17PM +0100, Borgen, Terje wrote: > Hi, > We are using haproxy in an main/failover scenario in front of Jetty-servers. > With approximately every fifth config-reload all the requests gets 503 error > in the next minute. > It seems very much like the issue described here: > http://marc.info/?t=12114602629&r=1&w=2 > > Some info: > RHEL 5.3 > Haproxy 1.4.22 > The traffic is approximately 30 request/second. > There is a Apache HTTP Server in front of Haproxy > > Attached is the configuration and haproxy-log for normal reload and reload > with hang. > Can You help me indentify what???s wrong? I'm having an idea right now. If you look below, the pause happens between the old and the new process : Nov 30 11:00:13 alp-stb-004 haproxy[29722]: 127.0.0.1:48843 [30/Nov/2012:10:59:44.396] client skade_p/skade_p_main 0/0/0/29383/29387 302 408 - - --VN 0/0/0/0/0 0/0 "POST /skade/public/storage/submitRetrieveShoppingCart.action HTTP/1.1" Nov 30 11:01:01 alp-stb-004 haproxy[32609]: 127.0.0.1:51371 [30/Nov/2012:11:01:01.365] client skade_p/skade_p_main 0/0/0/47/47 200 12828 - - --VN 0/0/0/0/0 0/0 "GET /skade/partner HTTP/1.1" One thing that the new process does is to try to bind the listeners, and if any fails, it asks the old process to release the port so that it can try again. Since you're running on a non-privileged port (8088), what could be possible is that during the reload, when the port is remporarily released, it's used as a source port by the old process, preventing any of them from binding to it. Could you please check /proc/sys/net/ipv4/ip_local_port_range ? If it spans further than the default 32768-61000, please ensure the range does not cover 8088 and try again. If it solves the issue, maybe we should document that in the doc. Regards, Willy
Haparoxy hangs in one minute on config reload
Hi, We are using haproxy in an main/failover scenario in front of Jetty-servers. With approximately every fifth config-reload all the requests gets 503 error in the next minute. It seems very much like the issue described here: http://marc.info/?t=12114602629&r=1&w=2 Some info: RHEL 5.3 Haproxy 1.4.22 The traffic is approximately 30 request/second. There is a Apache HTTP Server in front of Haproxy Attached is the configuration and haproxy-log for normal reload and reload with hang. Can You help me indentify what’s wrong? Best regards Terje Borgen