SV: SV: VS: Haparoxy hangs in one minute on config reload

2013-04-23 Thread Borgen, Terje
Hi Willy,
Thanks, we will try this ASAP.

Regards
Terje

-Opprinnelig melding-
Fra: Willy Tarreau [mailto:w...@1wt.eu] 
Sendt: tirsdag, april 23, 2013 11:26
Til: Borgen, Terje
Kopi: 'haproxy@formilux.org'
Emne: Re: SV: VS: Haparoxy hangs in one minute on config reload

Hi Terje,

On Tue, Apr 23, 2013 at 11:17:13AM +0200, Borgen, Terje wrote:
> Hi Willy.
> I am sorry it took so long. We got another problem that got my full 
> attention for months. Firewall issues with database connections. This 
> is now solved and today this HAProxy reload problem occurred again.
> We have checked the logs and it's not the first time after we added "option 
> nolinger".
> 
> You mentioned that using port 8080 could be part of the problem. Our 
> other setup which runs on port 80 have never had this problem.
> Do You think changing the listen port to eg 85 might solve the problem?

Yes definitely, please try this.

Regards,
Willy




SV: VS: Haparoxy hangs in one minute on config reload

2013-04-23 Thread Borgen, Terje
Hi Willy.
I am sorry it took so long. We got another problem that got my full attention 
for months. Firewall issues with database connections. This is now solved and 
today this HAProxy reload problem occurred again.
We have checked the logs and it's not the first time after we added "option 
nolinger".

You mentioned that using port 8080 could be part of the problem. Our other 
setup which runs on port 80 have never had this problem.
Do You think changing the listen port to eg 85 might solve the problem?

Best regards
Terje


-Opprinnelig melding-----
Fra: Borgen, Terje 
Sendt: torsdag, desember 06, 2012 16:56
Til: Willy Tarreau
Kopi: haproxy@formilux.org
Emne: SV: VS: Haparoxy hangs in one minute on config reload

Hi Willy,
Nice to know that a fix is on its way. Looking forward to that. We are in a 
process of migrating from Windows/WebSphere and have another twenty-five 
Jetty-apps that will run on this environment. With health checks from all these 
applications the problem might be bigger than it is today. 

I have put "option nolinger" in all the backends with backend-check in our 
test-environment. This change will be merged into production on Monday, but it 
might take some time before we know for sure if this has improved the 
situation. Its only one week left to do changes before Christmas, so I am an 
not sure how many reloads there will be before next Year.

Thanks for great help so far. I will update You as soon as we get five or more 
successful reloads (or worst case, a reload that hangs in one minute again)

Regards
Terje

-Opprinnelig melding-
Fra: Willy Tarreau [mailto:w...@1wt.eu]
Sendt: 5. desember 2012 22:43
Til: Borgen, Terje
Kopi: haproxy@formilux.org
Emne: Re: VS: Haparoxy hangs in one minute on config reload

Hi Terje,

On Wed, Dec 05, 2012 at 09:33:19AM +0100, Borgen, Terje wrote:
> Hi Willy,
> Thanks for Your quick response.
> I think You might be onto something here. We have a similar setup with 
> haproxy using port 80 and have never experienced this problem in that 
> environment.

OK.

> /proc/sys/net/ipv4/ip_local_port_range says 32768-61000, so nothing 
> special here. We have another similar problem when restarting the 
> Jetty-servers on the same server. We always get an error saying that 
> the port is in use and we have to wait one minute before it can start 
> again. The Jetty ports (as You can see in the config) are also outside 
> the ip_local_port_range. But this might be another problem since it happens 
> every restart.

Yes, typically a listening port bound without SO_REUSEADDR. Very common in fact.

> Some additional info:
> - We have two identical servers running apache http server, haproxy 
> and jetty servers. Most of the traffic hits the main server, and the 
> reload problem have never happened on the failover server. So this 
> problem might be "traffic-related".
> - For one week we changed the inter-parameter on the clusters from 
> default 2000 to 6 leaving rise/fall as default. In that period the 
> problem never occurred.

OK, I see. The health checks are causing too many time-wait sockets.
This issue was very recently fixed (in 1.5-dev14) as haproxy now closes health 
check sockets with a TCP reset, thus avoiding the TIME_WAIT. I'm pretty sure 
they're the one causing the issue as I've experienced a similar one recently 
(reason why I fixed it :-)).

I have not backported this yet as I wanted to keep an observation period.

However you can try something : put "option nolinger" in your BACKENDS, not 
your frontends, otherwise some clients will experience truncated responses!!! 
All backend connections (including checks) will be closed by a reset and you 
should see much less TIME_WAIT sockets between haproxy and the servers.

Regards,
Willy




SV: VS: Haparoxy hangs in one minute on config reload

2012-12-06 Thread Borgen, Terje
Hi Willy,
Nice to know that a fix is on its way. Looking forward to that. We are in a 
process of migrating from Windows/WebSphere and have another twenty-five 
Jetty-apps that will run on this environment. With health checks from all these 
applications the problem might be bigger than it is today. 

I have put "option nolinger" in all the backends with backend-check in our 
test-environment. This change will be merged into production on Monday, but it 
might take some time before we know for sure if this has improved the 
situation. Its only one week left to do changes before Christmas, so I am an 
not sure how many reloads there will be before next Year.

Thanks for great help so far. I will update You as soon as we get five or more 
successful reloads (or worst case, a reload that hangs in one minute again)

Regards
Terje

-Opprinnelig melding-
Fra: Willy Tarreau [mailto:w...@1wt.eu] 
Sendt: 5. desember 2012 22:43
Til: Borgen, Terje
Kopi: haproxy@formilux.org
Emne: Re: VS: Haparoxy hangs in one minute on config reload

Hi Terje,

On Wed, Dec 05, 2012 at 09:33:19AM +0100, Borgen, Terje wrote:
> Hi Willy,
> Thanks for Your quick response.
> I think You might be onto something here. We have a similar setup with 
> haproxy using port 80 and have never experienced this problem in that 
> environment.

OK.

> /proc/sys/net/ipv4/ip_local_port_range says 32768-61000, so nothing 
> special here. We have another similar problem when restarting the 
> Jetty-servers on the same server. We always get an error saying that 
> the port is in use and we have to wait one minute before it can start 
> again. The Jetty ports (as You can see in the config) are also outside 
> the ip_local_port_range. But this might be another problem since it happens 
> every restart.

Yes, typically a listening port bound without SO_REUSEADDR. Very common in fact.

> Some additional info:
> - We have two identical servers running apache http server, haproxy 
> and jetty servers. Most of the traffic hits the main server, and the 
> reload problem have never happened on the failover server. So this 
> problem might be "traffic-related".
> - For one week we changed the inter-parameter on the clusters from 
> default 2000 to 6 leaving rise/fall as default. In that period the 
> problem never occurred.

OK, I see. The health checks are causing too many time-wait sockets.
This issue was very recently fixed (in 1.5-dev14) as haproxy now closes health 
check sockets with a TCP reset, thus avoiding the TIME_WAIT. I'm pretty sure 
they're the one causing the issue as I've experienced a similar one recently 
(reason why I fixed it :-)).

I have not backported this yet as I wanted to keep an observation period.

However you can try something : put "option nolinger" in your BACKENDS, not 
your frontends, otherwise some clients will experience truncated responses!!! 
All backend connections (including checks) will be closed by a reset and you 
should see much less TIME_WAIT sockets between haproxy and the servers.

Regards,
Willy




Re: VS: Haparoxy hangs in one minute on config reload

2012-12-05 Thread Borgen, Terje
Hi Willy,
Thanks for Your quick response.
I think You might be onto something here. We have a similar setup with haproxy 
using port 80 and have never experienced this problem in that environment.
/proc/sys/net/ipv4/ip_local_port_range says 32768-61000, so nothing special 
here. We have another similar problem when restarting the Jetty-servers on the 
same server. We always get an error saying that the port is in use and we have 
to wait one minute before it can start again. The Jetty ports (as You can see 
in the config) are also outside the ip_local_port_range. But this might be 
another problem since it happens every restart.

Some additional info:
- We have two identical servers running apache http server, haproxy and jetty 
servers. Most of the traffic hits the main server, and the reload problem have 
never happened on the failover server. So this problem might be 
"traffic-related".
- For one week we changed the inter-parameter on the clusters from default 2000 
to 6 leaving rise/fall as default. In that period the problem never 
occurred. 

Regards,
Terje 

-Opprinnelig melding-
Fra: Willy Tarreau [mailto:w...@1wt.eu] 
Sendt: 5. desember 2012 08:22
Til: Borgen, Terje
Kopi: haproxy@formilux.org
Emne: Re: VS: Haparoxy hangs in one minute on config reload

Hi Terje,

On Tue, Dec 04, 2012 at 12:58:17PM +0100, Borgen, Terje wrote:
> Hi,
> We are using haproxy in an main/failover scenario in front of Jetty-servers. 
> With approximately every fifth config-reload all the requests gets 503 error 
> in the next minute.
> It seems very much like the issue described here: 
> http://marc.info/?t=12114602629&r=1&w=2
> 
> Some info:
> RHEL 5.3
> Haproxy 1.4.22
> The traffic is approximately 30 request/second.
> There is a Apache HTTP Server in front of Haproxy
> 
> Attached is the configuration and haproxy-log for normal reload and reload 
> with hang.
> Can You help me indentify what???s wrong?

I'm having an idea right now. If you look below, the pause happens between the 
old and the new process :

Nov 30 11:00:13 alp-stb-004 haproxy[29722]: 127.0.0.1:48843 
[30/Nov/2012:10:59:44.396] client skade_p/skade_p_main 0/0/0/29383/29387 302 
408 - - --VN 0/0/0/0/0 0/0 "POST 
/skade/public/storage/submitRetrieveShoppingCart.action HTTP/1.1"
Nov 30 11:01:01 alp-stb-004 haproxy[32609]: 127.0.0.1:51371 
[30/Nov/2012:11:01:01.365] client skade_p/skade_p_main 0/0/0/47/47 200 12828 - 
- --VN 0/0/0/0/0 0/0 "GET /skade/partner HTTP/1.1"

One thing that the new process does is to try to bind the listeners, and if any 
fails, it asks the old process to release the port so that it can try again.

Since you're running on a non-privileged port (8088), what could be possible is 
that during the reload, when the port is remporarily released, it's used as a 
source port by the old process, preventing any of them from binding to it.

Could you please check /proc/sys/net/ipv4/ip_local_port_range ?

If it spans further than the default 32768-61000, please ensure the range does 
not cover 8088 and try again. If it solves the issue, maybe we should document 
that in the doc.

Regards,
Willy




Haparoxy hangs in one minute on config reload

2012-12-04 Thread Borgen, Terje
Hi,
We are using haproxy in an main/failover scenario in front of Jetty-servers. 
With approximately every fifth config-reload all the requests gets 503 error in 
the next minute.
It seems very much like the issue described here: 
http://marc.info/?t=12114602629&r=1&w=2

Some info:
RHEL 5.3
Haproxy 1.4.22
The traffic is approximately 30 request/second.
There is a Apache HTTP Server in front of Haproxy

Attached is the configuration and haproxy-log for normal reload and reload with 
hang.
Can You help me indentify what’s wrong?

Best regards
Terje Borgen