Re: Problem with health checking

2016-10-12 Thread Cyril Bonté

Hi John,

Le 12/10/2016 à 22:07, John Lanigan a écrit :

Hi Cyril,

It’s only taken me 2 months to upgrade!

Once I figured out the process I dove straight in at the deep end and went up 
to 1.6.9, it's been live a few days now without issues so I have enabled health 
checking tonight on the nodes that were mis-reporting being down and they are 
showing green!


Great, thanks for the feedback.


Thank you for your help, and excellent work on the new HTML documentation, the 
format is much more readable than the txt version for occasional users like me.


Thanks, it reminds me I have a todo list which is still growing. I 
really need to find some times to fix some issues and re-enable the 
automatic generation of the documentation : since I included all the 
documentation files (for versions providing intro.txt, management.txt, 
...), it's stuck to a release version until I run the process and 
manually commit...





Kind Regards,

John Lanigan

-Original Message-
From: John Lanigan [mailto:john.lani...@coresoftware.ie]
Sent: Monday 8 August 2016 06:25
To: Cyril Bonté
Cc: haproxy@formilux.org
Subject: RE: Problem with health checking

Thanks Cyril,

I'll try that, and also have the app server team compare their settings across 
the servers.

Kind Regards,

John Lanigan

-Original Message-
From: Cyril Bonté [mailto:cyril.bo...@free.fr]
Sent: Sunday 7 August 2016 20:28
To: John Lanigan
Cc: haproxy@formilux.org
Subject: Re: Problem with health checking

Hi John,


Le 07/08/2016 à 21:17, John Lanigan a écrit :

[...]
http-check expect status 200
http-check disable-on-404

server hostingapp2_5041 172.17.2.40:5041  check
server hostingapp3_5041 172.17.2.50:5041  check

[...]

If I access all four of those URLs with lynx to show the headers from
the command line on the haproxy server I get the same result from each
server, instantaneous response and HTTP 200 status.

[...]

But the haproxy stats page reports L7 timeout if we add in a check on
the new app setup.



Any ideas what I need to check next?

[...]

I’ve been running  haproxy 1.4.24 on centos 6.5 for over 2 years, load
balancing a pair of Oracle 11g app servers on Windows 2008r2.



Did you upgrade to the latest 1.4 to see if the issue remains ?
After reviewing the changelog with the configuration you have provided, I think 
it comes from your http-check rules, due to a bug that has been fixed in 1.4.26 
(about HTTP keep-alived connections with http-check) :
http://www.haproxy.org/git?p=haproxy-1.4.git;a=commit;h=98739cba6281adeaf1db26be188970e4423b51fc

As a quick test, maybe you can disable keep-alive on your servers to verify 
this.


--
Cyril Bonté




--
Cyril Bonté



RE: Problem with health checking

2016-10-12 Thread John Lanigan
Hi Cyril,

It’s only taken me 2 months to upgrade!

Once I figured out the process I dove straight in at the deep end and went up 
to 1.6.9, it's been live a few days now without issues so I have enabled health 
checking tonight on the nodes that were mis-reporting being down and they are 
showing green!

Thank you for your help, and excellent work on the new HTML documentation, the 
format is much more readable than the txt version for occasional users like me.

Kind Regards,
 
John Lanigan

-Original Message-
From: John Lanigan [mailto:john.lani...@coresoftware.ie] 
Sent: Monday 8 August 2016 06:25
To: Cyril Bonté
Cc: haproxy@formilux.org
Subject: RE: Problem with health checking

Thanks Cyril,

I'll try that, and also have the app server team compare their settings across 
the servers.

Kind Regards,
 
John Lanigan

-Original Message-
From: Cyril Bonté [mailto:cyril.bo...@free.fr]
Sent: Sunday 7 August 2016 20:28
To: John Lanigan
Cc: haproxy@formilux.org
Subject: Re: Problem with health checking

Hi John,


Le 07/08/2016 à 21:17, John Lanigan a écrit :
> [...]
> http-check expect status 200
> http-check disable-on-404
>
> server hostingapp2_5041 172.17.2.40:5041  check
> server hostingapp3_5041 172.17.2.50:5041  check
>
> [...]
>
> If I access all four of those URLs with lynx to show the headers from 
> the command line on the haproxy server I get the same result from each 
> server, instantaneous response and HTTP 200 status.
>
> [...]
>
> But the haproxy stats page reports L7 timeout if we add in a check on 
> the new app setup.
>
>
>
> Any ideas what I need to check next?
>
> [...]
>
> I’ve been running  haproxy 1.4.24 on centos 6.5 for over 2 years, load 
> balancing a pair of Oracle 11g app servers on Windows 2008r2.


Did you upgrade to the latest 1.4 to see if the issue remains ?
After reviewing the changelog with the configuration you have provided, I think 
it comes from your http-check rules, due to a bug that has been fixed in 1.4.26 
(about HTTP keep-alived connections with http-check) :
http://www.haproxy.org/git?p=haproxy-1.4.git;a=commit;h=98739cba6281adeaf1db26be188970e4423b51fc

As a quick test, maybe you can disable keep-alive on your servers to verify 
this.


--
Cyril Bonté


RE: Problem with health checking

2016-08-07 Thread John Lanigan
Thanks Cyril,

I'll try that, and also have the app server team compare their settings across 
the servers.

Kind Regards,
 
John Lanigan
CORE |   +353 (0)25 41400   |  ||john.lani...@coresoftware.ie    |     
www.coresoftware.ie

-Original Message-
From: Cyril Bonté [mailto:cyril.bo...@free.fr] 
Sent: Sunday 7 August 2016 20:28
To: John Lanigan
Cc: haproxy@formilux.org
Subject: Re: Problem with health checking

Hi John,


Le 07/08/2016 à 21:17, John Lanigan a écrit :
> [...]
> http-check expect status 200
> http-check disable-on-404
>
> server hostingapp2_5041 172.17.2.40:5041  check
> server hostingapp3_5041 172.17.2.50:5041  check
>
> [...]
>
> If I access all four of those URLs with lynx to show the headers from 
> the command line on the haproxy server I get the same result from each 
> server, instantaneous response and HTTP 200 status.
>
> [...]
>
> But the haproxy stats page reports L7 timeout if we add in a check on 
> the new app setup.
>
>
>
> Any ideas what I need to check next?
>
> [...]
>
> I’ve been running  haproxy 1.4.24 on centos 6.5 for over 2 years, load 
> balancing a pair of Oracle 11g app servers on Windows 2008r2.


Did you upgrade to the latest 1.4 to see if the issue remains ?
After reviewing the changelog with the configuration you have provided, I think 
it comes from your http-check rules, due to a bug that has been fixed in 1.4.26 
(about HTTP keep-alived connections with http-check) :
http://www.haproxy.org/git?p=haproxy-1.4.git;a=commit;h=98739cba6281adeaf1db26be188970e4423b51fc

As a quick test, maybe you can disable keep-alive on your servers to verify 
this.


-- 
Cyril Bonté


Re: Problem with health checking

2016-08-07 Thread Cyril Bonté

Hi John,


Le 07/08/2016 à 21:17, John Lanigan a écrit :

[...]
http-check expect status 200
http-check disable-on-404

server hostingapp2_5041 172.17.2.40:5041  check
server hostingapp3_5041 172.17.2.50:5041  check

[...]

If I access all four of those URLs with lynx to show the headers from
the command line on the haproxy server I get the same result from each
server, instantaneous response and HTTP 200 status.

[...]

But the haproxy stats page reports L7 timeout if we add in a check on
the new app setup.



Any ideas what I need to check next?

[...]

I’ve been running  haproxy 1.4.24 on centos 6.5 for over 2 years, load
balancing a pair of Oracle 11g app servers on Windows 2008r2.



Did you upgrade to the latest 1.4 to see if the issue remains ?
After reviewing the changelog with the configuration you have provided, 
I think it comes from your http-check rules, due to a bug that has been 
fixed in 1.4.26 (about HTTP keep-alived connections with http-check) :

http://www.haproxy.org/git?p=haproxy-1.4.git;a=commit;h=98739cba6281adeaf1db26be188970e4423b51fc

As a quick test, maybe you can disable keep-alive on your servers to 
verify this.



--
Cyril Bonté



RE: Problem with health checking

2016-08-07 Thread John Lanigan
Hi,

Some further info.

The two configs are below, old and new, the differences are minor:

listen Load-Balancer-oldapp
bind *:5041
mode http
balance roundrobin

stats enable
stats hide-version
stats refresh 20s
stats show-node
stats admin if TRUE


stick-table type ip size 1m expire 1h
stick on src

option httpclose
option forwardfor

option httpchk GET /forms/frmservlet?config=oldapp

http-check expect status 200
http-check disable-on-404

   server hostingapp2_5041 172.17.2.40:5041  check
server hostingapp3_5041 172.17.2.50:5041  check
option redispatch



listen Load-Balancer-newapp
bind *:5051
mode http
balance roundrobin

stats enable
stats hide-version
stats refresh 20s
stats show-node
stats admin if TRUE


stick-table type ip size 1m expire 1h
stick on src

option httpclose
option forwardfor

option httpchk GET /forms/frmservlet?config=newapp

http-check expect status 200
http-check disable-on-404

server sh1hostingapp4_5051 172.17.2.60:5051
server sh1hostingapp5_5051 172.17.2.70:5051

option redispatch


When I take the IP, port and path from those configs and combine them for 
testing I end up with four URLS

http://172.17.2.40:5041/forms/frmservlet?config=oldapp
http://172.17.2.50:5041/forms/frmservlet?config=oldapp
http://172.17.2.60:5051/forms/frmservlet?config=newapp
http://172.17.2.70:5051/forms/frmservlet?config=newapp

If I access all four of those URLs with lynx to show the headers from the 
command line on the haproxy server I get the same result from each server, 
instantaneous response and HTTP 200 status.


[root@HAProxy1 ~]# lynx -head -dump 
http://172.17.2.40:5041/forms/frmservlet?config=oldapp
HTTP/1.1 200 OK
Date: Sun, 07 Aug 2016 19:01:32 GMT
Server: Oracle-Application-Server-11g
Cache-Control: no-cache,no-store
Content-Length: 5496
X-ORACLE-DMS-ECID: 00ia1Oq3joKFw00Fzzw0w1Co001zZk
X-Powered-By: Servlet/2.5 JSP/2.1
Connection: close
Content-Type: text/html
Content-Language: en

[root@HAProxy1 ~]# lynx -head -dump 
http://172.17.2.70:5051/forms/frmservlet?config=newapp
HTTP/1.1 200 OK
Date: Sun, 07 Aug 2016 18:59:29 GMT
Server: Oracle-Application-Server-11g
Cache-Control: no-cache,no-store
Content-Length: 5505
X-ORACLE-DMS-ECID: 00ia1OihrRjDsXH6yvnZ6G0001mS004H^5
X-Powered-By: Servlet/2.5 JSP/2.1
Connection: close
Content-Type: text/html
Content-Language: en



But the haproxy stats page reports L7 timeout if we add in a check on the new 
app setup.

Any ideas what I need to check next?


Kind Regards,

John Lanigan
CORE | * +353 (0)25 41400   |  * 
john.lani...@coresoftware.ie|  ?   
www.coresoftware.ie

From: John Lanigan [mailto:john.lani...@coresoftware.ie]
Sent: Thursday 4 August 2016 22:55
To: haproxy@formilux.org
Subject: Problem with health checking

Hi,

I've been running  haproxy 1.4.24 on centos 6.5 for over 2 years, load 
balancing a pair of Oracle 11g app servers on Windows 2008r2.

It's worked perfectly the entire time.  We have recently built two new app 
servers, same version of Oracle 11g on Windows 2012R2.

For health checking we check for http status 200 on the homepage of the 
application.

For the new servers the health checking is failing.  We are getting a layer 7 
timeout.  However when I try and access the page using either curl or lynx it 
downloads instantly with status 200.

Any tips as to where I go from here in troubleshooting would be great.  I've 
checked and double checked the config and the only difference for this load 
balancer over the existing ones is the IP and port numbers.

Right now we are load balancing without health checking, but obviously that's 
not a solution.

Any suggestions would be great, thanks in advance

John