RE: [chkfail] what went wrong?

Michel Hoogervorst Fri, 13 Jun 2014 08:06:06 -0700

Hi,

In syslog I see all kinds of things, for example when a server has gone down 
for maintenance or when I send a SIGHUP to the haproxy process. So why is a 
failure not being logged?
It would really help to at least know when the chkfail has occurred so we can 
try to relate the issue with messages from the webserver logs or our 
monitoring-system.



Michel


-----Oorspronkelijk bericht-----
Van: Michel Hoogervorst 
Verzonden: woensdag 11 juni 2014 14:21
Aan: '[email protected]'
Onderwerp: RE: [chkfail] what went wrong?

Hi,

I have done some more searching, but cannot lay my finger on what the chkfail's 
are about. Since the are counted I'd say they must mean "something".
I'd really like to know what the chkfail's are about. Is there some special 
loglevel I could setup to gain more information on these failures?


Michel

-----Oorspronkelijk bericht-----
Van: Michel Hoogervorst
Verzonden: dinsdag 3 juni 2014 12:02
Aan: 'Baptiste'
CC: [email protected]
Onderwerp: RE: [chkfail] what went wrong?

Hi,

Thanks for your reply.

This is how our config looks like (I stripped some non-relevant comments and 
names):
global
    log         127.0.0.1 local2
    chroot      /var/lib/haproxy
    pidfile     /var/run/haproxy.pid
    maxconn     20000
    user        haproxy
    group       haproxy
    daemon
    tune.maxrewrite 2048
    stats socket /var/lib/haproxy/stats

defaults
    mode                    http
    log                     global
    option                  httplog
    option                  dontlognull
    option                  dontlog-normal
    option forwardfor       except 127.0.0.0/8
    option                  redispatch
    retries                 3
    timeout http-request    10s
    timeout queue           1m
    timeout connect         10s
    timeout client          30m
    timeout server          30m
    timeout http-keep-alive 10s
    timeout check           10s
    maxconn                 20000

listen webcluster_prod 127.0.0.1:8081
  mode http
  balance hdr(X-Forwarded-For)

  option contstats
  option httpchk HEAD /check.txt HTTP/1.0
  option httpclose

  server xxx 10.20.30.1:80 check
  server yyy 10.20.30.2:80 check

When a server goes down I do get a message in the log, so loggins seems to work 
fine. I just would like to know what the chkfail is about since I only see it 
in the stats and nowhere else.


Kind regards,

Michel Hoogervorst

-----Oorspronkelijk bericht-----
Van: Baptiste [mailto:[email protected]]
Verzonden: dinsdag 3 juni 2014 11:41
Aan: Michel Hoogervorst
CC: [email protected]
Onderwerp: Re: [chkfail] what went wrong?

On Tue, Jun 3, 2014 at 11:28 AM, Michel Hoogervorst 
<[email protected]> wrote:
> Hi,
>
>
>
> I have checked the documentation but cannot find much information on 
> chkfail’s in there.
>
>
>
> Is there any way to check what went wrong when a “chkfail” has occurred?
>
> We use a monitoring-plugin that has thresholds set on chkfail’s which 
> shows me that some machines have multiple chkfail’s every day.
> However, no problem is logged both on the HAProxy side and on the 
> webserver side, and “show errors” on the stats-socket shows 0 errors.
>
> I’d like to know what happened, and if there has been any impact from 
> this failure. Is a chkfail always because the server didn’t respond 
> within 10 seconds or can it be something else?
>
> Met vriendelijke groet, kind regards,
>
> Michel Hoogervorst


Hi Michel,

Maybe you could start by sharing your configuration, then it will be easier to 
help you.

HAProxy setup log severity to notice when a server fails move to status DOWN 
due to wrong health check response.
If you don't retrieve this info in your log it's either because your HAProxy or 
your syslog server is not well configured (or both of them in the mean time).

Baptiste

RE: [chkfail] what went wrong?

Reply via email to