Le Vendredi 12 Mars 2010 20:17:14, Willy Tarreau a écrit :
> Hi Cyril,
> 
> On Fri, Mar 12, 2010 at 05:03:15PM +0100, Cyril Bonté wrote:
> > Hi Willy,
> > 
> > Our monitoring scripts use the unix socket to get haproxy's status. 
> > Sometimes they detect haproxy DOWN when it's not really the case.
> 
> Do you know how much time it takes to observe it ? I'm currently

It's random but generally I don't wait more than 10 seconds (less than 1500 
loops at home).

> running it on 1.4.0-3 here. It's been running for the last 10 minutes
> with 2, then 3 and now 10 concurrent scripts. For the record in case
> that matters, it's running on socat 1.6.0.0 :
>   # /root/bin/socat -V
>   socat by Gerhard Rieger - see www.dest-unreach.org
>   socat version 1.6.0.0 on Oct 28 2007 21:29:34

At work it should be version 1.6.0.1 (debian lenny package)
Tonight, my tests are done with the version 1.7.1.2.

>      running on Linux version #1 Sun Jan 31 00:55:16 CET 2010, release 
> 2.4.37-wt3-fw, machine i686

I don't think this makes big differences but my tests were done with 
2.6.{18,24,31,33} kernels.

> Not much as most of the rework happened between 1.3 and 1.4. In fact,
> some part of the work also happened between 1.3.15 and 1.3.16 but it
> was the low-level I/O which is now common with TCP/HTTP. It would be
> nice to try with "strace socat" instead of "socat" alone. I wonder
> if it's just a scheduling issue sometimes causing socat to close its
> output channel after sending the request and before receiving the
> response (as we commonly have with netcat).

This might be the case but then it's strange that it doesn't happen with non 
concurrent accesses.
Working trace :
...
stat("/tmp/haproxy.socket", {st_mode=S_IFSOCK|0755, st_size=0, ...}) = 0
socket(PF_FILE, SOCK_STREAM, 0)         = 3
fcntl(3, F_SETFD, FD_CLOEXEC)           = 0
connect(3, {sa_family=AF_FILE, path="/tmp/haproxy.socket"}, 21) = 0
getsockname(3, {sa_family=AF_FILE, NULL}, [2]) = 0
getsockname(3, {sa_family=AF_FILE, NULL}, [2]) = 0
ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fff6290e590) = -1 EINVAL (Invalid 
argument)
ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fff6290e590) = -1 EINVAL (Invalid 
argument)
select(4, [0 3], [1 3], [], NULL)       = 3 (in [0], out [1 3])
read(0, "show info\n", 8192)            = 10
write(3, "show info\n", 10)             = 10
select(4, [0 3], [3], [], NULL)         = 3 (in [0 3], out [3])
read(3, "Name: HAProxy\nVersion: 1.3.16\nRe"..., 8192) = 255
write(1, "Name: HAProxy\nVersion: 1.3.16\nRe"..., 255) = 255
read(0, "", 8192)                       = 0
shutdown(3, 1 /* send */)               = 0
select(4, [3], [1], [], {0, 500000})    = 2 (in [3], out [1], left {0, 499998})
read(3, "", 8192)                       = 0
shutdown(3, 1 /* send */)               = 0
shutdown(3, 2 /* send and receive */)   = 0
exit_group(0)                           = ?

Non working one :
stat("/tmp/haproxy.socket", {st_mode=S_IFSOCK|0755, st_size=0, ...}) = 0
socket(PF_FILE, SOCK_STREAM, 0)         = 3
fcntl(3, F_SETFD, FD_CLOEXEC)           = 0
connect(3, {sa_family=AF_FILE, path="/tmp/haproxy.socket"}, 21) = 0
getsockname(3, {sa_family=AF_FILE, NULL}, [2]) = 0
getsockname(3, {sa_family=AF_FILE, NULL}, [2]) = 0
ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fff5716e590) = -1 EINVAL (Invalid 
argument)
ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fff5716e590) = -1 EINVAL (Invalid 
argument)
select(4, [0 3], [1 3], [], NULL)       = 3 (in [0], out [1 3])
read(0, "show info\n", 8192)            = 10
write(3, "show info\n", 10)             = 10
select(4, [0 3], [3], [], NULL)         = 2 (in [0], out [3])
read(0, "", 8192)                       = 0
shutdown(3, 1 /* send */)               = 0
select(4, [3], [], [], {0, 500000})     = 1 (in [3], left {0, 499998})
read(3, "", 8192)                       = 0
shutdown(3, 1 /* send */)               = 0
shutdown(3, 2 /* send and receive */)   = 0
exit_group(0)                           = ?

> But from my experience,
> socat does not seem to abort any transfer after a unidirectional
> close. So that would indicate that haproxy's stats output stops
> if the input channel closes, which I don't think is the case at
> all.
> 
> Regards,
> Willy
> 
> 

-- 
Cyril Bonté

Reply via email to