On Mon, Jul 18, 2011 at 06:33:33PM -0400, Jonathan Simms wrote:
> Willy,
>
> I looked at the previous bug report here
> http://comments.gmane.org/gmane.comp.web.haproxy/5439
> based on 2.6.38 and checked the ubuntu 2.6.32 kernel for the offending patch
> <http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=c191a836a908d1dd6b40c503741f91b914de3348>
> and I didn't see it applied to the kernel I'm using.
OK then that's already a good thing, but we have to find out what
could cause a similar issue on a specific distro !
> Is there any other explanation, or some information I can find for you?
Do all the listeners have the same issue or only a few ? And did the
config change between the working one and the reloaded one ? What could
cause the same issue to happen is a copy-paste of a "bind" line in the
same file, which would cause a conflict when trying to bind the second
one.
Also, please check that you're don't have more than one process running
when the issue appears. It could be that another old process still holds
the ports open and does not get the signal to release them. But this would
be surprising considering that your config only allows one process.
In your trace below, you only have the expected part :
17492 socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 5
17492 fcntl(5, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
17492 setsockopt(5, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
17492 setsockopt(5, SOL_SOCKET, 0xf /* SO_??? */, [1], 4) = -1
ENOPROTOOPT (Protocol not available)
17492 bind(5, {sa_family=AF_INET, sin_port=htons(6379),
sin_addr=inet_addr("0.0.0.0")}, 16) = -1 EADDRINUSE (Address
already in use)
17492 close(5) = 0
17492 kill(13512, SIGTTOU) = 0
The first bind() fails, then the new process sends a SIGTTOU signal to the
old one asking it to release the ports, then haproxy tries to bind again for
a certain time, and only complains if it fails for too long. Ideally, a full
strace of the issue could help, but please take it with "strace -tt" so that
we get the timers.
Regards,
Willy