Hello, On 2014-07-08, 15:26:32, Willy Tarreau wrote: > Not exactly, I know what's happening, you have a frontend which had both > a unix socket and an abstract socket. When resuming, the abstract socket > failed and the proxy was marked in error so polling was not re-enabled on > its listeners. I still have to see how far we can go to change that. It's > very tricky as we don't want to leave a process in a bad state which will > never stop for example. Initially when the soft restart was implemented, > we were not supposed to have multiple processes listening :-) > > The pause/resume operations for unix sockets are different than those > for other protocols because a file system access is needed, so they're > performed by the new process. > > > Perhaps this could be solved by delaying the rename(tempname, path) and > > unlink(backname) after all else is done? Something like .bind_finish() > > and .bind_rollback() in struct protocol, where .bind_finish() would be > > for "all is okay" and .bind_rollback() for "something else failed, > > return the socket to the old haproxy instance"? Those functions could be > > called after we are reasonably sure nothing else can fail. > > All that is properly done. Check your config to ensure you're not in > the case above, or alternately, comment out the "fail = 1" statement > at liine 841 in proxy.c and you will see this annoying behaviour go > by itself.
I think we are talking about different problems. The one I mentioned doesn't even need abstract sockets at all. It just needs some other thing to fail after we have already made the link(), bind(), rename(), unlink() stuff. Let's have these two config files: conf1: --------------------------- global pidfile /tmp/proxy/pid defaults mode tcp listen test1 bind unix@/tmp/test1.sock server test1 127.0.0.1:22 --------------------------- conf2: --------------------------- global pidfile /tmp/proxy/pid defaults mode tcp listen test1 bind unix@/tmp/test1.sock server test1 127.0.0.1:22 listen test2 bind [email protected]:22 server test2 127.0.0.1:23 --------------------------- First start the first one (deamon necessary for pid file): ./haproxy -f conf1 -D "socat stdio unix-connect:/tmp/test1.sock" now works and connects to the local SSH. Now we try to reload haproxy with the second config: ./haproxy -f conf2 -p pid -D -sf `cat pid` The whole new haproxy instance will fail as port 22 is occupied by SSH and cannot be bound. The new instance unlink()ed the original /tmp/test1.sock, so the old instance, although running, is now effectively useless. I tried with and without the "fail = 1" statement present. Did I miss something? I realize it is not entirely fair to change the config this way :). It is not a problem for me. I just wanted to point out this can happen. Thanks, -- hodor

