Re: Abstract namespace sockets handling

hodor Tue, 08 Jul 2014 13:37:32 -0700

Hello,

On 2014-07-08, 15:26:32, Willy Tarreau wrote:
> Not exactly, I know what's happening, you have a frontend which had both
> a unix socket and an abstract socket. When resuming, the abstract socket
> failed and the proxy was marked in error so polling was not re-enabled on
> its listeners. I still have to see how far we can go to change that. It's
> very tricky as we don't want to leave a process in a bad state which will
> never stop for example. Initially when the soft restart was implemented,
> we were not supposed to have multiple processes listening :-)
> 
> The pause/resume operations for unix sockets are different than those
> for other protocols because a file system access is needed, so they're
> performed by the new process.
> 
> > Perhaps this could be solved by delaying the rename(tempname, path) and
> > unlink(backname) after all else is done? Something like .bind_finish()
> > and .bind_rollback() in struct protocol, where .bind_finish() would be
> > for "all is okay" and .bind_rollback() for "something else failed,
> > return the socket to the old haproxy instance"? Those functions could be
> > called after we are reasonably sure nothing else can fail.
> 
> All that is properly done. Check your config to ensure you're not in
> the case above, or alternately, comment out the "fail = 1" statement
> at liine 841 in proxy.c and you will see this annoying behaviour go
> by itself.


I think we are talking about different problems. The one I mentioned
doesn't even need abstract sockets at all. It just needs some other
thing to fail after we have already made the link(), bind(), rename(),
unlink() stuff.

Let's have these two config files:

conf1:

---------------------------
global
  pidfile /tmp/proxy/pid

defaults
  mode tcp

listen test1
  bind unix@/tmp/test1.sock
  server test1 127.0.0.1:22
---------------------------


conf2:

---------------------------
global
  pidfile /tmp/proxy/pid

defaults
  mode tcp

listen test1
  bind unix@/tmp/test1.sock
  server test1 127.0.0.1:22

listen test2
  bind [email protected]:22
  server test2 127.0.0.1:23
---------------------------

First start the first one (deamon necessary for pid file):

./haproxy -f conf1 -D

"socat stdio unix-connect:/tmp/test1.sock" now works and connects to the
local SSH.

Now we try to reload haproxy with the second config:

./haproxy -f conf2 -p pid -D -sf `cat pid`

The whole new haproxy instance will fail as port 22 is occupied by SSH
and cannot be bound. The new instance unlink()ed the original
/tmp/test1.sock, so the old instance, although running, is now
effectively useless.

I tried with and without the "fail = 1" statement present. Did I miss
something?

I realize it is not entirely fair to change the config this way :). It
is not a problem for me. I just wanted to point out this can happen.


Thanks,

-- 
hodor

Re: Abstract namespace sockets handling

Reply via email to