On Sun, Dec 24, 2017 at 07:36:54PM +0100, Willy Tarreau wrote:
> > The bug has been introduced in commit baf6ea4b (" BUG/MINOR: mworker:
> > detach from tty when in daemon mode") and affects 1.8.1 and newer
> > stable releases.
> > The discourse-OP also states that the socket is likely closed by the
> > triple flcose() introduced in that bisected commit.
> 
> Ah bad :-( But interestingly, we were speaking about using CLOEXEC on
> listeners, I think this will make all of this much easier to deal with.

OK I think I see what's happening. The fclose() is performed on FILE*
and makes sense for this process as any further references to these
files will know they are closed. But upon execve(), these FDs are
unassigned, and a new set of FILE* will be assigned to existing FDs
0,1,2 which now map to listening sockets. By doing this fclose() after
the reload, we in fact close the listening socket we've just created.

We could very well imagine closing 0,1,2 immediately upon startup in
the new process after we detect a reload (I think we can detect this).
Another option I'd rather not consider would be not to close these files
again upon reload, but that would mean that any warning/alert/debugging
message risks to be sent to whatever FD lies there (possibly including
the pidfile). We could also imagine opening /dev/null on 0,1,2 instead
of just closing them but I suspect it would just hide the real problem.

By the way, during my strace, I noticed that we do not check for the
pidfd's validity prior to writing to it, leading to this when no pidfile
is specified :

  10:09:39.475829 write(4294967295, "3205\n", 5) = -1 EBADF (Bad file 
descriptor)

So we still have some hardening work to do in this area. I guess we
first want to clearly describe how we want each process to behave in
each case (mw+debug, mw+quiet, mw+daemon, debug, quiet, daemon). This
way we'll avoid pushing fixes which break other use cases ;-)

Regards,
Willy

Reply via email to