On Mon, Jan 22, 2018 at 05:47:55PM +0100, Willy Tarreau wrote:
> > strace: Process 12166 attached
> > [pid 12166] set_robust_list(0x7ff9bc9aa9e0, 24 <unfinished ...>
> > [pid 12166] <... set_robust_list resumed> ) = 0
> > [pid 12166] gettimeofday({1516289044, 684014}, NULL) = 0
> > [pid 12166] mmap(NULL, 134217728, PROT_NONE, 
> > MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0 <unfinished ...>
> > [pid 12166] <... mmap resumed> )        = 0x7ff9ac000000
> > [pid 12166] munmap(0x7ff9b0000000, 67108864) = 0
> > [pid 12166] mprotect(0x7ff9ac000000, 135168, PROT_READ|PROT_WRITE 
> > <unfinished ...>
> > [pid 12166] <... mprotect resumed> )    = 0
> > [pid 12166] mmap(NULL, 8003584, PROT_READ|PROT_WRITE, 
> > MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
> > [pid 12166] <... mmap resumed> )        = 0x7ff9baa65000
> > [pid 12166] close(16 <unfinished ...>
> > [pid 12166] <... close resumed> )       = 0
> > [pid 12166] fcntl(15, F_SETFL, O_RDONLY|O_NONBLOCK <unfinished ...>
> > [pid 12166] <... fcntl resumed> )       = 0
> 
> Here it's getting obvious that it was a shared file descriptor :-(

So I have a suspect here :

   - run_thread_poll_loop() runs after the threads are created
   - first thing it does is to close the master-worker pipe FD :

        (...)
        if (global.mode & MODE_MWORKER)
                mworker_pipe_register(mworker_pipe);
        (...)

     void mworker_pipe_register(int pipefd[2])
     {
        close(mworker_pipe[1]); /* close the write end of the master pipe in 
the children */
        fcntl(mworker_pipe[0], F_SETFL, O_NONBLOCK);
        (...)
     }

     Looks familiar with the trace above ?

So I guess your config works in master-worker mode, am I right ?

Note that I'm bothered with the call to protocol_enable_all() as
well in this function since it will start the proxies multiple times
in a possibly unsafe mode. That may explain a lot of things suddenly!

I think the attached patch works around it, but I'd like your
confirmation before cleaning it up.

Thanks,
Willy

diff --git a/src/haproxy.c b/src/haproxy.c
index 20b18f8..66639fc 100644
--- a/src/haproxy.c
+++ b/src/haproxy.c
@@ -2339,7 +2339,11 @@ void mworker_pipe_handler(int fd)
 
 void mworker_pipe_register(int pipefd[2])
 {
+       if (mworker_pipe[1] < 0)
+               return;
+
        close(mworker_pipe[1]); /* close the write end of the master pipe in 
the children */
+       mworker_pipe[1] = -1;
 
        fcntl(mworker_pipe[0], F_SETFL, O_NONBLOCK);
        fdtab[mworker_pipe[0]].owner = mworker_pipe;
@@ -2408,6 +2412,7 @@ static void *run_thread_poll_loop(void *data)
 {
        struct per_thread_init_fct   *ptif;
        struct per_thread_deinit_fct *ptdf;
+       static __maybe_unused HA_SPINLOCK_T start_lock;
 
        tid     = *((unsigned int *)data);
        tid_bit = (1UL << tid);
@@ -2420,10 +2425,12 @@ static void *run_thread_poll_loop(void *data)
                }
        }
 
+       HA_SPIN_LOCK(LISTENER_LOCK, &start_lock);
        if (global.mode & MODE_MWORKER)
                mworker_pipe_register(mworker_pipe);
 
        protocol_enable_all();
+       HA_SPIN_UNLOCK(LISTENER_LOCK, &start_lock);
        THREAD_SYNC_ENABLE();
        run_poll_loop();
 

Reply via email to