Hey,

Sure thing! I've just put an nokqueue in the config and its running again. Lets see :)

Kind regards,

John

Willy Tarreau wrote:
Hi John,

On Thu, May 08, 2014 at 09:15:20AM +0200, John-Paul Bader wrote:
Hey,

so I have downloaded the haproxy-ss-Latest from the website and applied
your patches. I have compiled it with:

make TARGET=freebsd USE_PCRE=1 USE_OPENSSL=1 USE_ZLIB=1

It ran very good for 2 hours but then 6 out of 12 processes coredumped,
this time however in the haproxy code realm and apparently session related:

Great, so good news and bad news at the same time. Good news being that
the shared context was definitely causing the trouble, the bad news being
that the slightly-tested kqueue is still having some trouble.

Maybe the full backtrace is more helpful:

(gdb) bt full
#0  kill_mini_session (s=0x804269c00) at src/session.c:299
        level = 6
        conn = (struct connection *) 0x0
        err_msg =<value optimized out>
#1  0x0000000000463928 in conn_session_complete (conn=0x8039f2a80) at
src/session.c:355
        s = (struct session *) 0x804269c00
#2  0x0000000000432769 in conn_fd_handler (fd=<value optimized out>) at
src/connection.c:88
        conn =<value optimized out>
        flags = 41997063
#3  0x00000000004127db in fd_process_polled_events (fd=<value optimized
out>) at src/fd.c:271
        new_updt =<value optimized out>
        old_updt = 1
#4  0x000000000046ed85 in _do_poll (p=<value optimized out>, exp=<value
optimized out>)
     at src/ev_kqueue.c:141
        status = 1
        count = 0
        fd =<value optimized out>
        delta_ms =<value optimized out>
        timeout = {tv_sec = 0, tv_nsec = 27000000}
        updt_idx =<value optimized out>
        en =<value optimized out>
        eo =<value optimized out>
        changes =<value optimized out>

(...)

OK this trace tends to show that we were called for an event on
an FD in a strange state, half-open, half-closed :-/

If the fd were closed, in conn_fd_handler() it would have simply
returned because .owner == NULL. Since it managed to go to
conn_session_complete(), it means that fd.owner was correct but
the connection was not attached to this session, quite strange.
It seems like something was deinitialized, but I can't find a
code sequence which could produce this.

Would you please retry with poll ? I'm not really convinced that
a bug in kqueue could cause this :-/

Thanks
Willy


--
John-Paul Bader | Software Development

www.wooga.com
wooga GmbH | Saarbruecker Str. 38 | D-10405 Berlin
Sitz der Gesellschaft: Berlin; HRB 117846 B
Registergericht Berlin-Charlottenburg
Geschaeftsfuehrung: Jens Begemann, Philipp Moeser

Reply via email to