Re: segfault in haproxy=1.8.4

William Dauchy Mon, 19 Mar 2018 13:17:26 -0700

On Mon, Mar 19, 2018 at 08:41:16PM +0100, Willy Tarreau wrote:
> For me, "experimental" simply means "we did our best to ensure it works
> but we're realist and know that bug-free doesn't exist, so a risk remains
> that a bug will be hard enough to fix so as to force you to disable the
> feature for the time it takes to fix it". This issue between threads and
> queue is one such example. Some of the bugs faced on H2 requiring some
> heavy changes were other examples. But overall we know these features
> are highly demanded and are committed to make them work fine :-)


you are right, we probably magnified in our head the different issues we
had related to this.

> I'm still interested in knowing if you find crazy last percentile values.
> We've had that a very long time ago (version 1.3 or so) when some pending
> conns were accidently skipped, so I know how queues can amplify small
> issues. The only real risk here in my opinion is that the sync point was
> only used for health checks till now so it was running at low loads and
> if it had any issue, it would likely have remained unnoticed. But the code
> is small enough to be audited, and after re-reading it this afternoon I
> found it fine.

will do, migrating some low latency applications is more mid/longterm but
will see how the first results during the preparation tests.

> If you want to run a quick test with epoll, just apply this dirty hack :
>
> diff --git a/src/ev_epoll.c b/src/ev_epoll.c
> index b98ca8c..7bafd16 100644
> --- a/src/ev_epoll.c
> +++ b/src/ev_epoll.c
> @@ -116,7 +116,9 @@ REGPRM2 static void _do_poll(struct poller *p, int exp)
>         fd_nbupdt = 0;
>
>         /* compute the epoll_wait() timeout */
> -       if (!exp)
> +       if (1)
> +               wait_time = 0;
> +       else if (!exp)
>                 wait_time = MAX_DELAY_MS;
>         else if (tick_is_expired(exp, now_ms)) {
>                 activity[tid].poll_exp++;
>
> Please note that as this, it's suboptimal because it will increase the
> contention on other places, causing the perfomance to be a bit lower in
> certain situations. I do have some experimental code to loop on epoll
> instead but it's not completely stable yet. We an exchange on this later
> if you want. But feel free to apply this to your latency tests.

thanks a lot, will give a try!

-- 
William

Re: segfault in haproxy=1.8.4

Reply via email to