Hi! It's been almost 2 weeks since I've installed the patch and there were no segfaults since then. It seems that the problem is fixed now. Thank you!
2018-03-19 23:16 GMT+03:00 William Dauchy <w.dau...@criteo.com>: > On Mon, Mar 19, 2018 at 08:41:16PM +0100, Willy Tarreau wrote: > > For me, "experimental" simply means "we did our best to ensure it works > > but we're realist and know that bug-free doesn't exist, so a risk remains > > that a bug will be hard enough to fix so as to force you to disable the > > feature for the time it takes to fix it". This issue between threads and > > queue is one such example. Some of the bugs faced on H2 requiring some > > heavy changes were other examples. But overall we know these features > > are highly demanded and are committed to make them work fine :-) > > you are right, we probably magnified in our head the different issues we > had related to this. > > > I'm still interested in knowing if you find crazy last percentile values. > > We've had that a very long time ago (version 1.3 or so) when some pending > > conns were accidently skipped, so I know how queues can amplify small > > issues. The only real risk here in my opinion is that the sync point was > > only used for health checks till now so it was running at low loads and > > if it had any issue, it would likely have remained unnoticed. But the > code > > is small enough to be audited, and after re-reading it this afternoon I > > found it fine. > > will do, migrating some low latency applications is more mid/longterm but > will see how the first results during the preparation tests. > > > If you want to run a quick test with epoll, just apply this dirty hack : > > > > diff --git a/src/ev_epoll.c b/src/ev_epoll.c > > index b98ca8c..7bafd16 100644 > > --- a/src/ev_epoll.c > > +++ b/src/ev_epoll.c > > @@ -116,7 +116,9 @@ REGPRM2 static void _do_poll(struct poller *p, int > exp) > > fd_nbupdt = 0; > > > > /* compute the epoll_wait() timeout */ > > - if (!exp) > > + if (1) > > + wait_time = 0; > > + else if (!exp) > > wait_time = MAX_DELAY_MS; > > else if (tick_is_expired(exp, now_ms)) { > > activity[tid].poll_exp++; > > > > Please note that as this, it's suboptimal because it will increase the > > contention on other places, causing the perfomance to be a bit lower in > > certain situations. I do have some experimental code to loop on epoll > > instead but it's not completely stable yet. We an exchange on this later > > if you want. But feel free to apply this to your latency tests. > > thanks a lot, will give a try! > > -- > William >