On Fri, Jan 10, 2014 at 10:47:51AM +0100, Sander Klein wrote:
> Heyz,
> 
> On 10.01.2014 09:14, Willy Tarreau wrote:
> >Hi Sander,
> >
> >On Fri, Jan 10, 2014 at 08:57:18AM +0100, Sander Klein wrote:
> >>Hi,
> >>
> >>I'm sorry you haven't heard from me yet. But I didn't have time to 
> >>look
> >>into this issue. Hope to do it this weekend.
> >
> >Don't rush on it, Baptiste has reported to me a reproducible issue on 
> >his
> >lab which seems to match your problem, and which is caused by the way 
> >the
> >polling works right now (which is the reason why I want to address this
> >before the release). I'm currently working on it. The fix is far from 
> >being
> >trivial, but necessary.
> 
> Do you still want me to bisect? Or should I wait? If you think the 
> problem is the same I'll just test the fix :-)

Don't waste your time bisecting. I'll propose you to test the patch
instead. The problem I've seen is always the same and is related to
the fact that the SSL layer might have read all pending data from a
socket but not delivered everything to the buffer by lack of space
for example. Once the buffer is flushed and we want to read what
follows, we re-enable polling. But there's no more data pending so
poll() does not wake up the reader anymore. It happened to work due
to the speculative polling we've been using a lot, but sometimes it
was not intentional and caused some syscalls to be attempted for no
reason (resulting in many EAGAIN in recvfrom). When fixing this, we
also break what made SSL happen to work :-(

I've been wanting for about 1 year to change the polling to include
the status of the FD there (EAGAIN or not). But since it's complex
and there were not many reasons for doing it, I preferred to delay.
Now is a good opportunity given that I broke it several times in the
last few weeks.

Hoping this clarifies things a little bit.

Willy


Reply via email to