On Fri, Oct 5, 2012 at 4:53 PM, Willy Tarreau <[email protected]> wrote: > Hi Jesper, > > On Fri, Oct 05, 2012 at 02:14:01PM -0700, Jesper Noehr wrote: >> Willy, >> >> Thanks for your analysis and reply. Greatly appreciated. >> >> I adjusted the "mss" in our bind, attempting values between your >> suggested 1380 and 1460 (got that from elsewhere). Unfortunately, the >> problem persists. > > OK > >> Most of the failures we've seen so far, have not been from browsers, >> but from Git and Mercurial (this is all for bitbucket.org). However, >> I've noticed that even browsers fail: >> >> Oct 5 16:05:30 bb10 haproxy[29642]: 108.235.116.212:51558 >> [05/Oct/2012:16:05:28.542] ssl servers-ssl/bb12 1746/0/0/-1/+2414 -1 >> +0 - - CH-- 398/263/9/0/0 0/0 "POST /redacted/sf/issue-attachment/196/ >> HTTP/1.1" >> >> We are binding to 127.0.0.1, as we are sitting behind stud, an SSL/TLS >> terminator. > > Ah that was very important information then because it means that > haproxy is not the client's peer and that setting MSS is useless. > Also, are you running on a recent Stud ? I remind that one of my > coworkers (Emeric) found some bugs looking exactly like this some > time ago, IIRC it was sometimes possible to have the SSL handshake > fail if there was some activity on the other end of the socket, > though I don't remember precisely.
>From what I can tell in stud's repository on github (https://github.com/bumptech/stud), there have been no modifications to anything meaningful for at least 3 months. Our version is newer than that. >> I realize 1.5-dev12 has SSL support, but this is quite >> recent, so we're using the stud->haproxy setup still. > > I understand :-) There are some brave users anyway who helped us spot > a number of issues, but we're not finding that many bugs anymore. We'd love to have SSL termination inside haproxy for all our needs; less moving parts makes these things a lot easier! >> I'd be more than happy to provide as much information as I can on this issue. >> Any other ideas, or indications of what might be wrong? > > It's not easy because if it's a production system, I suppose you can't > easily capture both sides of stud for a long time in order to analyse > the SSL exchange for a faulty connection. At least ensure that your > version is sane. Stud is an excellent product, but it's also very > recent and needs to frequently follow updates. No, it's very difficult. I've been at this for well over a week now, having been all over the board in terms of ideas and theories as of to why this fails. I can reproduce this on a fairly consistent basis on a Windows laptop we have sitting around. It fails less often on linux/OSX, if at all. We haven't been able to reproduce it on those systems, and we haven't had any reports from customers either, although they could've just never reported it. Is there anything else you can think of? I'm almost willing to try anything at this point. Jesper

