Looks like just attempting to cleanly shutdown worked fine on an idle system (~1000 servers, not all java), I didn't notice an much of an increase in new sessions after the initial connections. I also didn't notice any TIME_WAIT sockets lingering.
On Tue, Mar 7, 2017 at 10:24 PM, Willy Tarreau <w...@1wt.eu> wrote: > On Tue, Mar 07, 2017 at 12:27:38PM -0800, Steven Davidovitz wrote: > > I tried on 1.8dev and was still able to reproduce the problem. > > It appears Java is strictly interpreting the TLS spec with regards to > > "Closure Alerts". > > The section specifies that: > > > > Note that as of TLS 1.1, failure to properly close a connection no > longer > > requires that a > > session not be resumed. This is a change from TLS 1.0 to conform > > with widespread implementation practice. > > (https://tools.ietf.org/html/rfc5246#section-7.2.1) > > > > Java does not allow session resumption in that case. > > And the fact that it's the removal of a requirement doesn't mean one > has to abide by this new rule, so Java is right to proceed like this > eventhough it clearly is suboptimal. > > > I haven't been able to reproduce the issue > > by running requests through the actual proxy (i.e. not check-related), > but > > it certainly might be possible > > to trigger this case. I have a simple test case here: > > https://github.com/steved/haproxy-java-ssl-check > > I think that production traffic takes more time to be processed and > leaves more time for the TLS response to be sent before the close is > performed. > > > I thought earlier that changing shutw_hard to shutw works, > > It's a matter of time race as usual :-/ > > > but now it appears I need this patch: > > > > diff --git a/src/checks.c b/src/checks.c > > index 49bd886b..75aa222b 100644 > > --- a/src/checks.c > > +++ b/src/checks.c > > @@ -2168,6 +2168,7 @@ static struct task *process_chk_conn(struct task > *t) > > * server state to be suddenly changed. > > */ > > conn_sock_drain(conn); > > + conn_data_shutw(conn); > > conn_force_close(conn); > > } > > I suspect that you'll accumulate TIME_WAIT sockets here on haproxy but > I could be wrong. I'd like you to check on an idle system. If there's > no such thing and if you find that this reliably solves your problem > I'm fine with taking this one, though I think that it only increases > the likeliness to let the connection close cleanly. But if you switch > from 5% of clean close to 95% it's already 20 times less key > computations :-) > > Thanks, > Willy >
0001-BUG-MINOR-attempt-clean-shutw-for-check.patch
Description: Binary data