On 19/04/18 16:50, Peter@Kreuser-Online wrote:

<snip/>

> Do you mind to share more about the root cause? I’ve followed this mail 
> communication from the start and am  curious. 

Sure.

Tomcat was configured to require CLIENT-CERT auth and the main client
was configured to use this. Occasionally, the main client would see
connection problems when using 8.5.x or later with NIO and the OpenSSL
TLS implementation.

There was a second client the performed health checks on the server.
This client did not use a certificate. The requests it made always
failed but did so with a predictable error message that the health check
looked for.

OpenSSL error states are stored per thread.

Each Java thread is mapped 1-to-1 to an OS thread.

The sequence of events that cause the problem was as follows:

- Health check ran
- TLS connection failed because no client certificate was provided
- OpenSSL set an error state that - depending on the timing of the
  socket closure - was not always cleaned up
- Standard request was received and was handled by the thread that
  previously experienced the error
- Because the error had not been cleaned up, this new connection thought
  the error was meant for it and closed the connection

The fix was to ensure that, whenever the Tomcat code made a call to
OpenSSL that looked like this:
- Do something via the OpenSSL API
- Check the OpenSSL error state

the code was changed so it looked like this:
- Clear the OpenSSL error state
- Do something via the OpenSSL API
- Check the OpenSSL error state

I also added a TODO for the arguably more complete fix which is to check
the OpenSSL error state after every call to the OpenSSL API.

> Let me tell you that your endurance on all the tricky issues here is 
> admirable! 
> 
> Thank you for that!

Thank you.

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to