On 17/04/18 10:14, Richard Tearle wrote:
> On 16 April 2018 at 22:04, Mark Thomas <ma...@apache.org> wrote:

<snip/>

>> I've started to look at them. I don't have any firm conclusions yet. I
>> have noticed that the problem occurs after a connection is made to the
>> service from localhost rather than the remote IP that is making the
>> other requests. The localhost client does not present a certificate.
>>
>> My working theory (so chances are it is completely wrong) is that the
>> missing certificate in the request from localhost puts the OpenSSL
>> engine into an error state that is not correctly handled by Tomcat
>> causing the subsequent request to fail.
>>
>> I've also noticed that the debug level log message consistently report 0
>> bytes being read which looks wrong but is probably a separate (minor) issue.

The above message are correct but misleading. I'm planning to add
additional debug logging which should clarify things.

<snip/>

> Ah that rings a bell.
> 
> Our containers have a simple health check, simply does
> 
> curl --connect-timeout 5 --max-time 20 -k -s -S --stderr -\
>     https://localhost:${TOMCATS_PORT}/ |\
>     grep -q "NSS: client certificate not found" || exit 1
> 
> just to make sure the ESB is responding, with something we expect.
> These are set to run at an interval of every 2m30s. The full parameters
> in the docker-compose[1] file are:
> 
>     healthcheck:
>       test: ["CMD", "/usr/local/bin/healthcheck.sh"]
>       interval: 2m30s
>       timeout: 10s
>       retries: 3
>       start_period: 20s
> 
> I've also disabled the health check on ESB container, and my tests
> ran through for an hour, without a connection closed error.

That is good news. That is a strong indicator that we are on the right
track. It also explains why I could not reproduce the problem with your
test case. And finally, it is another example of the debug logging added
to the I/O layer proving worth while.

Now all we need to to do is to figure out how to fix this. With the
understanding of what is (probably) going wrong, the problem can be
produced with a clean build and the certs we use for unit tests which
makes things a lot easier. I hope to make progress on this today.

Mark


> 
> [1] https://docs.docker.com/compose/compose-file/#healthcheck
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to