[ 
https://issues.apache.org/jira/browse/SOLR-13778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17001078#comment-17001078
 ] 

Chris M. Hostetter commented on SOLR-13778:
-------------------------------------------

{quote}Going back to failures in Solr tests: I think the reason is that we 
shutdown jetty in the middle of the test but then reuse the same client that 
was previously connected to an existing instance. If it's an SSL connection 
then there may be SSL comms flying around in addition to user messages and if 
they're issued on a closed socket connection they trigger this enigmatic recv 
failed error.

I think the client should be reinstantiated (or at least any existing 
connections dropped) for the tests to work reliably. ...
{quote}
Interesting ... but taking a step back, this isn't just about these tests and 
the "test clients" talking to the "test solr nodes", so we shouldn't just 
re-instantiate all "test clients" right after any call to {{jetty.stop()}} ... 
I believe we also see these exceptions when "test solr nodeA" talks to "test 
solr nodeB" (although i suspect you are correct that this is also only after 
"nodeB" has been stoped/started) ... and IIUC "real users" could see these 
errors on windows as well  (Because this seems like something that could happen 
to any solrj users running (Cloud|Http)SolrClient on a windows box, if it's 
talking to a remote solr node using using SSL that gets restarted.)
----
Which seems to raise the question: (How) Can we reliably ensure that 
SolrClients get re-instantiated (or have existing connections dropped) if the 
"remote" server is restarted?

Could/Should we make SolrHttpRequestRetryHandler close & re-open any existing 
connections (prior to retry) if there was a Socket/SSL Exception?

> Windows JDK SSL Test Failure trend: SSLException: Software caused connection 
> abort: recv failed
> -----------------------------------------------------------------------------------------------
>
>                 Key: SOLR-13778
>                 URL: https://issues.apache.org/jira/browse/SOLR-13778
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Chris M. Hostetter
>            Priority: Major
>         Attachments: dumps-LegacyCloud.zip, logs-2019-12-12-1.zip, 
> recv-multiple-2019-12-18.zip
>
>
> Now that Uwe's jenkins build has been correctly reporting it's build results 
> for my [automated 
> reports|http://fucit.org/solr-jenkins-reports/failure-report.html] to pick 
> up, I've noticed a pattern of failures that indicate a definite problem with 
> using SSL on Windows (even with java 11.0.4
>  )
>  The symptommatic stack traces all contain...
> {noformat}
> ...
>    [junit4]    > Caused by: javax.net.ssl.SSLException: Software caused 
> connection abort: recv failed
>    [junit4]    >        at 
> java.base/sun.security.ssl.Alert.createSSLException(Alert.java:127)
> ...
>    [junit4]    > Caused by: java.net.SocketException: Software caused 
> connection abort: recv failed
>    [junit4]    >        at 
> java.base/java.net.SocketInputStream.socketRead0(Native Method)
> ...
> {noformat}
> I suspect this may be related to 
> [https://bugs.openjdk.java.net/browse/JDK-8209333] but i have no concrete 
> evidence to back this up.
> I'll post some details of my analysis in comments...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to