[ 
https://issues.apache.org/jira/browse/SOLR-12990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-12990:
----------------------------
    Description: 
Ever since the policeman's Jenkins instance started running tests on Java11, 
we've seen an abnormally high number of test failures that seem to be related 
to randomzed ssl.

I've been investigating these logs, and trying to reproduce and have found the 
following observations:

* In all the policeman jenkins logs i looked at, these SSL related failures 
only occur when the RandomizeSSL annotation picks {{ssl=true clientAuth=false}}
** NOTE: this doesn't mean that every test using {{ssl=true clientAuth=false}} 
failed -- since our build system only prints test output when tests fail, it's 
possible/probably (based on how often the value should be picked) that many 
tests randomly use {{ssl=true clientAuth=false}} and pass
* the failures usually showed an exception that was {{Caused by: 
javax.net.ssl.SSLException: Received fatal alert: internal_error}} in the logs.
* when i attempted to re-produce some of these failing seeds on my own machine 
using Java11, i could not _reliably_ reproduce these failures w/the same seeds
** beasting could _occasionally_ reproduce the failures, at roughly 1/10 runs
** suggesting that system load/timing contributed to these SSL related failures
* picking one particularly trivial test (DistributedDebugComponentTest)
** with {{javax.net.debug=all}} enabled, i was able to see more details...
*** notably: {{Fatal (INTERNAL_ERROR): Session has no PSK}}
** when I patched the test to force {{ssl=true clientAuth=true}} I was unable 
to trigger any failures with the same seed.
* on the jira/http2 branch I was unable to reproduce these failures at all, w/o 
any patching
** similar to SOLR-12988, this may be because of bug fixes in the upgraded 
jetty.

----

Filing this issue largely for tracking purpose, although we may also want to 
use it for discussions/considerations of other backports/fixes to 7x

  was:
Ever since the policeman's Jenkins instance started running tests on Java11, 
we've seen an abnormally high number of test failures that seem to be related 
to randomzed ssl.

I've been investigating these logs, and trying to reproduce and have found the 
following observations:

* In all the policeman jenkins logs i looked at, these SSL related failures 
only occur when the RandomizeSSL annotation picks {{ssl=true clientAuth=false}}
** NOTE: this doesn't mean that every test using {{ssl=true clientAuth=false}} 
failed -- since our build system only prints test output when tests fail, it's 
possible/probably (based on how often the value should be picked) that many 
tests randomly use {{ssl=true clientAuth=false}} and pass
* the failures usually showed an exception that was {{Caused by: 
javax.net.ssl.SSLException: Received fatal alert: internal_error}} in the logs.
* when i attempted to re-produce some of these failing seeds on my own machine 
using Java11, i could not _reliably_ reproduce these failures w/the same seeds
** beasting could _occasionally_ reproduce the failures, at roughly 1/10 runs
** suggesting that system load/timing contributed to these SSL related failures
* picking one particularly trivial test (DistributedDebugComponentTest)
** with {{javax.net.debug=all}} enabled, i was able to see more details...
*** notably: {{Fatal (INTERNAL_ERROR): Session has no PSK}}
** when I patched the test to force {{ssl=true clientAuth=true}} I was unable 
to trigger any failures with the same seed.
* on the jira/http2 branch I was unable to reproduce these failures at all, w/o 
any patching

----

Filing this issue largely for tracking purpose, although we may also want to 
use it for discussions/considerations of other backports/fixes to 7x


> High test failure rate on Java11/12 when (randomized) ssl=true 
> clientAuth=false
> -------------------------------------------------------------------------------
>
>                 Key: SOLR-12990
>                 URL: https://issues.apache.org/jira/browse/SOLR-12990
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Hoss Man
>            Priority: Major
>              Labels: Java11, Java12
>         Attachments: DistributedDebugComponentTest.ssl.debug.log.txt, 
> enable.ssl.debug.patch
>
>
> Ever since the policeman's Jenkins instance started running tests on Java11, 
> we've seen an abnormally high number of test failures that seem to be related 
> to randomzed ssl.
> I've been investigating these logs, and trying to reproduce and have found 
> the following observations:
> * In all the policeman jenkins logs i looked at, these SSL related failures 
> only occur when the RandomizeSSL annotation picks {{ssl=true 
> clientAuth=false}}
> ** NOTE: this doesn't mean that every test using {{ssl=true 
> clientAuth=false}} failed -- since our build system only prints test output 
> when tests fail, it's possible/probably (based on how often the value should 
> be picked) that many tests randomly use {{ssl=true clientAuth=false}} and pass
> * the failures usually showed an exception that was {{Caused by: 
> javax.net.ssl.SSLException: Received fatal alert: internal_error}} in the 
> logs.
> * when i attempted to re-produce some of these failing seeds on my own 
> machine using Java11, i could not _reliably_ reproduce these failures w/the 
> same seeds
> ** beasting could _occasionally_ reproduce the failures, at roughly 1/10 runs
> ** suggesting that system load/timing contributed to these SSL related 
> failures
> * picking one particularly trivial test (DistributedDebugComponentTest)
> ** with {{javax.net.debug=all}} enabled, i was able to see more details...
> *** notably: {{Fatal (INTERNAL_ERROR): Session has no PSK}}
> ** when I patched the test to force {{ssl=true clientAuth=true}} I was unable 
> to trigger any failures with the same seed.
> * on the jira/http2 branch I was unable to reproduce these failures at all, 
> w/o any patching
> ** similar to SOLR-12988, this may be because of bug fixes in the upgraded 
> jetty.
> ----
> Filing this issue largely for tracking purpose, although we may also want to 
> use it for discussions/considerations of other backports/fixes to 7x



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to