[ 
https://issues.apache.org/jira/browse/SOLR-9068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-9068:
---------------------------
    Attachment: SOLR-9068.patch


bq. If this works I see not problem with the patch, because it is used during 
tests only. Right?

Correct, this is only a question of what SecureRandom source we use during 
tests (the idea being to prevent so low entropy jenkins machines from blocking 
when randomizing SSL testing)

bq. ... and for now disable the tests with assumeFalse(Constants.SUN_OS).

While this one test in particular seems to always trigger some Padding related 
problem in the SSLEngine, the underlying problem is something that *could* 
affect any SSL test (note that even with this test, the jenkins failures have 
*diff* Padding related Exceptions between master and 6x, presumably because 
some small amount of information in the Solr request/response payload is 
slightly diff between branches?) ... so if we do ultimately need to have 
special case logic when {{Constants.SUN_OS}} it shouldn't be specific to this 
test class/method, it should be part of the {{SSLTestConfig}} so we don't get 
confusing failures from any other test that might randomize SSL.

I've uploaded a new quick & dirty patch that uses a {{java.util.Random}} inside 
our {{NullSecureRandom}}.

[~thetaphi]: can you please try this new patch out?

* If this patch solves the problem I can come up with a better final fix that 
includes 2 diff "mock" SecureRandom instances and picks which one we use in 
SSLTestConfig depending on the {{Constants.SUN_OS}}.
* If this patch doesn't solve the problem then there is something more 
fundementally odd going on on Solaris (maybe our custom SecureRandomSpi is 
tickling some assumption in the JVM?) and I'll give up and just change 
SSLTestConfig to simply use the platform default SecureRandom on that OS.

bq. If you like a can give you an account on the Solaris machine to try 
yourself (keep in mind, it has neither GIT nor ANT installed, totally blank - 
all is provided by Jenkins).

No thank you -- that sounds terrible.  This is/should-be the last patch I'll 
ask you to manually try on Solaris

bq. Maybe we should open a bug report at Oracle ...

Probably, but from what i've seen you have to deal with in the past, don't have 
the time or patience to try and deal with their process.  If you want to file 
one by all means go ahead -- but you might want to wait until we figure out if 
using {{java.utilRandom}} under the covers works as a workarround, or if there 
is just some fundemental bug when using custom SecureRandom instances.




> Solaris SSL test failures when using NullSecureRandom?
> ------------------------------------------------------
>
>                 Key: SOLR-9068
>                 URL: https://issues.apache.org/jira/browse/SOLR-9068
>             Project: Solr
>          Issue Type: Sub-task
>            Reporter: Hoss Man
>             Fix For: 4.9, master
>
>         Attachments: SOLR-9068.Lucene-Solr-6.x-Solaris_110.log, 
> SOLR-9068.Lucene-Solr-master-Solaris_558.log, SOLR-9068.patch, SOLR-9068.patch
>
>
> In parent issue SOLR-5776, NullSecureRandom was introduced and SSLTestConfig 
> was refactored so that both client & server would use it to prevent blocked 
> threads waiting for entropy.
> Since those commits to master & branch_6x, both Solaris jenkins builds have 
> seen failures at the same spots in 
> TestMiniSolrCloudClusterSSL.testSslAndNoClientAuth - and looking at the logs 
> the root cause appears to be intranode communication failures due to 
> "javax.crypto.BadPaddingException"
> Perhaps the Solaris SSL impl has bugs in it's padding code that are tickeled 
> when the SecureRandom instance returns long strings of null bytes?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to