[
https://issues.apache.org/jira/browse/SOLR-9068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hoss Man updated SOLR-9068:
---------------------------
Attachment: SOLR-9068.patch
bq. If this works I see not problem with the patch, because it is used during
tests only. Right?
Correct, this is only a question of what SecureRandom source we use during
tests (the idea being to prevent so low entropy jenkins machines from blocking
when randomizing SSL testing)
bq. ... and for now disable the tests with assumeFalse(Constants.SUN_OS).
While this one test in particular seems to always trigger some Padding related
problem in the SSLEngine, the underlying problem is something that *could*
affect any SSL test (note that even with this test, the jenkins failures have
*diff* Padding related Exceptions between master and 6x, presumably because
some small amount of information in the Solr request/response payload is
slightly diff between branches?) ... so if we do ultimately need to have
special case logic when {{Constants.SUN_OS}} it shouldn't be specific to this
test class/method, it should be part of the {{SSLTestConfig}} so we don't get
confusing failures from any other test that might randomize SSL.
I've uploaded a new quick & dirty patch that uses a {{java.util.Random}} inside
our {{NullSecureRandom}}.
[~thetaphi]: can you please try this new patch out?
* If this patch solves the problem I can come up with a better final fix that
includes 2 diff "mock" SecureRandom instances and picks which one we use in
SSLTestConfig depending on the {{Constants.SUN_OS}}.
* If this patch doesn't solve the problem then there is something more
fundementally odd going on on Solaris (maybe our custom SecureRandomSpi is
tickling some assumption in the JVM?) and I'll give up and just change
SSLTestConfig to simply use the platform default SecureRandom on that OS.
bq. If you like a can give you an account on the Solaris machine to try
yourself (keep in mind, it has neither GIT nor ANT installed, totally blank -
all is provided by Jenkins).
No thank you -- that sounds terrible. This is/should-be the last patch I'll
ask you to manually try on Solaris
bq. Maybe we should open a bug report at Oracle ...
Probably, but from what i've seen you have to deal with in the past, don't have
the time or patience to try and deal with their process. If you want to file
one by all means go ahead -- but you might want to wait until we figure out if
using {{java.utilRandom}} under the covers works as a workarround, or if there
is just some fundemental bug when using custom SecureRandom instances.
> Solaris SSL test failures when using NullSecureRandom?
> ------------------------------------------------------
>
> Key: SOLR-9068
> URL: https://issues.apache.org/jira/browse/SOLR-9068
> Project: Solr
> Issue Type: Sub-task
> Reporter: Hoss Man
> Fix For: 4.9, master
>
> Attachments: SOLR-9068.Lucene-Solr-6.x-Solaris_110.log,
> SOLR-9068.Lucene-Solr-master-Solaris_558.log, SOLR-9068.patch, SOLR-9068.patch
>
>
> In parent issue SOLR-5776, NullSecureRandom was introduced and SSLTestConfig
> was refactored so that both client & server would use it to prevent blocked
> threads waiting for entropy.
> Since those commits to master & branch_6x, both Solaris jenkins builds have
> seen failures at the same spots in
> TestMiniSolrCloudClusterSSL.testSslAndNoClientAuth - and looking at the logs
> the root cause appears to be intranode communication failures due to
> "javax.crypto.BadPaddingException"
> Perhaps the Solaris SSL impl has bugs in it's padding code that are tickeled
> when the SecureRandom instance returns long strings of null bytes?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]