[ https://issues.apache.org/jira/browse/SOLR-9068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hoss Man updated SOLR-9068: --------------------------- Attachment: SOLR-9068.patch bq. If this works I see not problem with the patch, because it is used during tests only. Right? Correct, this is only a question of what SecureRandom source we use during tests (the idea being to prevent so low entropy jenkins machines from blocking when randomizing SSL testing) bq. ... and for now disable the tests with assumeFalse(Constants.SUN_OS). While this one test in particular seems to always trigger some Padding related problem in the SSLEngine, the underlying problem is something that *could* affect any SSL test (note that even with this test, the jenkins failures have *diff* Padding related Exceptions between master and 6x, presumably because some small amount of information in the Solr request/response payload is slightly diff between branches?) ... so if we do ultimately need to have special case logic when {{Constants.SUN_OS}} it shouldn't be specific to this test class/method, it should be part of the {{SSLTestConfig}} so we don't get confusing failures from any other test that might randomize SSL. I've uploaded a new quick & dirty patch that uses a {{java.util.Random}} inside our {{NullSecureRandom}}. [~thetaphi]: can you please try this new patch out? * If this patch solves the problem I can come up with a better final fix that includes 2 diff "mock" SecureRandom instances and picks which one we use in SSLTestConfig depending on the {{Constants.SUN_OS}}. * If this patch doesn't solve the problem then there is something more fundementally odd going on on Solaris (maybe our custom SecureRandomSpi is tickling some assumption in the JVM?) and I'll give up and just change SSLTestConfig to simply use the platform default SecureRandom on that OS. bq. If you like a can give you an account on the Solaris machine to try yourself (keep in mind, it has neither GIT nor ANT installed, totally blank - all is provided by Jenkins). No thank you -- that sounds terrible. This is/should-be the last patch I'll ask you to manually try on Solaris bq. Maybe we should open a bug report at Oracle ... Probably, but from what i've seen you have to deal with in the past, don't have the time or patience to try and deal with their process. If you want to file one by all means go ahead -- but you might want to wait until we figure out if using {{java.utilRandom}} under the covers works as a workarround, or if there is just some fundemental bug when using custom SecureRandom instances. > Solaris SSL test failures when using NullSecureRandom? > ------------------------------------------------------ > > Key: SOLR-9068 > URL: https://issues.apache.org/jira/browse/SOLR-9068 > Project: Solr > Issue Type: Sub-task > Reporter: Hoss Man > Fix For: 4.9, master > > Attachments: SOLR-9068.Lucene-Solr-6.x-Solaris_110.log, > SOLR-9068.Lucene-Solr-master-Solaris_558.log, SOLR-9068.patch, SOLR-9068.patch > > > In parent issue SOLR-5776, NullSecureRandom was introduced and SSLTestConfig > was refactored so that both client & server would use it to prevent blocked > threads waiting for entropy. > Since those commits to master & branch_6x, both Solaris jenkins builds have > seen failures at the same spots in > TestMiniSolrCloudClusterSSL.testSslAndNoClientAuth - and looking at the logs > the root cause appears to be intranode communication failures due to > "javax.crypto.BadPaddingException" > Perhaps the Solaris SSL impl has bugs in it's padding code that are tickeled > when the SecureRandom instance returns long strings of null bytes? -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org