[
https://issues.apache.org/jira/browse/SOLR-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hoss Man updated SOLR-8052:
---------------------------
Attachment: SOLR-8052.patch
I've been digging into some of the Java9 related SOLR jiras -- starting with
the kerberos based test problems -- to try and figure out if these really are
test only bugs and/or if there is anything we can do about making things work
better.
Based on my initial reading/experimenting, I think we should replace MiniKdc
(from hadoop's test infrastrcture) with SimpleKdcServer (from the apache kerby
project)...
* SimpleKdcServer does not appear to have reflection related bugs that cause
problems under jigsaw like MiniKdc does
* SimpleKdcServer does not suffer from the same "can't use multiple nodes"
problem (HADOOP-9893) that has required {{TestMiniSolrCloudClusterKerberos}} to
be {{@Ignored}} since it was created.
** I was able to add multiple solr nodes to {{TestSolrCloudWithKerberosAlt}}
w/o problems after switching
** With a few other modifications, I was able to get
{{TestMiniSolrCloudClusterKerberos}} to work as well (details below)
* In hadoop's master branch, MiniKdc has been refactored to use SimpleKdcServer
internally anyway
Doing this isn't a silver bullet for the java9/jigsaw related failures (I'll
file a new but about that), but it should help us move forward -- and in
general seems like an improvement.
----
The attached patch is a starting point for this change.
One thing I'm not particularly happy with here is that in order to get it to
pass, I _had_ to modify {{TestMiniSolrCloudClusterKerberos}} to create a single
{{KerberosTestServices}} instance in the {{@BeforeClass}} method, instead of in
a regular {{@Before}} method.
*In and of itself, this change isn't neccessarily bad -- it just means we only
start one Kerberos server instead of one per method.*
What concerns me is that w/o this change, only the first test method would ever
pass, and subsequent test methods would log/throw errors from ZK -- and running
any single test method with {{-Dtests.method}} would (seemingly) always pass.
My initial suspicion was that something in {{SimpleKdcServer}}, or in our
{{KerberosTestServices}} wrapper, wasn't "resetting" the JVM security settings
correctly when we shut it down -- but if that were the case I would expect
something like {{ant test -Dtests.jvms=1 -Dtests.class=\*Kerber\*}} to fail
(even with {{KerberosTestServices}} only ever being instnatiated once per test
class) when the (sole) Test JVM got to the second test class and instantated a
second {{KerberosTestServices}} instance.
However that doesn't seem to be the case. For some reason, using only one
{{KerberosTestServices}} in a test class is fine, regardless of how many test
classes using kerberos run in that JVM, but using multiple
{{KerberosTestServices}} in a single test class causes kerberos failures.
For the purposes of demonstrating this (in contrast with the changes made in
{{TestMiniSolrCloudClusterKerberos}} which seem like a good idea either way)
I've added a {{TestHossSanity}} which works just like
{{TestMiniSolrCloudClusterKerberos}} except initializes
{{KerberosTestServices}} on a per test-method basis.
Examples of the types of Kerberose errors it logs (after the first test method
succeeds)...
{noformat}
...
[junit4] <JUnit4> says 你好! Master seed: 6BEDD90DB0D4DC38
[junit4] Executing 1 suite with 1 JVM.
[junit4]
[junit4] Started J0 PID(11220@tray).
[junit4] Suite: org.apache.solr.cloud.TestHossSanity
[junit4] 2> 0 INFO
(TEST-TestHossSanity.testStopAllStartAll-seed#[6BEDD90DB0D4DC38]) [ ]
o.a.s.c.MiniSolrCloudCluster Starting cluster of 5 servers in
/home/hossman/lucene/dev/solr/build/solr-core/test/J0/temp/solr.cloud.TestHossSanity_6BEDD90DB0D4DC38-001/tempDir-002
[junit4] 2> 11 INFO
(TEST-TestHossSanity.testStopAllStartAll-seed#[6BEDD90DB0D4DC38]) [ ]
o.a.s.c.ZkTestServer STARTING ZK TEST SERVER
...first test (testStopAllStartAll) proceeds and runs fine...
[junit4] OK 31.2s | TestHossSanity.testStopAllStartAll
[junit4] 2> 30989 INFO
(TEST-TestHossSanity.testCollectionCreateWithoutCoresThenDelete-seed#[6BEDD90DB0D4DC38])
[ ] o.a.s.c.MiniSolrCloudCluster Starting cluster of 5 servers in
/home/hossman/lucene/dev/solr/build/solr-core/test/J0/temp/solr.cloud.TestHossSanity_6BEDD90DB0D4DC38-001/tempDir-004
[junit4] 2> 30989 INFO
(TEST-TestHossSanity.testCollectionCreateWithoutCoresThenDelete-seed#[6BEDD90DB0D4DC38])
[ ] o.a.s.c.ZkTestServer STARTING ZK TEST SERVER
...
[junit4] 2> 30989 INFO (Thread-100) [ ] o.a.s.c.ZkTestServer client
port:0.0.0.0/0.0.0.0:0
[junit4] 2> 30990 INFO (Thread-100) [ ] o.a.s.c.ZkTestServer Starting
server
[junit4] 2> 30995 INFO (pool-11-thread-1) [ ] o.a.k.k.k.s.r.KdcRequest
Client entry is empty.
[junit4] 2> 30995 INFO (pool-11-thread-1) [ ] o.a.k.k.k.s.r.KdcRequest
The preauth data is empty.
[junit4] 2> 30995 INFO (pool-11-thread-1) [ ] o.a.k.k.k.s.KdcHandler
KRB error occurred while processing request:Additional pre-authentication
required
[junit4] 2> 31001 INFO (pool-11-thread-2) [ ] o.a.k.k.k.s.r.KdcRequest
Client entry is empty.
[junit4] 2> 31090 INFO
(TEST-TestHossSanity.testCollectionCreateWithoutCoresThenDelete-seed#[6BEDD90DB0D4DC38])
[ ] o.a.s.c.ZkTestServer start zk server on port:34029
[junit4] 2> 31098 WARN (NIOServerCxn.Factory:0.0.0.0/0.0.0.0:0) [ ]
o.a.z.s.ZooKeeperServer Client failed to SASL authenticate:
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException:
Failure unspecified at GSS-API level (Mechanism level: Checksum failed)]
[junit4] 2> 31099 WARN (NIOServerCxn.Factory:0.0.0.0/0.0.0.0:0) [ ]
o.a.z.s.ZooKeeperServer Closing client connection due to SASL authentication
failure.
[junit4] 2> 31099 ERROR (NIOServerCxn.Factory:0.0.0.0/0.0.0.0:0) [ ]
o.a.z.s.NIOServerCnxn Unexpected Exception:
[junit4] 2> java.nio.channels.CancelledKeyException
[junit4] 2> at
sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
[junit4] 2> at
sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77)
[junit4] 2> at
org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151)
[junit4] 2> at
org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1081)
[junit4] 2> at
org.apache.zookeeper.server.ZooKeeperServer.processPacket(ZooKeeperServer.java:936)
[junit4] 2> at
org.apache.zookeeper.server.NIOServerCnxn.readRequest(NIOServerCnxn.java:373)
[junit4] 2> at
org.apache.zookeeper.server.NIOServerCnxn.readPayload(NIOServerCnxn.java:200)
[junit4] 2> at
org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:244)
[junit4] 2> at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
[junit4] 2> at java.lang.Thread.run(Thread.java:745)
[junit4] 2> 31099 WARN (NIOServerCxn.Factory:0.0.0.0/0.0.0.0:0) [ ]
o.a.z.s.NIOServerCnxn Exception causing close of session 0x15a6da2d5a90000 due
to java.nio.channels.CancelledKeyException
...etc...
[junit4] Completed [1/1 (1!)] in 205.89s, 3 tests, 4 errors <<< FAILURES!
[junit4]
[junit4]
[junit4] Tests with failures [seed: 6BEDD90DB0D4DC38]:
[junit4] -
org.apache.solr.cloud.TestHossSanity.testCollectionCreateWithoutCoresThenDelete
[junit4] -
org.apache.solr.cloud.TestHossSanity.testCollectionCreateSearchDelete
[junit4] - org.apache.solr.cloud.TestHossSanity (suite)
[junit4]
[junit4]
[junit4] JVM J0: 0.43 .. 207.37 = 206.94s
[junit4] Execution time total: 3 minutes 27 seconds
[junit4] Tests summary: 1 suite, 3 tests, 2 suite-level errors, 2 errors
{noformat}
If anyone with more Kerberos knowledge then myself (pretty much anybody!) could
look this over and share your thoughts, i'd appreciate it.
> Tests using MiniKDC do not work with Java 9 Jigsaw
> --------------------------------------------------
>
> Key: SOLR-8052
> URL: https://issues.apache.org/jira/browse/SOLR-8052
> Project: Solr
> Issue Type: Bug
> Components: Authentication
> Affects Versions: 5.3
> Reporter: Uwe Schindler
> Labels: Java9
> Attachments: SOLR-8052.patch
>
>
> As described in my status update yesterday, there are some problems in
> dependencies shipped with Solr that don't work with Java 9 Jigsaw builds.
> org.apache.solr.cloud.SaslZkACLProviderTest.testSaslZkACLProvider
> {noformat}
> [junit4] > Throwable #1: java.lang.RuntimeException:
> java.lang.IllegalAccessException: Class org.apache.hadoop.minikdc.MiniKdc can
> not access a member of class sun.security.krb5.Config (module
> java.security.jgss) with modifiers "public static", module java.security.jgss
> does not export sun.security.krb5 to <unnamed module @6d2a209c>
> [junit4] > at
> org.apache.solr.cloud.SaslZkACLProviderTest$SaslZkTestServer.run(SaslZkACLProviderTest.java:211)
> [junit4] > at
> org.apache.solr.cloud.SaslZkACLProviderTest.setUp(SaslZkACLProviderTest.java:81)
> [junit4] > at java.lang.Thread.run([email protected]/Thread.java:746)
> [junit4] > Caused by: java.lang.IllegalAccessException: Class
> org.apache.hadoop.minikdc.MiniKdc can not access a member of class
> sun.security.krb5.Config (module java.security.jgss) with modifiers "public
> static", module java.security.jgss does not export sun.security.krb5 to
> <unnamed module @6d2a209c>
> [junit4] > at
> java.lang.reflect.AccessibleObject.slowCheckMemberAccess([email protected]/AccessibleObject.java:384)
> [junit4] > at
> java.lang.reflect.AccessibleObject.checkAccess([email protected]/AccessibleObject.java:376)
> [junit4] > at
> org.apache.hadoop.minikdc.MiniKdc.initKDCServer(MiniKdc.java:478)
> [junit4] > at
> org.apache.hadoop.minikdc.MiniKdc.start(MiniKdc.java:320)
> [junit4] > at
> org.apache.solr.cloud.SaslZkACLProviderTest$SaslZkTestServer.run(SaslZkACLProviderTest.java:204)
> [junit4] > ... 38 moreThrowable #2:
> java.lang.NullPointerException
> [junit4] > at
> org.apache.solr.cloud.ZkTestServer$ZKServerMain.shutdown(ZkTestServer.java:334)
> [junit4] > at
> org.apache.solr.cloud.ZkTestServer.shutdown(ZkTestServer.java:526)
> [junit4] > at
> org.apache.solr.cloud.SaslZkACLProviderTest$SaslZkTestServer.shutdown(SaslZkACLProviderTest.java:218)
> [junit4] > at
> org.apache.solr.cloud.SaslZkACLProviderTest.tearDown(SaslZkACLProviderTest.java:116)
> [junit4] > at java.lang.Thread.run([email protected]/Thread.java:746)
> {noformat}
> This is really bad, bad, bad! All security related stuff should never ever be
> reflected on!
> So we have to open issue in MiniKdc project so they remove the "hacks".
> Elasticsearch had
> similar problems with Amazon's AWS API. The worked around with a funny hack
> in their SecurityPolicy
> (https://github.com/elastic/elasticsearch/pull/13538). But as Solr does not
> run with SecurityManager
> in production, there is no way to do that.
> We should report issue on the MiniKdc project, so they fix their code and
> remove the really bad reflection on Java's internal classes.
> FYI, my
> [conclusion|http://mail-archives.apache.org/mod_mbox/lucene-dev/201509.mbox/%3C014801d0ee23%245c8f5df0%2415ae19d0%24%40thetaphi.de%3E]
> from yesterday.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]