[ 
https://issues.apache.org/jira/browse/SOLR-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-8052:
---------------------------
    Attachment: SOLR-8052.patch


I've been digging into some of the Java9 related SOLR jiras -- starting with 
the kerberos based test problems -- to try and figure out if these really are 
test only bugs and/or if there is anything we can do about making things work 
better.

Based on my initial reading/experimenting, I think we should replace MiniKdc 
(from hadoop's test infrastrcture) with SimpleKdcServer (from the apache kerby 
project)...

* SimpleKdcServer does not appear to have reflection related bugs that cause 
problems under jigsaw like MiniKdc does
* SimpleKdcServer does not suffer from the same "can't use multiple nodes" 
problem (HADOOP-9893) that has required {{TestMiniSolrCloudClusterKerberos}} to 
be {{@Ignored}} since it was created.
** I was able to add multiple solr nodes to {{TestSolrCloudWithKerberosAlt}} 
w/o problems after switching
** With a few other modifications, I was able to get 
{{TestMiniSolrCloudClusterKerberos}} to work as well (details below)
* In hadoop's master branch, MiniKdc has been refactored to use SimpleKdcServer 
internally anyway

Doing this isn't a silver bullet for the java9/jigsaw related failures (I'll 
file a new but about that), but it should help us move forward -- and in 
general seems like an improvement.

----

The attached patch is a starting point for this change.

One thing I'm not particularly happy with here is that in order to get it to 
pass, I _had_ to modify {{TestMiniSolrCloudClusterKerberos}} to create a single 
{{KerberosTestServices}} instance in the {{@BeforeClass}} method, instead of in 
a regular {{@Before}} method.

*In and of itself, this change isn't neccessarily bad -- it just means we only 
start one Kerberos server instead of one per method.*

What concerns me is that w/o this change, only the first test method would ever 
pass, and subsequent test methods would log/throw errors from ZK -- and running 
any single test method with {{-Dtests.method}} would (seemingly) always pass.

My initial suspicion was that something in {{SimpleKdcServer}}, or in our 
{{KerberosTestServices}} wrapper, wasn't "resetting" the JVM security settings 
correctly when we shut it down -- but if that were the case I would expect 
something like {{ant test -Dtests.jvms=1 -Dtests.class=\*Kerber\*}} to fail 
(even with {{KerberosTestServices}} only ever being instnatiated once per test 
class) when the (sole) Test JVM got to the second test class and instantated a 
second {{KerberosTestServices}} instance.

However that doesn't seem to be the case.  For some reason, using only one 
{{KerberosTestServices}} in a test class is fine, regardless of how many test 
classes using kerberos run in that JVM, but using multiple 
{{KerberosTestServices}} in a single test class causes kerberos failures.

For the purposes of demonstrating this (in contrast with the changes made in 
{{TestMiniSolrCloudClusterKerberos}} which seem like a good idea either way) 
I've added a {{TestHossSanity}} which works just like 
{{TestMiniSolrCloudClusterKerberos}} except initializes 
{{KerberosTestServices}} on a per test-method basis.

Examples of the types of Kerberose errors it logs (after the first test method 
succeeds)...

{noformat}
...
   [junit4] <JUnit4> says 你好! Master seed: 6BEDD90DB0D4DC38
   [junit4] Executing 1 suite with 1 JVM.
   [junit4] 
   [junit4] Started J0 PID(11220@tray).
   [junit4] Suite: org.apache.solr.cloud.TestHossSanity
   [junit4]   2> 0    INFO  
(TEST-TestHossSanity.testStopAllStartAll-seed#[6BEDD90DB0D4DC38]) [    ] 
o.a.s.c.MiniSolrCloudCluster Starting cluster of 5 servers in 
/home/hossman/lucene/dev/solr/build/solr-core/test/J0/temp/solr.cloud.TestHossSanity_6BEDD90DB0D4DC38-001/tempDir-002
   [junit4]   2> 11   INFO  
(TEST-TestHossSanity.testStopAllStartAll-seed#[6BEDD90DB0D4DC38]) [    ] 
o.a.s.c.ZkTestServer STARTING ZK TEST SERVER


...first test (testStopAllStartAll) proceeds and runs fine...


   [junit4] OK      31.2s | TestHossSanity.testStopAllStartAll
   [junit4]   2> 30989 INFO  
(TEST-TestHossSanity.testCollectionCreateWithoutCoresThenDelete-seed#[6BEDD90DB0D4DC38])
 [    ] o.a.s.c.MiniSolrCloudCluster Starting cluster of 5 servers in 
/home/hossman/lucene/dev/solr/build/solr-core/test/J0/temp/solr.cloud.TestHossSanity_6BEDD90DB0D4DC38-001/tempDir-004
   [junit4]   2> 30989 INFO  
(TEST-TestHossSanity.testCollectionCreateWithoutCoresThenDelete-seed#[6BEDD90DB0D4DC38])
 [    ] o.a.s.c.ZkTestServer STARTING ZK TEST SERVER
...
   [junit4]   2> 30989 INFO  (Thread-100) [    ] o.a.s.c.ZkTestServer client 
port:0.0.0.0/0.0.0.0:0
   [junit4]   2> 30990 INFO  (Thread-100) [    ] o.a.s.c.ZkTestServer Starting 
server
   [junit4]   2> 30995 INFO  (pool-11-thread-1) [    ] o.a.k.k.k.s.r.KdcRequest 
Client entry is empty.
   [junit4]   2> 30995 INFO  (pool-11-thread-1) [    ] o.a.k.k.k.s.r.KdcRequest 
The preauth data is empty.
   [junit4]   2> 30995 INFO  (pool-11-thread-1) [    ] o.a.k.k.k.s.KdcHandler 
KRB error occurred while processing request:Additional pre-authentication 
required
   [junit4]   2> 31001 INFO  (pool-11-thread-2) [    ] o.a.k.k.k.s.r.KdcRequest 
Client entry is empty.
   [junit4]   2> 31090 INFO  
(TEST-TestHossSanity.testCollectionCreateWithoutCoresThenDelete-seed#[6BEDD90DB0D4DC38])
 [    ] o.a.s.c.ZkTestServer start zk server on port:34029
   [junit4]   2> 31098 WARN  (NIOServerCxn.Factory:0.0.0.0/0.0.0.0:0) [    ] 
o.a.z.s.ZooKeeperServer Client failed to SASL authenticate: 
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
Failure unspecified at GSS-API level (Mechanism level: Checksum failed)]
   [junit4]   2> 31099 WARN  (NIOServerCxn.Factory:0.0.0.0/0.0.0.0:0) [    ] 
o.a.z.s.ZooKeeperServer Closing client connection due to SASL authentication 
failure.
   [junit4]   2> 31099 ERROR (NIOServerCxn.Factory:0.0.0.0/0.0.0.0:0) [    ] 
o.a.z.s.NIOServerCnxn Unexpected Exception: 
   [junit4]   2> java.nio.channels.CancelledKeyException
   [junit4]   2>        at 
sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
   [junit4]   2>        at 
sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77)
   [junit4]   2>        at 
org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:151)
   [junit4]   2>        at 
org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1081)
   [junit4]   2>        at 
org.apache.zookeeper.server.ZooKeeperServer.processPacket(ZooKeeperServer.java:936)
   [junit4]   2>        at 
org.apache.zookeeper.server.NIOServerCnxn.readRequest(NIOServerCnxn.java:373)
   [junit4]   2>        at 
org.apache.zookeeper.server.NIOServerCnxn.readPayload(NIOServerCnxn.java:200)
   [junit4]   2>        at 
org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:244)
   [junit4]   2>        at 
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
   [junit4]   2>        at java.lang.Thread.run(Thread.java:745)
   [junit4]   2> 31099 WARN  (NIOServerCxn.Factory:0.0.0.0/0.0.0.0:0) [    ] 
o.a.z.s.NIOServerCnxn Exception causing close of session 0x15a6da2d5a90000 due 
to java.nio.channels.CancelledKeyException

...etc...

   [junit4] Completed [1/1 (1!)] in 205.89s, 3 tests, 4 errors <<< FAILURES!
   [junit4] 
   [junit4] 
   [junit4] Tests with failures [seed: 6BEDD90DB0D4DC38]:
   [junit4]   - 
org.apache.solr.cloud.TestHossSanity.testCollectionCreateWithoutCoresThenDelete
   [junit4]   - 
org.apache.solr.cloud.TestHossSanity.testCollectionCreateSearchDelete
   [junit4]   - org.apache.solr.cloud.TestHossSanity (suite)
   [junit4] 
   [junit4] 
   [junit4] JVM J0:     0.43 ..   207.37 =   206.94s
   [junit4] Execution time total: 3 minutes 27 seconds
   [junit4] Tests summary: 1 suite, 3 tests, 2 suite-level errors, 2 errors
{noformat}

If anyone with more Kerberos knowledge then myself (pretty much anybody!) could 
look this over and share your thoughts, i'd appreciate it.


> Tests using MiniKDC do not work with Java 9 Jigsaw
> --------------------------------------------------
>
>                 Key: SOLR-8052
>                 URL: https://issues.apache.org/jira/browse/SOLR-8052
>             Project: Solr
>          Issue Type: Bug
>          Components: Authentication
>    Affects Versions: 5.3
>            Reporter: Uwe Schindler
>              Labels: Java9
>         Attachments: SOLR-8052.patch
>
>
> As described in my status update yesterday, there are some problems in 
> dependencies shipped with Solr that don't work with Java 9 Jigsaw builds.
> org.apache.solr.cloud.SaslZkACLProviderTest.testSaslZkACLProvider
> {noformat}
>    [junit4]    > Throwable #1: java.lang.RuntimeException: 
> java.lang.IllegalAccessException: Class org.apache.hadoop.minikdc.MiniKdc can 
> not access a member of class sun.security.krb5.Config (module 
> java.security.jgss) with modifiers "public static", module java.security.jgss 
> does not export sun.security.krb5 to <unnamed module @6d2a209c>
>    [junit4]    >        at 
> org.apache.solr.cloud.SaslZkACLProviderTest$SaslZkTestServer.run(SaslZkACLProviderTest.java:211)
>    [junit4]    >        at 
> org.apache.solr.cloud.SaslZkACLProviderTest.setUp(SaslZkACLProviderTest.java:81)
>    [junit4]    >        at java.lang.Thread.run([email protected]/Thread.java:746)
>    [junit4]    > Caused by: java.lang.IllegalAccessException: Class 
> org.apache.hadoop.minikdc.MiniKdc can not access a member of class 
> sun.security.krb5.Config (module java.security.jgss) with modifiers "public 
> static", module java.security.jgss does not export sun.security.krb5 to 
> <unnamed module @6d2a209c>
>    [junit4]    >        at 
> java.lang.reflect.AccessibleObject.slowCheckMemberAccess([email protected]/AccessibleObject.java:384)
>    [junit4]    >        at 
> java.lang.reflect.AccessibleObject.checkAccess([email protected]/AccessibleObject.java:376)
>    [junit4]    >        at 
> org.apache.hadoop.minikdc.MiniKdc.initKDCServer(MiniKdc.java:478)
>    [junit4]    >        at 
> org.apache.hadoop.minikdc.MiniKdc.start(MiniKdc.java:320)
>    [junit4]    >        at 
> org.apache.solr.cloud.SaslZkACLProviderTest$SaslZkTestServer.run(SaslZkACLProviderTest.java:204)
>    [junit4]    >        ... 38 moreThrowable #2: 
> java.lang.NullPointerException
>    [junit4]    >        at 
> org.apache.solr.cloud.ZkTestServer$ZKServerMain.shutdown(ZkTestServer.java:334)
>    [junit4]    >        at 
> org.apache.solr.cloud.ZkTestServer.shutdown(ZkTestServer.java:526)
>    [junit4]    >        at 
> org.apache.solr.cloud.SaslZkACLProviderTest$SaslZkTestServer.shutdown(SaslZkACLProviderTest.java:218)
>    [junit4]    >        at 
> org.apache.solr.cloud.SaslZkACLProviderTest.tearDown(SaslZkACLProviderTest.java:116)
>    [junit4]    >        at java.lang.Thread.run([email protected]/Thread.java:746)
> {noformat}
> This is really bad, bad, bad! All security related stuff should never ever be 
> reflected on!
> So we have to open issue in MiniKdc project so they remove the "hacks". 
> Elasticsearch had
> similar problems with Amazon's AWS API. The worked around with a funny hack 
> in their SecurityPolicy
> (https://github.com/elastic/elasticsearch/pull/13538). But as Solr does not 
> run with SecurityManager
> in production, there is no way to do that. 
> We should report issue on the MiniKdc project, so they fix their code and 
> remove the really bad reflection on Java's internal classes.
> FYI, my 
> [conclusion|http://mail-archives.apache.org/mod_mbox/lucene-dev/201509.mbox/%3C014801d0ee23%245c8f5df0%2415ae19d0%24%40thetaphi.de%3E]
>  from yesterday.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to