Ah, I found the problem... In the hadoop chain of events, this is called eventually, because clientConfigString is not null (containing instance.name and instance.zookeeper.host): https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/mapreduce/lib/impl/ConfiguratorBase.java#L382
The deserialize function unfortunately doesn't load the default options, therefore left out the sasl thing from ~/.accumulo/config https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/ClientConfiguration.java#L235 Would it be reasonable for deserialize to load the default settings? -Simon On Fri, Jun 12, 2015 at 11:34 AM, Josh Elser <josh.el...@gmail.com> wrote: > Just be careful with the mapreduce classes. I wouldn't be surprised if we > try to avoid any locally installed client.conf in MapReduce (using only the > ClientConfiguration stored inside the Job). > > Will wait to hear back from you :) > > Xu (Simon) Chen wrote: >> >> Emm.. I have ~/.accumulo/config with "instance.rpc.sasl.enabled=true". >> That property is indeed populated to ClientConfiguration the first time >> - that's why I said the token worked initially. >> >> Apparently, in the Hadoop portion that property is not set, as I added >> some debug message to ZooKeeperInstance class. I think that's likely the >> issue. >> >> So the zookeeper instance is created in the following sequence: >> >> https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/mapred/AbstractInputFormat.java#L341 >> >> https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/mapreduce/lib/impl/InputConfigurator.java#L671 >> >> https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/mapreduce/lib/impl/ConfiguratorBase.java#L361 >> >> The getClientConfiguration function calls getDefaultSearchPath() >> eventually, so my ~/.accumulo/config should be searched. I think we are >> close to the root cause... Will update when I find out more. >> >> Thanks! >> -Simon >> >> On Thu, Jun 11, 2015 at 11:28 PM, Josh Elser <josh.el...@gmail.com> wrote: >> > Are you sure that the spark tasks have the proper >> ClientConfiguration? They >> > need to have instance.rpc.sasl.enabled. I believe you should be able >> to set >> > this via the AccumuloInputFormat >> > >> > You can turn up logging org.apache.accumulo.core.client=TRACE and/or >> set the >> > system property -Dsun.security.krb5.debug=true to get some more >> information >> > as to why the authentication is failing. >> > >> > >> > Xu (Simon) Chen wrote: >> >> >> >> Josh, >> >> >> >> I am using this function: >> >> >> >> >> >> >> >> https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/mapred/AbstractInputFormat.java#L106 >> >> >> >> If I pass in a KerberosToken, it's stuck at line 111; if I pass in a >> >> delegation token, the setConnectorInfo function finishes fine. >> >> >> >> But when I do something like queryRDD.count, spark eventually calls >> >> HadoopRDD.getPartitions, which calls the following and get stuck in >> >> the last authenticate() function: >> >> >> >> >> >> https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/mapred/AbstractInputFormat.java#L621 >> >> >> >> >> >> https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/mapred/AbstractInputFormat.java#L348 >> >> >> >> >> >> https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/ZooKeeperInstance.java#L248 >> >> >> >> >> >> https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/impl/ConnectorImpl.java#L70 >> >> >> >> Which essentially the same place where it would be stuck with >> >> KerberosToken. >> >> >> >> -Simon >> >> >> >> On Thu, Jun 11, 2015 at 9:41 PM, Josh Elser<josh.el...@gmail.com> >> wrote: >> >>> >> >>> What are the Accumulo methods that you are calling and what is the >> error >> >>> you >> >>> are seeing? >> >>> >> >>> A KerberosToken cannot be used in a MapReduce job which is why a >> >>> DelegationToken is automatically retrieved. You should still be able >> to >> >>> provide your own DelegationToken -- if that doesn't work, that's a >> bug. >> >>> >> >>> Xu (Simon) Chen wrote: >> >>>> >> >>>> I actually added a flag such that I can pass in either a >> KerberosToken >> >>>> or a DelegationTokenImpl to accumulo. >> >>>> >> >>>> Actually when a KerberosToken is passed in, accumulo converts it to >> a >> >>>> DelegationToken - the conversion is where I am having trouble. I >> tried >> >>>> passing in a delegation token directly to bypass the conversion, but >> a >> >>>> similar problem happens, that I am stuck at authenticate on the >> client >> >>>> side, and server side outputs the same output... >> >>>> >> >>>> On Thursday, June 11, 2015, Josh Elser<josh.el...@gmail.com >> >>>> <mailto:josh.el...@gmail.com>> wrote: >> >>>> >> >>>> Keep in mind that the authentication path for DelegationToken >> >>>> (mapreduce) and KerberosToken are completely different. >> >>>> >> >>>> Since most mapreduce jobs have multiple mappers (or reducers), >> I >> >>>> expect we would have run into the case that the same >> >>>> DelegationToken >> >>>> was used multiple times. It would still be good to narrow >> down the >> >>>> scope of the problem. >> >>>> >> >>>> Xu (Simon) Chen wrote: >> >>>> >> >>>> Thanks Josh... >> >>>> >> >>>> I tested this in scala REPL, and called >> >>>> DataStoreFinder.getDataStore() >> >>>> multiple times, each time it seems to be reusing the same >> >>>> KerberosToken object, and it works fine each time. >> >>>> >> >>>> So my problem only happens when the token is used in >> accumulo's >> >>>> mapred >> >>>> package. Weird.. >> >>>> >> >>>> -Simon >> >>>> >> >>>> >> >>>> On Thu, Jun 11, 2015 at 5:29 PM, Josh >> >>>> Elser<josh.el...@gmail.com> wrote: >> >>>> >> >>>> Simon, >> >>>> >> >>>> Can you reproduce this in plain-jane Java code? I don't >> >>>> know >> >>>> enough about >> >>>> spark/scala, much less what Geomesa is actually do, >> to know >> >>>> what the issue >> >>>> is. >> >>>> >> >>>> Also, which token are you referring to: A >> KerberosToken or >> >>>> a >> >>>> DelegationToken? Either of them should be usable as >> many >> >>>> times as you'd like >> >>>> (given the underlying credentials are still available >> for >> >>>> KT >> >>>> or the DT token >> >>>> hasn't yet expired). >> >>>> >> >>>> >> >>>> Xu (Simon) Chen wrote: >> >>>> >> >>>> Folks, >> >>>> >> >>>> I am working on geomesa+accumulo+spark >> integration. For >> >>>> some reason, I >> >>>> found that the same token cannot be used to >> >>>> authenticate >> >>>> twice. >> >>>> >> >>>> The workflow is that geomesa would try to create a >> >>>> hadoop rdd, during >> >>>> which it tries to create an AccumuloDataStore: >> >>>> >> >>>> >> >>>> >> >>>> >> >> https://github.com/locationtech/geomesa/blob/master/geomesa-compute/src/main/scala/org/locationtech/geomesa/compute/spark/GeoMesaSpark.scala#L81 >> >>>> >> >>>> During this process, a ZooKeeperInstance is >> created: >> >>>> >> >>>> >> >>>> >> >>>> >> >> https://github.com/locationtech/geomesa/blob/rc7_a1.7_h2.5/geomesa-core/src/main/scala/org/locationtech/geomesa/core/data/AccumuloDataStoreFactory.scala#L177 >> >>>> I modified geomesa such that it would use kerberos >> to >> >>>> authenticate >> >>>> here. This step works fine. >> >>>> >> >>>> But next, geomesa calls >> >>>> ConfigurationBase.setConnectorInfo: >> >>>> >> >>>> >> >>>> >> >>>> >> >> https://github.com/locationtech/geomesa/blob/rc7_a1.7_h2.5/geomesa-compute/src/main/scala/org/locationtech/geomesa/compute/spark/GeoMesaSpark.scala#L69 >> >>>> >> >>>> This is using the same token and the same zookeeper >> >>>> URI, >> >>>> for some >> >>>> reason it is stuck on spark-shell, and the >> following is >> >>>> outputted on >> >>>> tserver side: >> >>>> >> >>>> 2015-06-06 18:58:19,616 [server.TThreadPoolServer] >> >>>> ERROR: Error >> >>>> occurred during processing of message. >> >>>> java.lang.RuntimeException: >> >>>> org.apache.thrift.transport.TTransportException: >> >>>> java.net >> <http://java.net><http://java.net>.SocketTimeoutException: Read >> >> >>>> >> >>>> timed out >> >>>> at >> >>>> >> >>>> >> >>>> >> >> org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219) >> >>>> at >> >>>> >> >>>> >> >>>> >> >> org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:51) >> >>>> at >> >>>> >> >>>> >> >>>> >> >> org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:48) >> >>>> at >> >>>> java.security.AccessController.doPrivileged(Native >> >>>> Method) >> >>>> at >> >>>> javax.security.auth.Subject.doAs(Subject.java:356) >> >>>> at >> >>>> >> >>>> >> >>>> >> >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1622) >> >>>> at >> >>>> >> >>>> >> >>>> >> >> org.apache.accumulo.core.rpc.UGIAssumingTransportFactory.getTransport(UGIAssumingTransportFactory.java:48) >> >>>> at >> >>>> >> >>>> >> >>>> >> >> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:208) >> >>>> at >> >>>> >> >>>> >> >>>> >> >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> >>>> at >> >>>> >> >>>> >> >>>> >> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> >>>> at >> >>>> >> >>>> >> >>>> >> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35) >> >>>> at java.lang.Thread.run(Thread.java:745) >> >>>> Caused by: >> >>>> org.apache.thrift.transport.TTransportException: >> >>>> java.net >> <http://java.net><http://java.net>.SocketTimeoutException: Read >> >> >>>> timed out >> >>>> at >> >>>> >> >>>> >> >>>> >> >> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129) >> >>>> at >> >>>> >> >>>> org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) >> >>>> at >> >>>> >> >>>> >> >>>> >> >> org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:182) >> >>>> at >> >>>> >> >>>> >> >>>> >> >> org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125) >> >>>> at >> >>>> >> >>>> >> org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253) >> >>>> at >> >>>> >> >>>> >> >>>> >> >> org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) >> >>>> at >> >>>> >> >>>> >> >>>> >> >> org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) >> >>>> ... 11 more >> >>>> Caused by: java.net <http://java.net> >> >>>> <http://java.net>.SocketTimeoutException: Read timed >> >>>> out >> >>>> at >> >>>> java.net.SocketInputStream.socketRead0(Native >> Method) >> >>>> at >> >>>> >> >>>> java.net.SocketInputStream.read(SocketInputStream.java:152) >> >>>> at >> >>>> >> >>>> java.net.SocketInputStream.read(SocketInputStream.java:122) >> >>>> at >> >>>> >> >>>> java.io.BufferedInputStream.read1(BufferedInputStream.java:273) >> >>>> at >> >>>> >> >>>> java.io.BufferedInputStream.read(BufferedInputStream.java:334) >> >>>> at >> >>>> >> >>>> >> >>>> >> >> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) >> >>>> ... 17 more >> >>>> >> >>>> Any idea why? >> >>>> >> >>>> Thanks. >> >>>> -Simon >> >>>> >> > >> >