Emm.. I have ~/.accumulo/config with "instance.rpc.sasl.enabled=true". That property is indeed populated to ClientConfiguration the first time - that's why I said the token worked initially.
Apparently, in the Hadoop portion that property is not set, as I added some debug message to ZooKeeperInstance class. I think that's likely the issue. So the zookeeper instance is created in the following sequence: https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/mapred/AbstractInputFormat.java#L341 https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/mapreduce/lib/impl/InputConfigurator.java#L671 https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/mapreduce/lib/impl/ConfiguratorBase.java#L361 The getClientConfiguration function calls getDefaultSearchPath() eventually, so my ~/.accumulo/config should be searched. I think we are close to the root cause... Will update when I find out more. Thanks! -Simon On Thu, Jun 11, 2015 at 11:28 PM, Josh Elser <josh.el...@gmail.com> wrote: > Are you sure that the spark tasks have the proper ClientConfiguration? They > need to have instance.rpc.sasl.enabled. I believe you should be able to set > this via the AccumuloInputFormat > > You can turn up logging org.apache.accumulo.core.client=TRACE and/or set the > system property -Dsun.security.krb5.debug=true to get some more information > as to why the authentication is failing. > > > Xu (Simon) Chen wrote: >> >> Josh, >> >> I am using this function: >> >> >> https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/mapred/AbstractInputFormat.java#L106 >> >> If I pass in a KerberosToken, it's stuck at line 111; if I pass in a >> delegation token, the setConnectorInfo function finishes fine. >> >> But when I do something like queryRDD.count, spark eventually calls >> HadoopRDD.getPartitions, which calls the following and get stuck in >> the last authenticate() function: >> >> https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/mapred/AbstractInputFormat.java#L621 >> >> https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/mapred/AbstractInputFormat.java#L348 >> >> https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/ZooKeeperInstance.java#L248 >> >> https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/impl/ConnectorImpl.java#L70 >> >> Which essentially the same place where it would be stuck with >> KerberosToken. >> >> -Simon >> >> On Thu, Jun 11, 2015 at 9:41 PM, Josh Elser<josh.el...@gmail.com> wrote: >>> >>> What are the Accumulo methods that you are calling and what is the error >>> you >>> are seeing? >>> >>> A KerberosToken cannot be used in a MapReduce job which is why a >>> DelegationToken is automatically retrieved. You should still be able to >>> provide your own DelegationToken -- if that doesn't work, that's a bug. >>> >>> Xu (Simon) Chen wrote: >>>> >>>> I actually added a flag such that I can pass in either a KerberosToken >>>> or a DelegationTokenImpl to accumulo. >>>> >>>> Actually when a KerberosToken is passed in, accumulo converts it to a >>>> DelegationToken - the conversion is where I am having trouble. I tried >>>> passing in a delegation token directly to bypass the conversion, but a >>>> similar problem happens, that I am stuck at authenticate on the client >>>> side, and server side outputs the same output... >>>> >>>> On Thursday, June 11, 2015, Josh Elser<josh.el...@gmail.com >>>> <mailto:josh.el...@gmail.com>> wrote: >>>> >>>> Keep in mind that the authentication path for DelegationToken >>>> (mapreduce) and KerberosToken are completely different. >>>> >>>> Since most mapreduce jobs have multiple mappers (or reducers), I >>>> expect we would have run into the case that the same >>>> DelegationToken >>>> was used multiple times. It would still be good to narrow down the >>>> scope of the problem. >>>> >>>> Xu (Simon) Chen wrote: >>>> >>>> Thanks Josh... >>>> >>>> I tested this in scala REPL, and called >>>> DataStoreFinder.getDataStore() >>>> multiple times, each time it seems to be reusing the same >>>> KerberosToken object, and it works fine each time. >>>> >>>> So my problem only happens when the token is used in accumulo's >>>> mapred >>>> package. Weird.. >>>> >>>> -Simon >>>> >>>> >>>> On Thu, Jun 11, 2015 at 5:29 PM, Josh >>>> Elser<josh.el...@gmail.com> wrote: >>>> >>>> Simon, >>>> >>>> Can you reproduce this in plain-jane Java code? I don't >>>> know >>>> enough about >>>> spark/scala, much less what Geomesa is actually do, to know >>>> what the issue >>>> is. >>>> >>>> Also, which token are you referring to: A KerberosToken or >>>> a >>>> DelegationToken? Either of them should be usable as many >>>> times as you'd like >>>> (given the underlying credentials are still available for >>>> KT >>>> or the DT token >>>> hasn't yet expired). >>>> >>>> >>>> Xu (Simon) Chen wrote: >>>> >>>> Folks, >>>> >>>> I am working on geomesa+accumulo+spark integration. For >>>> some reason, I >>>> found that the same token cannot be used to >>>> authenticate >>>> twice. >>>> >>>> The workflow is that geomesa would try to create a >>>> hadoop rdd, during >>>> which it tries to create an AccumuloDataStore: >>>> >>>> >>>> >>>> https://github.com/locationtech/geomesa/blob/master/geomesa-compute/src/main/scala/org/locationtech/geomesa/compute/spark/GeoMesaSpark.scala#L81 >>>> >>>> During this process, a ZooKeeperInstance is created: >>>> >>>> >>>> >>>> https://github.com/locationtech/geomesa/blob/rc7_a1.7_h2.5/geomesa-core/src/main/scala/org/locationtech/geomesa/core/data/AccumuloDataStoreFactory.scala#L177 >>>> I modified geomesa such that it would use kerberos to >>>> authenticate >>>> here. This step works fine. >>>> >>>> But next, geomesa calls >>>> ConfigurationBase.setConnectorInfo: >>>> >>>> >>>> >>>> https://github.com/locationtech/geomesa/blob/rc7_a1.7_h2.5/geomesa-compute/src/main/scala/org/locationtech/geomesa/compute/spark/GeoMesaSpark.scala#L69 >>>> >>>> This is using the same token and the same zookeeper >>>> URI, >>>> for some >>>> reason it is stuck on spark-shell, and the following is >>>> outputted on >>>> tserver side: >>>> >>>> 2015-06-06 18:58:19,616 [server.TThreadPoolServer] >>>> ERROR: Error >>>> occurred during processing of message. >>>> java.lang.RuntimeException: >>>> org.apache.thrift.transport.TTransportException: >>>> java.net<http://java.net>.SocketTimeoutException: Read >>>> >>>> timed out >>>> at >>>> >>>> >>>> org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219) >>>> at >>>> >>>> >>>> org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:51) >>>> at >>>> >>>> >>>> org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:48) >>>> at >>>> java.security.AccessController.doPrivileged(Native >>>> Method) >>>> at >>>> javax.security.auth.Subject.doAs(Subject.java:356) >>>> at >>>> >>>> >>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1622) >>>> at >>>> >>>> >>>> org.apache.accumulo.core.rpc.UGIAssumingTransportFactory.getTransport(UGIAssumingTransportFactory.java:48) >>>> at >>>> >>>> >>>> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:208) >>>> at >>>> >>>> >>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>>> at >>>> >>>> >>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>>> at >>>> >>>> >>>> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35) >>>> at java.lang.Thread.run(Thread.java:745) >>>> Caused by: >>>> org.apache.thrift.transport.TTransportException: >>>> java.net<http://java.net>.SocketTimeoutException: Read >>>> timed out >>>> at >>>> >>>> >>>> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129) >>>> at >>>> >>>> org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) >>>> at >>>> >>>> >>>> org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:182) >>>> at >>>> >>>> >>>> org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125) >>>> at >>>> >>>> org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253) >>>> at >>>> >>>> >>>> org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) >>>> at >>>> >>>> >>>> org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) >>>> ... 11 more >>>> Caused by: java.net >>>> <http://java.net>.SocketTimeoutException: Read timed >>>> out >>>> at >>>> java.net.SocketInputStream.socketRead0(Native Method) >>>> at >>>> >>>> java.net.SocketInputStream.read(SocketInputStream.java:152) >>>> at >>>> >>>> java.net.SocketInputStream.read(SocketInputStream.java:122) >>>> at >>>> >>>> java.io.BufferedInputStream.read1(BufferedInputStream.java:273) >>>> at >>>> >>>> java.io.BufferedInputStream.read(BufferedInputStream.java:334) >>>> at >>>> >>>> >>>> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) >>>> ... 17 more >>>> >>>> Any idea why? >>>> >>>> Thanks. >>>> -Simon >>>> >