Josh, I am using this function:
https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/mapred/AbstractInputFormat.java#L106 If I pass in a KerberosToken, it's stuck at line 111; if I pass in a delegation token, the setConnectorInfo function finishes fine. But when I do something like queryRDD.count, spark eventually calls HadoopRDD.getPartitions, which calls the following and get stuck in the last authenticate() function: https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/mapred/AbstractInputFormat.java#L621 https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/mapred/AbstractInputFormat.java#L348 https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/ZooKeeperInstance.java#L248 https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/impl/ConnectorImpl.java#L70 Which essentially the same place where it would be stuck with KerberosToken. -Simon On Thu, Jun 11, 2015 at 9:41 PM, Josh Elser <josh.el...@gmail.com> wrote: > What are the Accumulo methods that you are calling and what is the error you > are seeing? > > A KerberosToken cannot be used in a MapReduce job which is why a > DelegationToken is automatically retrieved. You should still be able to > provide your own DelegationToken -- if that doesn't work, that's a bug. > > Xu (Simon) Chen wrote: >> >> I actually added a flag such that I can pass in either a KerberosToken >> or a DelegationTokenImpl to accumulo. >> >> Actually when a KerberosToken is passed in, accumulo converts it to a >> DelegationToken - the conversion is where I am having trouble. I tried >> passing in a delegation token directly to bypass the conversion, but a >> similar problem happens, that I am stuck at authenticate on the client >> side, and server side outputs the same output... >> >> On Thursday, June 11, 2015, Josh Elser <josh.el...@gmail.com >> <mailto:josh.el...@gmail.com>> wrote: >> >> Keep in mind that the authentication path for DelegationToken >> (mapreduce) and KerberosToken are completely different. >> >> Since most mapreduce jobs have multiple mappers (or reducers), I >> expect we would have run into the case that the same DelegationToken >> was used multiple times. It would still be good to narrow down the >> scope of the problem. >> >> Xu (Simon) Chen wrote: >> >> Thanks Josh... >> >> I tested this in scala REPL, and called >> DataStoreFinder.getDataStore() >> multiple times, each time it seems to be reusing the same >> KerberosToken object, and it works fine each time. >> >> So my problem only happens when the token is used in accumulo's >> mapred >> package. Weird.. >> >> -Simon >> >> >> On Thu, Jun 11, 2015 at 5:29 PM, Josh >> Elser<josh.el...@gmail.com> wrote: >> >> Simon, >> >> Can you reproduce this in plain-jane Java code? I don't know >> enough about >> spark/scala, much less what Geomesa is actually do, to know >> what the issue >> is. >> >> Also, which token are you referring to: A KerberosToken or a >> DelegationToken? Either of them should be usable as many >> times as you'd like >> (given the underlying credentials are still available for KT >> or the DT token >> hasn't yet expired). >> >> >> Xu (Simon) Chen wrote: >> >> Folks, >> >> I am working on geomesa+accumulo+spark integration. For >> some reason, I >> found that the same token cannot be used to authenticate >> twice. >> >> The workflow is that geomesa would try to create a >> hadoop rdd, during >> which it tries to create an AccumuloDataStore: >> >> >> https://github.com/locationtech/geomesa/blob/master/geomesa-compute/src/main/scala/org/locationtech/geomesa/compute/spark/GeoMesaSpark.scala#L81 >> >> During this process, a ZooKeeperInstance is created: >> >> >> https://github.com/locationtech/geomesa/blob/rc7_a1.7_h2.5/geomesa-core/src/main/scala/org/locationtech/geomesa/core/data/AccumuloDataStoreFactory.scala#L177 >> I modified geomesa such that it would use kerberos to >> authenticate >> here. This step works fine. >> >> But next, geomesa calls >> ConfigurationBase.setConnectorInfo: >> >> >> https://github.com/locationtech/geomesa/blob/rc7_a1.7_h2.5/geomesa-compute/src/main/scala/org/locationtech/geomesa/compute/spark/GeoMesaSpark.scala#L69 >> >> This is using the same token and the same zookeeper URI, >> for some >> reason it is stuck on spark-shell, and the following is >> outputted on >> tserver side: >> >> 2015-06-06 18:58:19,616 [server.TThreadPoolServer] >> ERROR: Error >> occurred during processing of message. >> java.lang.RuntimeException: >> org.apache.thrift.transport.TTransportException: >> java.net <http://java.net>.SocketTimeoutException: Read >> >> timed out >> at >> >> org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219) >> at >> >> org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:51) >> at >> >> org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:48) >> at >> java.security.AccessController.doPrivileged(Native Method) >> at >> javax.security.auth.Subject.doAs(Subject.java:356) >> at >> >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1622) >> at >> >> org.apache.accumulo.core.rpc.UGIAssumingTransportFactory.getTransport(UGIAssumingTransportFactory.java:48) >> at >> >> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:208) >> at >> >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> at >> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> at >> >> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35) >> at java.lang.Thread.run(Thread.java:745) >> Caused by: >> org.apache.thrift.transport.TTransportException: >> java.net <http://java.net>.SocketTimeoutException: Read >> timed out >> at >> >> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129) >> at >> >> org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) >> at >> >> org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:182) >> at >> >> org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125) >> at >> >> org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253) >> at >> >> org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) >> at >> >> org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) >> ... 11 more >> Caused by: java.net >> <http://java.net>.SocketTimeoutException: Read timed out >> at >> java.net.SocketInputStream.socketRead0(Native Method) >> at >> >> java.net.SocketInputStream.read(SocketInputStream.java:152) >> at >> >> java.net.SocketInputStream.read(SocketInputStream.java:122) >> at >> >> java.io.BufferedInputStream.read1(BufferedInputStream.java:273) >> at >> >> java.io.BufferedInputStream.read(BufferedInputStream.java:334) >> at >> >> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) >> ... 17 more >> >> Any idea why? >> >> Thanks. >> -Simon >> >