Jim, The commit you mentioned is not in the branch you created for me (rc7_a1.7_h2.5)...
-Simon On Thu, Jun 11, 2015 at 8:09 PM, James Hughes <jn...@virginia.edu> wrote: > Simon, > > Hmm... it may be worth reverting the changes to the GeoMesaSpark source from > this commit: > https://github.com/locationtech/geomesa/commit/1d0178e399ce5f66f0bcd94b071bf0634fe31e8d. > > That'd give a chance to make sure the problem is with the AccumuloInput > rather than something GeoMesa is doing with the input format. As note, > GeoMesa has some code to help gather splits similar to what Eugene proposed > recently. I don't know if that made it into 1.7.0 or not. > > Cheers, > > Jim > > On Thu, Jun 11, 2015 at 7:51 PM, Xu (Simon) Chen <xche...@gmail.com> wrote: >> >> Thanks Josh... >> >> I tested this in scala REPL, and called DataStoreFinder.getDataStore() >> multiple times, each time it seems to be reusing the same >> KerberosToken object, and it works fine each time. >> >> So my problem only happens when the token is used in accumulo's mapred >> package. Weird.. >> >> -Simon >> >> >> On Thu, Jun 11, 2015 at 5:29 PM, Josh Elser <josh.el...@gmail.com> wrote: >> > Simon, >> > >> > Can you reproduce this in plain-jane Java code? I don't know enough >> > about >> > spark/scala, much less what Geomesa is actually do, to know what the >> > issue >> > is. >> > >> > Also, which token are you referring to: A KerberosToken or a >> > DelegationToken? Either of them should be usable as many times as you'd >> > like >> > (given the underlying credentials are still available for KT or the DT >> > token >> > hasn't yet expired). >> > >> > >> > Xu (Simon) Chen wrote: >> >> >> >> Folks, >> >> >> >> I am working on geomesa+accumulo+spark integration. For some reason, I >> >> found that the same token cannot be used to authenticate twice. >> >> >> >> The workflow is that geomesa would try to create a hadoop rdd, during >> >> which it tries to create an AccumuloDataStore: >> >> >> >> >> >> https://github.com/locationtech/geomesa/blob/master/geomesa-compute/src/main/scala/org/locationtech/geomesa/compute/spark/GeoMesaSpark.scala#L81 >> >> >> >> During this process, a ZooKeeperInstance is created: >> >> >> >> >> >> https://github.com/locationtech/geomesa/blob/rc7_a1.7_h2.5/geomesa-core/src/main/scala/org/locationtech/geomesa/core/data/AccumuloDataStoreFactory.scala#L177 >> >> I modified geomesa such that it would use kerberos to authenticate >> >> here. This step works fine. >> >> >> >> But next, geomesa calls ConfigurationBase.setConnectorInfo: >> >> >> >> >> >> https://github.com/locationtech/geomesa/blob/rc7_a1.7_h2.5/geomesa-compute/src/main/scala/org/locationtech/geomesa/compute/spark/GeoMesaSpark.scala#L69 >> >> >> >> This is using the same token and the same zookeeper URI, for some >> >> reason it is stuck on spark-shell, and the following is outputted on >> >> tserver side: >> >> >> >> 2015-06-06 18:58:19,616 [server.TThreadPoolServer] ERROR: Error >> >> occurred during processing of message. >> >> java.lang.RuntimeException: >> >> org.apache.thrift.transport.TTransportException: >> >> java.net.SocketTimeoutException: Read timed out >> >> at >> >> >> >> org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219) >> >> at >> >> >> >> org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:51) >> >> at >> >> >> >> org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:48) >> >> at java.security.AccessController.doPrivileged(Native Method) >> >> at javax.security.auth.Subject.doAs(Subject.java:356) >> >> at >> >> >> >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1622) >> >> at >> >> >> >> org.apache.accumulo.core.rpc.UGIAssumingTransportFactory.getTransport(UGIAssumingTransportFactory.java:48) >> >> at >> >> >> >> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:208) >> >> at >> >> >> >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> >> at >> >> >> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> >> at >> >> >> >> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35) >> >> at java.lang.Thread.run(Thread.java:745) >> >> Caused by: org.apache.thrift.transport.TTransportException: >> >> java.net.SocketTimeoutException: Read timed out >> >> at >> >> >> >> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129) >> >> at >> >> org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) >> >> at >> >> >> >> org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:182) >> >> at >> >> >> >> org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125) >> >> at >> >> >> >> org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253) >> >> at >> >> >> >> org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) >> >> at >> >> >> >> org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) >> >> ... 11 more >> >> Caused by: java.net.SocketTimeoutException: Read timed out >> >> at java.net.SocketInputStream.socketRead0(Native Method) >> >> at java.net.SocketInputStream.read(SocketInputStream.java:152) >> >> at java.net.SocketInputStream.read(SocketInputStream.java:122) >> >> at >> >> java.io.BufferedInputStream.read1(BufferedInputStream.java:273) >> >> at >> >> java.io.BufferedInputStream.read(BufferedInputStream.java:334) >> >> at >> >> >> >> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) >> >> ... 17 more >> >> >> >> Any idea why? >> >> >> >> Thanks. >> >> -Simon > >