Are you sure that the spark tasks have the proper ClientConfiguration? They need to have instance.rpc.sasl.enabled. I believe you should be able to set this via the AccumuloInputFormat

You can turn up logging org.apache.accumulo.core.client=TRACE and/or set the system property -Dsun.security.krb5.debug=true to get some more information as to why the authentication is failing.

Xu (Simon) Chen wrote:
Josh,

I am using this function:

https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/mapred/AbstractInputFormat.java#L106

If I pass in a KerberosToken, it's stuck at line 111; if I pass in a
delegation token, the setConnectorInfo function finishes fine.

But when I do something like queryRDD.count, spark eventually calls
HadoopRDD.getPartitions, which calls the following and get stuck in
the last authenticate() function:
https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/mapred/AbstractInputFormat.java#L621
https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/mapred/AbstractInputFormat.java#L348
https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/ZooKeeperInstance.java#L248
https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/impl/ConnectorImpl.java#L70

Which essentially the same place where it would be stuck with KerberosToken.

-Simon

On Thu, Jun 11, 2015 at 9:41 PM, Josh Elser<[email protected]>  wrote:
What are the Accumulo methods that you are calling and what is the error you
are seeing?

A KerberosToken cannot be used in a MapReduce job which is why a
DelegationToken is automatically retrieved. You should still be able to
provide your own DelegationToken -- if that doesn't work, that's a bug.

Xu (Simon) Chen wrote:
I actually added a flag such that I can pass in either a KerberosToken
or a DelegationTokenImpl to accumulo.

Actually when a KerberosToken is passed in, accumulo converts it to a
DelegationToken - the conversion is where I am having trouble. I tried
passing in a delegation token directly to bypass the conversion, but a
similar problem happens, that I am stuck at authenticate on the client
side, and server side outputs the same output...

On Thursday, June 11, 2015, Josh Elser<[email protected]
<mailto:[email protected]>>  wrote:

     Keep in mind that the authentication path for DelegationToken
     (mapreduce) and KerberosToken are completely different.

     Since most mapreduce jobs have multiple mappers (or reducers), I
     expect we would have run into the case that the same DelegationToken
     was used multiple times. It would still be good to narrow down the
     scope of the problem.

     Xu (Simon) Chen wrote:

         Thanks Josh...

         I tested this in scala REPL, and called
         DataStoreFinder.getDataStore()
         multiple times, each time it seems to be reusing the same
         KerberosToken object, and it works fine each time.

         So my problem only happens when the token is used in accumulo's
         mapred
         package. Weird..

         -Simon


         On Thu, Jun 11, 2015 at 5:29 PM, Josh
         Elser<[email protected]>   wrote:

             Simon,

             Can you reproduce this in plain-jane Java code? I don't know
             enough about
             spark/scala, much less what Geomesa is actually do, to know
             what the issue
             is.

             Also, which token are you referring to: A KerberosToken or a
             DelegationToken? Either of them should be usable as many
             times as you'd like
             (given the underlying credentials are still available for KT
             or the DT token
             hasn't yet expired).


             Xu (Simon) Chen wrote:

                 Folks,

                 I am working on geomesa+accumulo+spark integration. For
                 some reason, I
                 found that the same token cannot be used to authenticate
                 twice.

                 The workflow is that geomesa would try to create a
                 hadoop rdd, during
                 which it tries to create an AccumuloDataStore:


https://github.com/locationtech/geomesa/blob/master/geomesa-compute/src/main/scala/org/locationtech/geomesa/compute/spark/GeoMesaSpark.scala#L81

                 During this process, a ZooKeeperInstance is created:


https://github.com/locationtech/geomesa/blob/rc7_a1.7_h2.5/geomesa-core/src/main/scala/org/locationtech/geomesa/core/data/AccumuloDataStoreFactory.scala#L177
                 I modified geomesa such that it would use kerberos to
                 authenticate
                 here. This step works fine.

                 But next, geomesa calls
ConfigurationBase.setConnectorInfo:


https://github.com/locationtech/geomesa/blob/rc7_a1.7_h2.5/geomesa-compute/src/main/scala/org/locationtech/geomesa/compute/spark/GeoMesaSpark.scala#L69

                 This is using the same token and the same zookeeper URI,
                 for some
                 reason it is stuck on spark-shell, and the following is
                 outputted on
                 tserver side:

                 2015-06-06 18:58:19,616 [server.TThreadPoolServer]
                 ERROR: Error
                 occurred during processing of message.
                 java.lang.RuntimeException:
                 org.apache.thrift.transport.TTransportException:
                 java.net<http://java.net>.SocketTimeoutException: Read

                 timed out
                            at

org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219)
                            at

org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:51)
                            at

org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:48)
                            at
                 java.security.AccessController.doPrivileged(Native Method)
                            at
                 javax.security.auth.Subject.doAs(Subject.java:356)
                            at

org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1622)
                            at

org.apache.accumulo.core.rpc.UGIAssumingTransportFactory.getTransport(UGIAssumingTransportFactory.java:48)
                            at

org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:208)
                            at

java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
                            at

java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
                            at

org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
                            at java.lang.Thread.run(Thread.java:745)
                 Caused by:
org.apache.thrift.transport.TTransportException:
                 java.net<http://java.net>.SocketTimeoutException: Read
                 timed out
                            at

org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
                            at

org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
                            at

org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:182)
                            at

org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125)
                            at

org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
                            at

org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
                            at

org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
                            ... 11 more
                 Caused by: java.net
                 <http://java.net>.SocketTimeoutException: Read timed out
                            at
                 java.net.SocketInputStream.socketRead0(Native Method)
                            at

java.net.SocketInputStream.read(SocketInputStream.java:152)
                            at

java.net.SocketInputStream.read(SocketInputStream.java:122)
                            at

java.io.BufferedInputStream.read1(BufferedInputStream.java:273)
                            at

java.io.BufferedInputStream.read(BufferedInputStream.java:334)
                            at

org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
                            ... 17 more

                 Any idea why?

                 Thanks.
                 -Simon

Reply via email to