James Srinivasan wrote:
Delegation tokens are serialized into the Job's "credentials" section and
distributed securely that way.
Ah, that's my problem. Will probably have to update the GeoMesa code
to wok with Jobs rather than Configurations, so that the Credentials
aren't lost.
Hmm, not so easy it seems. My callstack which triggers the exception
when the credentials are missing from the Job is this:
java.lang.NullPointerException
at
org.apache.accumulo.core.client.mapreduce.lib.impl.ConfiguratorBase.unwrapAuthenticationToken(ConfiguratorBase.java:493)
at
org.apache.accumulo.core.client.mapreduce.AbstractInputFormat.validateOptions(AbstractInputFormat.java:390)
at
org.apache.accumulo.core.client.mapreduce.AbstractInputFormat.getSplits(AbstractInputFormat.java:668)
at
org.locationtech.geomesa.jobs.mapreduce.GeoMesaAccumuloInputFormat.getSplits(GeoMesaAccumuloInputFormat.scala:174)
at org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:121)
...
Now org.apache.spark.rdd.NewHadoopRDD.getPartitions does this:
val jobContext = new JobContextImpl(_conf, jobId)
So doesn't seem to support tokens (Jobs) being supplied, just Configurations.
I can't call AccumuloInputFormat.setConnectorInfo again since it has
already been called, and I presume adding the serialised token to the
Configuration would be insecure?
Yeah, the configuration can't protect sensitive information.
MapReduce/YARN has special handling to make sure those tokens serialized
in the Job's credentials are only readable by you (the job submitter).
The thing I don't entirely follow is how you've gotten into this
situation to begin with. The adding of the delegation tokens to the
Job's credentials should be done by Accumulo's MR code on your behalf
(just like it's obtaining the delegation token, it would automatically
add it to the job for ya).
Any chance you can provide an end-to-end example? I am also pretty
Spark-ignorant -- so maybe I just don't understand what is possible and
what isn't..
Yours in puzzlement,
James