On 29 Jun 2015, at 14:18, Dave Ariens 
<dari...@blackberry.com<mailto:dari...@blackberry.com>> wrote:

I'd like to toss out another idea that doesn't involve a complete end-to-end 
Kerberos implementation.  Essentially, have the driver authenticate to  
Kerberos, instantiate a Hadoop file system, and serialize/cache it for the 
executors to use instead of them having to instantiate their own.

- Driver authenticates to Kerberos via 
UserGroupInformation.loginUserFromKeytab(principal, keytab)
- Driver instantiates a Hadoop configuration via hdfs-site.xml and core-site.xml
- Driver instantiates the Hadoop file system from a path based on the Hadoop 
root URI (hdfs://hadoop-cluster.site.org/) and hadoop config
- Driver makes this file system available to all future executors
- Executors first check for an existing/cached file system object before 
instantiating their own


Hadoop automatically caches filesystems loaded with FileSystem.get(), unless 
you go (fs.NAME.impl.disable.cache=true), so all followup FileSystem.get() 
calls get the same instance automatically.

....But you can't share that information across JVMs or machines, at least in 
my experience. the non-keytab login stuff happens in the depths of the JVM; the 
keytab login is via the Hadoop codebase and some jvm-brittle introspection into 
kerberos implementation classes, code which doesn't directly offer shareability.

Delegation tokens are essentially the workaround: the driver creates those 
tokens and hands them off. That's essentially what YARN client apps are 
expected to do: there's nothing to stop the Mesos code doing the same thing, 
just a matter of implementation and (worse) testing.


-Steve

Reply via email to