Hi,
I'm a committer at the Apache Flink project.
One of our users asked for adding support for reading from a secured HDFS
cluster.
Flink has a master-worker model. Since its not really feasible for users to
login with their kerberos credentials on all workers, I wanted to acquire
the security token on the master and send it to all workers.
For that, I wrote the following code to get the tokens in to a byte array:
UserGroupInformation.setConfiguration(hdConf);
Credentials credentials = new Credentials();
UserGroupInformation currUsr = UserGroupInformation.getCurrentUser();
Collection<Token<? extends TokenIdentifier>> usrTok = currUsr.getTokens();
for(Token<? extends TokenIdentifier> token : usrTok) {
final Text id = new Text(token.getIdentifier());
credentials.addToken(id, token);
}
DataOutputBuffer dob = new DataOutputBuffer();
credentials.writeTokenStorageToStream(dob);
dob.flush();
However, the collection currUsr.getTokens() is empty, hence the output
buffer doesn't contain much data.
I suspect that I didn't fully understand the Hadoop security concepts yet.
It would be great if somebody from the list could clarify how to
properly acquire the tokens.
Also, I was wondering if there is any document describing how the
UserGroupInformation class is working (when is it loading the
credentials, does it only work for Kerberos, ...)
Best,
Robert