[
https://issues.apache.org/jira/browse/MAPREDUCE-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13207237#comment-13207237
]
Daryn Sharp commented on MAPREDUCE-3825:
----------------------------------------
Per an offline request by Sanjay, here's a summary of the proposed changes that
henceforth shall be referred to as "solution 3".
Required {{FileSystem}} APIs:
* {{getFileSystems()}} - proposed new api
** returns leaf filesystems, returns this filesystem by default
* {{getCanonicalServiceName()}} - existing api, no change; should consider
reducing visibility
** returns the service of this filesystem's token, or null if no intrinsic token
* {{getDelegationToken(renewer)}} - existing api, no change; should consider
reducing visibility
** returns this filesystem's token, {{token.getService()}} must match
{{getCanonicalServiceName()}}
* {{getDelegationTokens(renewer, credentials)}} - existing api, new to 23;
should be public api to acquire tokens
** returns tokens not already acquired for this filesystem
** propose adding new tokens to supplied creds
* {{getDelegationTokens(renewer)}} - existing api, new to 23; proposed
convenience method
** returns all tokens for the filesystem
Changes to:
# {{FilterFileSystem}}
#* Add:{code}
@Override
String getCanonicalServiceName() {
return null;
}
@Override
public List<FileSystem> getFileSystems() {
return fs.getFileSystems();
}{code}
# {{DistributedFileSystem}}
#* Delete {{getDelegationTokens(renewer)}}
# {{ViewFileSystem}}
#* Delete {{getDelegationTokens(renewer)}} and {{getDelegationTokens(renewer,
creds)}}
#* Add:{code}
@Override
String getCanonicalServiceName() {
return null;
}
@Override
List<FileSystem> getFileSystems() {
List<InodeTree.MountPoint<FileSystem>> mountPoints =
fsState.getMountPoints();
Set<FileSystem> fsSet = new HashSet<FileSystem>();
for (InodeTree.MountPoint<FileSystem> mountPoint : mountPoints) {
FileSystem targetFs = mountPoint.target.targetFileSystem;
fsSet.addAll(targetFs.getFileSystems());
}
return new ArrayList<FileSystem>(fsSet);
}{code}
# {{FileSystem}}
#* Add:{code}
List<FileSystem> getFileSystems() {
List<FileSystem> list = new ArrayList<FileSystem>(1);
list.add(this);
return list;
}{code}
#* Change:{code}
public final List<Token<?>> getDelegationTokens(String renewer, Credentials
credentials) throws IOException {
List<Token<?>> newTokens = new ArrayList<Token<?>>();
// there shouldn't be dups, but use a set just to be safe
Set<FileSystem> fsLeafs = new HashSet<FileSystem>(getFileSystems());
for (FileSystem fs : fsLeafs) {
String serviceString = fs.getCanonicalServiceName();
if (serviceString != null) { // null service = no tokens
Text service = new Text(serviceString);
Token<?> token = credentials.getToken(service);
if (token == null) { // we don't have the token, so get it
token = fs.getDelegationToken(renewer);
if (token != null) { // add to the return list and to the creds
newTokens.add(token);
credentials.addToken(service, token);
}
}
}
}
return newTokens;
}
// just a convenience method, it's not strictly required.
public final List<Token<?>> getDelegationTokens(String renewer) throws
IOException {
return getDelegationTokens(renewer, new Credentials());
}{code}
# {{TokenCache}}
#* Change: (note this is a big simplification){code}
static void obtainTokensForNamenodesInternal(FileSystem fs,
Credentials credentials, Configuration conf) throws IOException {
String delegTokenRenewer = Master.getMasterPrincipal(conf);
if (delegTokenRenewer == null || delegTokenRenewer.length() == 0) {
throw new IOException(
"Can't get Master Kerberos principal for use as renewer");
}
mergeBinaryTokens(credentials, conf);
List<Token<?>> tokens = fs.getDelegationTokens(delegTokenRenewer,
credentials);
if (tokens != null) {
for (Token<?> token : tokens) {
LOG.info("Got dt for " + fs.getUri() +
";t.service="+token.getService());
}
}
}{code}
> MR should not be getting duplicate tokens for a MR Job.
> -------------------------------------------------------
>
> Key: MAPREDUCE-3825
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3825
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: security
> Affects Versions: 0.23.1, 0.24.0
> Reporter: Daryn Sharp
> Assignee: Daryn Sharp
> Attachments: MAPREDUCE-3825.patch, TokenCache.pdf
>
>
> This is the counterpart to HADOOP-7967.
> MR gets tokens for all input, output and the default filesystem when a MR job
> is submitted.
> The APIs in FileSystem make it challenging to avoid duplicate tokens when
> there are file systems that have embedded
> filesystems.
> Here is the original description that Daryn wrote:
> The token cache currently tries to assume a filesystem's token service key.
> The assumption generally worked while there was a one to one mapping of
> filesystem to token. With the advent of multi-token filesystems like viewfs,
> the token cache will try to use a service key (ie. for viewfs) that will
> never exist (because it really gets the mounted fs tokens).
> The descriop
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira