[ https://issues.apache.org/jira/browse/MAPREDUCE-5663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13870396#comment-13870396 ]
Siddharth Seth commented on MAPREDUCE-5663: ------------------------------------------- That's two sets of tokens that are obtained - for the working directory, and for any additional HDFS servers which the user may have configured. In addition to this, tokens may be obtained by Input/OutputFormats >From FileInputFormat {code} Path[] dirs = getInputPaths(job); if (dirs.length == 0) { throw new IOException("No input paths specified in job"); } // get tokens for all the required FileSystems.. TokenCache.obtainTokensForNamenodes(job.getCredentials(), dirs, job.getConfiguration()); {code} getInputPaths reads the property "mapreduce.input.fileinputformat.inputdir" - which is specific to FIF. If the input paths reside on a different Namenode than the one on which the staging directory is, I don't think users must set MRJobConfig.JOB_NAMENODES. The tokens would just be picked up as part of client side split generation. In terms of Oozie, from what I understand, the JobSubmitter does not get invoked on a box with kerberos credentials - not for the main job anyway (maybe for the launcher) - so this code to obtain tokens doesn't kick in. If that's the case, my guess is Oozie has additional configuration, and explicitly goes out and fetches tokens before submitting the launcher. > Add an interface to Input/Ouput Formats to obtain delegation tokens > ------------------------------------------------------------------- > > Key: MAPREDUCE-5663 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5663 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Reporter: Siddharth Seth > Assignee: Michael Weng > Attachments: MAPREDUCE-5663.4.txt, MAPREDUCE-5663.5.txt, > MAPREDUCE-5663.6.txt, MAPREDUCE-5663.patch.txt, MAPREDUCE-5663.patch.txt2, > MAPREDUCE-5663.patch.txt3 > > > Currently, delegation tokens are obtained as part of the getSplits / > checkOutputSpecs calls to the InputFormat / OutputFormat respectively. > This works as long as the splits are generated on a node with kerberos > credentials. For split generation elsewhere (AM for example), an explicit > interface is required. -- This message was sent by Atlassian JIRA (v6.1.5#6160)