[
https://issues.apache.org/jira/browse/PIG-3507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14113672#comment-14113672
]
liyunzhang commented on PIG-3507:
---------------------------------
The problem that pig can not work in local mode with kerberos, only exists in
hadoop2.
{noformat}
As the root cause of the bug description said:
When running pig in local mode, jobConf in HExecutionEngine is initiated with
core-default.xml (hadoop.security.authentication = simple), mapred-default.xml,
and yarn-default.xml. However, the settings are not passed to
UserGroupInformation. That's why obtainTokensForNamenodesInternal() is called
from obtainTokensForNamenodes(), and causes the exception
to happen.
org.apache.hadoop.mapreduce.security.TokenCache#obtainTokensForNamenodes
public static void obtainTokensForNamenodes(Credentials credentials, Path[]
ps, Configuration conf) throws IOException {
if (!UserGroupInformation.isSecurityEnabled()) {
return;
}
obtainTokensForNamenodesInternal(credentials, ps, conf);
}
{noformat}
in hadoop1.2.1:
In function "obtainTokensForNamenodesInternal",if path is in local mode, the
fsName will be null. In this case, checks about whether the kerberos
credentials exist or not will not be executed.
{{org.apache.hadoop.mapreduce.security.TokenCache#obtainTokensForNamenodesInternal}}
{code:java}
static void obtainTokensForNamenodesInternal(Credentials credentials,
Path [] ps,
Configuration conf
) throws IOException {
// get jobtracker principal id (for the renewer)
KerberosName jtKrbName = new KerberosName(conf.get(JobTracker.JT_USER_NAME,
""));
String delegTokenRenewer = jtKrbName.getShortName();
boolean readFile = true;
for(Path p: ps) {
FileSystem fs = FileSystem.get(p.toUri(), conf);
String fsName = fs.getCanonicalServiceName();
if (fsName == null) {
continue;
}
if (TokenCache.getDelegationToken(credentials, fsName) == null) {
//TODO: Need to come up with a better place to put
//this block of code to do with reading the file
if (readFile) {
readFile = false;
String binaryTokenFilename =
conf.get(MAPREDUCE_JOB_CREDENTIALS_BINARY);
if (binaryTokenFilename != null) {
Credentials binary;
try {
binary = Credentials.readTokenStorageFile(new Path("file:///" +
binaryTokenFilename), conf);
} catch (IOException e) {
throw new RuntimeException(e);
}
credentials.addAll(binary);
}
if (TokenCache.getDelegationToken(credentials, fsName) != null) {
LOG.debug("DT for " + fsName + " is already present");
continue;
}
}
Token<?> token = fs.getDelegationToken(delegTokenRenewer);
if (token != null) {
Text fsNameText = new Text(fsName);
credentials.addToken(fsNameText, token);
LOG.info("Got dt for " + p + ";uri="+ fsName +
";t.service="+token.getService());
}
}
}
}
{code}
in hadoop2.3:
In function "obtainTokensForNamenodesInternal", whether the path is in local
mode or not, obtainTokensForNamenodesInternal always is executed. At that time,
kerberos check fails in local mode.
{{org.apache.hadoop.mapreduce.security.TokenCache#obtainTokensForNamenodesInternal}}
{code:java}
static void obtainTokensForNamenodesInternal(Credentials credentials,
Path[] ps, Configuration conf) throws IOException {
Set<FileSystem> fsSet = new HashSet<FileSystem>();
for(Path p: ps) {
fsSet.add(p.getFileSystem(conf));
}
for (FileSystem fs : fsSet) {
obtainTokensForNamenodesInternal(fs, credentials, conf);
}
}
static void obtainTokensForNamenodesInternal(FileSystem fs,
Credentials credentials, Configuration conf) throws IOException {
String delegTokenRenewer = Master.getMasterPrincipal(conf);
if (delegTokenRenewer == null || delegTokenRenewer.length() == 0) {
throw new IOException(
"Can't get Master Kerberos principal for use as renewer");
}
mergeBinaryTokens(credentials, conf);
final Token<?> tokens[] = fs.addDelegationTokens(delegTokenRenewer,
credentials);
if (tokens != null) {
for (Token<?> token : tokens) {
LOG.info("Got dt for " + fs.getUri() + "; "+token);
}
}
}
{code}
As [~rohini] said:{quote} We should fix the bin/pig script to not load
HADOOP_CONF_DIR in local mode.{quote} What patch did is not load the
configurations in HADOOP_CONF_DIR in local mode.
> It fails to run pig in local mode on a Kerberos enabled Hadoop cluster
> ----------------------------------------------------------------------
>
> Key: PIG-3507
> URL: https://issues.apache.org/jira/browse/PIG-3507
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.10.0, 0.11
> Reporter: chiyang
> Assignee: chiyang
> Fix For: 0.14.0
>
> Attachments: PIG-3507.patch
>
>
> It fails to run pig in local mode on a Kerberos enabled Hadoop cluster
> *Command*
> pig -x local <pig script>
>
> *Pig script*
> A = load '/etc/passwd';
> dump A;
>
> *Root cause*
> When running pig in local mode, jobConf in HExecutionEngine is initiated with
> core-default.xml (hadoop.security.authentication = simple),
> mapred-default.xml, and yarn-default.xml. However, the settings are not
> passed to UserGroupInformation. That's why obtainTokensForNamenodesInternal()
> is called from obtainTokensForNamenodes(), and causes the exception to happen.
> {noformat}
> public static void obtainTokensForNamenodes(Credentials credentials, Path[]
> ps, Configuration conf) throws IOException {
> if (!UserGroupInformation.isSecurityEnabled()) {
> return;
> }
> obtainTokensForNamenodesInternal(credentials, ps, conf);
> }
> {noformat}
> *Error*
> Pig Stack Trace
> ---------------
> ERROR 6000: Output Location Validation Failed for:
> 'file:/tmp/temp-308998488/tmp-2025176494 More info to follow:
> Can't get JT Kerberos principal for use as renewer
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to
> open iterator for alias A
> at org.apache.pig.PigServer.openIterator(PigServer.java:841)
> at
> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:696)
> at
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:320)
> at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194)
> at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170)
> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
> at org.apache.pig.Main.run(Main.java:604)
> at org.apache.pig.Main.main(Main.java:157)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
> Caused by: org.apache.pig.PigException: ERROR 1002: Unable to store alias A
> at org.apache.pig.PigServer.storeEx(PigServer.java:940)
> at org.apache.pig.PigServer.store(PigServer.java:903)
> at org.apache.pig.PigServer.openIterator(PigServer.java:816)
> ... 12 more
> Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 6000: Output
> Location Validation Failed for: 'file:/tmp/temp-308998488/tmp-2025176494 More
> info to follow:
> Can't get JT Kerberos principal for use as renewer
> at
> org.apache.pig.newplan.logical.rules.InputOutputFileValidator$InputOutputFileVisitor.visit(InputOutputFileValidator.java:95)
> at
> org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:66)
> at
> org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64)
> at
> org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
> at
> org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53)
> at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52)
> at
> org.apache.pig.newplan.logical.rules.InputOutputFileValidator.validate(InputOutputFileValidator.java:45)
> at
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:288)
> at org.apache.pig.PigServer.compilePp(PigServer.java:1327)
> at
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1252)
> at org.apache.pig.PigServer.storeEx(PigServer.java:936)
> ... 14 more
> Caused by: java.io.IOException: Can't get JT Kerberos principal for use as
> renewer
> at
> org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:129)
> at
> org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:111)
> at
> org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:85)
> at
> org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:127)
> at
> org.apache.pig.newplan.logical.rules.InputOutputFileValidator$InputOutputFileVisitor.visit(InputOutputFileValidator.java:80)
> ... 24 more
> ================================================================================
--
This message was sent by Atlassian JIRA
(v6.2#6252)