[
https://issues.apache.org/jira/browse/PIG-3507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14136898#comment-14136898
]
liyunzhang_intel commented on PIG-3507:
---------------------------------------
Hi [~rohini]
Thanks for your comment. Keberos security is widely used in hadoop. This bug
was also found by other endusers(see
http://www.ghostar.org/2014/05/pig-local-mode-fails-kerberos-auth-enabled/)
detail steps to reproduce the bug:
1. build kerbreos env
2. build hadoop1 env
which hadoop
/home/zly/prj/oss/hadoop-1.2.1/bin/hadoop
grep -C2 kerberos /home/zly/prj/oss/hadoop-1.2.1/conf/core-site.xml
<name>hadoop.security.authentication</name>
<value>kerberos</value>
</property>
3. build pig and run in local mode
cd $PIG_HOME/bin
./pig -x local id.pig
ps -ef|grep pig
root 12126 10072 0 14:42 pts/4 00:00:01 /usr/java/jdk1.7.0_51//bin/java
-Dproc_jar -Xmx1000m -Xmx1000m -Dpig.log.dir=/home/zly/prj/oss/pig/logs
-Dpig.log.file=pig.log -Dpig.home.dir=/home/zly/prj/oss/pig -Xmx1000m
-Dpig.log.dir=/home/zly/prj/oss/pig/logs -Dpig.log.file=pig.log
-Dpig.home.dir=/home/zly/prj/oss/pig
-Dhadoop.log.dir=/home/zly/prj/oss/hadoop-1.2.1/libexec/../logs
-Dhadoop.log.file=hadoop.log
-Dhadoop.home.dir=/home/zly/prj/oss/hadoop-1.2.1/libexec/.. -Dhadoop.id.str=
-Dhadoop.root.logger=INFO,console -Dhadoop.security.logger=INFO,NullAppender
-Djava.library.path=/home/zly/prj/oss/hadoop-1.2.1/libexec/../lib/native/Linux-amd64-64
-Dhadoop.policy.file=hadoop-policy.xml
-Xrunjdwp:transport=dt_socket,server=y,address=9999 -classpath
/home/zly/prj/oss/hadoop-1.2.1/conf:/home/zly/prj/oss/pig/conf:/usr/java/jdk1.7.0_51/lib/tools.jar:/home/zly/prj/oss/pig/build/ivy/lib/Pig/*:/home/zly/prj/oss/hadoop-1.2.1/conf:/home/zly/prj/oss/hadoop-1.2.1/conf:/home/zly/prj/oss/pig/lib/ST4-4.0.4.jar:/home/zly/prj/oss/pig/lib/accumulo-core-1.5.0.jar:......
*/home/zly/prj/oss/hadoop-1.2.1/conf:/home/zly/prj/oss/hadoop-1.2.1/conf is in
the classpath*
*Why UserGroupInformation set "hadoop.security.authentication" as "kerberos"*
{code}
UserGroupInformation#ensureInitialized
private static synchronized void ensureInitialized() {
if (!isInitialized) {
initialize(new Configuration());
}
}
{code}
{code}
UserGroupInformation#initialize
private static synchronized void initialize(Configuration conf) {
String value = conf.get(HADOOP_SECURITY_AUTHENTICATION);
if (value == null || "simple".equals(value)) {
useKerberos = false;
} else if ("kerberos".equals(value)) {
useKerberos = true;
} else {
throw new IllegalArgumentException("Invalid attribute value for " +
HADOOP_SECURITY_AUTHENTICATION +
" of " + value);
}
{code}
The value of conf.get(HADOOP_SECURITY_AUTHENTICATION) decides "simple" or
"kerberos".
Referred
https://hadoop.apache.org/docs/r1.2.1/api/org/apache/hadoop/conf/Configuration.html
Configurations are specified by resources. A resource contains a set of
name/value pairs as XML data. Each resource is named by either a String or by a
Path. If named by a String, then the classpath is examined for a file with that
name. If named by a Path, then the local filesystem is examined directly,
without referring to the classpath.
Unless explicitly turned off, Hadoop by default specifies two resources, loaded
in-order from the classpath:
core-default.xml : Read-only defaults for hadoop.
core-site.xml: Site-specific configuration for a given hadoop installation.
*core-default.xml in the jar and core-site.xml in the $HADOOP_HOME/conf/ are
loaded.* So the value of "hadoop.security.authentication" is "kerberos".
If you don't build hadoop env and only deploy kerberos.
1. build kerberos env
2. build pig and run in local mode:
cd $PIG_HOME/bin; ./pig -x local id.pig
Error messages are found:
exec /usr/java/jdk1.7.0_51/bin/java -Xmx1000m
-Dpig.log.dir=/home/zly/prj/oss/pig/bin/../logs -Dpig.log.file=pig.log
-Dpig.home.dir=/home/zly/prj/oss/pig/bin/..
-Xrunjdwp:transport=dt_socket,server=y,address=9999 -classpath
/home/zly/prj/oss/pig/bin/../conf:/usr/java/jdk1.7.0_51/lib/tools.jar:/home/zly/prj/oss/pig/bin/../lib/ST4-4.0.4.jar:/home/zly/prj/oss/pig/bin/../lib/accumulo-core-1.5.0.jar:/home/zly/prj/oss/pig/bin/../lib/accumulo-fate-1.5.0.jar:/home/zly/prj/oss/pig/bin/../lib/accumulo-server-1.5.0.jar:/home/zly/prj/oss/pig/bin/../lib/accumulo-start-1.5.0.jar:/home/zly/prj/oss/pig/bin/../lib/accumulo-trace-1.5.0.jar:/home/zly/prj/oss/pig/bin/../lib/antlr-runtime-3.4.jar:/home/zly/prj/oss/pig/bin/../lib/asm-4.0.jar:/home/zly/prj/oss/pig/bin/../lib/asm-commons-4.0.jar:/home/zly/prj/oss/pig/bin/../lib/asm-tree-4.0.jar:/home/zly/prj/oss/pig/bin/../lib/automaton-1.11-8.jar:/home/zly/prj/oss/pig/bin/../lib/avro-1.7.5.jar:/home/zly/prj/oss/pig/bin/../lib/avro-tools-1.7.5-nodeps.jar:/home/zly/prj/oss/pig/bin/../lib/groovy-all-1.8.6.jar:/home/zly/prj/oss/pig/bin/../lib/guava-14.0.1.jar:/home/zly/prj/oss/pig/bin/../lib/hive-common-0.14.0-SNAPSHOT.jar:/home/zly/prj/oss/pig/bin/../lib/hive-exec-0.14.0-SNAPSHOT-core.jar:/home/zly/prj/oss/pig/bin/../lib/hive-serde-0.14.0-SNAPSHOT.jar:/home/zly/prj/oss/pig/bin/../lib/hive-shims-common-0.14.0-SNAPSHOT.jar:/home/zly/prj/oss/pig/bin/../lib/hive-shims-common-secure-0.14.0-SNAPSHOT.jar:/home/zly/prj/oss/pig/bin/../lib/jackson-core-asl-1.8.8.jar:/home/zly/prj/oss/pig/bin/../lib/jackson-mapper-asl-1.8.8.jar:/home/zly/prj/oss/pig/bin/../lib/jansi-1.9.jar:/home/zly/prj/oss/pig/bin/../lib/jline-1.0.jar:/home/zly/prj/oss/pig/bin/../lib/joda-time-2.1.jar:/home/zly/prj/oss/pig/bin/../lib/jruby-complete-1.6.7.jar:/home/zly/prj/oss/pig/bin/../lib/js-1.7R2.jar:/home/zly/prj/oss/pig/bin/../lib/json-simple-1.1.jar:/home/zly/prj/oss/pig/bin/../lib/jython-standalone-2.5.3.jar:/home/zly/prj/oss/pig/bin/../lib/protobuf-java-2.4.1-shaded.jar:/home/zly/prj/oss/pig/bin/../lib/protobuf-java-2.4.1.jar:/home/zly/prj/oss/pig/bin/../lib/protobuf-java-2.5.0.jar:/home/zly/prj/oss/pig/bin/../lib/snappy-java-1.0.5.jar:/home/zly/prj/oss/pig/bin/../lib/trevni-avro-1.7.5.jar:/home/zly/prj/oss/pig/bin/../lib/trevni-core-1.7.5.jar:/home/zly/prj/oss/pig/bin/../lib/zookeeper-3.4.5.jar:/home/zly/prj/oss/pig/bin/../pig-0.14.0-SNAPSHOT-core-h1.jar:/home/zly/prj/oss/pig/bin/../lib/h1/avro-mapred-1.7.5.jar:/home/zly/prj/oss/pig/bin/../lib/h1/hbase-client-0.96.0-hadoop1.jar:/home/zly/prj/oss/pig/bin/../lib/h1/hbase-common-0.96.0-hadoop1.jar:/home/zly/prj/oss/pig/bin/../lib/h1/hbase-hadoop-compat-0.96.0-hadoop1.jar:/home/zly/prj/oss/pig/bin/../lib/h1/hbase-hadoop1-compat-0.96.0-hadoop1.jar:/home/zly/prj/oss/pig/bin/../lib/h1/hbase-protocol-0.96.0-hadoop1.jar:/home/zly/prj/oss/pig/bin/../lib/h1/hbase-server-0.96.0-hadoop1.jar:/home/zly/prj/oss/pig/bin/../lib/h1/hive-shims-0.20S-0.14.0-SNAPSHOT.jar
org.apache.pig.Main -x local id.pig
Listening for transport dt_socket at address: 9999
Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/commons/logging/LogFactory
at org.apache.pig.Main.<clinit>(Main.java:100)
Caused by: java.lang.ClassNotFoundException:
org.apache.commons.logging.LogFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 1 more
Exception in thread "Thread-0" java.lang.NoClassDefFoundError:
org/apache/hadoop/fs/LocalFileSystem
at org.apache.pig.Main$1.run(Main.java:95)
Caused by: java.lang.ClassNotFoundException:
org.apache.hadoop.fs.LocalFileSystem
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 1 more
> It fails to run pig in local mode on a Kerberos enabled Hadoop cluster
> ----------------------------------------------------------------------
>
> Key: PIG-3507
> URL: https://issues.apache.org/jira/browse/PIG-3507
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.10.0, 0.11
> Reporter: chiyang
> Assignee: liyunzhang_intel
> Fix For: 0.14.0
>
> Attachments: PIG-3507.patch, PIG_3507_1.patch
>
>
> It fails to run pig in local mode on a Kerberos enabled Hadoop cluster
> *Command*
> pig -x local <pig script>
>
> *Pig script*
> A = load '/etc/passwd';
> dump A;
>
> *Root cause*
> When running pig in local mode, jobConf in HExecutionEngine is initiated with
> core-default.xml (hadoop.security.authentication = simple),
> mapred-default.xml, and yarn-default.xml. However, the settings are not
> passed to UserGroupInformation. That's why obtainTokensForNamenodesInternal()
> is called from obtainTokensForNamenodes(), and causes the exception to happen.
> {noformat}
> public static void obtainTokensForNamenodes(Credentials credentials, Path[]
> ps, Configuration conf) throws IOException {
> if (!UserGroupInformation.isSecurityEnabled()) {
> return;
> }
> obtainTokensForNamenodesInternal(credentials, ps, conf);
> }
> {noformat}
> *Error*
> Pig Stack Trace
> ---------------
> ERROR 6000: Output Location Validation Failed for:
> 'file:/tmp/temp-308998488/tmp-2025176494 More info to follow:
> Can't get JT Kerberos principal for use as renewer
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to
> open iterator for alias A
> at org.apache.pig.PigServer.openIterator(PigServer.java:841)
> at
> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:696)
> at
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:320)
> at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194)
> at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170)
> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
> at org.apache.pig.Main.run(Main.java:604)
> at org.apache.pig.Main.main(Main.java:157)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
> Caused by: org.apache.pig.PigException: ERROR 1002: Unable to store alias A
> at org.apache.pig.PigServer.storeEx(PigServer.java:940)
> at org.apache.pig.PigServer.store(PigServer.java:903)
> at org.apache.pig.PigServer.openIterator(PigServer.java:816)
> ... 12 more
> Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 6000: Output
> Location Validation Failed for: 'file:/tmp/temp-308998488/tmp-2025176494 More
> info to follow:
> Can't get JT Kerberos principal for use as renewer
> at
> org.apache.pig.newplan.logical.rules.InputOutputFileValidator$InputOutputFileVisitor.visit(InputOutputFileValidator.java:95)
> at
> org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:66)
> at
> org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64)
> at
> org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
> at
> org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53)
> at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52)
> at
> org.apache.pig.newplan.logical.rules.InputOutputFileValidator.validate(InputOutputFileValidator.java:45)
> at
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:288)
> at org.apache.pig.PigServer.compilePp(PigServer.java:1327)
> at
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1252)
> at org.apache.pig.PigServer.storeEx(PigServer.java:936)
> ... 14 more
> Caused by: java.io.IOException: Can't get JT Kerberos principal for use as
> renewer
> at
> org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:129)
> at
> org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:111)
> at
> org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:85)
> at
> org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:127)
> at
> org.apache.pig.newplan.logical.rules.InputOutputFileValidator$InputOutputFileVisitor.visit(InputOutputFileValidator.java:80)
> ... 24 more
> ================================================================================
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)