hcatalog.jar is independent of libraries like metastore and thrift when it's
running on the slaves side of a cluster
--------------------------------------------------------------------------------------------------------------------
Key: HCATALOG-137
URL: https://issues.apache.org/jira/browse/HCATALOG-137
Project: HCatalog
Issue Type: Improvement
Components: pig
Affects Versions: 0.2
Reporter: Min Zhou
Priority: Critical
At present, if we run a pig script like below w/o register hive-metastore.jar
or libthrift.jar.
{noformat}
A = LOAD 'orders' USING org.apache.hcatalog.pig.HCatLoader();
B = FOREACH A GENERATE o_custkey;
C = LIMIT B 10;
DUMP C;
{noformat}
Each mapper would throw exceptions like below
{noformat}
java.lang.RuntimeException: could not instantiate
'org.apache.hcatalog.pig.HCatLoader' with arguments 'null' at
org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:504) at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getLoadFunc(PigInputFormat.java:154)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.createRecordReader(PigInputFormat.java:106)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:594) at
org.apache.hadoop.mapred.MapTask.run(MapTask.java:308) at
org.apache.hadoop.mapred.Child.main(Child.java:156) Caused by:
java.lang.NoClassDefFoundError:
org/apache/hadoop/hive/metastore/api/NoSuchObjectException at
org.apache.hcatalog.pig.HCatLoader.(HCatLoader.java:55) at
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at
java.lang.Class.newInstance0(Class.java:355) at
java.lang.Class.newInstance(Class.java:308) at
org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:474) ...
5 more Caused by: java.lang.ClassNotFoundException:
org.apache.hadoop.hive.metastore.api.NoSuchObjectException at
java.net.URLClassLoader$1.run(URLClassLoader.java:202) at
java.security.AccessController.doPrivileged(Native Method) at
java.net.URLClassLoader.findClass(URLClassLoader.java:190) at
java.lang.ClassLoader.loadClass(ClassLoader.java:307) at
sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at
java.lang.ClassLoader.loadClass(ClassLoader.java:248) ... 13 more
{noformat}
Theoretically, hive metastore and thrift are needed by HCatLoader/HCatStorer
when it's running on the client side, However, they actually won't needed on
the slave side. The scripts people register those jars are unnecessary. Those
jars shouldn't be distributed to every nodes of MR tasks will run on.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira