[
https://issues.apache.org/jira/browse/HCATALOG-137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Min Zhou updated HCATALOG-137:
------------------------------
Description:
At present, if we run a pig script like below w/o register hive-metastore.jar
or libthrift.jar.
{noformat}
A = LOAD 'orders' USING org.apache.hcatalog.pig.HCatLoader();
B = FOREACH A GENERATE o_custkey;
C = LIMIT B 10;
DUMP C;
{noformat}
Each mapper would throw exceptions like below
{noformat}
java.lang.RuntimeException: could not instantiate
'org.apache.hcatalog.pig.HCatLoader' with arguments 'null'
at org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:504)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getLoadFunc(PigInputFormat.java:154)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.createRecordReader(PigInputFormat.java:106)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:594)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:308) at
org.apache.hadoop.mapred.Child.main(Child.java:156)
Caused by: java.lang.NoClassDefFoundError:
org/apache/hadoop/hive/metastore/api/NoSuchObjectException
at org.apache.hcatalog.pig.HCatLoader.(HCatLoader.java:55)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at java.lang.Class.newInstance0(Class.java:355)
at java.lang.Class.newInstance(Class.java:308)
at org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:474)
... 5 more
Caused by: java.lang.ClassNotFoundException:
org.apache.hadoop.hive.metastore.api.NoSuchObjectException
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at
sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
... 13 more
{noformat}
Theoretically, hive metastore and thrift are needed by HCatLoader/HCatStorer
when it's running on the client side, However, they actually won't needed on
the slave side. The scripts people register those jars are unnecessary. Those
jars shouldn't be distributed to any nodes where MR tasks will run on.
was:
At present, if we run a pig script like below w/o register hive-metastore.jar
or libthrift.jar.
{noformat}
A = LOAD 'orders' USING org.apache.hcatalog.pig.HCatLoader();
B = FOREACH A GENERATE o_custkey;
C = LIMIT B 10;
DUMP C;
{noformat}
Each mapper would throw exceptions like below
{noformat}
java.lang.RuntimeException: could not instantiate
'org.apache.hcatalog.pig.HCatLoader' with arguments 'null' at
org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:504) at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getLoadFunc(PigInputFormat.java:154)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.createRecordReader(PigInputFormat.java:106)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:594) at
org.apache.hadoop.mapred.MapTask.run(MapTask.java:308) at
org.apache.hadoop.mapred.Child.main(Child.java:156) Caused by:
java.lang.NoClassDefFoundError:
org/apache/hadoop/hive/metastore/api/NoSuchObjectException at
org.apache.hcatalog.pig.HCatLoader.(HCatLoader.java:55) at
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at
java.lang.Class.newInstance0(Class.java:355) at
java.lang.Class.newInstance(Class.java:308) at
org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:474) ...
5 more Caused by: java.lang.ClassNotFoundException:
org.apache.hadoop.hive.metastore.api.NoSuchObjectException at
java.net.URLClassLoader$1.run(URLClassLoader.java:202) at
java.security.AccessController.doPrivileged(Native Method) at
java.net.URLClassLoader.findClass(URLClassLoader.java:190) at
java.lang.ClassLoader.loadClass(ClassLoader.java:307) at
sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at
java.lang.ClassLoader.loadClass(ClassLoader.java:248) ... 13 more
{noformat}
Theoretically, hive metastore and thrift are needed by HCatLoader/HCatStorer
when it's running on the client side, However, they actually won't needed on
the slave side. The scripts people register those jars are unnecessary. Those
jars shouldn't be distributed to every nodes of MR tasks will run on.
> hcatalog.jar is independent of libraries like metastore and thrift when it's
> running on the slaves side of a cluster
> --------------------------------------------------------------------------------------------------------------------
>
> Key: HCATALOG-137
> URL: https://issues.apache.org/jira/browse/HCATALOG-137
> Project: HCatalog
> Issue Type: Improvement
> Components: pig
> Affects Versions: 0.2
> Reporter: Min Zhou
> Priority: Critical
>
> At present, if we run a pig script like below w/o register hive-metastore.jar
> or libthrift.jar.
> {noformat}
> A = LOAD 'orders' USING org.apache.hcatalog.pig.HCatLoader();
> B = FOREACH A GENERATE o_custkey;
> C = LIMIT B 10;
> DUMP C;
> {noformat}
> Each mapper would throw exceptions like below
> {noformat}
> java.lang.RuntimeException: could not instantiate
> 'org.apache.hcatalog.pig.HCatLoader' with arguments 'null'
> at
> org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:504)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getLoadFunc(PigInputFormat.java:154)
>
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.createRecordReader(PigInputFormat.java:106)
>
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:594)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:308) at
> org.apache.hadoop.mapred.Child.main(Child.java:156)
> Caused by: java.lang.NoClassDefFoundError:
> org/apache/hadoop/hive/metastore/api/NoSuchObjectException
> at org.apache.hcatalog.pig.HCatLoader.(HCatLoader.java:55)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>
> at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
> at java.lang.Class.newInstance0(Class.java:355)
> at java.lang.Class.newInstance(Class.java:308)
> at
> org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:474)
> ... 5 more
> Caused by: java.lang.ClassNotFoundException:
> org.apache.hadoop.hive.metastore.api.NoSuchObjectException
> at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at
> sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
> ... 13 more
> {noformat}
> Theoretically, hive metastore and thrift are needed by HCatLoader/HCatStorer
> when it's running on the client side, However, they actually won't needed on
> the slave side. The scripts people register those jars are unnecessary. Those
> jars shouldn't be distributed to any nodes where MR tasks will run on.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira