[ 
https://issues.apache.org/jira/browse/HCATALOG-137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou updated HCATALOG-137:
------------------------------

    Description: 
At present, if we run a pig script like below w/o register hive-metastore.jar 
or libthrift.jar.
{noformat}
A = LOAD 'orders' USING org.apache.hcatalog.pig.HCatLoader(); 
B = FOREACH A GENERATE o_custkey;
C = LIMIT B 10;
DUMP C; 
{noformat}

Each mapper would throw exceptions like below 
{noformat}
java.lang.RuntimeException: could not instantiate 
'org.apache.hcatalog.pig.HCatLoader' with arguments 'null' 
at org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:504) 
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getLoadFunc(PigInputFormat.java:154)
 
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.createRecordReader(PigInputFormat.java:106)
 
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:594) 
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:308) at 
org.apache.hadoop.mapred.Child.main(Child.java:156)
 Caused by: java.lang.NoClassDefFoundError: 
org/apache/hadoop/hive/metastore/api/NoSuchObjectException 
at org.apache.hcatalog.pig.HCatLoader.(HCatLoader.java:55) 
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
 
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
 
at java.lang.reflect.Constructor.newInstance(Constructor.java:513) 
at java.lang.Class.newInstance0(Class.java:355) 
at java.lang.Class.newInstance(Class.java:308) 
at org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:474) 
... 5 more 
Caused by: java.lang.ClassNotFoundException: 
org.apache.hadoop.hive.metastore.api.NoSuchObjectException 
at java.net.URLClassLoader$1.run(URLClassLoader.java:202) 
at java.security.AccessController.doPrivileged(Native Method) 
at java.net.URLClassLoader.findClass(URLClassLoader.java:190) 
at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at 
sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) 
at java.lang.ClassLoader.loadClass(ClassLoader.java:248) 
... 13 more
{noformat}

Theoretically, hive metastore and thrift are needed by HCatLoader/HCatStorer 
when it's running on the client side, However, they actually have no use for 
slave side. The scripts people register those jars are unnecessary. Those jars 
shouldn't be distributed to any nodes where MR tasks will run on.



  was:
At present, if we run a pig script like below w/o register hive-metastore.jar 
or libthrift.jar.
{noformat}
A = LOAD 'orders' USING org.apache.hcatalog.pig.HCatLoader(); 
B = FOREACH A GENERATE o_custkey;
C = LIMIT B 10;
DUMP C; 
{noformat}

Each mapper would throw exceptions like below 
{noformat}
java.lang.RuntimeException: could not instantiate 
'org.apache.hcatalog.pig.HCatLoader' with arguments 'null' 
at org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:504) 
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getLoadFunc(PigInputFormat.java:154)
 
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.createRecordReader(PigInputFormat.java:106)
 
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:594) 
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:308) at 
org.apache.hadoop.mapred.Child.main(Child.java:156)
 Caused by: java.lang.NoClassDefFoundError: 
org/apache/hadoop/hive/metastore/api/NoSuchObjectException 
at org.apache.hcatalog.pig.HCatLoader.(HCatLoader.java:55) 
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
 
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
 
at java.lang.reflect.Constructor.newInstance(Constructor.java:513) 
at java.lang.Class.newInstance0(Class.java:355) 
at java.lang.Class.newInstance(Class.java:308) 
at org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:474) 
... 5 more 
Caused by: java.lang.ClassNotFoundException: 
org.apache.hadoop.hive.metastore.api.NoSuchObjectException 
at java.net.URLClassLoader$1.run(URLClassLoader.java:202) 
at java.security.AccessController.doPrivileged(Native Method) 
at java.net.URLClassLoader.findClass(URLClassLoader.java:190) 
at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at 
sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) 
at java.lang.ClassLoader.loadClass(ClassLoader.java:248) 
... 13 more
{noformat}

Theoretically, hive metastore and thrift are needed by HCatLoader/HCatStorer 
when it's running on the client side, However, they actually won't needed on 
the slave side. The scripts people register those jars are unnecessary. Those 
jars shouldn't be distributed to any nodes where MR tasks will run on.



    
> hcatalog.jar is independent of libraries like metastore and thrift when it's 
> running on the slaves side of a cluster
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HCATALOG-137
>                 URL: https://issues.apache.org/jira/browse/HCATALOG-137
>             Project: HCatalog
>          Issue Type: Improvement
>          Components: pig
>    Affects Versions: 0.2
>            Reporter: Min Zhou
>            Priority: Critical
>
> At present, if we run a pig script like below w/o register hive-metastore.jar 
> or libthrift.jar.
> {noformat}
> A = LOAD 'orders' USING org.apache.hcatalog.pig.HCatLoader(); 
> B = FOREACH A GENERATE o_custkey;
> C = LIMIT B 10;
> DUMP C; 
> {noformat}
> Each mapper would throw exceptions like below 
> {noformat}
> java.lang.RuntimeException: could not instantiate 
> 'org.apache.hcatalog.pig.HCatLoader' with arguments 'null' 
> at 
> org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:504) 
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getLoadFunc(PigInputFormat.java:154)
>  
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.createRecordReader(PigInputFormat.java:106)
>  
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:594) 
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:308) at 
> org.apache.hadoop.mapred.Child.main(Child.java:156)
>  Caused by: java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hive/metastore/api/NoSuchObjectException 
> at org.apache.hcatalog.pig.HCatLoader.(HCatLoader.java:55) 
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>  
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>  
> at java.lang.reflect.Constructor.newInstance(Constructor.java:513) 
> at java.lang.Class.newInstance0(Class.java:355) 
> at java.lang.Class.newInstance(Class.java:308) 
> at 
> org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:474) 
> ... 5 more 
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.metastore.api.NoSuchObjectException 
> at java.net.URLClassLoader$1.run(URLClassLoader.java:202) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at java.net.URLClassLoader.findClass(URLClassLoader.java:190) 
> at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at 
> sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) 
> at java.lang.ClassLoader.loadClass(ClassLoader.java:248) 
> ... 13 more
> {noformat}
> Theoretically, hive metastore and thrift are needed by HCatLoader/HCatStorer 
> when it's running on the client side, However, they actually have no use for 
> slave side. The scripts people register those jars are unnecessary. Those 
> jars shouldn't be distributed to any nodes where MR tasks will run on.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to