[ 
https://issues.apache.org/jira/browse/PIG-3285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13645772#comment-13645772
 ] 

Rohini Palaniswamy commented on PIG-3285:
-----------------------------------------

bq. I am not sure if we double ship those jars if we doing this. Actually I 
would prefer a TableMapReduce.addDependencyJars version which only adds 
hbase.jar/guava.jar/protobuf.jar and additional dependencies when hbase evolves 
(but no hadoop.jar/pig.jar)

Nick, Looking at the code I am sure we will end up double shipping jars which 
is very inefficient. It would be good to write a separate function instead of 
TableMapReduce.addDependencyJars(job) that filters out pig and hadoop jars 
(classes starting with org.apache.pig and org.apache.hadoop) and those in 
pigContext.extraJars from the list of classes in 
TableMapReduce.addDependencyJars(job) and then set them on tmpjars. You can 
reuse JarManager.findContainingJar to find the jar for a class file.
                
> Jobs using HBaseStorage fail to ship dependency jars
> ----------------------------------------------------
>
>                 Key: PIG-3285
>                 URL: https://issues.apache.org/jira/browse/PIG-3285
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Nick Dimiduk
>            Assignee: Nick Dimiduk
>             Fix For: 0.11.1
>
>         Attachments: 0001-PIG-3285-Add-HBase-dependency-jars.patch, 
> 0001-PIG-3285-Add-HBase-dependency-jars.patch, 1.pig, 1.txt, 2.pig
>
>
> Launching a job consuming {{HBaseStorage}} fails out of the box. The user 
> must specify {{-Dpig.additional.jars}} for HBase and all of its dependencies. 
> Exceptions look something like this:
> {noformat}
> 2013-04-19 18:58:39,360 FATAL org.apache.hadoop.mapred.Child: Error running 
> child : java.lang.NoClassDefFoundError: com/google/protobuf/Message
>       at 
> org.apache.hadoop.hbase.io.HbaseObjectWritable.<clinit>(HbaseObjectWritable.java:266)
>       at org.apache.hadoop.hbase.ipc.Invocation.write(Invocation.java:139)
>       at 
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.sendParam(HBaseClient.java:612)
>       at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:975)
>       at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:84)
>       at $Proxy7.getProtocolVersion(Unknown Source)
>       at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:136)
>       at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to