[ https://issues.apache.org/jira/browse/PIG-3285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13645934#comment-13645934 ]
Rohini Palaniswamy commented on PIG-3285: ----------------------------------------- Nick, You can raise a jira in Hadoop to handle duplicates. But since we support all older versions, we can't rely on it. Also it will not help with the current problem anyways. The problem here is that hbase code is setting some jars in tmpjars which copies the jar to hdfs to /user/[username]/.staging and adds that hdfs file to DistributedCache.addArchiveToClassPath when JobClient.submitJob() is done. Pig already puts the pig.jar as job.jar and it ships the other registered jar to a tmp location in hdfs (/tmp/...) and then does a DistributedCache.addFileToClassPath before submitting the job. In this case, all the three settings are different and since pig does not use tmpfiles or tmpjars and does the work by itself the hdfs path is also different. So duplicates have to be resolved at the pig level. > Jobs using HBaseStorage fail to ship dependency jars > ---------------------------------------------------- > > Key: PIG-3285 > URL: https://issues.apache.org/jira/browse/PIG-3285 > Project: Pig > Issue Type: Bug > Reporter: Nick Dimiduk > Assignee: Nick Dimiduk > Fix For: 0.11.1 > > Attachments: 0001-PIG-3285-Add-HBase-dependency-jars.patch, > 0001-PIG-3285-Add-HBase-dependency-jars.patch, 1.pig, 1.txt, 2.pig > > > Launching a job consuming {{HBaseStorage}} fails out of the box. The user > must specify {{-Dpig.additional.jars}} for HBase and all of its dependencies. > Exceptions look something like this: > {noformat} > 2013-04-19 18:58:39,360 FATAL org.apache.hadoop.mapred.Child: Error running > child : java.lang.NoClassDefFoundError: com/google/protobuf/Message > at > org.apache.hadoop.hbase.io.HbaseObjectWritable.<clinit>(HbaseObjectWritable.java:266) > at org.apache.hadoop.hbase.ipc.Invocation.write(Invocation.java:139) > at > org.apache.hadoop.hbase.ipc.HBaseClient$Connection.sendParam(HBaseClient.java:612) > at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:975) > at > org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:84) > at $Proxy7.getProtocolVersion(Unknown Source) > at > org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:136) > at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira