[ 
https://issues.apache.org/jira/browse/PIG-3285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13645934#comment-13645934
 ] 

Rohini Palaniswamy commented on PIG-3285:
-----------------------------------------

Nick,
 You can raise a jira in Hadoop to handle duplicates. But since we support all 
older versions, we can't rely on it. Also it will not help with the current 
problem anyways. 

  The problem here is that hbase code is setting some jars in tmpjars which 
copies the jar to hdfs to /user/[username]/.staging and adds that hdfs file to 
DistributedCache.addArchiveToClassPath when JobClient.submitJob() is done. Pig 
already puts the pig.jar as job.jar and it ships the other registered jar to a 
tmp location in hdfs (/tmp/...) and then does a 
DistributedCache.addFileToClassPath before submitting the job. In this case, 
all the three settings are different and since pig does not use tmpfiles or 
tmpjars and does the work by itself the hdfs path is also different. So 
duplicates have to be resolved at the pig level. 
                
> Jobs using HBaseStorage fail to ship dependency jars
> ----------------------------------------------------
>
>                 Key: PIG-3285
>                 URL: https://issues.apache.org/jira/browse/PIG-3285
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Nick Dimiduk
>            Assignee: Nick Dimiduk
>             Fix For: 0.11.1
>
>         Attachments: 0001-PIG-3285-Add-HBase-dependency-jars.patch, 
> 0001-PIG-3285-Add-HBase-dependency-jars.patch, 1.pig, 1.txt, 2.pig
>
>
> Launching a job consuming {{HBaseStorage}} fails out of the box. The user 
> must specify {{-Dpig.additional.jars}} for HBase and all of its dependencies. 
> Exceptions look something like this:
> {noformat}
> 2013-04-19 18:58:39,360 FATAL org.apache.hadoop.mapred.Child: Error running 
> child : java.lang.NoClassDefFoundError: com/google/protobuf/Message
>       at 
> org.apache.hadoop.hbase.io.HbaseObjectWritable.<clinit>(HbaseObjectWritable.java:266)
>       at org.apache.hadoop.hbase.ipc.Invocation.write(Invocation.java:139)
>       at 
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.sendParam(HBaseClient.java:612)
>       at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:975)
>       at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:84)
>       at $Proxy7.getProtocolVersion(Unknown Source)
>       at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:136)
>       at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to