[ 
https://issues.apache.org/jira/browse/PIG-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13113208#comment-13113208
 ] 

Daniel Dai commented on PIG-2262:
---------------------------------

There are a couple issues with this approach, actually most of issues are not 
specific to AvroStorage, it is how we deal with UDF dependent jars:

1. Pig don't automatically ship all classes in pig-withouthadoop.jar
   We also need to make code change in JarManager.jar to denote the package to 
ship. Putting a jar into pig-withouthadoop.jar alone is equal to put this jar 
in classpath. This mechanism confusing and we shall stop putting more jars into 
pig-withouthadoop.jar

2. Conflict with hadoop bundled jars
   Hadoop 20.204 bundles jackson-1.0.1, which is too old for AvroLoader. In 
frontend, we can force hadoop take our jackson-1.7.3 by setting flag 
HADOOP_USER_CLASSPATH_FIRST=true. But in the backend, seems hadoop always pick 
bundled jackson-1.0.1, which results a job failure.

3. Do we need to bundle piggybank dependent jars?
   We don't even bundle hbase.jar though HbaseLoader is in builtin. Further, 
these jars are not even in Pig distribution. They are ivy dependencies and will 
only be retrieved during compilation. My thinking is we need to bundle some 
popular jars (hbase.jar, avro.jar, etc) in lib so user knows where to find it 
when needed. But we don't want to ship all those jars to the backend. Ideally 
Pig should be smart enough to ship jars when needed (as we do for jython.jar)

> AvroStorage dependencies are missing from the release tarball
> -------------------------------------------------------------
>
>                 Key: PIG-2262
>                 URL: https://issues.apache.org/jira/browse/PIG-2262
>             Project: Pig
>          Issue Type: Bug
>          Components: build, piggybank
>            Reporter: Tom White
>            Assignee: Tom White
>         Attachments: PIG-2262.patch
>
>
> This makes AvroStorage hard to use, since users have to download the 
> dependencies manually, or build Pig themselves.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to