[jira] Commented: (PIG-924) Make Pig work with multiple versions of Hadoop

Santhosh Srinivasan (JIRA) Thu, 20 Aug 2009 11:51:38 -0700

    [ 
https://issues.apache.org/jira/browse/PIG-924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12745568#action_12745568
 ]


Santhosh Srinivasan commented on PIG-924:
-----------------------------------------

Hadoop has promised "APIs in stone" forever and has not delivered on that 
promise yet. Higher layers in the stack have to learn how to cope with a ever 
changing lower layer. How this change is managed is a matter of convenience to 
the owners of the higher layer. I really like Shims approach which avoids the 
cost of branching out Pig every time we make a compatible release. The cost of 
creating a branch for each version of hadoop seems to be too high compared to 
the cost of the Shims approach.

Of course, there are pros and cons to each approach. The question here is when 
will Hadoop set its APIs in stone and how many more releases will we have 
before this happens. If the answer to the question is 12 months and 2 more 
releases, then we should go with the Shims approach. If the answer is 3-6 
months and one more release then we should stick with our current approach and 
pay the small penalty of patches supplied to work with the specific release of 
Hadoop.

Summary: Use the shims patch if APIs are not set in stone within a quarter or 
two and if there is more than one release of Hadoop.

> Make Pig work with multiple versions of Hadoop
> ----------------------------------------------
>
>                 Key: PIG-924
>                 URL: https://issues.apache.org/jira/browse/PIG-924
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Dmitriy V. Ryaboy
>         Attachments: pig_924.2.patch, pig_924.3.patch, pig_924.patch
>
>
> The current Pig build scripts package hadoop and other dependencies into the 
> pig.jar file.
> This means that if users upgrade Hadoop, they also need to upgrade Pig.
> Pig has relatively few dependencies on Hadoop interfaces that changed between 
> 18, 19, and 20.  It is possibly to write a dynamic shim that allows Pig to 
> use the correct calls for any of the above versions of Hadoop. Unfortunately, 
> the building process precludes us from the ability to do this at runtime, and 
> forces an unnecessary Pig rebuild even if dynamic shims are created.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-924) Make Pig work with multiple versions of Hadoop

Reply via email to