On Aug 18, 2009, at 10:05 AM, Dmitriy Ryaboy wrote:

I am about to submit a cleaned up patch for 924.
It works fine as a static patch (in fact I can attach it to 660 as
well) -- compiling with -Dhadoop.version=XX works as proposed for the
static shims. It does the necessary prep for the code to be able to
switch based on what's in its classpath, but it does not require
unbundling to work statically.

Ok, we'll take a look.


The hadoop20 jar attached to the zebra ticket is built in a different
way than 18 and 19; it does not report its version (18 and 19 do).
Right now I get around it by hard-coding a special case ("Unknown" =>
20), but that's obviously suboptimal. Could someone rebuild
hadoop20.jar the way Pig wants it, and with the proper version
identification?  If that happens, 924/660 can go in together with
hadoop20.jar and users will at least be able to build against a static
version of hadoop without requiring a patch.

The hadoop 0.20 jar submitted with Zebra is not a standard jar. It has extra tfile functionality that was not in 0.20, but will be in 0.20.1. It isn't something we should publish. If we put a hadoop20.jar into pig's lib, it should be from 0.20 (or when available, 0.20.1).

Alan.


-Dmitriy

On Tue, Aug 18, 2009 at 9:56 AM, Alan Gates<ga...@yahoo-inc.com> wrote:
Non-committers certainly get a vote, it just isn't binding.

I agree on PIG-925 as a blocker. I don't see PIG-859 as a blocker since
there is a simple work around.

If we want to release 0.4.0 within a week or so, dynamic shims won't be an option because we won't be able to solve the bundled hadoop lib problem in that amount of time. I agree that we are not making life easy enough for users who want to build with hadoop 0.20. Based on comments on the JIRA, I'm not sure the patch for the static shims is ready. What if instead we checked in a version of hadoop20.jar that will work for users who want to build with 0.20. This way users can still build this if they want and our
release isn't blocked on the patch.

Alan.


On Aug 17, 2009, at 12:03 PM, Dmitriy Ryaboy wrote:

Olga,

Do non-commiters get a vote?

Zebra is in trunk, but relies on 0.20, which is somewhat inconsistent
even if it's in contrib/

Would love to see dynamic (or at least static) shims incorporated into
the 0.4 release (see PIG-660, PIG-924)

There are a couple of bugs still outstanding that I think would need
to get fixed before a release:

https://issues.apache.org/jira/browse/PIG-859
https://issues.apache.org/jira/browse/PIG-925

I think all of these can be solved within a week; assuming we are
talking about a release after these go into trunk, +1.

-D


On Mon, Aug 17, 2009 at 11:46 AM, Olga Natkovich<ol...@yahoo- inc.com>
wrote:

Pig Developers,



We have made several significant performance and other improvements over
the last couple of months:



(1)     Added an optimizer with several rules

(2)     Introduced skew and merge joins

(3)     Cleaned COUNT and AVG semantics



I think it is time for another release to make this functionality
available to users.



I propose that Pig 0.4.0 is released against Hadoop 18 since most users are still using this version. Once Hadoop 20.1 is released, we will roll
Pig 0.5.0 based on Hadoop 20.



Please, vote on the proposal by Thursday.



Olga





Reply via email to