Re: Developing cross-component patches post-split

Steve Loughran Fri, 03 Jul 2009 08:57:41 -0700

Owen O'Malley wrote:

On Wed, Jul 1, 2009 at 6:45 PM, Todd Lipcon<[email protected]> wrote:

Agree with Phillip here. Requiring a new jar to be checked in anywhere after
every common commit seems unscalable and nonperformant. For git users this
will make the repository size baloon like crazy (the jar is 400KB and we
have around 5300 commits so far = 2GB!).


This is silly. Obviously, just like the source the jars compress
across versions very well.

I think it would be reasonable to require that developers check out a
structure like:

working-dir/
 hadoop-common/
 hadoop-mapred/
 hadoop-hdfs/


-1 They are separate subprojects. In the medium term, mapreduce and
hdfs should compile and run against the released version common.
Checking in the jars is a temporary step while the interfaces in
common stabilize. Furthermore, I expect the volume in common should be
much lower than in mapreduce or hdfs.


There are various use cases here

-people working in hdfs who don't need mapred (though they should forregression testing their work) but do need a stable common

-people working in mapred who need a working common/hdfs

-someone trying to work across all three (or in common, which iseffectively that from a regression testing viewpoint)-someone who just wants all the code for debugging/using mapreduce orother bits of hadoo

For anyone who is playing in at the source level where they are gettingchanging libraries, having the separate projects in subdirs with commontargets is invaluable; ivy can do the glue. But at the same time, shouldyou require everyone working on mapred to pull down and build common andhdfs?

Re: Developing cross-component patches post-split

Reply via email to