Re: Developing cross-component patches post-split

Todd Lipcon Wed, 01 Jul 2009 22:14:46 -0700

On Wed, Jul 1, 2009 at 10:11 PM, Dhruba Borthakur <dhr...@gmail.com> wrote:


> Hi Todd,
>
> Another option (one that is used by Hive) is to have an ant macro that can
> be overridden from the ant command line. This macro points to the location
> of the common.jar. By default, it is set to the same value as it is now. If
> a developer has a common jar that is built in his/her directory, he/she can
> set this macro from the command line while compiling hdfs.
>
> For example,
>  ant test
> does the same as it does now, but
>  ant -Dhadoop.common.jar=/home/dhruba/common/hadoop-common.jar test will
> pick up the common jar from my home directory.
>
> is this feasible?
>

That's feasible, but it will still require having a built jar in one or
another repository after every new commit (yuck!) I imagine in hive's case
it's reasonably rare that you have to import a new hadoop dev jar in, since
you mostly target existing stable releases. This is going to be happening
all the time in MR/HDFS, at least for the forseeable future imho.

-Todd

On Wed, Jul 1, 2009 at 6:45 PM, Todd Lipcon <tlip...@gmail.com> wrote:

> On Wed, Jul 1, 2009 at 2:10 PM, Philip Zeyliger <phi...@cloudera.com>
> wrote:
>
> > -1 to checking in jars.  It's quite a bit of bloat in the repository
> (which
> > admittedly affects the git.apache folks more than the svn folks), but
> it's
> > also cumbersome to develop.
> >
> > It'd be nice to have a one-liner that builds the equivalent of the
> tarball
> > built by "ant binary" in the old world.  When you're working on
something
> > that affects both common and hdfs, it'll be pretty painful to make the
> jars
> > in common, move them over to hdfs, and then compile hdfs.
> >
> > Could the build.xml in hdfs call into common's build.xml and build
common
> > as
> > part of building hdfs?  Or perhaps have a separate "top-level" build
file
> > that builds everything?
> >
>
> Agree with Phillip here. Requiring a new jar to be checked in anywhere
> after
> every common commit seems unscalable and nonperformant. For git users this
> will make the repository size baloon like crazy (the jar is 400KB and we
> have around 5300 commits so far = 2GB!). For svn users it will still mean
> that every "svn update" requires a download of a new jar. Using svn
> externals to manage them also complicates things when trying to work on a
> cross-component patch with two dirty directories - you really need a
> symlink
> between your working directories rather than through the SVN tree.
>
> I think it would be reasonable to require that developers check out a
> structure like:
>
> working-dir/
>  hadoop-common/
>  hadoop-mapred/
>  hadoop-hdfs/
>
> We can then use relative paths for the mapred->common and hdfs->common
> dependencies. Those who only work on HDFS or only work on mapred will not
> have to check out the other, but everyone will check out common.
>
> Whether there exists a fourth repository (eg hadoop-build) that has a
> build.xml that ties together the other build.xmls is another open question
> IMO.
>
> -Todd
>

Re: Developing cross-component patches post-split

Reply via email to