Re: Thinking about Mahout layout, builds, etc.

Ted Dunning Tue, 29 Jan 2008 14:44:01 -0800


I think that having multiple source roots is a pain.  That is what packages
are for.


I would recommend instead:

- at the top level, there should be trunk, tags, releases as is typical in
an SVN based project.

- below trunk and any tag or release there should be:

   docs
   lib
   src/org/apache/mahout

Below the source directory, there should be packages common, algorithmA,
algorithmB and all tests should be locaated near the associated source.

If it is really desirable to separate tests from normal source (I have done
it both ways and find having the tests nearby beneficial), then there can be
a parallel tree next to src called "test".

The target of compilation should be a single jar file.


On 1/29/08 2:26 PM, "Grant Ingersoll" <[EMAIL PROTECTED]> wrote:

> I am thinking a structure like the following would be useful for
> getting started:
> mahout/trunk/
>    docs
>    common/
> src/
>             main/
>             test/
>          docs/
>          lib/
>    algorithmA/
>         Similar to common, but for this algorithm
>    algB
>         ...
>     ...
> 
> Where algorithmA, B, etc. are the various libraries we intend to
> implement.  We can hold off on creating them until we have some code,
> but was thinking it would be good to have the general layout in mind.
> 
> Of course, this is expandable and changeable.  What do others think?
> 
> On a related note, one of the things we discussed pre-Apache, was the
> general sense that we shouldn't feel the need to create an all
> encompassing framework.  The basic gist of this being that any given
> library could be completely independent of the others (with maybe the
> exception that they share a common library).  My gut says this is the
> way to get started, but that it may evolve over time once we have some
> running time together and can start to recognize synergies, such that
> maybe by the time we get to 1.0 of Mahout there may be more common
> code than we originally thought.  The "common" area above can serve as
> the area for utilities, classes, common Hadoop extensions, etc. that
> are shared between the various algorithms, but I would also say let's
> not try to prematurely optimize across the algorithms just yet.
> 
> Anyone else have any preference on this?
> 
> -Grant
>

Re: Thinking about Mahout layout, builds, etc.

Reply via email to