I'm with Ted on this one. +1 for tags,trunk, branches and diff. packages.
Where I differ Is with the output. I can see some scenarios where it makes sense for ant dist-alg1, ant dist-alg2 -- this would reduce the footprint in applications that only need one vs the other. Having multiple projects is just unnecessary over head. -Yousef On 1/29/08, Steve Rowe <[EMAIL PROTECTED]> wrote: > > On 01/29/2008 at 6:44 PM, Lukas Vlcek wrote: > > I would prefer to have an option not to work with whole library but > > select only specific algorithms and optionally their particular > > modifications. > > +1 > > > > Thinking about these alternatives from an Eclipse user's point of > view, > > > the original proposal would seem to encourage multiple projects (one > > > per algorithm + a common project) while the second would encourage a > > > single project containing multiple packages. Depending upon the amount > > > of code that would reside in each algorithm, one or the other might be > > > preferable. > > > > > > Would a given developer typically be working on the entire library > > > (single project favoring) or just on one or two algorithms (multiple > > > project favoring)? > > > > > > Jeff > > > > > > -----Original Message----- > > > From: Ted Dunning [mailto:[EMAIL PROTECTED] > > > Sent: Tuesday, January 29, 2008 2:43 PM > > > To: [email protected] > > > Subject: Re: Thinking about Mahout layout, builds, etc. > > > > > > > > > > > > I think that having multiple source roots is a pain. That is what > > > packages > > > are for. > > > > > > I would recommend instead: > > > > > > - at the top level, there should be trunk, tags, releases as is > typical > > > in an SVN based project. > > > > > > - below trunk and any tag or release there should be: > > > > > > docs > > > lib > > > src/org/apache/mahout > > > > > > Below the source directory, there should be packages common, > > > algorithmA, algorithmB and all tests should be locaated near the > > > associated source. > > > > > > If it is really desirable to separate tests from normal source (I have > > > done it both ways and find having the tests nearby beneficial), then > > > there can be a parallel tree next to src called "test". > > > > > > The target of compilation should be a single jar file. > > > > > > > > > On 1/29/08 2:26 PM, "Grant Ingersoll" <[EMAIL PROTECTED]> wrote: > > > > > > > I am thinking a structure like the following would be useful for > > > > getting started: > > > > mahout/trunk/ > > > > docs > > > > common/ > > > > src/ > > > > main/ > > > > test/ > > > > docs/ > > > > lib/ > > > > algorithmA/ > > > > Similar to common, but for this algorithm algB ... > > > > ... > > > > > > > > Where algorithmA, B, etc. are the various libraries we intend to > > > > implement. We can hold off on creating them until we have some > code, > > > > but was thinking it would be good to have the general layout in > mind. > > > > > > > > Of course, this is expandable and changeable. What do others think? > > > > > > > > On a related note, one of the things we discussed pre-Apache, was > the > > > > general sense that we shouldn't feel the need to create an all > > > > encompassing framework. The basic gist of this being that any given > > > > library could be completely independent of the others (with maybe > the > > > > exception that they share a common library). My gut says this is > the > > > > way to get started, but that it may evolve over time once we have > some > > > > running time together and can start to recognize synergies, such > that > > > > maybe by the time we get to 1.0 of Mahout there may be more common > > > > code than we originally thought. The "common" area above can serve > as > > > > the area for utilities, classes, common Hadoop extensions, etc. that > > > > are shared between the various algorithms, but I would also say > let's > > > > not try to prematurely optimize across the algorithms just yet. > > > > > > > > Anyone else have any preference on this? > > > > > > > > -Grant > > > >
