On 01/29/2008 at 6:44 PM, Lukas Vlcek wrote: > I would prefer to have an option not to work with whole library but > select only specific algorithms and optionally their particular > modifications.
+1 > > Thinking about these alternatives from an Eclipse user's point of view, > > the original proposal would seem to encourage multiple projects (one > > per algorithm + a common project) while the second would encourage a > > single project containing multiple packages. Depending upon the amount > > of code that would reside in each algorithm, one or the other might be > > preferable. > > > > Would a given developer typically be working on the entire library > > (single project favoring) or just on one or two algorithms (multiple > > project favoring)? > > > > Jeff > > > > -----Original Message----- > > From: Ted Dunning [mailto:[EMAIL PROTECTED] > > Sent: Tuesday, January 29, 2008 2:43 PM > > To: [email protected] > > Subject: Re: Thinking about Mahout layout, builds, etc. > > > > > > > > I think that having multiple source roots is a pain. That is what > > packages > > are for. > > > > I would recommend instead: > > > > - at the top level, there should be trunk, tags, releases as is typical > > in an SVN based project. > > > > - below trunk and any tag or release there should be: > > > > docs > > lib > > src/org/apache/mahout > > > > Below the source directory, there should be packages common, > > algorithmA, algorithmB and all tests should be locaated near the > > associated source. > > > > If it is really desirable to separate tests from normal source (I have > > done it both ways and find having the tests nearby beneficial), then > > there can be a parallel tree next to src called "test". > > > > The target of compilation should be a single jar file. > > > > > > On 1/29/08 2:26 PM, "Grant Ingersoll" <[EMAIL PROTECTED]> wrote: > > > > > I am thinking a structure like the following would be useful for > > > getting started: > > > mahout/trunk/ > > > docs > > > common/ > > > src/ > > > main/ > > > test/ > > > docs/ > > > lib/ > > > algorithmA/ > > > Similar to common, but for this algorithm algB ... > > > ... > > > > > > Where algorithmA, B, etc. are the various libraries we intend to > > > implement. We can hold off on creating them until we have some code, > > > but was thinking it would be good to have the general layout in mind. > > > > > > Of course, this is expandable and changeable. What do others think? > > > > > > On a related note, one of the things we discussed pre-Apache, was the > > > general sense that we shouldn't feel the need to create an all > > > encompassing framework. The basic gist of this being that any given > > > library could be completely independent of the others (with maybe the > > > exception that they share a common library). My gut says this is the > > > way to get started, but that it may evolve over time once we have some > > > running time together and can start to recognize synergies, such that > > > maybe by the time we get to 1.0 of Mahout there may be more common > > > code than we originally thought. The "common" area above can serve as > > > the area for utilities, classes, common Hadoop extensions, etc. that > > > are shared between the various algorithms, but I would also say let's > > > not try to prematurely optimize across the algorithms just yet. > > > > > > Anyone else have any preference on this? > > > > > > -Grant >
