On Jan 29, 2008, at 16:13, Yousef Ourabi wrote:
I am am with Yoasef. I would prefer single-rooted source tree
but would leave an option of building multiple jars. Actually
we can build one jar per algorithm, plus special jumbo jar containing
everything.
Sincerely,
Vadim
I'm with Ted on this one.
+1 for tags,trunk, branches and diff. packages.
Where I differ Is with the output. I can see some scenarios where it
makes
sense for ant dist-alg1, ant dist-alg2 -- this would reduce the
footprint in
applications that only need one vs the other.
Having multiple projects is just unnecessary over head.
-Yousef
On 1/29/08, Steve Rowe <[EMAIL PROTECTED]> wrote:
On 01/29/2008 at 6:44 PM, Lukas Vlcek wrote:
I would prefer to have an option not to work with whole library but
select only specific algorithms and optionally their particular
modifications.
+1
Thinking about these alternatives from an Eclipse user's point of
view,
the original proposal would seem to encourage multiple projects
(one
per algorithm + a common project) while the second would
encourage a
single project containing multiple packages. Depending upon the
amount
of code that would reside in each algorithm, one or the other
might be
preferable.
Would a given developer typically be working on the entire library
(single project favoring) or just on one or two algorithms
(multiple
project favoring)?
Jeff
-----Original Message-----
From: Ted Dunning [mailto:[EMAIL PROTECTED]
Sent: Tuesday, January 29, 2008 2:43 PM
To: [email protected]
Subject: Re: Thinking about Mahout layout, builds, etc.
I think that having multiple source roots is a pain. That is what
packages
are for.
I would recommend instead:
- at the top level, there should be trunk, tags, releases as is
typical
in an SVN based project.
- below trunk and any tag or release there should be:
docs
lib
src/org/apache/mahout
Below the source directory, there should be packages common,
algorithmA, algorithmB and all tests should be locaated near the
associated source.
If it is really desirable to separate tests from normal source (I
have
done it both ways and find having the tests nearby beneficial),
then
there can be a parallel tree next to src called "test".
The target of compilation should be a single jar file.
On 1/29/08 2:26 PM, "Grant Ingersoll" <[EMAIL PROTECTED]> wrote:
I am thinking a structure like the following would be useful for
getting started:
mahout/trunk/
docs
common/
src/
main/
test/
docs/
lib/
algorithmA/
Similar to common, but for this algorithm algB ...
...
Where algorithmA, B, etc. are the various libraries we intend to
implement. We can hold off on creating them until we have some
code,
but was thinking it would be good to have the general layout in
mind.
Of course, this is expandable and changeable. What do others
think?
On a related note, one of the things we discussed pre-Apache, was
the
general sense that we shouldn't feel the need to create an all
encompassing framework. The basic gist of this being that any
given
library could be completely independent of the others (with maybe
the
exception that they share a common library). My gut says this is
the
way to get started, but that it may evolve over time once we have
some
running time together and can start to recognize synergies, such
that
maybe by the time we get to 1.0 of Mahout there may be more common
code than we originally thought. The "common" area above can
serve
as
the area for utilities, classes, common Hadoop extensions, etc.
that
are shared between the various algorithms, but I would also say
let's
not try to prematurely optimize across the algorithms just yet.
Anyone else have any preference on this?
-Grant