For the past... 5 years?  I've been using Spring as a DI container
at every job I've had.  At LinkedIn, in fact we have extended
Spring extensively
(see here: http://www.springsource.com/files/SpringAtLinkedIn.pdf
for some details).  It's incredibly powerful, and while the config files
can be pretty verbose, they're pretty easy to read if you just use them
for creating java objects and wiring them together (and having some
of them run initialization methods on startup, and various shutdown
methods at close).

This is not to say I'm anywhere near even a +0 on cramming it into
Mahout.  I'd have to see a pretty strong use case for it, because
while it does nothing to force programmers to change their APIs really
at all, getting into the mode of thinking about setting up apps by
changing a bunch of xml files is a big shift.  Of course, using Spring
doesn't typically *require* you to do everything in Spring (the code
will almost never have direct Spring dependencies / imports), but
if it becomes the de facto standard way to configure the system,
"require" is irrelevant.

The other issue that comes with IoC containers (certainly with
Spring, not sure with Guice) is that without strong IDE support,
you don't really know a lot about your system at compile-time.
Wiring errors are only visible at runtime, and debugging your system
by firing up the entire ApplicationContext (and then debugging
where in the stack trace your wiring blew up...) can be a total
nightmare.

To deal with this, you make a suite of configuration tests,
essentially, which are all wired together like your primary apps,
which test the wiring.  But this gets ugly, and you also lose
things like Clover, which doesn't look for wiring coverage.

When you have actual QA, this can be ok - things are run
all the time, you notice wiring errors quickly.  Will this be
true for Mahout?  Or will nobody be running, eg. Dirichlet
Clustering for months on end, and the wiring drifts out of
sync with the code, but everything still compiles, so nobody
notices...

So... I guess my $0.02 on this is that while I highly advocate
things like Spring in big production environments, bringing it
into what is essentially a *library* with one webapp and a
handful of command line utilities... is a bad idea.  I'm ready
to be convinced, but I'd need to see some good examples
of where it would be used.  I'm down with IoC, it's a great
way to program to interfaces and abstract away your deep
coupling, but open-source libraries I think aren't the best
place for it.

  -jake

p.s. two other open source projects I work on - bobo-browse
for faceted search, and zoie for realtime search, both
*optionally* couple to Spring, in the sense that they both
have their "example" apps that live with their source tree
use them, but it's just for *apps* built on top of the libraries,
not for the wiring of anything done inside.

On Mon, Jan 18, 2010 at 11:47 AM, Ted Dunning <ted.dunn...@gmail.com> wrote:

> I am going to address IoC issues only on this thread.  The other
> repeatability issues should be address, but on the other thread.
>
> On Mon, Jan 18, 2010 at 7:10 AM, Sean Owen <sro...@gmail.com> wrote:
>
> > > I am not especially in favor of my own Random patch. If people are
> > > willing to run in 'fork-once' mode to get the clock time down, and
> > > prefer to stick with the uncommons RNG, then it's not useful.
> > >
> >
> > Since anything has benefits and drawbacks, my question is which has
> > the highest benefit-to-drawback ratio.
> >
>
> +1 at the high level.
>
> >
> > Formal dependency injection is heavy, even with this Bus or Guice:
> > - Requires artifical change to API
> >
>
> Not too large.  For Guice, it requires a constructor with the injectables.
> This is easy and probably good anyway.
>
>
> > - Config files
> >
>
> The Guice equivalent is a Module which provides the bindings between
> interfaces and implementations and a little bit of annotation.  This is a
> small cost (but a definite cost).
>
>
> > - More library code or dependency to implement
> >
>
> I don't see this.  Could be true.
>
>
> > - Less 'readable': harder to see what's actually being used
> >
>
> My experience with Spring is that it definitely is less readable because
> you
> have to read reams of XML which wasn't designed to be read.  There are
> hacks
> upon hacks design to improve this, but it is still a major problem.
>
> Guice seems vastly more readable given that it just uses java.  The
> readability loss that I see is that you don't know what implementation
> would
> be used for any injectable class, but I haven't found that a problem with
> that since I can look at the implementations very quickly.
>
> Summary on costs from my point of view, 2 x +0.5, 1 x +0, 1 x -0.5, net
> weak
> agreement but I seem to think the costs are less than Sean thinks.
> Interesting, he is the one who has actually used Guice in production as
> opposed to my experience of using Spring.
>
>
> > Benefits are:
> > - Flexibility to swap out depedencies across the board at runtime
> > - Doing it one way reduces errors as opposed to reimplementing
> > injection by hand 10 times
> >
>
> +2
>
> The major impact of the first point is that we can do better testing.  In
> fact, my feeling is that testing is the biggest benefit overall of most IoC
> frameworks.
>
>
>
> > If Random is the only candidate, I don't think it's worth it, and it
> > ends up being overengineering: who's realistically going to need to
> > change the RNG, and we have only 1 instance, and no obvious candidates
> > for injection besides at the moment.
>
>
> +1 -1
>
> The other obvious candidates would be the various clustering frame
> works/drivers, the DP implementation models, the new SGD code for injection
> of loss functions.  Plausibly Taste could use IoC for injecting various
> kinds of distances and such.  In general, I don't advocate wholesale
> rewrites of working code, but I think that the real reason that we don't
> have any obvious uses of IoC is that we haven't had IoC to use.
>
> Compare the overhead with,
> > literally, 10 lines of simple code.
> >
>
> For the random case, I think that hand-written constructor level injection
> of a generator is no more effort than using a static block.  I also count
> the use of any static as nearly as evil as using a global variable.
>
> --
> Ted Dunning, CTO
> DeepDyve
>

Reply via email to