For the past... 5 years? I've been using Spring as a DI container at every job I've had. At LinkedIn, in fact we have extended Spring extensively (see here: http://www.springsource.com/files/SpringAtLinkedIn.pdf for some details). It's incredibly powerful, and while the config files can be pretty verbose, they're pretty easy to read if you just use them for creating java objects and wiring them together (and having some of them run initialization methods on startup, and various shutdown methods at close).
This is not to say I'm anywhere near even a +0 on cramming it into Mahout. I'd have to see a pretty strong use case for it, because while it does nothing to force programmers to change their APIs really at all, getting into the mode of thinking about setting up apps by changing a bunch of xml files is a big shift. Of course, using Spring doesn't typically *require* you to do everything in Spring (the code will almost never have direct Spring dependencies / imports), but if it becomes the de facto standard way to configure the system, "require" is irrelevant. The other issue that comes with IoC containers (certainly with Spring, not sure with Guice) is that without strong IDE support, you don't really know a lot about your system at compile-time. Wiring errors are only visible at runtime, and debugging your system by firing up the entire ApplicationContext (and then debugging where in the stack trace your wiring blew up...) can be a total nightmare. To deal with this, you make a suite of configuration tests, essentially, which are all wired together like your primary apps, which test the wiring. But this gets ugly, and you also lose things like Clover, which doesn't look for wiring coverage. When you have actual QA, this can be ok - things are run all the time, you notice wiring errors quickly. Will this be true for Mahout? Or will nobody be running, eg. Dirichlet Clustering for months on end, and the wiring drifts out of sync with the code, but everything still compiles, so nobody notices... So... I guess my $0.02 on this is that while I highly advocate things like Spring in big production environments, bringing it into what is essentially a *library* with one webapp and a handful of command line utilities... is a bad idea. I'm ready to be convinced, but I'd need to see some good examples of where it would be used. I'm down with IoC, it's a great way to program to interfaces and abstract away your deep coupling, but open-source libraries I think aren't the best place for it. -jake p.s. two other open source projects I work on - bobo-browse for faceted search, and zoie for realtime search, both *optionally* couple to Spring, in the sense that they both have their "example" apps that live with their source tree use them, but it's just for *apps* built on top of the libraries, not for the wiring of anything done inside. On Mon, Jan 18, 2010 at 11:47 AM, Ted Dunning <ted.dunn...@gmail.com> wrote: > I am going to address IoC issues only on this thread. The other > repeatability issues should be address, but on the other thread. > > On Mon, Jan 18, 2010 at 7:10 AM, Sean Owen <sro...@gmail.com> wrote: > > > > I am not especially in favor of my own Random patch. If people are > > > willing to run in 'fork-once' mode to get the clock time down, and > > > prefer to stick with the uncommons RNG, then it's not useful. > > > > > > > Since anything has benefits and drawbacks, my question is which has > > the highest benefit-to-drawback ratio. > > > > +1 at the high level. > > > > > Formal dependency injection is heavy, even with this Bus or Guice: > > - Requires artifical change to API > > > > Not too large. For Guice, it requires a constructor with the injectables. > This is easy and probably good anyway. > > > > - Config files > > > > The Guice equivalent is a Module which provides the bindings between > interfaces and implementations and a little bit of annotation. This is a > small cost (but a definite cost). > > > > - More library code or dependency to implement > > > > I don't see this. Could be true. > > > > - Less 'readable': harder to see what's actually being used > > > > My experience with Spring is that it definitely is less readable because > you > have to read reams of XML which wasn't designed to be read. There are > hacks > upon hacks design to improve this, but it is still a major problem. > > Guice seems vastly more readable given that it just uses java. The > readability loss that I see is that you don't know what implementation > would > be used for any injectable class, but I haven't found that a problem with > that since I can look at the implementations very quickly. > > Summary on costs from my point of view, 2 x +0.5, 1 x +0, 1 x -0.5, net > weak > agreement but I seem to think the costs are less than Sean thinks. > Interesting, he is the one who has actually used Guice in production as > opposed to my experience of using Spring. > > > > Benefits are: > > - Flexibility to swap out depedencies across the board at runtime > > - Doing it one way reduces errors as opposed to reimplementing > > injection by hand 10 times > > > > +2 > > The major impact of the first point is that we can do better testing. In > fact, my feeling is that testing is the biggest benefit overall of most IoC > frameworks. > > > > > If Random is the only candidate, I don't think it's worth it, and it > > ends up being overengineering: who's realistically going to need to > > change the RNG, and we have only 1 instance, and no obvious candidates > > for injection besides at the moment. > > > +1 -1 > > The other obvious candidates would be the various clustering frame > works/drivers, the DP implementation models, the new SGD code for injection > of loss functions. Plausibly Taste could use IoC for injecting various > kinds of distances and such. In general, I don't advocate wholesale > rewrites of working code, but I think that the real reason that we don't > have any obvious uses of IoC is that we haven't had IoC to use. > > Compare the overhead with, > > literally, 10 lines of simple code. > > > > For the random case, I think that hand-written constructor level injection > of a generator is no more effort than using a static block. I also count > the use of any static as nearly as evil as using a global variable. > > -- > Ted Dunning, CTO > DeepDyve >