On Wed, Mar 13, 2013 at 12:06 PM, Dmitriy Lyubimov <[email protected]>wrote:
> Also. I still have an impression as i mentioned that adaptive version of > algorithm is not available and specifying lambda for als-wr is left to > operator's intuition? This is probably a bigger issue even than the > (What's the adaptive version? I don't know of an implementation that dynamically chooses lambda, but you can always choose it with cross-validation. And that could be done in-line with iterations I guess. ) > Suppose for a moment that Mahout was a commerical project with a lot of > things in roadmap. And we had to make strategic decisions about something > we dont yet know, and even if we could demonstrate that some of vital > pieces we know about, with some sweat and tears could be solved with a more > constraining technology B as well as more naturally with its superset > technology A, what is the merit of making such choice in favor of B, > debatable maturity issues of either of choices aside? > I'm speaking for myself but the huge reason is that technology B is widely used and mature, and rightly or wrongly in demand, and customers are trying to make use of idle resources exposed via B. If using A is only easier for the product developer, that's great (and going to lead to better results long-term) but not something the customer is interested in. I say "customer" but this goes for consumers of open source code. > And finally, on the side of pragmatic project management, why even to > artificially favor either of choices if we only rely on non commericial > conributions? why do we even want to oppose any diversification attempts on > any ground as long as we manage it incubator style along with established > safe graduation policies to ensure chaos control? Viable things will find > their use and adoption. (Well, maybe i am a little bit optimistic here. > Nonviable tech seems to be striving for years as well just on the pitch > alone). If they dont find their way into Mahout, they will eventually > flourish elsewhere (assuming their viability). > I think this leads to a jumble of half-baked code. A playground of bits of code is fine, but, why push it together into a project that implies it's going to be coherent, supported? Just collaborate on Github. This project is I think interpreted by users as a product with lifecycle and some guarantees. I think that's the expectation of an Apache project too, and it's not being met. Such is life, but I think this code base is already too uneven and hacked and mutually incompatible to support any meaningful evolution. Sebastian's flurry of effort yesterday is awesome, but highlights how almost nobody cares to work on Mahout now. Any effort is just tacking on more bits and pieces that are ever less related to the other bits. This is excellent -- on Github. Why not stick a fork in it? Is it just more convenient to pile on here since there's a mailing list and SVN is already set up with everybody's pet code? I'd think there would be desire to break off and start fresh projects based on totally different languages or paradigms. My only concern is about lessons learned. I look at this as a cautionary tale of how not to operate a sustainable and healthy project, though a great picture of how collaborative experimentation works. I am much more interested in working on a product now.
