Re: Discussion Of ML environment/MR, Mahout

Sean Owen Wed, 13 Mar 2013 05:40:24 -0700

On Wed, Mar 13, 2013 at 12:06 PM, Dmitriy Lyubimov <[email protected]>wrote:


> Also. I still have an impression as i mentioned that adaptive version of
> algorithm is not available and specifying lambda for als-wr is left to
> operator's intuition? This is probably a bigger issue even than the
>

(What's the adaptive version? I don't know of an implementation that
dynamically chooses lambda, but you can always choose it with
cross-validation. And that could be done in-line with iterations I guess. )



> Suppose for a moment that Mahout was a commerical project with a lot of
> things in roadmap. And we had to make strategic decisions about something
> we dont yet know, and even if we could demonstrate that some of vital
> pieces we know about, with some sweat and tears could be solved with a more
> constraining technology B as well as more naturally with its superset
> technology A, what is the merit of making such choice in favor of B,
> debatable maturity issues of either of choices aside?
>

I'm speaking for myself but the huge reason is that technology B is widely
used and mature, and rightly or wrongly in demand, and customers are trying
to make use of idle resources exposed via B. If using A is only easier for
the product developer, that's great (and going to lead to better results
long-term) but not something the customer is interested in. I say
"customer" but this goes for consumers of open source code.



> And finally, on the side of pragmatic project management, why even to
> artificially favor either of choices if we only rely on non commericial
> conributions? why do we even want to oppose any diversification attempts on
> any ground as long as we manage it  incubator style along with established
> safe graduation policies to ensure chaos control? Viable things will find
> their use and adoption. (Well, maybe i am a little bit optimistic here.
> Nonviable tech seems to be striving for years as well just on the pitch
> alone). If they dont find their way into Mahout, they will eventually
> flourish elsewhere (assuming their viability).
>

I think this leads to a jumble of half-baked code. A playground of bits of
code is fine, but, why push it together into a project that implies it's
going to be coherent, supported? Just collaborate on Github. This project
is I think interpreted by users as a product with lifecycle and some
guarantees. I think that's the expectation of an Apache project too, and
it's not being met.

Such is life, but I think this code base is already too uneven and hacked
and mutually incompatible to support any meaningful evolution. Sebastian's
flurry of effort yesterday is awesome, but highlights how almost nobody
cares to work on Mahout now. Any effort is just tacking on more bits and
pieces that are ever less related to the other bits. This is excellent --
on Github. Why not stick a fork in it?

Is it just more convenient to pile on here since there's a mailing list and
SVN is already set up with everybody's pet code? I'd think there would be
desire to break off and start fresh projects based on totally different
languages or paradigms.

My only concern is about lessons learned. I look at this as a cautionary
tale of how not to operate a sustainable and healthy project, though a
great picture of how collaborative experimentation works. I am much more
interested in working on a product now.

Re: Discussion Of ML environment/MR, Mahout

Reply via email to