Guys, quick and broad question -- what's the roadmap for Mahout look like? Even just for the next two releases?
Now, much of the project is mostly a space for tinkering, tossing around bits of code for now, and that's OK for 0.1 or 0.2. I just wonder what the path to a proper finished product is like. It'll take some agreement on who exactly the audience is, what they need and don't need, what interface it presents to those users. It takes work to design for that, bring the project into line around that design, document and test, etc. And -- it takes people with responsibility and authority to make it happen. I'm not clear we quite have those things yet. Until we do this will be an 0.x project that nobody can really get into using for production. It doesn't have to happen tomorrow, but, what's our path like from here to there? Spare time from even 10 people won't get the docs written, tidy the code, refactor / redesign / unify the lot of copy/paste that's going on, etc. People definitely have ideas about what the project should do -- I see lots of little bits of functionality being thrown into the pot. But is it adding up to something consistent and coherent? should we talk seriously about it? "Machine learning" is too broad a remit. It's not ruining my day or anything but I'm sitting on a piece of the project that I put effort into making clearly do a few things, do them well, and not try to do other things, designed for practical use cases, and documented and polished and tested it. So I'll be a little concerned if it's attached to an early-0.x tinkering project this time next year. That's not cool for an Apache project anyway. It may be presumptuous but I volunteer to try to lead answers to these questions. It's going to lead to some tough answers and more work in some cases, no matter who drives it. Hoping to do it sooner than later.