I've been following this thread with a great deal of interest, and I
think it's a neat project! I've read Dan's corresponding blog entry,
and will keep up with his blog as well, and am working through Ted's
K-Means Clustering At Scale paper now.
I'm particularly interested in the process of integrating the knn code
from github into Mahout. Not to diminish any other aspect of the
project, it's just that I'm looking forward to learning what this
involves in detail.
Consequently, I'd just like to say that I really appreciate that you're
keeping the development of this project out in the open here.
Regards, Ray.
On 10/12/2012 01:45 PM, Ted Dunning wrote:
Review the knn code from github
File an individual contributors license agreement with Apache
Change knn to fit the Mahout API
Push back to Mahout
Solicit current clustering users for metrics on their data (I can help with
this)
Write up data generation strategy with useable results
Not sure how long these tasks are because they are a bit big for planning
purposes, but give a decent outline.
On Fri, Oct 12, 2012 at 1:34 PM, Dan Filimon<[email protected]>wrote:
Now, where do I start? What would a plan for the coming months look like?