I've been following this thread with a great deal of interest, and I think it's a neat project! I've read Dan's corresponding blog entry, and will keep up with his blog as well, and am working through Ted's K-Means Clustering At Scale paper now.

I'm particularly interested in the process of integrating the knn code from github into Mahout. Not to diminish any other aspect of the project, it's just that I'm looking forward to learning what this involves in detail.

Consequently, I'd just like to say that I really appreciate that you're keeping the development of this project out in the open here.

Regards, Ray.

On 10/12/2012 01:45 PM, Ted Dunning wrote:
Review the knn code from github

File an individual contributors license agreement with Apache

Change knn to fit the Mahout API

Push back to Mahout

Solicit current clustering users for metrics on their data (I can help with
this)

Write up data generation strategy with useable results

Not sure how long these tasks are because they are a bit big for planning
purposes, but give a decent outline.

On Fri, Oct 12, 2012 at 1:34 PM, Dan Filimon<[email protected]>wrote:

Now, where do I start? What would a plan for the coming months look like?



Reply via email to