Hi all, I am a beginner wrt Mahout and am trying to learn its architecture and how it works. This can help me to implement some ML algos for Mahout. To understand the big picture and end-end flow of an algorithm, I am not able to find any good documentation (I have tried searching thru google, mailing list, mahout site..). So I am thinking of writing some documentation so that new developers would find it easy to understand the architecture / end-end flow and start designing / coding new algos. Can someone please point me in the right direction as to where I can start and what to refer etc...
I am thinking of starting off with 1 classification (probably Naive Bayes) and create a template for the documentation like 1. Overview of the Algo 2. I/P data set (how to prepare and sample data set) 3. Maybe a sequence diagram explaining how the code flow happens (or any other way of representing this info ??) 4. O/P (how to read the o/p model and apply it for a real-world classification problem) If you have any quick pointers on the design of Naive bayes / any info you want added to the document template, plz let me know.. would appreciate any guidance regarding this.. goal : new developers can quickly ramp up and understand how an algo is implemented so they can re-use etc effectively.. i understand many are already mentoring for GSOC but if someone has time to mentor me in this effort, I'll be glad to submit a formal application through http://community.apache.org/mentoringprogramme.html. thanks, Joe.
