Hi Federico,

Sean just sent out an excellent email, and I'd like to mention a few things related to what he mentioned:
I think if you're looking at JIRA issues and constructing a good proposal 
around that, you
have a good start.
JIRA tickets are the place to start. That is where you will find everything we are currently working toward regarding Mahout, and is a good place to officially post your proposals (obviously in addition to the GSoC app page). That way, as we're perusing the open tickets, we can leave you feedback there, and you can adjust specific aspects of the proposal as necessary, just as you would any other open ticket.
However, I feel that one of the best things you can do in a proposal is
convince that you know how much work it is, you know what the steps are, and
you know you can finish it even accounting for unexpected difficulty.

+1

In glancing over your proposal, they are certainly good ideas from a standpoint of theory, but I would also love to see some more implementation specifics, as well as plans for how you intend to test, and what the timelines would be. Have you looked over how Kmeans is implemented in a map-reduce fashion? Do you understand how map-reduce works? Can you envision how you would build the map-reduce paradigm into your kernel smoother and LSH? Can you divide this project up into phases, and assess how long each phase would take?
I will also say personally that I would prefer to see GSoC projects that
focus on architecture, refactoring, performance tuning and measurement,
tests, etc, rather than implementing another algorithm. Mahout needs the
former more, I think. But I speak for myself and I am not mentoring.

Another excellent point, which I too agree with (though I too speak only for myself). At the very least, both new algorithms and tuning existing code are very important and worthy of GSoC projects. Something to keep in mind.

Shannon

Reply via email to