On Mar 12, 2010, at 1:22 AM, Robin Anil wrote:

> Shall I go and put some of the ideas up. I will do it as a whole for the
> project. Later we can re-assign things maybe ? How does that sound? Unlike
> other projects we cant really go an put a proposal like "Implement
> back-propagation" and expect a student to take it up and reduce things to
> map/reduce.
> 
> Some of the ideas (i am going to be really ambitious/vague here, but write
> clear expectations or guidelines on what is an ideal proposal)
> 
> 1) Implement a cool classifier over map/reduce
> 2) Implement a cool clustering algorithm on map/reduce
> 3) Implement a meta-learner to plugin to various classifiers in mahout and
> have bagging, boosting support.
> 4) Continuous performance benchmarking/dashboard maybe wrappers over EC2
> 5) Create a matrix implementations of MYSQL and NOSQL(hbase, cassandra)
> access for all the algorithms to use.
> 6) Implement some of the ideas from Netflix top 5 to boost recommendations
> packge
> 7) Visualization tool for clustering, classification or recommendation.
> ability to explain(optional)
> 8) Improve mahout-math package

9. Implement M/R Tika integration to take "rich" documents on HDFS and output 
Vectors.   Likely not a full Summer of Work there, but could be part of some 
larger "Utils" capabilities focused on making it easier to consume Mahout.  
Also included: Finish ARFF compatibility.  
10. Benchmark.  Break the record?

I think we should still solicit ideas on list here that we can put up on JIRA.

> 
> 
> Who is free to mentor this year?  i.e giving 5-6 hours weekly to a student
> and hear then crib(sorry ian and isabel :P) and give words of encouragement.
> And yes, code reviews.

I'm in.

Reply via email to