On 26.09.2011 Patrick Marchwiak wrote: > I am a part-time graduate student currently taking a data mining > course and am looking to potentially contribute to Mahout as a class > project. I've noticed there are a few algorithms on the Algorithms > wiki page that have yet to be implemented, such as Locally Weighted > Linear Regression, Principal Components Analysis, Independent > Component Analysis, and Gaussian Discriminative Analysis. As I'm new > to these algorithms and machine learning in general I am seeking > advice on which of these (if any) would be suitable to take on given a > limited amount of time (a little over 2 months, juggled with a > full-time job) and background knowledge. I have worked with Hadoop for > over a year now and do have several years of Java experience. Any > other suggestions would be welcome as well.
With only two months of (part-) time and being new to machine learning I would advise against trying to implement a new algorithm. I think it would make much more sense to come up with a project that uses Mahout to solve a specific task. If you need some inspiration on what task that could be, a good idea might be to look at currently running, or even past machine learning challenges, e.g. see http://www.kaggle.com/ - alternatively you could of course also work on a problem setting that is relevant to your regular work. Other than that it's always a good idea to check out the Mahout JIRA for any open issues that are waiting for helping hands. Isabel
signature.asc
Description: This is a digitally signed message part.
