Some more Wild and Wacky Ideas. Might be out of scope for GSOC, but are nice to have features for mahout. I would like to encourage all of you to put down your ideas here.
1. Data Visualization tool backed with HDFS/Hbase for inspecting clusters, Topic model etc etc - It could have many map/reduce jobs which transform the clustering output, aggregates things and produce interesting stats or visualization of data 2. UIMA Integration with Mahout? (Maybe a good project if UIMA folks are taking in GSOC students) Robin On Mon, Feb 1, 2010 at 6:17 PM, Isabel Drost <isa...@apache.org> wrote: > On Wed Robin Anil <robin.a...@gmail.com> wrote: > > Greetings! Fellow GSOC alums, administrators and dear mentors, the > > next edition is right here. Details are given in the link below. > > > > > https://groups.google.com/group/google-summer-of-code-discuss/browse_thread/thread/d839c0b02ac15b3f > > Some additional notes to committers: > > First of all mentoring a GSoC student is a great experience, so if > you do have some cycles left, I would highly recommend participating in > GSoC as a mentor (thanks Grant for convincing myself last year...). > > We had several successful students here at Mahout in past GSoC years. > Each year there were strong proposals for projects within Mahout. As a > results projects usually turn out to be interesting for both, mentor > and student. > > One final note: If there is anyone on this list who might be interested > in helping with general ASF GSoC logistics and administration tasks, > please have a look at the newly founded community development project > (d...@community.apache.org) > > > > Maybe we could identify key areas in Mahout which we need to develop > > apart from the ML implementations and list it down for students to > > see before they start trickling in. > > And motivate students to come up with their own ideas and discuss them > on-list before submitting their submission. > > > > Some ideas: > > Benchmarking Framework with EC2 wrappers > > +1 I would love to see that. > > > > Commandline Console+Launcher like Hbase and hadoop > > +1 > > > > Online Tool/Query UI for Algorithms in Mahout(like CF) > > > > > > Possible ideas(I have no idea what i am talking here but there are > > nice problems to solve) > > Improvements in Math? > > How to tackle management of datasets? > > Error Recovery if a job fails? > > How to tackle managment of learned classification models? > > Better tooling for Mahout integration? (Lucene for tokenization and > analysers?, data import and export?) > > > > Isabel >