Siddu, If this is for an undergraduate class, I would suggest something that allows you to get some work in with basic data structures such as building an inverted index over a few million documents (maybe Wikipedia pages?). You will also need to get a general feel for Hadoop.
The University of Washington has some really nice project ideas for their distributed systems class: http://www.cs.washington.edu/education/courses/cse490h/09wi/projects/490 H.project.ideas.pdf If you wanted to tackle something a little more advanced, then you could take a look at Pete Skomoroch's article on finding trends with Hadoop and Hive: http://www.cloudera.com/blog/2009/07/31/tracking-trends-with-hadoop-and- hive-on-ec2/ http://www.cloudera.com/blog/2009/09/28/grouping-related-trends-with-had oop-and-hive/ Things to keep in mind: 1.) Hadoop wont be as simple as writing a single Java app 2.) There will be some overhead involved in re-writing algorithms in Map Reduce 3.) There will also be some overhead involved in setup and maintenance of the Hadoop Cluster Take these three things into account when planning how to manage your time for the project during the semester, semesters can seem a lot shorter when you spend too much time on things not related to just implementing and testing your algorithm. Good luck! Josh Patterson TVA -----Original Message----- From: Siddu [mailto:[email protected]] Sent: Wednesday, October 14, 2009 6:09 AM To: [email protected] Cc: [email protected] Subject: Project ideas ! Hello Hadoop Users, Me and another friend of mine are looking out for some of the project ideas based on hadoop as a part of our curriculum . Can you give us some pointers please Thanks in advance ! Regards, ~Sid~
