2009/10/14 Patterson, Josh <[email protected]> > Siddu, > If this is for an undergraduate class, I would suggest something that > allows you to get some work in with basic data structures such as > building an inverted index over a few million documents (maybe Wikipedia > pages?). You will also need to get a general feel for Hadoop. > > The University of Washington has some really nice project ideas for > their distributed systems class: > > http://www.cs.washington.edu/education/courses/cse490h/09wi/projects/490 > H.project.ideas.pdf > > If you wanted to tackle something a little more advanced, then you could > take a look at Pete Skomoroch's article on finding trends with Hadoop > and Hive: >
Related: CUSUM charts are used for interpretation of (noisy) time series (e.g. collected from sensor measurements), perhaps efficiently create those with hadoop? http://www.variation.com/cpa/help/hs108.htm http://www.itl.nist.gov/div898/handbook/pmc/section3/pmc323.htm note: R, Pig, Hive could be of relevance(/overlap) for this. Amund http://atbrox.com
