Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change 
notification.

The following page has been changed by Gal Nitzan:
http://wiki.apache.org/nutch/MapReduce

------------------------------------------------------------------------------
  [http://weblogs.java.net/blog/tomwhite/archive/2005/09/mapreduce.html#more 
"Excerpt from TomWhite's blog: MapReduce"][[BR]]
  
- MapReduce is the brainchild of Google and is very well documented by Jeffrey 
Dean and Sanjay Ghemawat in their paper 
[http://labs.google.com/papers/mapreduce.html "MapReduce: Simplified Data 
Processing on Large Clusters"]. In essence, it allows massive data sets to be 
processed in a distributed fashion by breaking the processing into many small 
computations of two types: a map operation that transforms the input into an 
intermediate representation, and a reduce function that recombines the 
intermediate representation into the final output. This processing model is 
ideal for the operations a search engine indexer like Nutch or Google needs to 
perform - like computing inlinks for URLs, or building inverted indexes - and 
it will 
[http://wiki.apache.org/nutch-data/attachments/Presentations/attachments/mapred.pdf
 "transform Nutch"] into a scalable, distributed search engine.
+ * MapReduce is the brainchild of Google and is very well documented by 
Jeffrey Dean and Sanjay Ghemawat in their paper 
[http://labs.google.com/papers/mapreduce.html "MapReduce: Simplified Data 
Processing on Large Clusters"].
  
+ * In essence, it allows massive data sets to be processed in a distributed 
fashion by breaking the processing into many small computations of two types:
+   1. A Map operation that transforms the input into an intermediate 
representation.
+   2. A Reduce function that recombines the intermediate representation into 
the final output.
+ 
+ * This processing model is ideal for the operations a search engine indexer 
like Nutch or Google needs to perform - like computing inlinks for URLs, or 
building inverted indexes - and it will 
[http://wiki.apache.org/nutch-data/attachments/Presentations/attachments/mapred.pdf
 "transform Nutch"] into a scalable, distributed search engine.
+ 

Reply via email to