Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change 
notification.

The following page has been changed by Gal Nitzan:
http://wiki.apache.org/nutch/MapReduce

------------------------------------------------------------------------------
  [http://weblogs.java.net/blog/tomwhite/archive/2005/09/mapreduce.html#more 
"Excerpt from TomWhite's blog: MapReduce"][[BR]]
  
- * MapReduce is the brainchild of Google and is very well documented by 
Jeffrey Dean and Sanjay Ghemawat in their paper 
[http://labs.google.com/papers/mapreduce.html "MapReduce: Simplified Data 
Processing on Large Clusters"].
+  * MapReduce is the brainchild of Google and is very well documented by 
Jeffrey Dean and Sanjay Ghemawat in their paper 
[http://labs.google.com/papers/mapreduce.html "MapReduce: Simplified Data 
Processing on Large Clusters"].
  
- * In essence, it allows massive data sets to be processed in a distributed 
fashion by breaking the processing into many small computations of two types:
+  * In essence, it allows massive data sets to be processed in a distributed 
fashion by breaking the processing into many small computations of two types:
    1. A Map operation that transforms the input into an intermediate 
representation.
    2. A Reduce function that recombines the intermediate representation into 
the final output.
  
- * This processing model is ideal for the operations a search engine indexer 
like Nutch or Google needs to perform - like computing inlinks for URLs, or 
building inverted indexes - and it will 
[http://wiki.apache.org/nutch-data/attachments/Presentations/attachments/mapred.pdf
 "transform Nutch"] into a scalable, distributed search engine.
+  * This processing model is ideal for the operations a search engine indexer 
like Nutch or Google needs to perform - like computing inlinks for URLs, or 
building inverted indexes - and it will 
[http://wiki.apache.org/nutch-data/attachments/Presentations/attachments/mapred.pdf
 "transform Nutch"] into a scalable, distributed search engine.
  

Reply via email to