Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for 
change notification.

The following page has been changed by OwenOMalley:
http://wiki.apache.org/lucene-hadoop/FrontPage

------------------------------------------------------------------------------
- Please contribute your knowledge about Hadoop here!
+ = Hadoop =
+ 
+ [http://lucene.apache.org/hadoop/ Hadoop] is a framework for managing 
applications across large clusters of information in such a way that the 
application does not need to worry about either reliability or locality. Hadoop 
uses a computational paradigm named [:HadoopMapReduce: Map/Reduce], where the 
application is divided into many fragments of work, each of which may be 
executed or reexecuted on any computer in the cluster. To support 
locality-transparency, Hadoop stores persistent data in a distributed file 
system that is designed for large streaming reads and fault tolerance.
+ 
+ The intent is to scale Hadoop up to handling thousand of computers. The 
current high water marks that have been reported are:
+  * !DataNodes: 620
+  * !TaskTrackers: 500
+ 
+ Hadoop was originally built as infrastructure for the 
[http://lucene.apache.org/nutch/ Nutch] project, which crawls the web and 
builds a search engine index for the crawled pages. Both Hadoop and Nutch are 
part of the [http://lucene.apache.org/java/docs/index.html Lucene] 
[http://www.apache.org/ Apache] project.
  
  == General Information ==
   * [http://lucene.apache.org/hadoop/ Hadoop Website ]

Reply via email to