[Lucene-hadoop Wiki] Trivial Update of "ImportantConcepts" by TedDunning

Apache Wiki Thu, 19 Jul 2007 20:09:08 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for 
change notification.


The following page has been changed by TedDunning:
http://wiki.apache.org/lucene-hadoop/ImportantConcepts

------------------------------------------------------------------------------
  
  * Task - Whereas a job describes all of the inputs, outputs, classes and 
libraries used in a map/reduce program, a task is the program that executes the 
individual map and reduce steps.
  
- * HDFS - stands for Hadoop Distributed File System.  This is how input and 
output files of Hadoop programs are normally stored.  The major advantage of 
HDFS are that it provides very high input and output speeds.  This is critical 
for good performance for highly parallel programs since as the number of 
processors involved in working on a problem increases, the overall demand for 
input data increases as does the overall rate that output is produced.  HDFS 
provides very high bandwidth by storing chunks of files scattered throughout 
the Hadoop cluster.  By clever choice of where individual tasks are run and 
because files are stored in multiple places, tasks are placed near their input 
data and output data is largely stored where it is created.    
+ * [:DFS:HDFS] - stands for Hadoop Distributed File System.  This is how input 
and output files of Hadoop programs are normally stored.  The major advantage 
of HDFS are that it provides very high input and output speeds.  This is 
critical for good performance for highly parallel programs since as the number 
of processors involved in working on a problem increases, the overall demand 
for input data increases as does the overall rate that output is produced.  
HDFS provides very high bandwidth by storing chunks of files scattered 
throughout the Hadoop cluster.  By clever choice of where individual tasks are 
run and because files are stored in multiple places, tasks are placed near 
their input data and output data is largely stored where it is created.

[Lucene-hadoop Wiki] Trivial Update of "ImportantConcepts" by TedDunning

Reply via email to