Hi,
Just to give you a brief idea, Hadoop is a distributed computing platform
written in Java that lets one easily write and run applications that process
vast amounts of data. The applications are written using the MapReduce
programming model which typically used for writing applications that process
large data-sets. The data itself is stored in Hadoop Distributed File System
(HDFS).
For more information on Map/Reduce, refer to
http://wiki.apache.org/lucene-hadoop/HadoopMapReduce
For more information on HDFS, refer to
http://lucene.apache.org/hadoop/hdfs_design.html
Please see my replies inline below as more precise answers to your questions.
-Ankur
-----Original Message-----
From: 王学超 [mailto:[EMAIL PROTECTED]
Sent: Tuesday, December 18, 2007 12:02 PM
To: [email protected]
Subject: give me some opinions, thanks
Hello,everyone!
Our company prepared to use hadoop to deal with massive data. because i
don't know well about hadoop. there are following questions. please give me
some opinions thanks firstly!
1、What is specificed at programming type vs. configured at run-time in terms of
parallelization, environment, etc?
[Ankur]: Refer to
http://lucene.apache.org/hadoop/docs/r0.15.1/mapred_tutorial.html
2、What kind of out-of-box samples are packaged with the middleware?
[Ankur]: Refer to http://wiki.apache.org/lucene-hadoop/ , Search for Examples
and click on link for individual example to get a fair idea.
3、What programming models are supported? please describe the nature of the APIs
provided
Map/Reduce is the only programming model supported. The APIs that a typical
map/reduce job writer will use are mostly
reated to available Data-types and supported data-formats, Map/Reduce specific
classes and interfaces and utlity classes
and interfaces for HDFS access. You can refer to the latest API javadoc which
is available at
http://lucene.apache.org/hadoop/docs/r0.15.1/api/index.html