RE: Need help understanding Hadoop Architecture

Anupam Seth Mon, 24 Oct 2011 09:26:55 -0700

Hi Mike,

This might help address your question:

http://storageconference.org/2010/Papers/MSST/Shvachko.pdf

Regards,
Anupam

-----Original Message-----
From: panamamike [mailto:[email protected]] 
Sent: Sunday, October 23, 2011 9:59 AM
To: [email protected]
Subject: Need help understanding Hadoop Architecture

I'm new to Hadoop.  I've read a few articles and presentations which are
directed at explaining what Hadoop is, and how it works.  Currently my
understanding is Hadoop is an MPP system which leverages the use of large
block size to quickly find data.  In theory, I understand how a large block
size along with an MPP architecture as well as using what I'm understanding
to be a massive index scheme via mapreduce can be used to find data.

What I don't understand is how ,after you identify the appropriate 64MB
blocksize, do you find the data you're specifically after?  Does this mean
the CPU has to search the entire 64MB block for the data of interest?  If
so, how does Hadoop know what data from that block to retrieve?

I'm assuming the block is probably composed of one or more files.  If not,
I'm assuming the user isn't look for the entire 64MB block rather a portion
of it.

Any help indicating documentation, books, articles on the subject would be
much appreciated.

Regards,

Mike
-- 
View this message in context: 
http://old.nabble.com/Need-help-understanding-Hadoop-Architecture-tp32705405p32705405.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

RE: Need help understanding Hadoop Architecture

Reply via email to