Hello, I would suggest you to read at least this piece of info: http://hadoop.apache.org/common/docs/r0.20.204.0/hdfs_design.html#NameNode+and+DataNodes HDFS Architecture
This is the main part of HDFS architecture. There you can find some info of how client read data from different nodes. Also I would suggest good book " http://www.amazon.com/Hadoop-Definitive-Guide-Tom-White/dp/1449389732 Tom White - Hadoop. The Definitive Guide - 2010, 2nd Edition " There you definitely find all answers on your questions. Regards, Oleksiy panamamike wrote: > > I'm new to Hadoop. I've read a few articles and presentations which are > directed at explaining what Hadoop is, and how it works. Currently my > understanding is Hadoop is an MPP system which leverages the use of large > block size to quickly find data. In theory, I understand how a large > block size along with an MPP architecture as well as using what I'm > understanding to be a massive index scheme via mapreduce can be used to > find data. > > What I don't understand is how ,after you identify the appropriate 64MB > blocksize, do you find the data you're specifically after? Does this mean > the CPU has to search the entire 64MB block for the data of interest? If > so, how does Hadoop know what data from that block to retrieve? > > I'm assuming the block is probably composed of one or more files. If not, > I'm assuming the user isn't look for the entire 64MB block rather a > portion of it. > > Any help indicating documentation, books, articles on the subject would be > much appreciated. > > Regards, > > Mike > -- View this message in context: http://old.nabble.com/Need-help-understanding-Hadoop-Architecture-tp32705405p32722610.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
