Thanks for your response again. I could not understand a few things in your reply. So, I want to clarify them. Please find my questions inline.
On Thu, May 7, 2009 at 2:28 AM, Todd Lipcon <t...@cloudera.com> wrote: > On Wed, May 6, 2009 at 1:46 PM, Foss User <foss...@gmail.com> wrote: >> 2. Is the meta data for file blocks on data node kept in the >> underlying OS's file system on namenode or is it kept in RAM of the >> name node? >> > > The block locations are kept in the RAM of the name node, and are updated > whenever a Datanode does a "block report". This is why the namenode is in > "safe mode" at startup until it has received block locations for some > configurable percentage of blocks from the datanodes. > What is "safe mode" in namenode? This concept is new to me. Could you please explain this? > >> >> 3. If no mapper more mapper functions can be run on the node that >> contains the data on which the mapper has to act on, is Hadoop >> intelligent enough to run the new mappers on some machines within the >> same rack? >> > > Yes, assuming you have configured a network topology script. Otherwise, > Hadoop has no magical knowledge of your network infrastructure, and it > treats the whole cluster as a single rack called /default-rack > Is it a network topology script or is it a Java plugin code? AFAIK, we need to write an implementation of org.apache.hadoop.net.DNSToSwitchMapping interface. Can we write it as a script or configuration file and avoid Java coding to achieve this? If so, how?