Author: hairong
Date: Mon Jul  7 12:14:06 2008
New Revision: 674598

URL: http://svn.apache.org/viewvc?rev=674598&view=rev
Log:
Merge -r 674587:674588 from trunk to main to move the change log of HADOOP-3688 
into the release 0.18.0 section

Modified:
    hadoop/core/branches/branch-0.18/CHANGES.txt
    hadoop/core/branches/branch-0.18/docs/changes.html
    hadoop/core/branches/branch-0.18/docs/hdfs_design.html
    hadoop/core/branches/branch-0.18/docs/hdfs_design.pdf
    hadoop/core/branches/branch-0.18/docs/hdfs_quota_admin_guide.html
    hadoop/core/branches/branch-0.18/docs/hdfs_quota_admin_guide.pdf
    hadoop/core/branches/branch-0.18/docs/hdfs_user_guide.html
    hadoop/core/branches/branch-0.18/docs/hdfs_user_guide.pdf
    
hadoop/core/branches/branch-0.18/src/docs/src/documentation/content/xdocs/hdfs_design.xml
    
hadoop/core/branches/branch-0.18/src/docs/src/documentation/content/xdocs/hdfs_quota_admin_guide.xml
    
hadoop/core/branches/branch-0.18/src/docs/src/documentation/content/xdocs/hdfs_user_guide.xml

Modified: hadoop/core/branches/branch-0.18/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/hadoop/core/branches/branch-0.18/CHANGES.txt?rev=674598&r1=674597&r2=674598&view=diff
==============================================================================
--- hadoop/core/branches/branch-0.18/CHANGES.txt (original)
+++ hadoop/core/branches/branch-0.18/CHANGES.txt Mon Jul  7 12:14:06 2008
@@ -320,6 +320,8 @@
 
     HADOOP-3100. Develop tests to test the DFS command line interface. (mukund)
 
+    HADOOP-3688. Fix up HDFS docs. (Robert Chansler via hairong)
+
   OPTIMIZATIONS
 
     HADOOP-3274. The default constructor of BytesWritable creates empty 

Modified: hadoop/core/branches/branch-0.18/docs/changes.html
URL: 
http://svn.apache.org/viewvc/hadoop/core/branches/branch-0.18/docs/changes.html?rev=674598&r1=674597&r2=674598&view=diff
==============================================================================
--- hadoop/core/branches/branch-0.18/docs/changes.html (original)
+++ hadoop/core/branches/branch-0.18/docs/changes.html Mon Jul  7 12:14:06 2008
@@ -187,7 +187,7 @@
     </ol>
   </li>
   <li><a 
href="javascript:toggleList('release_0.18.0_-_unreleased_._improvements_')">  
IMPROVEMENTS
-</a>&nbsp;&nbsp;&nbsp;(45)
+</a>&nbsp;&nbsp;&nbsp;(46)
     <ol id="release_0.18.0_-_unreleased_._improvements_">
       <li><a 
href="http://issues.apache.org/jira/browse/HADOOP-2928";>HADOOP-2928</a>. Remove 
deprecated FileSystem.getContentLength().<br />(Lohit Vjayarenu via 
rangadi)</li>
       <li><a 
href="http://issues.apache.org/jira/browse/HADOOP-3130";>HADOOP-3130</a>. Make 
the connect timeout smaller for getFile.<br />(Amar Ramesh Kamat via ddas)</li>
@@ -278,6 +278,7 @@
       <li><a 
href="http://issues.apache.org/jira/browse/HADOOP-3606";>HADOOP-3606</a>. 
Updates the Streaming doc.<br />(Amareshwari Sriramadasu via ddas)</li>
       <li><a 
href="http://issues.apache.org/jira/browse/HADOOP-3532";>HADOOP-3532</a>. Add 
jdiff reports to the build scripts.<br />(omalley)</li>
       <li><a 
href="http://issues.apache.org/jira/browse/HADOOP-3100";>HADOOP-3100</a>. 
Develop tests to test the DFS command line interface.<br />(mukund)</li>
+      <li><a 
href="http://issues.apache.org/jira/browse/HADOOP-3688";>HADOOP-3688</a>. Fix up 
HDFS docs.<br />(Robert Chansler via hairong)</li>
     </ol>
   </li>
   <li><a 
href="javascript:toggleList('release_0.18.0_-_unreleased_._optimizations_')">  
OPTIMIZATIONS
@@ -565,7 +566,7 @@
     </ol>
   </li>
   <li><a 
href="javascript:toggleList('release_0.17.1_-_unreleased_._bug_fixes_')">  BUG 
FIXES
-</a>&nbsp;&nbsp;&nbsp;(4)
+</a>&nbsp;&nbsp;&nbsp;(5)
     <ol id="release_0.17.1_-_unreleased_._bug_fixes_">
       <li><a 
href="http://issues.apache.org/jira/browse/HADOOP-1979";>HADOOP-1979</a>. Speed 
up fsck by adding a buffered stream.<br />(Lohit
 Vijaya Renu via omalley)</li>
@@ -574,6 +575,8 @@
       <li><a 
href="http://issues.apache.org/jira/browse/HADOOP-3571";>HADOOP-3571</a>. Fix 
bug in block removal used in lease recovery.<br />(shv)</li>
       <li><a 
href="http://issues.apache.org/jira/browse/HADOOP-3645";>HADOOP-3645</a>. 
MetricsTimeVaryingRate returns wrong value for
 metric_avg_time.<br />(Lohit Vijayarenu via hairong)</li>
+      <li><a 
href="http://issues.apache.org/jira/browse/HADOOP-3633";>HADOOP-3633</a>. 
Correct exception handling in DataXceiveServer, and throttle
+the number of xceiver threads in a data-node.<br />(shv)</li>
     </ol>
   </li>
 </ul>

Modified: hadoop/core/branches/branch-0.18/docs/hdfs_design.html
URL: 
http://svn.apache.org/viewvc/hadoop/core/branches/branch-0.18/docs/hdfs_design.html?rev=674598&r1=674597&r2=674598&view=diff
==============================================================================
--- hadoop/core/branches/branch-0.18/docs/hdfs_design.html (original)
+++ hadoop/core/branches/branch-0.18/docs/hdfs_design.html Mon Jul  7 12:14:06 
2008
@@ -221,7 +221,7 @@
 </ul>
 </li>
 <li>
-<a href="#Namenode+and+Datanodes"> Namenode and Datanodes </a>
+<a href="#NameNode+and+DataNodes"> NameNode and DataNodes </a>
 </li>
 <li>
 <a href="#The+File+System+Namespace"> The File System Namespace </a>
@@ -236,7 +236,7 @@
 <a href="#Replica+Selection"> Replica Selection </a>
 </li>
 <li>
-<a href="#SafeMode"> SafeMode </a>
+<a href="#Safemode"> Safemode </a>
 </li>
 </ul>
 </li>
@@ -284,7 +284,7 @@
 <a href="#Accessibility"> Accessibility </a>
 <ul class="minitoc">
 <li>
-<a href="#DFSShell"> DFSShell </a>
+<a href="#FS+Shell"> FS Shell </a>
 </li>
 <li>
 <a href="#DFSAdmin"> DFSAdmin </a>
@@ -341,7 +341,7 @@
 <a name="N1004A"></a><a name="Simple+Coherency+Model"></a>
 <h3 class="h4"> Simple Coherency Model </h3>
 <p>
-        HDFS applications need a write-once-read-many access model for files. 
A file once created, written, and closed need not be changed. This assumption 
simplifies data coherency issues and enables high throughput data access. A 
MapReduce application or a web crawler application fits perfectly with this 
model. There is a plan to support appending-writes to files in the future. 
+        HDFS applications need a write-once-read-many access model for files. 
A file once created, written, and closed need not be changed. This assumption 
simplifies data coherency issues and enables high throughput data access. A 
Map/Reduce application or a web crawler application fits perfectly with this 
model. There is a plan to support appending-writes to files in the future. 
         </p>
 <a name="N10054"></a><a 
name="%E2%80%9CMoving+Computation+is+Cheaper+than+Moving+Data%E2%80%9D"></a>
 <h3 class="h4"> &ldquo;Moving Computation is Cheaper than Moving Data&rdquo; 
</h3>
@@ -357,51 +357,51 @@
 
  
     
-<a name="N10069"></a><a name="Namenode+and+Datanodes"></a>
-<h2 class="h3"> Namenode and Datanodes </h2>
+<a name="N10069"></a><a name="NameNode+and+DataNodes"></a>
+<h2 class="h3"> NameNode and DataNodes </h2>
 <div class="section">
 <p>
-      HDFS has a master/slave architecture. An HDFS cluster consists of a 
single <em>Namenode</em>, a master server that manages the file system 
namespace and regulates access to files by clients. In addition, there are a 
number of <em>Datanodes</em>, usually one per node in the cluster, which manage 
storage attached to the nodes that they run on. HDFS exposes a file system 
namespace and allows user data to be stored in files. Internally, a file is 
split into one or more blocks and these blocks are stored in a set of 
Datanodes. The Namenode executes file system namespace operations like opening, 
closing, and renaming files and directories. It also determines the mapping of 
blocks to Datanodes. The Datanodes are responsible for serving read and write 
requests from the file system&rsquo;s clients. The Datanodes also perform block 
creation, deletion, and replication upon instruction from the Namenode.
+      HDFS has a master/slave architecture. An HDFS cluster consists of a 
single NameNode, a master server that manages the file system namespace and 
regulates access to files by clients. In addition, there are a number of 
DataNodes, usually one per node in the cluster, which manage storage attached 
to the nodes that they run on. HDFS exposes a file system namespace and allows 
user data to be stored in files. Internally, a file is split into one or more 
blocks and these blocks are stored in a set of DataNodes. The NameNode executes 
file system namespace operations like opening, closing, and renaming files and 
directories. It also determines the mapping of blocks to DataNodes. The 
DataNodes are responsible for serving read and write requests from the file 
system&rsquo;s clients. The DataNodes also perform block creation, deletion, 
and replication upon instruction from the NameNode.
       </p>
 <div id="" style="text-align: center;">
 <img id="" class="figure" alt="HDFS Architecture" 
src="images/hdfsarchitecture.gif"></div>
 <p>
-      The Namenode and Datanode are pieces of software designed to run on 
commodity machines. These machines typically run a GNU/Linux operating system 
(<acronym title="operating system">OS</acronym>). HDFS is built using the Java 
language; any machine that supports Java can run the Namenode or the Datanode 
software. Usage of the highly portable Java language means that HDFS can be 
deployed on a wide range of machines. A typical deployment has a dedicated 
machine that runs only the Namenode software. Each of the other machines in the 
cluster runs one instance of the Datanode software. The architecture does not 
preclude running multiple Datanodes on the same machine but in a real 
deployment that is rarely the case.
+      The NameNode and DataNode are pieces of software designed to run on 
commodity machines. These machines typically run a GNU/Linux operating system 
(<acronym title="operating system">OS</acronym>). HDFS is built using the Java 
language; any machine that supports Java can run the NameNode or the DataNode 
software. Usage of the highly portable Java language means that HDFS can be 
deployed on a wide range of machines. A typical deployment has a dedicated 
machine that runs only the NameNode software. Each of the other machines in the 
cluster runs one instance of the DataNode software. The architecture does not 
preclude running multiple DataNodes on the same machine but in a real 
deployment that is rarely the case.
       </p>
 <p>
-      The existence of a single Namenode in a cluster greatly simplifies the 
architecture of the system. The Namenode is the arbitrator and repository for 
all HDFS metadata. The system is designed in such a way that user <em>data</em> 
never flows through the Namenode.
+      The existence of a single NameNode in a cluster greatly simplifies the 
architecture of the system. The NameNode is the arbitrator and repository for 
all HDFS metadata. The system is designed in such a way that user data never 
flows through the NameNode.
       </p>
 </div>
 
  
 
     
-<a name="N1008A"></a><a name="The+File+System+Namespace"></a>
+<a name="N10081"></a><a name="The+File+System+Namespace"></a>
 <h2 class="h3"> The File System Namespace </h2>
 <div class="section">
 <p>
       HDFS supports a traditional hierarchical file organization. A user or an 
application can create directories and store files inside these directories. 
The file system namespace hierarchy is similar to most other existing file 
systems; one can create and remove files, move a file from one directory to 
another, or rename a file. HDFS does not yet implement user quotas or access 
permissions. HDFS does not support hard links or soft links. However, the HDFS 
architecture does not preclude implementing these features.
       </p>
 <p>
-      The Namenode maintains the file system namespace. Any change to the file 
system namespace or its properties is recorded by the Namenode. An application 
can specify the number of replicas of a file that should be maintained by HDFS. 
The number of copies of a file is called the replication factor of that file. 
This information is stored by the Namenode.
+      The NameNode maintains the file system namespace. Any change to the file 
system namespace or its properties is recorded by the NameNode. An application 
can specify the number of replicas of a file that should be maintained by HDFS. 
The number of copies of a file is called the replication factor of that file. 
This information is stored by the NameNode.
       </p>
 </div>
 
  
 
     
-<a name="N10097"></a><a name="Data+Replication"></a>
+<a name="N1008E"></a><a name="Data+Replication"></a>
 <h2 class="h3"> Data Replication </h2>
 <div class="section">
 <p>
       HDFS is designed to reliably store very large files across machines in a 
large cluster. It stores each file as a sequence of blocks; all blocks in a 
file except the last block are the same size. The blocks of a file are 
replicated for fault tolerance. The block size and replication factor are 
configurable per file. An application can specify the number of replicas of a 
file. The replication factor can be specified at file creation time and can be 
changed later. Files in HDFS are write-once and have strictly one writer at any 
time. 
       </p>
 <p>
-      The Namenode makes all decisions regarding replication of blocks. It 
periodically receives a <em>Heartbeat</em> and a <em>Blockreport</em> from each 
of the Datanodes in the cluster. Receipt of a Heartbeat implies that the 
Datanode is functioning properly. A Blockreport contains a list of all blocks 
on a Datanode. 
+      The NameNode makes all decisions regarding replication of blocks. It 
periodically receives a Heartbeat and a Blockreport from each of the DataNodes 
in the cluster. Receipt of a Heartbeat implies that the DataNode is functioning 
properly. A Blockreport contains a list of all blocks on a DataNode. 
     </p>
 <div id="" style="text-align: center;">
-<img id="" class="figure" alt="HDFS Datanodes" 
src="images/hdfsdatanodes.gif"></div>
-<a name="N100AD"></a><a name="Replica+Placement%3A+The+First+Baby+Steps"></a>
+<img id="" class="figure" alt="HDFS DataNodes" 
src="images/hdfsdatanodes.gif"></div>
+<a name="N1009E"></a><a name="Replica+Placement%3A+The+First+Baby+Steps"></a>
 <h3 class="h4"> Replica Placement: The First Baby Steps </h3>
 <p>
         The placement of replicas is critical to HDFS reliability and 
performance. Optimizing replica placement distinguishes HDFS from most other 
distributed file systems. This is a feature that needs lots of tuning and 
experience. The purpose of a rack-aware replica placement policy is to improve 
data reliability, availability, and network bandwidth utilization. The current 
implementation for the replica placement policy is a first effort in this 
direction. The short-term goals of implementing this policy are to validate it 
on production systems, learn more about its behavior, and build a foundation to 
test and research more sophisticated policies. 
@@ -418,76 +418,76 @@
 <p>
         The current, default replica placement policy described here is a work 
in progress.
         </p>
-<a name="N100C7"></a><a name="Replica+Selection"></a>
+<a name="N100B8"></a><a name="Replica+Selection"></a>
 <h3 class="h4"> Replica Selection </h3>
 <p>
         To minimize global bandwidth consumption and read latency, HDFS tries 
to satisfy a read request from a replica that is closest to the reader. If 
there exists a replica on the same rack as the reader node, then that replica 
is preferred to satisfy the read request. If angg/ HDFS cluster spans multiple 
data centers, then a replica that is resident in the local data center is 
preferred over any remote replica.
         </p>
-<a name="N100D1"></a><a name="SafeMode"></a>
-<h3 class="h4"> SafeMode </h3>
+<a name="N100C2"></a><a name="Safemode"></a>
+<h3 class="h4"> Safemode </h3>
 <p>
-        On startup, the Namenode enters a special state called 
<em>Safemode</em>. Replication of data blocks does not occur when the Namenode 
is in the Safemode state. The Namenode receives Heartbeat and Blockreport 
messages from the Datanodes. A Blockreport contains the list of data blocks 
that a Datanode is hosting. Each block has a specified minimum number of 
replicas. A block is considered <em>safely replicated</em> when the minimum 
number of replicas of that data block has checked in with the Namenode. After a 
configurable percentage of safely replicated data blocks checks in with the 
Namenode (plus an additional 30 seconds), the Namenode exits the Safemode 
state. It then determines the list of data blocks (if any) that still have 
fewer than the specified number of replicas. The Namenode then replicates these 
blocks to other Datanodes.
+        On startup, the NameNode enters a special state called Safemode. 
Replication of data blocks does not occur when the NameNode is in the Safemode 
state. The NameNode receives Heartbeat and Blockreport messages from the 
DataNodes. A Blockreport contains the list of data blocks that a DataNode is 
hosting. Each block has a specified minimum number of replicas. A block is 
considered safely replicated when the minimum number of replicas of that data 
block has checked in with the NameNode. After a configurable percentage of 
safely replicated data blocks checks in with the NameNode (plus an additional 
30 seconds), the NameNode exits the Safemode state. It then determines the list 
of data blocks (if any) that still have fewer than the specified number of 
replicas. The NameNode then replicates these blocks to other DataNodes.
         </p>
 </div>
 
     
-<a name="N100E2"></a><a name="The+Persistence+of+File+System+Metadata"></a>
+<a name="N100CD"></a><a name="The+Persistence+of+File+System+Metadata"></a>
 <h2 class="h3"> The Persistence of File System Metadata </h2>
 <div class="section">
 <p>
-        The HDFS namespace is stored by the Namenode. The Namenode uses a 
transaction log called the <em>EditLog</em> to persistently record every change 
that occurs to file system <em>metadata</em>. For example, creating a new file 
in HDFS causes the Namenode to insert a record into the EditLog indicating 
this. Similarly, changing the replication factor of a file causes a new record 
to be inserted into the EditLog. The Namenode uses a file in its <em>local</em> 
host OS file system to store the EditLog. The entire file system namespace, 
including the mapping of blocks to files and file system properties, is stored 
in a file called the <em>FsImage</em>. The FsImage is stored as a file in the 
Namenode&rsquo;s local file system too.
+        The HDFS namespace is stored by the NameNode. The NameNode uses a 
transaction log called the EditLog to persistently record every change that 
occurs to file system metadata. For example, creating a new file in HDFS causes 
the NameNode to insert a record into the EditLog indicating this. Similarly, 
changing the replication factor of a file causes a new record to be inserted 
into the EditLog. The NameNode uses a file in its local host OS file system to 
store the EditLog. The entire file system namespace, including the mapping of 
blocks to files and file system properties, is stored in a file called the 
FsImage. The FsImage is stored as a file in the NameNode&rsquo;s local file 
system too.
         </p>
 <p>
-        The Namenode keeps an image of the entire file system namespace and 
file <em>Blockmap</em> in memory. This key metadata item is designed to be 
compact, such that a Namenode with 4 GB of RAM is plenty to support a huge 
number of files and directories. When the Namenode starts up, it reads the 
FsImage and EditLog from disk, applies all the transactions from the EditLog to 
the in-memory representation of the FsImage, and flushes out this new version 
into a new FsImage on disk. It can then truncate the old EditLog because its 
transactions have been applied to the persistent FsImage. This process is 
called a <em>checkpoint</em>. In the current implementation, a checkpoint only 
occurs when the Namenode starts up. Work is in progress to support periodic 
checkpointing in the near future.
+        The NameNode keeps an image of the entire file system namespace and 
file Blockmap in memory. This key metadata item is designed to be compact, such 
that a NameNode with 4 GB of RAM is plenty to support a huge number of files 
and directories. When the NameNode starts up, it reads the FsImage and EditLog 
from disk, applies all the transactions from the EditLog to the in-memory 
representation of the FsImage, and flushes out this new version into a new 
FsImage on disk. It can then truncate the old EditLog because its transactions 
have been applied to the persistent FsImage. This process is called a 
checkpoint. In the current implementation, a checkpoint only occurs when the 
NameNode starts up. Work is in progress to support periodic checkpointing in 
the near future.
         </p>
 <p>
-        The Datanode stores HDFS data in files in its local file system. The 
Datanode has no knowledge about HDFS files. It stores each block of HDFS data 
in a separate file in its local file system. The Datanode does not create all 
files in the same directory. Instead, it uses a heuristic to determine the 
optimal number of files per directory and creates subdirectories appropriately. 
It is not optimal to create all local files in the same directory because the 
local file system might not be able to efficiently support a huge number of 
files in a single directory. When a Datanode starts up, it scans through its 
local file system, generates a list of all HDFS data blocks that correspond to 
each of these local files and sends this report to the Namenode: this is the 
Blockreport. 
+        The DataNode stores HDFS data in files in its local file system. The 
DataNode has no knowledge about HDFS files. It stores each block of HDFS data 
in a separate file in its local file system. The DataNode does not create all 
files in the same directory. Instead, it uses a heuristic to determine the 
optimal number of files per directory and creates subdirectories appropriately. 
It is not optimal to create all local files in the same directory because the 
local file system might not be able to efficiently support a huge number of 
files in a single directory. When a DataNode starts up, it scans through its 
local file system, generates a list of all HDFS data blocks that correspond to 
each of these local files and sends this report to the NameNode: this is the 
Blockreport. 
         </p>
 </div>
 
 
     
-<a name="N10104"></a><a name="The+Communication+Protocols"></a>
+<a name="N100DD"></a><a name="The+Communication+Protocols"></a>
 <h2 class="h3"> The Communication Protocols </h2>
 <div class="section">
 <p>
-      All HDFS communication protocols are layered on top of the TCP/IP 
protocol. A client establishes a connection to a configurable <acronym 
title="Transmission Control Protocol">TCP</acronym> port on the Namenode 
machine. It talks the <em>ClientProtocol</em> with the Namenode. The Datanodes 
talk to the Namenode using the <em>DatanodeProtocol</em>. A Remote Procedure 
Call (<acronym title="Remote Procedure Call">RPC</acronym>) abstraction wraps 
both the ClientProtocol and the DatanodeProtocol. By design, the Namenode never 
initiates any RPCs. Instead, it only responds to RPC requests issued by 
Datanodes or clients. 
+      All HDFS communication protocols are layered on top of the TCP/IP 
protocol. A client establishes a connection to a configurable <acronym 
title="Transmission Control Protocol">TCP</acronym> port on the NameNode 
machine. It talks the ClientProtocol with the NameNode. The DataNodes talk to 
the NameNode using the DataNode Protocol. A Remote Procedure Call (<acronym 
title="Remote Procedure Call">RPC</acronym>) abstraction wraps both the Client 
Protocol and the DataNode Protocol. By design, the NameNode never initiates any 
RPCs. Instead, it only responds to RPC requests issued by DataNodes or clients. 
       </p>
 </div>
  
 
     
-<a name="N1011C"></a><a name="Robustness"></a>
+<a name="N100EF"></a><a name="Robustness"></a>
 <h2 class="h3"> Robustness </h2>
 <div class="section">
 <p>
-      The primary objective of HDFS is to store data reliably even in the 
presence of failures. The three common types of failures are Namenode failures, 
Datanode failures and network partitions.
+      The primary objective of HDFS is to store data reliably even in the 
presence of failures. The three common types of failures are NameNode failures, 
DataNode failures and network partitions.
       </p>
-<a name="N10125"></a><a 
name="Data+Disk+Failure%2C+Heartbeats+and+Re-Replication"></a>
+<a name="N100F8"></a><a 
name="Data+Disk+Failure%2C+Heartbeats+and+Re-Replication"></a>
 <h3 class="h4"> Data Disk Failure, Heartbeats and Re-Replication </h3>
 <p>
-        Each Datanode sends a Heartbeat message to the Namenode periodically. 
A network partition can cause a subset of Datanodes to lose connectivity with 
the Namenode. The Namenode detects this condition by the absence of a Heartbeat 
message. The Namenode marks Datanodes without recent Heartbeats as dead and 
does not forward any new <acronym title="Input/Output">IO</acronym> requests to 
them. Any data that was registered to a dead Datanode is not available to HDFS 
any more. Datanode death may cause the replication factor of some blocks to 
fall below their specified value. The Namenode constantly tracks which blocks 
need to be replicated and initiates replication whenever necessary. The 
necessity for re-replication may arise due to many reasons: a Datanode may 
become unavailable, a replica may become corrupted, a hard disk on a Datanode 
may fail, or the replication factor of a file may be increased. 
+        Each DataNode sends a Heartbeat message to the NameNode periodically. 
A network partition can cause a subset of DataNodes to lose connectivity with 
the NameNode. The NameNode detects this condition by the absence of a Heartbeat 
message. The NameNode marks DataNodes without recent Heartbeats as dead and 
does not forward any new <acronym title="Input/Output">IO</acronym> requests to 
them. Any data that was registered to a dead DataNode is not available to HDFS 
any more. DataNode death may cause the replication factor of some blocks to 
fall below their specified value. The NameNode constantly tracks which blocks 
need to be replicated and initiates replication whenever necessary. The 
necessity for re-replication may arise due to many reasons: a DataNode may 
become unavailable, a replica may become corrupted, a hard disk on a DataNode 
may fail, or the replication factor of a file may be increased. 
         </p>
-<a name="N10133"></a><a name="Cluster+Rebalancing"></a>
+<a name="N10106"></a><a name="Cluster+Rebalancing"></a>
 <h3 class="h4"> Cluster Rebalancing </h3>
 <p>
-        The HDFS architecture is compatible with <em>data rebalancing 
schemes</em>. A scheme might automatically move data from one Datanode to 
another if the free space on a Datanode falls below a certain threshold. In the 
event of a sudden high demand for a particular file, a scheme might dynamically 
create additional replicas and rebalance other data in the cluster. These types 
of data rebalancing schemes are not yet implemented. 
+        The HDFS architecture is compatible with data rebalancing schemes. A 
scheme might automatically move data from one DataNode to another if the free 
space on a DataNode falls below a certain threshold. In the event of a sudden 
high demand for a particular file, a scheme might dynamically create additional 
replicas and rebalance other data in the cluster. These types of data 
rebalancing schemes are not yet implemented. 
         </p>
-<a name="N10140"></a><a name="Data+Integrity"></a>
+<a name="N10110"></a><a name="Data+Integrity"></a>
 <h3 class="h4"> Data Integrity </h3>
 <p>
         <!-- XXX "checksum checking" sounds funny -->
-        It is possible that a block of data fetched from a Datanode arrives 
corrupted. This corruption can occur because of faults in a storage device, 
network faults, or buggy software. The HDFS client software implements checksum 
checking on the contents of HDFS files. When a client creates an HDFS file, it 
computes a checksum of each block of the file and stores these checksums in a 
separate hidden file in the same HDFS namespace. When a client retrieves file 
contents it verifies that the data it received from each Datanode matches the 
checksum stored in the associated checksum file. If not, then the client can 
opt to retrieve that block from another Datanode that has a replica of that 
block.
+        It is possible that a block of data fetched from a DataNode arrives 
corrupted. This corruption can occur because of faults in a storage device, 
network faults, or buggy software. The HDFS client software implements checksum 
checking on the contents of HDFS files. When a client creates an HDFS file, it 
computes a checksum of each block of the file and stores these checksums in a 
separate hidden file in the same HDFS namespace. When a client retrieves file 
contents it verifies that the data it received from each DataNode matches the 
checksum stored in the associated checksum file. If not, then the client can 
opt to retrieve that block from another DataNode that has a replica of that 
block.
         </p>
-<a name="N1014C"></a><a name="Metadata+Disk+Failure"></a>
+<a name="N1011C"></a><a name="Metadata+Disk+Failure"></a>
 <h3 class="h4"> Metadata Disk Failure </h3>
 <p>
-        The FsImage and the EditLog are central data structures of HDFS. A 
corruption of these files can cause the HDFS instance to be non-functional. For 
this reason, the Namenode can be configured to support maintaining multiple 
copies of the FsImage and EditLog. Any update to either the FsImage or EditLog 
causes each of the FsImages and EditLogs to get updated synchronously. This 
synchronous updating of multiple copies of the FsImage and EditLog may degrade 
the rate of namespace transactions per second that a Namenode can support. 
However, this degradation is acceptable because even though HDFS applications 
are very <em>data</em> intensive in nature, they are not <em>metadata</em> 
intensive. When a Namenode restarts, it selects the latest consistent FsImage 
and EditLog to use.
+        The FsImage and the EditLog are central data structures of HDFS. A 
corruption of these files can cause the HDFS instance to be non-functional. For 
this reason, the NameNode can be configured to support maintaining multiple 
copies of the FsImage and EditLog. Any update to either the FsImage or EditLog 
causes each of the FsImages and EditLogs to get updated synchronously. This 
synchronous updating of multiple copies of the FsImage and EditLog may degrade 
the rate of namespace transactions per second that a NameNode can support. 
However, this degradation is acceptable because even though HDFS applications 
are very data intensive in nature, they are not metadata intensive. When a 
NameNode restarts, it selects the latest consistent FsImage and EditLog to use.
         </p>
 <p> 
-        The Namenode machine is a single point of failure for an HDFS cluster. 
If the Namenode machine fails, manual intervention is necessary. Currently, 
automatic restart and failover of the Namenode software to another machine is 
not supported.
+        The NameNode machine is a single point of failure for an HDFS cluster. 
If the NameNode machine fails, manual intervention is necessary. Currently, 
automatic restart and failover of the NameNode software to another machine is 
not supported.
         </p>
-<a name="N1015F"></a><a name="Snapshots"></a>
+<a name="N10129"></a><a name="Snapshots"></a>
 <h3 class="h4"> Snapshots </h3>
 <p>
         Snapshots support storing a copy of data at a particular instant of 
time. One usage of the snapshot feature may be to roll back a corrupted HDFS 
instance to a previously known good point in time. HDFS does not currently 
support snapshots but will in a future release.
@@ -496,40 +496,40 @@
  
 
     
-<a name="N1016A"></a><a name="Data+Organization"></a>
+<a name="N10134"></a><a name="Data+Organization"></a>
 <h2 class="h3"> Data Organization </h2>
 <div class="section">
-<a name="N10172"></a><a name="Data+Blocks"></a>
+<a name="N1013C"></a><a name="Data+Blocks"></a>
 <h3 class="h4"> Data Blocks </h3>
 <p>
-        HDFS is designed to support very large files. Applications that are 
compatible with HDFS are those that deal with large data sets. These 
applications write their data only once but they read it one or more times and 
require these reads to be satisfied at streaming speeds. HDFS supports 
write-once-read-many semantics on files. A typical block size used by HDFS is 
64 MB. Thus, an HDFS file is chopped up into 64 MB chunks, and if possible, 
each chunk will reside on a different Datanode.
+        HDFS is designed to support very large files. Applications that are 
compatible with HDFS are those that deal with large data sets. These 
applications write their data only once but they read it one or more times and 
require these reads to be satisfied at streaming speeds. HDFS supports 
write-once-read-many semantics on files. A typical block size used by HDFS is 
64 MB. Thus, an HDFS file is chopped up into 64 MB chunks, and if possible, 
each chunk will reside on a different DataNode.
         </p>
-<a name="N1017C"></a><a name="Staging"></a>
+<a name="N10146"></a><a name="Staging"></a>
 <h3 class="h4"> Staging </h3>
 <p>
-        A client request to create a file does not reach the Namenode 
immediately. In fact, initially the HDFS client caches the file data into a 
temporary local file. Application writes are transparently redirected to this 
temporary local file. When the local file accumulates data worth over one HDFS 
block size, the client contacts the Namenode. The Namenode inserts the file 
name into the file system hierarchy and allocates a data block for it. The 
Namenode responds to the client request with the identity of the Datanode and 
the destination data block. Then the client flushes the block of data from the 
local temporary file to the specified Datanode. When a file is closed, the 
remaining un-flushed data in the temporary local file is transferred to the 
Datanode. The client then tells the Namenode that the file is closed. At this 
point, the Namenode commits the file creation operation into a persistent 
store. If the Namenode dies before the file is closed, the file is lost. 
+        A client request to create a file does not reach the NameNode 
immediately. In fact, initially the HDFS client caches the file data into a 
temporary local file. Application writes are transparently redirected to this 
temporary local file. When the local file accumulates data worth over one HDFS 
block size, the client contacts the NameNode. The NameNode inserts the file 
name into the file system hierarchy and allocates a data block for it. The 
NameNode responds to the client request with the identity of the DataNode and 
the destination data block. Then the client flushes the block of data from the 
local temporary file to the specified DataNode. When a file is closed, the 
remaining un-flushed data in the temporary local file is transferred to the 
DataNode. The client then tells the NameNode that the file is closed. At this 
point, the NameNode commits the file creation operation into a persistent 
store. If the NameNode dies before the file is closed, the file is lost. 
         </p>
 <p>
         The above approach has been adopted after careful consideration of 
target applications that run on HDFS. These applications need streaming writes 
to files. If a client writes to a remote file directly without any client side 
buffering, the network speed and the congestion in the network impacts 
throughput considerably. This approach is not without precedent. Earlier 
distributed file systems, e.g. <acronym title="Andrew File 
System">AFS</acronym>, have used client side caching to improve performance. A 
POSIX requirement has been relaxed to achieve higher performance of data 
uploads. 
         </p>
-<a name="N1018F"></a><a name="Replication+Pipelining"></a>
+<a name="N10159"></a><a name="Replication+Pipelining"></a>
 <h3 class="h4"> Replication Pipelining </h3>
 <p>
-        When a client is writing data to an HDFS file, its data is first 
written to a local file as explained in the previous section. Suppose the HDFS 
file has a replication factor of three. When the local file accumulates a full 
block of user data, the client retrieves a list of Datanodes from the Namenode. 
This list contains the Datanodes that will host a replica of that block. The 
client then flushes the data block to the first Datanode. The first Datanode 
starts receiving the data in small portions (4 KB), writes each portion to its 
local repository and transfers that portion to the second Datanode in the list. 
The second Datanode, in turn starts receiving each portion of the data block, 
writes that portion to its repository and then flushes that portion to the 
third Datanode. Finally, the third Datanode writes the data to its local 
repository. Thus, a Datanode can be receiving data from the previous one in the 
pipeline and at the same time forwarding data to the next o
 ne in the pipeline. Thus, the data is pipelined from one Datanode to the next.
+        When a client is writing data to an HDFS file, its data is first 
written to a local file as explained in the previous section. Suppose the HDFS 
file has a replication factor of three. When the local file accumulates a full 
block of user data, the client retrieves a list of DataNodes from the NameNode. 
This list contains the DataNodes that will host a replica of that block. The 
client then flushes the data block to the first DataNode. The first DataNode 
starts receiving the data in small portions (4 KB), writes each portion to its 
local repository and transfers that portion to the second DataNode in the list. 
The second DataNode, in turn starts receiving each portion of the data block, 
writes that portion to its repository and then flushes that portion to the 
third DataNode. Finally, the third DataNode writes the data to its local 
repository. Thus, a DataNode can be receiving data from the previous one in the 
pipeline and at the same time forwarding data to the next o
 ne in the pipeline. Thus, the data is pipelined from one DataNode to the next.
         </p>
 </div>
 
     
-<a name="N1019A"></a><a name="Accessibility"></a>
+<a name="N10164"></a><a name="Accessibility"></a>
 <h2 class="h3"> Accessibility </h2>
 <div class="section">
 <p>
       HDFS can be accessed from applications in many different ways. Natively, 
HDFS provides a <a href="http://hadoop.apache.org/core/docs/current/api/";>Java 
API</a> for applications to use. A C language wrapper for this Java API is also 
available. In addition, an HTTP browser can also be used to browse the files of 
an HDFS instance. Work is in progress to expose HDFS through the <acronym 
title="Web-based Distributed Authoring and Versioning">WebDAV</acronym> 
protocol. 
       </p>
-<a name="N101AF"></a><a name="DFSShell"></a>
-<h3 class="h4"> DFSShell </h3>
+<a name="N10179"></a><a name="FS+Shell"></a>
+<h3 class="h4"> FS Shell </h3>
 <p>
-        HDFS allows user data to be organized in the form of files and 
directories. It provides a commandline interface called <em>DFSShell</em> that 
lets a user interact with the data in HDFS. The syntax of this command set is 
similar to other shells (e.g. bash, csh) that users are already familiar with. 
Here are some sample action/command pairs:
+        HDFS allows user data to be organized in the form of files and 
directories. It provides a commandline interface called  FS shell that lets a 
user interact with the data in HDFS. The syntax of this command set is similar 
to other shells (e.g. bash, csh) that users are already familiar with. Here are 
some sample action/command pairs:
         </p>
 <table class="ForrestTable" cellspacing="1" cellpadding="4">
           
@@ -559,12 +559,12 @@
         
 </table>
 <p>
-        DFSShell is targeted for applications that need a scripting language 
to interact with the stored data.
+        FS shell is targeted for applications that need a scripting language 
to interact with the stored data.
         </p>
-<a name="N10207"></a><a name="DFSAdmin"></a>
+<a name="N101CE"></a><a name="DFSAdmin"></a>
 <h3 class="h4"> DFSAdmin </h3>
 <p>
-        The <em>DFSAdmin</em> command set is used for administering an HDFS 
cluster. These are commands that are used only by an HDFS administrator. Here 
are some sample action/command pairs:
+        The DFSAdmin command set is used for administering an HDFS cluster. 
These are commands that are used only by an HDFS administrator. Here are some 
sample action/command pairs:
         </p>
 <table class="ForrestTable" cellspacing="1" cellpadding="4">
           
@@ -576,24 +576,24 @@
           
 <tr>
             
-<td colspan="1" rowspan="1"> Put a cluster in SafeMode </td> <td colspan="1" 
rowspan="1"> <span class="codefrag">bin/hadoop dfsadmin -safemode enter</span> 
</td>
+<td colspan="1" rowspan="1"> Put the cluster in Safemode </td> <td colspan="1" 
rowspan="1"> <span class="codefrag">bin/hadoop dfsadmin -safemode enter</span> 
</td>
           
 </tr>
           
 <tr>
             
-<td colspan="1" rowspan="1"> Generate a list of Datanodes </td> <td 
colspan="1" rowspan="1"> <span class="codefrag">bin/hadoop dfsadmin 
-report</span> </td>
+<td colspan="1" rowspan="1"> Generate a list of DataNodes </td> <td 
colspan="1" rowspan="1"> <span class="codefrag">bin/hadoop dfsadmin 
-report</span> </td>
           
 </tr>
           
 <tr>
             
-<td colspan="1" rowspan="1"> Decommission Datanode <span 
class="codefrag">datanodename</span> </td><td colspan="1" rowspan="1"> <span 
class="codefrag">bin/hadoop dfsadmin -decommission datanodename</span> </td>
+<td colspan="1" rowspan="1"> Decommission DataNode <span 
class="codefrag">datanodename</span> </td><td colspan="1" rowspan="1"> <span 
class="codefrag">bin/hadoop dfsadmin -decommission datanodename</span> </td>
           
 </tr>
         
 </table>
-<a name="N10255"></a><a name="Browser+Interface"></a>
+<a name="N10219"></a><a name="Browser+Interface"></a>
 <h3 class="h4"> Browser Interface </h3>
 <p>
         A typical HDFS install configures a web server to expose the HDFS 
namespace through a configurable TCP port. This allows a user to navigate the 
HDFS namespace and view the contents of its files using a web browser.
@@ -601,27 +601,27 @@
 </div> 
 
     
-<a name="N10260"></a><a name="Space+Reclamation"></a>
+<a name="N10224"></a><a name="Space+Reclamation"></a>
 <h2 class="h3"> Space Reclamation </h2>
 <div class="section">
-<a name="N10266"></a><a name="File+Deletes+and+Undeletes"></a>
+<a name="N1022A"></a><a name="File+Deletes+and+Undeletes"></a>
 <h3 class="h4"> File Deletes and Undeletes </h3>
 <p>
-        When a file is deleted by a user or an application, it is not 
immediately removed from HDFS.  Instead, HDFS first renames it to a file in the 
<span class="codefrag">/trash</span> directory. The file can be restored 
quickly as long as it remains in <span class="codefrag">/trash</span>. A file 
remains in <span class="codefrag">/trash</span> for a configurable amount of 
time. After the expiry of its life in <span class="codefrag">/trash</span>, the 
Namenode deletes the file from the HDFS namespace. The deletion of a file 
causes the blocks associated with the file to be freed. Note that there could 
be an appreciable time delay between the time a file is deleted by a user and 
the time of the corresponding increase in free space in HDFS.
+        When a file is deleted by a user or an application, it is not 
immediately removed from HDFS.  Instead, HDFS first renames it to a file in the 
<span class="codefrag">/trash</span> directory. The file can be restored 
quickly as long as it remains in <span class="codefrag">/trash</span>. A file 
remains in <span class="codefrag">/trash</span> for a configurable amount of 
time. After the expiry of its life in <span class="codefrag">/trash</span>, the 
NameNode deletes the file from the HDFS namespace. The deletion of a file 
causes the blocks associated with the file to be freed. Note that there could 
be an appreciable time delay between the time a file is deleted by a user and 
the time of the corresponding increase in free space in HDFS.
         </p>
 <p>
         A user can Undelete a file after deleting it as long as it remains in 
the <span class="codefrag">/trash</span> directory. If a user wants to undelete 
a file that he/she has deleted, he/she can navigate the <span 
class="codefrag">/trash</span> directory and retrieve the file. The <span 
class="codefrag">/trash</span> directory contains only the latest copy of the 
file that was deleted. The <span class="codefrag">/trash</span> directory is 
just like any other directory with one special feature: HDFS applies specified 
policies to automatically delete files from this directory. The current default 
policy is to delete files from <span class="codefrag">/trash</span> that are 
more than 6 hours old. In the future, this policy will be configurable through 
a well defined interface.
         </p>
-<a name="N1028E"></a><a name="Decrease+Replication+Factor"></a>
+<a name="N10252"></a><a name="Decrease+Replication+Factor"></a>
 <h3 class="h4"> Decrease Replication Factor </h3>
 <p>
-        When the replication factor of a file is reduced, the Namenode selects 
excess replicas that can be deleted. The next Heartbeat transfers this 
information to the Datanode. The Datanode then removes the corresponding blocks 
and the corresponding free space appears in the cluster. Once again, there 
might be a time delay between the completion of the <span 
class="codefrag">setReplication</span> API call and the appearance of free 
space in the cluster.
+        When the replication factor of a file is reduced, the NameNode selects 
excess replicas that can be deleted. The next Heartbeat transfers this 
information to the DataNode. The DataNode then removes the corresponding blocks 
and the corresponding free space appears in the cluster. Once again, there 
might be a time delay between the completion of the <span 
class="codefrag">setReplication</span> API call and the appearance of free 
space in the cluster.
         </p>
 </div>
 
 
     
-<a name="N1029C"></a><a name="References"></a>
+<a name="N10260"></a><a name="References"></a>
 <h2 class="h3"> References </h2>
 <div class="section">
 <p>


Reply via email to