[Hadoop Wiki] Trivial Update of "TestFaqPage" by SomeOt herAccount

Apache Wiki Wed, 06 Oct 2010 15:09:35 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.


The "TestFaqPage" page has been changed by SomeOtherAccount.
http://wiki.apache.org/hadoop/TestFaqPage?action=diff&rev1=3&rev2=4

--------------------------------------------------

  
  = HDFS =
  
- <<BR>> <<Anchor(3.1)>> '''1. [[#A3.1|If I add new data-nodes to the cluster 
will HDFS move the blocks to the newly added nodes in order to balance disk 
space utilization between the nodes?]]'''
+ == If I add new DataNodes to the cluster will HDFS move the blocks to the 
newly added nodes in order to balance disk space utilization between the nodes? 
==
  
  No, HDFS will not move blocks to new nodes automatically. However, newly 
created files will likely have their blocks placed on the new nodes.
  
@@ -193, +193 @@

    * 
[[http://developer.yahoo.com/hadoop/tutorial/module2.html#rebalancing|HDFS 
Tutorial: Rebalancing]];
    * 
[[http://hadoop.apache.org/core/docs/current/commands_manual.html#balancer|HDFS 
Commands Guide: balancer]].
  
- <<BR>> <<Anchor(3.2)>> '''2. [[#A3.2|What is the purpose of the secondary 
name-node?]]'''
+ == What is the purpose of the secondary name-node? ==
  
  The term "secondary name-node" is somewhat misleading. It is not a name-node 
in the sense that data-nodes cannot connect to the secondary name-node, and in 
no event it can replace the primary name-node in case of its failure.
  
@@ -201, +201 @@

  
  So if the name-node fails and you can restart it on the same physical node 
then there is no need  to shutdown data-nodes, just the name-node need to be 
restarted. If you cannot use the old node anymore you will need to copy the 
latest image somewhere else. The latest image can be found either on the node 
that used to be the primary before failure if available; or on the secondary 
name-node. The latter will be the latest checkpoint without subsequent edits 
logs,  that is the most recent name space modifications may be missing there. 
You will also need to restart the whole cluster in this case.
  
- <<BR>> <<Anchor(3.3)>> '''3. [[#A3.3|Does the name-node stay in safe mode 
till all under-replicated files are fully replicated?]]'''
+ == Does the name-node stay in safe mode till all under-replicated files are 
fully replicated? ==
  
  No. During safe mode replication of blocks is prohibited.  The name-node 
awaits when all or majority of data-nodes report their blocks.
  
@@ -211, +211 @@

  
  Learn more about safe mode 
[[http://hadoop.apache.org/hdfs/docs/current/hdfs_user_guide.html#Safemode|in 
the HDFS Users' Guide]].
  
- <<BR>> <<Anchor(3.4)>> '''4. [[#A3.4|How do I set up a hadoop node to use 
multiple volumes?]]'''
+ == How do I set up a hadoop node to use multiple volumes? ==
  
  ''Data-nodes'' can store blocks in multiple directories typically allocated 
on different local disk drives. In order to setup multiple directories one 
needs to specify a comma separated list of pathnames as a value of the 
configuration parameter  
[[http://hadoop.apache.org/core/docs/current/hadoop-default.html#dfs.data.dir|dfs.data.dir]].
 Data-nodes will attempt to place equal amount of data in each of the 
directories.
  
  The ''name-node'' also supports multiple directories, which in the case store 
the name space image and the edits log. The directories are specified via the  
[[http://hadoop.apache.org/core/docs/current/hadoop-default.html#dfs.name.dir|dfs.name.dir]]
 configuration parameter. The name-node directories are used for the name space 
data replication so that the image and the  log could be restored from the 
remaining volumes if one of them fails.
  
- <<BR>> <<Anchor(3.5)>> '''5. [[#A3.5|What happens if one Hadoop client 
renames a file or a directory containing this file while another client is 
still writing into it?]]'''
+ == What happens if one Hadoop client renames a file or a directory containing 
this file while another client is still writing into it? ==
  
  Starting with release hadoop-0.15, a file will appear in the name space as 
soon as it is created.  If a writer is writing to a file and another client 
renames either the file itself or any of its path  components, then the 
original writer will get an IOException either when it finishes writing to the 
current  block or when it closes the file.
  
- <<BR>> <<Anchor(3.6)>> '''6. [[#A3.6|I want to make a large cluster smaller 
by taking out a bunch of nodes simultaneously. How can this be done?]]'''
+ == I want to make a large cluster smaller by taking out a bunch of nodes 
simultaneously. How can this be done? ==
  
  On a large cluster removing one or two data-nodes will not lead to any data 
loss, because  name-node will replicate their blocks as long as it will detect 
that the nodes are dead. With a large number of nodes getting removed or dying 
the probability of losing data is higher.
  
@@ -236, +236 @@

  
  The decommission process can be terminated at any time by editing the 
configuration or the exclude files  and repeating the {{{-refreshNodes}}} 
command.
  
- <<BR>> <<Anchor(3.7)>> '''7. [[#A3.7|Wildcard characters doesn't work 
correctly in FsShell.]]'''
+ == Wildcard characters doesn't work correctly in FsShell. ==
  
  When you issue a command in !FsShell, you may want to apply that command to 
more than one file. !FsShell provides a wildcard character to help you do so.  
The * (asterisk) character can be used to take the place of any set of 
characters. For example, if you would like to list all the files in your 
account which begin with the letter '''x''', you could use the ls command with 
the * wildcard:

[Hadoop Wiki] Trivial Update of "TestFaqPage" by SomeOt herAccount

Reply via email to