[Lucene-hadoop Wiki] Update of "Hadoop 0.14 Upgrade" by RaghuAngadi

Apache Wiki Tue, 21 Aug 2007 13:05:49 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for 
change notification.


The following page has been changed by RaghuAngadi:
http://wiki.apache.org/lucene-hadoop/Hadoop_0%2e14_Upgrade

------------------------------------------------------------------------------
  = Upgrade Guide for Hadoop-0.14 =
+ 
+ '''XXX This document is still under development'''. Should be complete by end 
of Aug 21st.
  
  This page describes upgrade information that is specific to Hadoop-0.14. The 
usual upgrade described in [:Hadoop_Upgrade: Hadoop Upgrade page] still applies 
for Hadoop-0.14. 
  
@@ -19, +21 @@

  
  The rest of the document describes what happens once the cluster is started 
with {{{-upgrade}}} option.
  
+ == Block CRC Upgrade ==
+ 
+ Hadoop-0.14 maintains checksums for HDFS data differently than earlier 
versions. Before Hadoop-0.14, checksum for a file {{{f.txt}}} is stored in 
another HDFS file {{{.f.txt.crc}}}. In Hadoop-0.14, there are no such 
''shadow'' checksum files. In stead, checksum is stored with each ''block'' of 
data at the ''datanode''. [http://issues.apache.org/jira/browse/HADOOP-1134 
HADOOP-1134] describes this feature in great details. In order to migrate to 
the new structure, each datanode reads the checksum data from {{{.crc}}} files 
in HDFS for each of its blocks and stores the the checksum next to the block in 
local filesystem.
+ 
+ Depending on number of blocks and number of files in HDFS, upgrade can take 
anywhere from a few minutes to a few hours.
+ 
+ There are three stages in this upgrade :
+  1. '''SafeMode''' : Similar to normal restart of the cluster, namenode waits 
for datanodes in the cluster to report their blocks. The cluster may wait in 
the state for a long time if some of the datanodes do not report their blocks. 
+  1. '''Datanode Upgrade''' : Once the most of the blocks are reported, 
namenode asks the registered datanodes to start their local upgrade. Namenode 
waits for for ''all'' the datanodes to complete their upgrade.
+  1. '''Deleting {{{.crc}}} files''' : Namenode deletes {{{.crc}}} files that 
were previously used for storing checksum.
+ 
+ === Monitoring the Upgrade ===
+ 
+ The cluster stays in ''safeMode'' until the upgrade is complete. HDFS webui 
is a good place to check if safeMode is on or off. As always log files from 
''namenode'' and ''datanode'' are useful when nothing else helps.
+ 
+ Once the cluster is started with {{{-upgrade}}} option, the simplest way to 
monitor the upgrade is with '{{{dfsadmin -upgradeProgress status}}}' command. A 
typical output from this command looks like this: {{{
+ $ bin/hadoop dfsadmin -upgradeProgress status
+ Distributed upgrade for version -6 is in progress. Status = 78%
+ 
+         Last Block Level Stats updated at : Mon Aug 13 22:23:30 UTC 2007
+         Last Block Level Stats : Total Blocks : 1054713
+                                  Fully Upgragraded : 40.94%
+                                  Minimally Upgraded : 52.13%
+                                  Under Upgraded : 6.93% (includes Un-upgraded 
blocks)
+                                  Un-upgraded : 6.93%
+                                  Errors : 0
+         Brief Datanode Status  : Avg completion of all Datanodes: 91.59% with 
0 errors.
+                                  274 out of 893 nodes are not done.
+ }}} 
+ 
+  * {{{Status = 78%}}} : This is a rough approximation of how much of upgrade 
is completed.
+  * {{{Block Level Stats}}} : Once the upgrade is started, Namenode iterates 
through all the block to check how many of the blocks are upgrade. This 
information is useful on large clusters where some datanodes may never complete 
upgrade of their blocks (discussed in later sections).
+    * {{{Fully Upgraded}}} : Percentage of blocks, where the expected number 
of replicas are upgraded. E.g. if a block has replication of 3, it is 
considered ''fully upgraded'' if at least three datanodes that contain this 
blocks have completed their updating checksums.
+    * {{{Minimally Upgraded}}} : Similar to above, number of upgraded replicas 
is at least {{{dfs.min.replication}}} (default 1) and is less than expected 
number of replicas.
+    * {{{Under Upgraded}}} : number of upgraded replicas is less than 
{{{dfs.min.replication}}}.
+    * {{{Un-upgraded}}} : blocks with zero upgraded replicas.
+  * {{{Brief Datanode Status}}} : Each datanode reports its progress to the 
namenode during the upgrade. This shows average of percent completion on all 
the datanodes. This also shows how many datanodes have completed their upgrade. 
For the upgrade to proceed to next stage, all the datanodes should report 
completion of their local upgrade.
+

[Lucene-hadoop Wiki] Update of "Hadoop 0.14 Upgrade" by RaghuAngadi

Reply via email to