Author: szetszwo Date: Tue Feb 25 01:14:12 2014 New Revision: 1571508 URL: http://svn.apache.org/r1571508 Log: HDFS-5778. Add rolling upgrade user document.
Added: hadoop/common/branches/HDFS-5535/hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsRollingUpgrade.xml Modified: hadoop/common/branches/HDFS-5535/hadoop-hdfs-project/hadoop-hdfs/CHANGES_HDFS-5535.txt Modified: hadoop/common/branches/HDFS-5535/hadoop-hdfs-project/hadoop-hdfs/CHANGES_HDFS-5535.txt URL: http://svn.apache.org/viewvc/hadoop/common/branches/HDFS-5535/hadoop-hdfs-project/hadoop-hdfs/CHANGES_HDFS-5535.txt?rev=1571508&r1=1571507&r2=1571508&view=diff ============================================================================== --- hadoop/common/branches/HDFS-5535/hadoop-hdfs-project/hadoop-hdfs/CHANGES_HDFS-5535.txt (original) +++ hadoop/common/branches/HDFS-5535/hadoop-hdfs-project/hadoop-hdfs/CHANGES_HDFS-5535.txt Tue Feb 25 01:14:12 2014 @@ -83,3 +83,5 @@ HDFS-5535 subtasks: Arpit Agarwal) HDFS-5583. Make DN send an OOB Ack on shutdown before restarting. (kihwal) + + HDFS-5778. Add rolling upgrade user document. (szetszwo) Added: hadoop/common/branches/HDFS-5535/hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsRollingUpgrade.xml URL: http://svn.apache.org/viewvc/hadoop/common/branches/HDFS-5535/hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsRollingUpgrade.xml?rev=1571508&view=auto ============================================================================== --- hadoop/common/branches/HDFS-5535/hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsRollingUpgrade.xml (added) +++ hadoop/common/branches/HDFS-5535/hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsRollingUpgrade.xml Tue Feb 25 01:14:12 2014 @@ -0,0 +1,270 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!-- + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--> +<document xmlns="http://maven.apache.org/XDOC/2.0" + xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" + xsi:schemaLocation="http://maven.apache.org/XDOC/2.0 http://maven.apache.org/xsd/xdoc-2.0.xsd"> + + <properties> + <title>HDFS Rolling Upgrade</title> + </properties> + + <body> + + <h1>HDFS Rolling Upgrade</h1> + <macro name="toc"> + <param name="section" value="0"/> + <param name="fromDepth" value="0"/> + <param name="toDepth" value="4"/> + </macro> + + <section name="Introduction" id="Introduction"> + <p> + <em>HDFS rolling upgrade</em> allows upgrading individual HDFS daemons. + For examples, the datanodes can be upgraded independent of the namenodes. + A namenode can be upgraded independent of the other namenodes. + The namenodes can be upgraded independent of datanods and journal nodes. + </p> + </section> + + <section name="Upgrade" id="Upgrade"> + <p> + In Hadoop v2, HDFS supports highly-available (HA) namenode services and wire compatibility. + These two capabilities make it feasible to upgrade HDFS without incurring HDFS downtime. + In order to upgrade a HDFS cluster without downtime, the cluster must be setup with HA. + </p> + + <subsection name="Upgrade without Downtime" id="UpgradeWithoutDowntime"> + <p> + In a HA cluster, there are two or more <em>NameNodes (NNs)</em>, many <em>DataNodes (DNs)</em>, + a few <em>JournalNodes (JNs)</em> and a few <em>ZooKeeperNodes (ZKNs)</em>. + <em>JNs</em> is relatively stable and does not require upgrade when upgrading HDFS in most of the cases. + In the rolling upgrade procedure described here, + only <em>NNs</em> and <em>DNs</em> are considered but <em>JNs</em> and <em>ZKNs</em> are not. + Upgrading <em>JNs</em> and <em>ZKNs</em> may incur cluster downtime. + </p> + + <h4>Upgrading Non-Federated Clusters</h4> + <p> + Suppose there are two namenodes <em>NN1</em> and <em>NN2</em>, + where <em>NN1</em> and <em>NN2</em> are respectively in active and standby states. + The following are the steps for upgrading a HA cluster: + </p> + <ol> + <li>Prepare Rolling Upgrade<ul> + <li>Run "<code><a href="#dfsadmin_-rollingUpgrade">hdfs dfsadmin -rollingUpgrade prepare</a></code>" + to create a fsimage for rollback. + </li> + <li>Run "<code><a href="#dfsadmin_-rollingUpgrade">hdfs dfsadmin -rollingUpgrade query</a></code>" + to check the status of the rollback image. + Wait and re-run the command until the "Proceed with rolling upgrade" message is shown. + </li> + </ul></li> + <li>Upgrade Active and Standby <em>NNs</em><ol> + <li>Shutdown, upgrade and restart <em>NN2</em> as standby.</li> + <li>Failover from <em>NN1</em> to <em>NN2</em> + so that <em>NN2</em> becomes active and <em>NN1</em> becomes standby.</li> + <li>Shutdown, upgrade and restart <em>NN1</em> as standby.</li> + </ol></li> + <li>Upgrade <em>DNs</em><ol> + <li>Choose a small subset of datanodes (e.g. all datanodes under a particular rack).</li> + <ol> + <li>Run "<code><a href="#dfsadmin_-shutdownDatanode">hdfs dfsadmin -shutdownDatanode <DATANODE_HOST:IPC_PORT> upgrade</a></code>" + to shutdown one of the chosen datanodes.</li> + <li>Run "<code><a href="#dfsadmin_-getDatanodeInfo">hdfs dfsadmin -getDatanodeInfo <DATANODE_HOST:IPC_PORT></a></code>" + to check and wait for the datanode to shutdown.</li> + <li>Upgrade and restart the datanode.</li> + <li>Repeat the above steps for all the chosen datanodes in the subset.</li> + </ol> + <li>Repeat the above steps until all datanodes in the cluster are upgraded.</li> + </ol></li> + <li>Finalize Rolling Upgrade<ul> + <li>Run "<code><a href="#dfsadmin_-rollingUpgrade">hdfs dfsadmin -rollingUpgrade finalize</a></code>" + to finalize the rolling upgrade.</li> + </ul></li> + </ol> + + <h4>Upgrading Federated Clusters</h4> + <p> + In a federated cluster, there are multiple namespaces + and a pair of active and standby <em>NNs</em> for each namespace. + The procedure for upgrading a federated cluster is similar to upgrading a non-federated cluster + except that Step 1 and Step 4 are performed on each namespace + and Step 2 is performed on each pair of active and standby <em>NNs</em>, i.e. + </p> + <ol> + <li>Prepare Rolling Upgrade for Each Namespace</li> + <li>Upgrade Active and Standby <em>NN</em> pairs for Each Namespace</li> + <li>Upgrade <em>DNs</em></li> + <li>Finalize Rolling Upgrade for Each Namespace</li> + </ol> + + </subsection> + + <subsection name="Upgrade with Downtime" id="UpgradeWithDowntime"> + <p> + For non-HA clusters, + it is impossible to upgrade HDFS without downtime since it requires restarting the namenodes. + However, datanodes can still be upgraded in a rolling manner. + </p> + + <h4>Upgrading Non-HA Clusters</h4> + <p> + In a non-HA cluster, there are a <em>NameNode (NN)</em>, a <em>SecondaryNameNode (SNN)</em> + and many <em>DataNodes (DNs)</em>. + The procedure for upgrading a non-HA cluster is similar to upgrading a HA cluster + except that Step 2 "Upgrade Active and Standby <em>NNs</em>" is changed to below: + </p> + <ul> + <li>Upgrade <em>NN</em> and <em>SNN</em><ol> + <li>Shutdown <em>SNN</em></li> + <li>Shutdown, upgrade and restart <em>NN</em></li> + <li>Upgrade and restart <em>SNN</em></li> + </ol></li> + </ul> + </subsection> + </section> + + <section name="Downgrade and Rollback" id="DowngradeAndRollback"> + <p> + When the upgraded release is undesirable + or, in some unlikely case, the upgrade fails (due to bugs in the newer release), + administrators may choose to downgrade HDFS back to the pre-upgrade release + or rollback HDFS to the pre-upgrade release and the pre-upgrade state. + Both downgrade and rollback require cluster downtime and are not done in a rolling fashion. + </p> + <p> + Note that downgrade and rollback are possible only after a rolling upgrade is started and + before the upgrade is terminated. + An upgrade can be terminated by either finalize, downgrade or rollback. + Therefore, it is impossible to run rollback after finalize or downgrade, + or to run downgrade after finalize. + </p> + + <subsection name="Downgrade" id="Downgrade"> + <p> + <em>Downgrade</em> restores the software back to the pre-upgrade release + and preserves the user data. + Suppose time <em>T</em> is the rolling upgrade start time and the upgrade is terminated by downgrade. + Then, the files created before or after <em>T</em> remain available in HDFS. + The files deleted before or after <em>T</em> remain deleted in HDFS. + </p> + <p> + A newer release is downgradable to the pre-upgrade release + only if both the namenode layout version and the datenode layout version + are not changed between these two releases. + Below are the steps for downgrade: + </p> + <ul> + <li>Downgrade HDFS<ol> + <li>Shutdown all <em>NNs</em> and <em>DNs</em>.</li> + <li>Restore the pre-upgrade release in all machines.</li> + <li>Start <em>NNs</em> with the + "<a href="#namenode_-rollingUpgrade"><code>-rollingUpgrade downgrade</code></a>" option.</li> + <li>Start <em>DNs</em> normally.</li> + </ol></li> + </ul> + + </subsection> + + <subsection name="Rollback" id="Rollback"> + <p> + <em>Rollback</em> restores the software back to the pre-upgrade release + but also reverts the user data back to the pre-upgrade state. + Suppose time <em>T</em> is the rolling upgrade start time and the upgrade is terminated by rollback. + The files created before <em>T</em> remain available in HDFS but the files created after <em>T</em> become unavailable. + The files deleted before <em>T</em> remain deleted in HDFS but the files deleted after <em>T</em> are restored. + </p> + <p> + Rollback from a newer release to the pre-upgrade release is always supported. + Below are the steps for rollback: + </p> + <ul> + <li>Rollback HDFS<ol> + <li>Shutdown all <em>NNs</em> and <em>DNs</em>.</li> + <li>Restore the pre-upgrade release in all machines.</li> + <li>Start <em>NNs</em> with the + "<a href="#namenode_-rollingUpgrade"><code>-rollingUpgrade rollback</code></a>" option.</li> + <li>Start <em>DNs</em> normally.</li> + </ol></li> + </ul> + + </subsection> + </section> + + <section name="Commands and Startup Options for Rolling Upgrade" id="dfsadminCommands"> + + <subsection name="DFSAdmin Commands" id="dfsadminCommands"> + <h4><code>dfsadmin -rollingUpgrade</code></h4> + <source>hdfs dfsadmin -rollingUpgrade <query|start|finalize></source> + <p> + Execute a rolling upgrade action. + <ul><li>Options:<table> + <tr><td><code>query</code></td><td>Query the current rolling upgrade status.</td></tr> + <tr><td><code>prepare</code></td><td>Prepare a new rolling upgrade.</td></tr> + <tr><td><code>finalize</code></td><td>Finalize the current rolling upgrade.</td></tr> + </table></li></ul> + </p> + + <h4><code>dfsadmin -getDatanodeInfo</code></h4> + <source>hdfs dfsadmin -getDatanodeInfo <DATANODE_HOST:IPC_PORT></source> + <p> + Get the information about the given datanode. + This command can be used for checking if a datanode is alive + like the Unix <code>ping</code> command. + </p> + + <h4><code>dfsadmin -shutdownDatanode</code></h4> + <source>hdfs dfsadmin -shutdownDatanode <DATANODE_HOST:IPC_PORT> [upgrade]</source> + <p> + Submit a shutdown request for the given datanode. + If the optional <code>upgrade</code> argument is specified, + clients accessing the datanode will be advised to wait for it to restart + and the fast start-up mode will be enabled. + When the restart does not happen in time, clients will timeout and ignore the datanode. + In such case, the fast start-up mode will also be disabled. + </p> + <p> + Note that the command does not wait for the datanode shutdown to complete. + The "<a href="#dfsadmin_-getDatanodeInfo">dfsadmin -getDatanodeInfo</a>" + command can be used for checking if the datanode shutdown is complete. + </p> + </subsection> + + <subsection name="NameNode Startup Options" id="dfsadminCommands"> + + <h4><code>namenode -rollingUpgrade</code></h4> + <source>hdfs namenode -rollingUpgrade <downgrade|rollback></source> + <p> + Downgrade or rollback an ongoing rolling upgrade. + </p> + <ul><li>Options:<table> + <tr><td><code>downgrade</code></td> + <td>Restores the namenode back to the pre-upgrade release + and preserves the user data.</td> + </tr> + <tr><td><code>rollback</code></td> + <td>Restores the namenode back to the pre-upgrade release + but also reverts the user data back to the pre-upgrade state.</td> + </tr> + </table></li></ul> + + </subsection> + + </section> + </body> +</document>