Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The following page has been changed by Marc Harris: http://wiki.apache.org/hadoop/Hadoop_Upgrade The comment on the change is: Change â to - in command line options ------------------------------------------------------------------------------ == Instructions: == 1. Stop map-reduce cluster(s) [[BR]] {{{bin/stop-mapred.sh}}} [[BR]] and all client applications running on the DFS cluster. - 2. Run {{{fsck}}} command: [[BR]] {{{bin/hadoop fsck / -files âblocks âlocations > dfs-v-old-fsck-1.log}}} [[BR]] Fix DFS to the point there are no errors. The resulting file will contain complete block map of the file system. [[BR]] Note. Redirecting the {{{fsck}}} output is recommend for large clusters in order to avoid time consuming output to stdout. + 2. Run {{{fsck}}} command: [[BR]] {{{bin/hadoop fsck / -files -blocks -locations > dfs-v-old-fsck-1.log}}} [[BR]] Fix DFS to the point there are no errors. The resulting file will contain complete block map of the file system. [[BR]] Note. Redirecting the {{{fsck}}} output is recommend for large clusters in order to avoid time consuming output to stdout. 3. Run {{{lsr}}} command: [[BR]] {{{bin/hadoop dfs -lsr / > dfs-v-old-lsr-1.log}}} [[BR]] The resulting file will contain complete namespace of the file system. 4. Run {{{report}}} command to create a list of data nodes participating in the cluster. [[BR]] {{{bin/hadoop dfsadmin -report > dfs-v-old-report-1.log}}} 5. Optionally, copy all or unrecoverable only data stored in DFS to a local file system or a backup instance of DFS. @@ -45, +45 @@ 15. Start DFS cluster. [[BR]] {{{bin/start-dfs.sh}}} 16. Run report command: [[BR]] {{{bin/hadoop dfsadmin -report > dfs-v-new-report-1.log}}} [[BR]] and compare with {{{dfs-v-old-report-1.log}}} to ensure all data nodes previously belonging to the cluster are up and running. 17. Run {{{lsr}}} command: [[BR]] {{{bin/hadoop dfs -lsr / > dfs-v-new-lsr-1.log}}} [[BR]] and compare with {{{dfs-v-old-lsr-1.log}}}. These files should be identical unless the format of {{{lsr}}} reporting or the data structures have changed in the new version. - 18. Run {{{fsck}}} command: [[BR]] {{{bin/hadoop fsck / -files âblocks âlocations > dfs-v-new-fsck-1.log}}} [[BR]] and compare with {{{dfs-v-old-fsck-1.log}}}. These files should be identical, unless the {{{fsck}}} reporting format has changed in the new version. + 18. Run {{{fsck}}} command: [[BR]] {{{bin/hadoop fsck / -files -blocks -locations > dfs-v-new-fsck-1.log}}} [[BR]] and compare with {{{dfs-v-old-fsck-1.log}}}. These files should be identical, unless the {{{fsck}}} reporting format has changed in the new version. 19. Start map-reduce cluster [[BR]] {{{bin/start-mapred.sh}}} In case of failure the administrator should have the checkpoint files in order to be able to repeat the procedure from the appropriate point or to restart the old version of Hadoop. The {{{*.log}}} files should help in investigating what went wrong during the upgrade. @@ -57, +57 @@ 2. The '''safe mode''' implementation will further help to prevent name node from voluntary decisions on block deletion and replication. 3. A '''faster fsck''' is required. ''Currently {{{fsck}}} processes 1-2 TB per minute.'' 4. Hadoop should provide a '''backup solution''' as a stand alone application. - 5. Introduce an explicit '''âupgrade option''' for DFS (See below) and a related + 5. Introduce an explicit '''-upgrade option''' for DFS (See below) and a related 6. '''finalize upgrade''' command. == Shutdown command: == @@ -116, +116 @@ 1. Stop map-reduce cluster(s) and all client applications running on the DFS cluster. 2. Stop DFS using the shutdown command. 3. Install new version of Hadoop software. - 4. Start DFS cluster with âupgrade option. + 4. Start DFS cluster with -upgrade option. 5. Start map-reduce cluster. 6. Verify the components run properly and finalize the upgrade when convinced. This is done using the -finalizeUpgrade option to the hadoop dfsadmin command.
