wojiaodoubao commented on a change in pull request #2035: URL: https://github.com/apache/hadoop/pull/2035#discussion_r442012436
########## File path: hadoop-tools/hadoop-federation-balance/src/site/markdown/FederationBalance.md ########## @@ -0,0 +1,177 @@ +<!--- + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. See accompanying LICENSE file. +--> + +Federation Balance Guide +===================== + +--- + + - [Overview](#Overview) + - [Usage](#Usage) + - [Basic Usage](#Basic_Usage) + - [RBF Mode And Normal Federation Mode](#RBF_Mode_And_Normal_Federation_Mode) + - [Command Options](#Command_Options) + - [Configuration Options](#Configuration_Options) + - [Architecture of Federation Balance](#Architecture_of_Federation_Balance) + - [Balance Procedure Scheduler](#Balance_Procedure_Scheduler) + - [DistCpFedBalance](#DistCpFedBalance) +--- + +Overview +-------- + + Federation Balance is a tool balancing data across different federation + namespaces. It uses [DistCp](../hadoop-distcp/DistCp.html) to copy data from + the source path to the target path. First it creates a snapshot at the source + path and submits the initial distcp. Then it uses distcp diff to do the + incremental copy. Finally when the source and the target are the same, it + updates the mount table in Router and moves the source to trash. + + This document aims to describe the usage and design of the Federation Balance. + +Usage +----- + +### Basic Usage + + The federation balance tool supports both normal federation cluster and + router-based federation cluster. Taking rbf for example. Supposing we have a + mount entry in Router: + + /foo/src --> hdfs://nn0:8020/foo/src + + The command below runs a federation balance job. The first parameter is the + mount entry. The second one is the target path which must include the target + cluster. + + bash$ /bin/hadoop fedbalance submit /foo/src hdfs://nn1:8020/foo/dst + + It copies data from hdfs://nn0:8020/foo/src to hdfs://nn1:8020/foo/dst + incrementally and finally updates the mount entry to: + + /foo/src --> hdfs://nn1:8020/foo/dst + + If the hadoop shell process exits unexpectedly, we can use the command below + to continue the unfinished job: + + bash$ /bin/hadoop fedbalance continue + + This will scan the journal to find all the unfinished jobs, recover and + continue to execute them. + + If we want to balance in a normal federation cluster, use the command below. + + bash$ /bin/hadoop fedbalance -router false submit hdfs://nn0:8020/foo/src hdfs://nn1:8020/foo/dst + + The option `-router false` indicates this is not in router-based federation. + The source path must includes the source cluster. + +### RBF Mode And Normal Federation Mode + + The federation balance tool has 2 modes: + + * the router-based federation mode(rbf mode). + * the normal federation mode. + + By default the command runs in the rbf mode. You can specify the rbf mode + explicitly by using the option `-router true`. The option `-router false` + specifies the normal federation mode. + + In the rbf mode the first parameter is taken as the mount point. It disables + write by setting the mount point readonly. + + In the normal federation mode the first parameter is taken as the full path of + the source. The first parameter must include the source cluster. It disables + write by cancelling the execute permission of the source path. + + Details about disabling write see [DistCpFedBalance](#DistCpFedBalance). + +### Command Options + +Command `submit` has 5 options: + +| Option key | Description | +| ------------------------------ | ------------------------------------ | +| -router | This option specifies the mode of the command. `True` indicates the router-based federation mode. `False` indicates the normal federation mode. | +| -forceCloseOpen | If `true`, the DIFF_DISTCP stage forces close all open files when there is no diff. Otherwise it waits until there is no open files. The default value is `false`. | +| -map | Max number of concurrent maps to use for copy. | +| -bandwidth | Specify bandwidth per map in MB. | +| -moveToTrash | If `true` move the source path to trash after the job is done. Otherwise delete the source path directly. | + +### Configuration Options +-------------------- + +| Configuration key | Description | +| ------------------------------ | ------------------------------------ | +| hadoop.hdfs.procedure.work.thread.num | The worker threads number of the BalanceProcedureScheduler. Default is `10`. | Review comment: done ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
