HDFS-11874. [SPS]: Document the SPS feature. Contributed by Uma Maheswara Rao G


Project: http://git-wip-us.apache.org/repos/asf/hadoop/repo
Commit: http://git-wip-us.apache.org/repos/asf/hadoop/commit/b1238b21
Tree: http://git-wip-us.apache.org/repos/asf/hadoop/tree/b1238b21
Diff: http://git-wip-us.apache.org/repos/asf/hadoop/diff/b1238b21

Branch: refs/heads/HDFS-10285
Commit: b1238b217ce2e4b2314827bc8b6411fa25d883f0
Parents: bc4b9e6
Author: Rakesh Radhakrishnan <rake...@apache.org>
Authored: Fri Jul 14 22:36:09 2017 +0530
Committer: Rakesh Radhakrishnan <rake...@apache.org>
Committed: Tue Jul 31 12:09:17 2018 +0530

----------------------------------------------------------------------
 .../src/site/markdown/ArchivalStorage.md        | 51 ++++++++++++++++++--
 1 file changed, 48 insertions(+), 3 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/hadoop/blob/b1238b21/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md
----------------------------------------------------------------------
diff --git 
a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md 
b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md
index a56cf8b..9098616 100644
--- a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md
+++ b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md
@@ -97,8 +97,44 @@ The effective storage policy can be retrieved by the 
"[`storagepolicies -getStor
 
     The default storage type of a datanode storage location will be DISK if it 
does not have a storage type tagged explicitly.
 
-Mover - A New Data Migration Tool
----------------------------------
+Storage Policy Based Data Movement
+----------------------------------
+
+Setting a new storage policy on already existing file/dir will change the 
policy in Namespace, but it will not move the blocks physically across storage 
medias.
+Following 2 options will allow users to move the blocks based on new policy 
set. So, once user change/set to a new policy on file/directory, user should 
also perform one of the following options to achieve the desired data movement. 
Note that both options cannot be allowed to run simultaneously.
+
+### <u>S</u>torage <u>P</u>olicy <u>S</u>atisfier (SPS)
+
+When user changes the storage policy on a file/directory, user can call 
`HdfsAdmin` API `satisfyStoragePolicy()` to move the blocks as per the new 
policy set.
+The SPS daemon thread runs along with namenode and periodically scans for the 
storage mismatches between new policy set and the physical blocks placed. This 
will only track the files/directories for which user invoked 
satisfyStoragePolicy. If SPS identifies some blocks to be moved for a file, 
then it will schedule block movement tasks to datanodes. A Coordinator 
DataNode(C-DN) will track all block movements associated to a file and notify 
to namenode about movement success/failure. If there are any failures in 
movement, the SPS will re-attempt by sending new block movement task.
+
+SPS can be activated and deactivated dynamically without restarting the 
Namenode.
+
+Detailed design documentation can be found at [Storage Policy Satisfier(SPS) 
(HDFS-10285)](https://issues.apache.org/jira/browse/HDFS-10285)
+
+* **Note**: When user invokes `satisfyStoragePolicy()` API on a directory, SPS 
will consider the files which are immediate to that directory. Sub-directories 
won't be considered for satisfying the policy. Its user responsibility to call 
this API on directories recursively, to track all files under the sub tree.
+
+* HdfsAdmin API :
+        `public void satisfyStoragePolicy(final Path path) throws IOException`
+
+* Arguments :
+
+| | |
+|:---- |:---- |
+| `path` | A path which requires blocks storage movement. |
+
+####Configurations:
+
+*   **dfs.storage.policy.satisfier.activate** - Used to activate or deactivate 
SPS. Configuring true represents SPS is
+   activated and vice versa.
+
+*   **dfs.storage.policy.satisfier.recheck.timeout.millis** - A timeout to 
re-check the processed block storage movement
+   command results from Co-ordinator Datanode.
+
+*   **dfs.storage.policy.satisfier.self.retry.timeout.millis** - A timeout to 
retry if no block movement results reported from
+   Co-ordinator Datanode in this configured timeout.
+
+### Mover - A New Data Migration Tool
 
 A new data migration tool is added for archiving data. The tool is similar to 
Balancer. It periodically scans the files in HDFS to check if the block 
placement satisfies the storage policy. For the blocks violating the storage 
policy, it moves the replicas to a different storage type in order to fulfill 
the storage policy requirement. Note that it always tries to move block 
replicas within the same node whenever possible. If that is not possible (e.g. 
when a node doesn’t have the target storage type) then it will copy the block 
replicas to another node over the network.
 
@@ -115,6 +151,10 @@ A new data migration tool is added for archiving data. The 
tool is similar to Ba
 
 Note that, when both -p and -f options are omitted, the default path is the 
root directory.
 
+####Administrator notes:
+
+`StoragePolicySatisfier` and `Mover tool` cannot run simultaneously. If a 
Mover instance is already triggered and running, SPS will be deactivated while 
starting. In that case, administrator should make sure, Mover execution 
finished and then activate SPS again. Similarly when SPS activated already, 
Mover cannot be run. If administrator is looking to run Mover tool explicitly, 
then he/she should make sure to deactivate SPS first and then run Mover. Please 
look at the commands section to know how to activate or deactivate SPS 
dynamically.
+
 Storage Policy Commands
 -----------------------
 
@@ -173,7 +213,8 @@ Get the storage policy of a file or a directory.
 
 ### Satisfy Storage Policy
 
-Schedule blocks to move based on file/directory policy. This command 
applicable only to the given path and its immediate children. Sub-directories 
won't be considered for satisfying the policy.
+Schedule blocks to move based on file's/directory's current storage policy.
+Note: For directory case, it will consider immediate files under that 
directory and it will not consider sub directories recursively.
 
 * Command:
 
@@ -193,4 +234,8 @@ Check the running status of Storage Policy Satisfier in 
namenode. If it is runni
 
         hdfs storagepolicies -isSPSRunning
 
+### Activate or Deactivate SPS without restarting Namenode
+If administrator wants to activate or deactivate SPS feature while Namenode is 
running, first he/she needs to update the desired value(true or false) for the 
configuration item `dfs.storage.policy.satisfier.activate` in configuration 
file (`hdfs-site.xml`) and then run the following Namenode reconfig command
+
++       hdfs dfsadmin -reconfig namenode <host:ipc_port> start
 


---------------------------------------------------------------------
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

Reply via email to