[ 
https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16286040#comment-16286040
 ] 

Uma Maheswara Rao G commented on HDFS-10285:
--------------------------------------------

Hi [~chris.douglas], 
{quote}
Have any benchmarks been run, particularly with the SPS disabled?
{quote}

I tried to benchmark startup times with trunk code and SPS branch when SPS 
disabled whether its really impacting the startup time. 

Here are the data points: 


Total Inodes created for test: INFO 
org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode: Loading 26598566 
INodes.

*Restart times with trunk code:*
Run1: 2017-12-11 06:23:30,658 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading FSImage 
in *81153 msecs*
Run2: 2017-12-11 06:27:15,313 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading FSImage 
in *83717 msecs*
Run3: 2017-12-11 06:29:18,574 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading FSImage 
in *82620 msecs*

*Restart times with SPS branch:*
Added a log to indicate SPS flag: INFO 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: spsEnabled          
= false
And while checking Xattr in addToInodeMap, it will check whether SPS enabled or 
not
{code}
        if (getBlockManager().isSPSEnabled()) {
            addStoragePolicySatisfier((INodeWithAdditionalFields) inode, xaf);
        }
{code} 

Run1: 2017-12-11 06:38:49,209 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading FSImage 
in *83874 msecs*
Run2: 2017-12-11 06:42:57,803 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading FSImage 
in *81013 msecs*
Run3: 2017-12-11 06:45:33,288 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading FSImage 
in *81817 msecs*

*So, this is clearly showing that, with disable of SPS, there is no impact on 
NN.*





> Storage Policy Satisfier in Namenode
> ------------------------------------
>
>                 Key: HDFS-10285
>                 URL: https://issues.apache.org/jira/browse/HDFS-10285
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: datanode, namenode
>    Affects Versions: HDFS-10285
>            Reporter: Uma Maheswara Rao G
>            Assignee: Uma Maheswara Rao G
>         Attachments: HDFS-10285-consolidated-merge-patch-00.patch, 
> HDFS-10285-consolidated-merge-patch-01.patch, 
> HDFS-10285-consolidated-merge-patch-02.patch, 
> HDFS-10285-consolidated-merge-patch-03.patch, 
> HDFS-SPS-TestReport-20170708.pdf, 
> Storage-Policy-Satisfier-in-HDFS-June-20-2017.pdf, 
> Storage-Policy-Satisfier-in-HDFS-May10.pdf, 
> Storage-Policy-Satisfier-in-HDFS-Oct-26-2017.pdf
>
>
> Heterogeneous storage in HDFS introduced the concept of storage policy. These 
> policies can be set on directory/file to specify the user preference, where 
> to store the physical block. When user set the storage policy before writing 
> data, then the blocks could take advantage of storage policy preferences and 
> stores physical block accordingly. 
> If user set the storage policy after writing and completing the file, then 
> the blocks would have been written with default storage policy (nothing but 
> DISK). User has to run the ‘Mover tool’ explicitly by specifying all such 
> file names as a list. In some distributed system scenarios (ex: HBase) it 
> would be difficult to collect all the files and run the tool as different 
> nodes can write files separately and file can have different paths.
> Another scenarios is, when user rename the files from one effected storage 
> policy file (inherited policy from parent directory) to another storage 
> policy effected directory, it will not copy inherited storage policy from 
> source. So it will take effect from destination file/dir parent storage 
> policy. This rename operation is just a metadata change in Namenode. The 
> physical blocks still remain with source storage policy.
> So, Tracking all such business logic based file names could be difficult for 
> admins from distributed nodes(ex: region servers) and running the Mover tool. 
> Here the proposal is to provide an API from Namenode itself for trigger the 
> storage policy satisfaction. A Daemon thread inside Namenode should track 
> such calls and process to DN as movement commands. 
> Will post the detailed design thoughts document soon. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to