[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16888592#comment-16888592
 ] 

Yiqun Lin commented on HDFS-14313:
----------------------------------

Take a deep review for the patch, this FSCachingGetSpaceUsed should be the HDFS 
specific impl class. I prefer not to make any change in the Common module. We 
can do this extend like HDFS DFSNetworkTopology did, add new config to enable 
or disable this. The instance will be got like this:
{code:java}
     if(enableFSCachingGetSpace) {
     this.dfsUsage = new CachingGetSpaceUsed.Builder().setPath(bpDir)
                                                    .setConf(conf)
                                                     
.setInitialUsed(loadDfsUsed())
                                                     .build();
     } else {
        this.dfsUsage = new FSCachingGetSpaceUsed.Builder().setBpid(bpid)
            .setVolume(volume)
            .setPath(bpDir)
            .setConf(conf)
            .setInitialUsed(loadDfsUsed())
            .build();
    }
{code}
{code:java}
+  @Override
+  public Set<? extends Replica> deepCopyReplica(String bpid)
+      throws IOException {
+    Set<? extends Replica> replicas =
+        new HashSet<>(volumeMap.replicas(bpid) == null ? Collections.EMPTY_SET
+            : volumeMap.replicas(bpid));
+    return Collections.unmodifiableSet(replicas);
+  }
{code}
Even though deepCopyReplica is only used by another thread, I still prefer to 
let it be an atomic operation incase this will be used in other places in the 
future. Can you add datasetock here?
{noformat}
fs.getspaceused.deep.copy.replica.threshold.ms
fs.getspaceused.replica.caching.threshold.ms
{noformat}
I don't think above two configs are really needed once we make this as the HDFS 
specific getused class. Instead of, we can still keep this using a hard-coded 
way.

For the unit test {{TestReplicaCachingGetSpaceUsed}}, why not use a more real 
way to test this? I prefer to start up the real minicluster then write the 
actual blocks files and do the verification. Mocking way needs many steps to 
setup the DN and also not convenient to cover all the cases.

Also we need a comparison test by respectively using FSCachingGetUsed and 
default Du way, to check if the dfsUsed calculated is same. 

[~leosun08], I suppose you may need to do a refactor for the latest patch.

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> ----------------------------------------------------------------------------------------
>
>                 Key: HDFS-14313
>                 URL: https://issues.apache.org/jira/browse/HDFS-14313
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode, performance
>    Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>            Reporter: Lisheng Sun
>            Assignee: Lisheng Sun
>            Priority: Major
>         Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to