[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16888592#comment-16888592 ]
Yiqun Lin commented on HDFS-14313: ---------------------------------- Take a deep review for the patch, this FSCachingGetSpaceUsed should be the HDFS specific impl class. I prefer not to make any change in the Common module. We can do this extend like HDFS DFSNetworkTopology did, add new config to enable or disable this. The instance will be got like this: {code:java} if(enableFSCachingGetSpace) { this.dfsUsage = new CachingGetSpaceUsed.Builder().setPath(bpDir) .setConf(conf) .setInitialUsed(loadDfsUsed()) .build(); } else { this.dfsUsage = new FSCachingGetSpaceUsed.Builder().setBpid(bpid) .setVolume(volume) .setPath(bpDir) .setConf(conf) .setInitialUsed(loadDfsUsed()) .build(); } {code} {code:java} + @Override + public Set<? extends Replica> deepCopyReplica(String bpid) + throws IOException { + Set<? extends Replica> replicas = + new HashSet<>(volumeMap.replicas(bpid) == null ? Collections.EMPTY_SET + : volumeMap.replicas(bpid)); + return Collections.unmodifiableSet(replicas); + } {code} Even though deepCopyReplica is only used by another thread, I still prefer to let it be an atomic operation incase this will be used in other places in the future. Can you add datasetock here? {noformat} fs.getspaceused.deep.copy.replica.threshold.ms fs.getspaceused.replica.caching.threshold.ms {noformat} I don't think above two configs are really needed once we make this as the HDFS specific getused class. Instead of, we can still keep this using a hard-coded way. For the unit test {{TestReplicaCachingGetSpaceUsed}}, why not use a more real way to test this? I prefer to start up the real minicluster then write the actual blocks files and do the verification. Mocking way needs many steps to setup the DN and also not convenient to cover all the cases. Also we need a comparison test by respectively using FSCachingGetUsed and default Du way, to check if the dfsUsed calculated is same. [~leosun08], I suppose you may need to do a refactor for the latest patch. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > ---------------------------------------------------------------------------------------- > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance > Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 > Reporter: Lisheng Sun > Assignee: Lisheng Sun > Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch, HDFS-14313.007.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org