[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880052#comment-16880052
 ] 

Wei-Chiu Chuang commented on HDFS-14313:
----------------------------------------

Thank you [~leosun08]. I was out for a few days.

I think overall the patch is almost ready. Please take care of a few nits that 
I spotted.

ReplicaCachingGetSpaceUsed#run() would throw an NPE if ExternalDatasetImpl is 
used since ExternalDatasetImpl#deepCopyReplica() returns a null. IMO, it should 
throw an exception to indicate it is not supported, or return an empty 
Collection.

For the new configuration keys, please update them with a prefix. For example, 
deep.copy.replica.threshold.ms --> 
fs.getspaceused.deep.copy.replica.threshold.ms

Please add description of the new configurations into 
hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml

> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> ----------------------------------------------------------------------------------------
>
>                 Key: HDFS-14313
>                 URL: https://issues.apache.org/jira/browse/HDFS-14313
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode, performance
>    Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>            Reporter: Lisheng Sun
>            Assignee: Lisheng Sun
>            Priority: Major
>         Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to