[
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16869692#comment-16869692
]
Wei-Chiu Chuang edited comment on HDFS-14313 at 6/21/19 5:05 PM:
-----------------------------------------------------------------
Yes, there is a potential data race problem.
{code:title=ReplicaMap.java}
/**
* Get a collection of the replicas for given block pool
* This method is <b>not synchronized</b>. It needs to be synchronized
* externally using the lock, both for getting the replicas
* values from the map and iterating over it. Mutex can be accessed using
* {@link #getLock()} method.
*
* @param bpid block pool id
* @return a collection of the replicas belonging to the block pool
*/
Collection<ReplicaInfo> replicas(String bpid) {
return map.get(bpid);
}
{code}
was (Author: jojochuang):
{code:title=ReplicaMap.java}
/**
* Get a collection of the replicas for given block pool
* This method is <b>not synchronized</b>. It needs to be synchronized
* externally using the lock, both for getting the replicas
* values from the map and iterating over it. Mutex can be accessed using
* {@link #getLock()} method.
*
* @param bpid block pool id
* @return a collection of the replicas belonging to the block pool
*/
Collection<ReplicaInfo> replicas(String bpid) {
return map.get(bpid);
}
{code}
> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory
> instead of df/du
> ----------------------------------------------------------------------------------------
>
> Key: HDFS-14313
> URL: https://issues.apache.org/jira/browse/HDFS-14313
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: datanode, performance
> Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
> Reporter: Lisheng Sun
> Assignee: Lisheng Sun
> Priority: Major
> Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
> # Running DU across lots of disks is very expensive and running all of the
> processes at the same time creates a noticeable IO spike.
> # Running DF is inaccurate when the disk sharing by multiple datanode or
> other servers.
> Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory
> is very small and accurate.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]