[
https://issues.apache.org/jira/browse/HDFS-8045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
zhouyingchao updated HDFS-8045:
-------------------------------
Attachment: HDFS-8045-001.patch
Test with
"-Dtest=TestSimulatedFSDataset,TestNamenodeCapacityReport,TestFileCreation,TestDecommission"
> Incorrect calculation of NonDfsUsed and Remaining
> -------------------------------------------------
>
> Key: HDFS-8045
> URL: https://issues.apache.org/jira/browse/HDFS-8045
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: datanode
> Affects Versions: 2.6.0
> Reporter: zhouyingchao
> Assignee: zhouyingchao
> Attachments: HDFS-8045-001.patch
>
>
> After reserve some space via the param "dfs.datanode.du.reserved", we noticed
> that the namenode usually report NonDfsUsed of Datanodes as 0 even if we
> write some non-hdfs data to the volume. After some investigation, we think
> there is an issue in the calculation of FsVolumeImpl.getAvailable - following
> is the explaination.
> For a volume, let's use Raw to represent raw capacity, DfsUsed to represent
> space consumed by hdfs blocks, Reserved to represent reservation through
> "dfs.datanode.du.reserved", RbwReserved to represent space reservation for
> rbw blocks, RealNonDfsUsed to represent real value of NonDfsUsed(which will
> include non-hdfs files and meta data consumed by local filesystem).
> In current implementation, for a volume, available space will be actually
> calculated as
> {code}
> min{Raw - Reserved - DfsUsed -RbwReserved, Raw - DfsUsed - RealNonDfsUsed }
> {code}
> Later on, Namenode will calculate NonDfsUsed of the volume as
> {code}
> Raw - Reserved - DfsUsed - min{Raw - Reserved - DfsUsed - RbwReserved, Raw -
> DfsUsed - RealNonDfsUsed}
> {code}
> Given the calculation, finally we will have -
> {code}
> if (Reserved + RbwReserved > RealNonDfsUsed) NonDfsUsed = RbwReserved;
> else NonDfsUsed = RealNonDfsUsed - Reserved;
> {code}
> Either way it is far from the correct value.
> After investigating the implementation, we believe the Reserved and
> RbwReserved should be subtract from available in getAvailable since they are
> actually not available to hdfs in any sense. I'll post a patch soon.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)