[
https://issues.apache.org/jira/browse/HDFS-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15053202#comment-15053202
]
Chris Nauroth commented on HDFS-9038:
-------------------------------------
bq. I believe, post HDFS-5215, calculation turns out to be the correct one.
I disagree, because the post-HDFS-5215 calculation has broken previously
established operations workflows.
Non-DFS usage is "space unexpectedly consumed on data volumes by things that
are not HDFS". As a cluster administrator, I would monitor the non-DFS usage
value. If a node consistently showed a non-zero non-DFS usage, then that would
signal me to login to the box, figure out where the space was consumed, free it
up, and then address root cause (probably a rogue process running on the wrong
box or misconfigured to write to the wrong volume). It's important to fix
this, because high non-DFS usage reduces disk capacity that had been planned
for HDFS.
After HDFS-5215, this workflow no longer works. The monitoring will show false
positives because of inclusion of {{dfs.datanode.du.reserved}}. An
administrator would need to use additional checks, or simply disable this
monitoring due to the noise.
{{dfs.datanode.du.reserved}} is a special case on top of what I described
above. Setting a non-zero {{dfs.datanode.du.reserved}} means that the
administrator intentionally wants to hold back a portion of the volume,
essentially making it invisible to HDFS. This is the same reason that
{{dfs.datanode.du.reserved}} gets subtracted from the capacity calculation.
This is why I think 3 GB is incorrect in the example above. Non-DFS usage is
really "unexpected non-DFS usage". By setting {{dfs.datanode.du.reserved}} to
1 GB, you have stated that up to 1 GB of usage by something other than HDFS is
expected, so it's incorrect to count it as part of "unexpected non-DFS usage".
Bottom line: I have seen the post-HDFS-5215 calculation cause confusion for
administrators who had built a workflow around the old calculation.
> DFS reserved space is erroneously counted towards non-DFS used.
> ---------------------------------------------------------------
>
> Key: HDFS-9038
> URL: https://issues.apache.org/jira/browse/HDFS-9038
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: datanode
> Affects Versions: 2.7.1
> Reporter: Chris Nauroth
> Assignee: Brahma Reddy Battula
> Attachments: HDFS-9038-002.patch, HDFS-9038-003.patch,
> HDFS-9038-004.patch, HDFS-9038-005.patch, HDFS-9038-006.patch,
> HDFS-9038-007.patch, HDFS-9038.patch
>
>
> HDFS-5215 changed the DataNode volume available space calculation to consider
> the reserved space held by the {{dfs.datanode.du.reserved}} configuration
> property. As a side effect, reserved space is now counted towards non-DFS
> used. I don't believe it was intentional to change the definition of non-DFS
> used. This issue proposes restoring the prior behavior: do not count
> reserved space towards non-DFS used.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)