Cached directory size in INodeDirectory can get permantently out of sync with
computed size, causing quota issues
-----------------------------------------------------------------------------------------------------------------
Key: HDFS-3061
URL: https://issues.apache.org/jira/browse/HDFS-3061
Project: Hadoop HDFS
Issue Type: Bug
Components: name-node
Affects Versions: 0.20.203.0
Reporter: Alex Holmes
It appears that there's a condition under which a HDFS directory with a quota
set can get to a point where the cached size for the directory can permanently
differ from the computed value. When this happens the following command:
{code}
hadoop fs -count -q /tmp/quota-test
{code}
results in the following output in the NameNode logs:
{code}
WARN org.apache.hadoop.hdfs.server.namenode.NameNode: Inconsistent diskspace
for directory quota-test. Cached: 6000 Computed: 6072
{code}
I've observed both transient and persistent instances of this happening. In
the transient instances this warning goes away, but in the persistent instances
every invocation of the {{fs -count -q}} command yields the above warning.
I've seen instances where the actual disk usage of a directory is 25% of the
cached value in INodeDirectory, which creates problems since the quota code
uses this cached value to determine whether block write requests are permitted.
This isn't easy to reproduce - I am able to (inconsistently) get HDFS into this
state with a simple program which:
# Writes files into HDFS
# When a DSQuotaExceededException is encountered removes all files created in
step 1
# Repeat step 1
I'm going to try and come up with a more repeatable test case to reproduce this
issue.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira