[ 
https://issues.apache.org/jira/browse/HDFS-10843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15468993#comment-15468993
 ] 

Erik Krogen commented on HDFS-10843:
------------------------------------

I have realized that this bug is more severe than I previously thought. As 
described above this will only result in a transient state where the cached and 
computed values briefly disagree. However, if the replication is changed while 
in this state, or similarly, while the block is under construction but has not 
yet been committed, the cached value will be updated incorrectly, causing the 
cached value to be persistently incorrect. All of these issues should be fixed 
by the same root cause of ensuring that the storagespace values are always 
computed in a consistent way - right now different spots in the code handle the 
different states (under construction, committed, completed) in inconsistent 
ways. 

> Quota Feature Cached Size != Computed Size When Block Committed But Not 
> Completed
> ---------------------------------------------------------------------------------
>
>                 Key: HDFS-10843
>                 URL: https://issues.apache.org/jira/browse/HDFS-10843
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs, namenode
>    Affects Versions: 2.6.0
>            Reporter: Erik Krogen
>            Assignee: Erik Krogen
>
> Currently when a block has been committed but has not yet been completed, the 
> cached size (used for the quota feature) of the directory containing that 
> block differs from the computed size. This results in log messages of the 
> following form:
> bq. ERROR namenode.NameNode 
> (DirectoryWithQuotaFeature.java:checkStoragespace(141)) - BUG: Inconsistent 
> storagespace for directory /TestQuotaUpdate. Cached = 512 != Computed = 8192
> When a block is initially started under construction, the used space is 
> conservatively set to a full block. When the block is committed, the cached 
> size is updated to the final size of the block. However, the calculation of 
> the computed size uses the full block size until the block is completed, so 
> in the period where the block is committed but not completed they disagree. 
> To fix this we need to decide which is correct and fix the other to match. It 
> seems to me that the cached size is correct since once the block is committed 
> its size will not change. 
> This can be reproduced using the following steps:
> - Create a directory with a quota
> - Start writing to a file within this directory
> - Prevent all datanodes to which the file is written from communicating the 
> corresponding BlockReceivedAndDeletedRequestProto to the NN temporarily (i.e. 
> simulate a transient network partition/delay)
> - During this time, call DistributedFileSystem.getContentSummary() on the 
> directory with the quota



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to