[ 
https://issues.apache.org/jira/browse/HDFS-10843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-10843:
-------------------------------
    Status: Patch Available  (was: Open)

I am attaching a patch which includes 6 new tests in 
{{TestDiskspaceQuotaUpdate}}:
- {{testComputedCachedSizesAgreeWhileCommitting}}
- {{testIncreaseReplicationWhileCommitting}}
- {{testDecreaseReplicationWhileCommitting}}
- {{testDecreaseReplicationBeforeCommitting}}
- {{testComputedCachedSizesAgreeBeforeCommitting}}
- {{testIncreaseReplicationBeforeCommitting}}

The first four all fail on the current build, 2.6 through trunk. 

The patch also includes a fix which moves the logic which updates the cached 
storagespace consumed by a file to occur when a block is completed, rather than 
when it is completed, to be consistent with how the rest of the code base 
considers the situation between commit and completion. Since there are multiple 
code paths which cause a block to complete and these paths originate from 
BlockManager, whereas there is only one code path (located in FSNamesystem) 
which causes a block to commit, this required an entry point from BlockManager 
back into FSNamesystem via a new public method on the Namesystem interface. I 
am open to suggestions as to a cleaner way to achieve this.

The patch doesn't backport cleanly since numerous changes to numerous changes 
to the namenode have occurred (e.g. EC) but I have tested it back to 2.6 and 
after fixing compile errors it runs as expected.  

> Quota Feature Cached Size != Computed Size When Block Committed But Not 
> Completed
> ---------------------------------------------------------------------------------
>
>                 Key: HDFS-10843
>                 URL: https://issues.apache.org/jira/browse/HDFS-10843
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs, namenode
>    Affects Versions: 2.6.0
>            Reporter: Erik Krogen
>            Assignee: Erik Krogen
>
> Currently when a block has been committed but has not yet been completed, the 
> cached size (used for the quota feature) of the directory containing that 
> block differs from the computed size. This results in log messages of the 
> following form:
> bq. ERROR namenode.NameNode 
> (DirectoryWithQuotaFeature.java:checkStoragespace(141)) - BUG: Inconsistent 
> storagespace for directory /TestQuotaUpdate. Cached = 512 != Computed = 8192
> When a block is initially started under construction, the used space is 
> conservatively set to a full block. When the block is committed, the cached 
> size is updated to the final size of the block. However, the calculation of 
> the computed size uses the full block size until the block is completed, so 
> in the period where the block is committed but not completed they disagree. 
> To fix this we need to decide which is correct and fix the other to match. It 
> seems to me that the cached size is correct since once the block is committed 
> its size will not change. 
> This can be reproduced using the following steps:
> - Create a directory with a quota
> - Start writing to a file within this directory
> - Prevent all datanodes to which the file is written from communicating the 
> corresponding BlockReceivedAndDeletedRequestProto to the NN temporarily (i.e. 
> simulate a transient network partition/delay)
> - During this time, call DistributedFileSystem.getContentSummary() on the 
> directory with the quota



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to