[
https://issues.apache.org/jira/browse/HDFS-10529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15332719#comment-15332719
]
Lei (Eddy) Xu commented on HDFS-10529:
--------------------------------------
Hi, [~pranavprakash] Thanks a lot for the fixes.
Would you mind to add a unit test to verify the problem existed before and
fixed after your patch ?
THanks!
> Df reports incorrect usage when appending less than block size
> --------------------------------------------------------------
>
> Key: HDFS-10529
> URL: https://issues.apache.org/jira/browse/HDFS-10529
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 2.7.2, 3.0.0-alpha1
> Reporter: Pranav Prakash
> Assignee: Pranav Prakash
> Priority: Minor
> Labels: datanode, fs, hdfs
> Attachments: HDFS-10529.000.patch
>
>
> Steps to recreate issue:
> 1. Create a 100MB file on HDFS cluster with 128MB blocksize and replication
> factor 3
> 2. Append 100MB to the file
> 3. Df reports around 900MB even though it should only be around 600MB.
> Looking at the blocks confirms that df is incorrect, as there exist only two
> blocks on each DN -- a 128MB block and a 72MB block.
> This issue seems to arise because BlockPoolSlice does not account for the
> delta increase in dfsUsage when an append happens to a partially-filled
> block, and instead naively adds the total block size. For instance, in the
> example scenario when when block is "filled" from 100 to 128MB,
> addFinalizedBlock() in BlockPoolSlice adds the size of the newly created
> block into the total instead of accounting for the difference/delta in block
> size between old and new. This has the effect of double-counting the old
> partially-filled block: it is counted once when it is first created (in the
> example scenario when the 100MB file is created) and again when it becomes
> part of the filled block (in the example scenario when the 128MB block is
> formed form the initial 100MB block). Thus the perceived size becomes 100MB +
> 128MB + 72 = 300 MB for each DN, or 900MB across the cluster.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]