[jira] [Comment Edited] (HDFS-6489) DFS Used space is not correct computed on frequent append operations

Ravi Prakash (JIRA) Mon, 11 Sep 2017 22:51:24 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-6489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16155977#comment-16155977
 ]


Ravi Prakash edited comment on HDFS-6489 at 9/12/17 5:50 AM:
-------------------------------------------------------------

Thanks for your reply [~brahmareddy]! Sorry about the tangent on 
{{FsDatasetImpl#removeOldReplica}} . I'm afraid I'm also not sure you are the 
point person on this. Could you please redirect me to the right person if 
you're not?

Let's focus on the {{HDFS6489.java}} test in written and reported by Bogdan. I 
see that it still fails on trunk. Here's the output
{code}
$ java HDFS6489
doing small appends...
17/09/06 13:20:25 INFO hdfs.DataStreamer: Exception in createBlockOutputStream 
blk_1073741835_1057
java.io.EOFException: Unexpected EOF while trying to read response from server
        at 
org.apache.hadoop.hdfs.protocolPB.PBHelperClient.vintPrefixed(PBHelperClient.java:444)
        at 
org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1750)
        at 
org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1495)
        at 
org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1469)
        at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:737)
Exception in thread "main" java.io.IOException: All datanodes 
[DatanodeInfoWithStorage[127.0.0.1:9866,DS-af60f3f1-eb86-46c2-821a-8d2f1dcb339d,DISK]]
 are bad. Aborting...
        at 
org.apache.hadoop.hdfs.DataStreamer.handleBadDatanode(DataStreamer.java:1549)
        at 
org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1483)
        at 
org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1469)
        at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:737)
{code}
Why do you think that is?

Where is the code you posted last? I wasn't able to find it in trunk or branch-2


was (Author: raviprak):
Thanks for your reply Brahma! Sorry about the tangent on 
{{FsDatasetImpl#removeOldReplica}} . I'm afraid I'm also not sure you are the 
point person on this. Could you please redirect me to the right person if 
you're not?

Let's focus on the {{HDFS6489.java}} test in written and reported by Bogdan. I 
see that it still fails on trunk. Here's the output
{code}
$ java HDFS6489
doing small appends...
17/09/06 13:20:25 INFO hdfs.DataStreamer: Exception in createBlockOutputStream 
blk_1073741835_1057
java.io.EOFException: Unexpected EOF while trying to read response from server
        at 
org.apache.hadoop.hdfs.protocolPB.PBHelperClient.vintPrefixed(PBHelperClient.java:444)
        at 
org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1750)
        at 
org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1495)
        at 
org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1469)
        at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:737)
Exception in thread "main" java.io.IOException: All datanodes 
[DatanodeInfoWithStorage[127.0.0.1:9866,DS-af60f3f1-eb86-46c2-821a-8d2f1dcb339d,DISK]]
 are bad. Aborting...
        at 
org.apache.hadoop.hdfs.DataStreamer.handleBadDatanode(DataStreamer.java:1549)
        at 
org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1483)
        at 
org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1469)
        at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:737)
{code}
Why do you think that is?

Where is the code you posted last? I wasn't able to find it in trunk or branch-2

> DFS Used space is not correct computed on frequent append operations
> --------------------------------------------------------------------
>
>                 Key: HDFS-6489
>                 URL: https://issues.apache.org/jira/browse/HDFS-6489
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>    Affects Versions: 2.2.0, 2.7.1, 2.7.2
>            Reporter: stanley shi
>         Attachments: HDFS-6489.001.patch, HDFS-6489.002.patch, 
> HDFS-6489.003.patch, HDFS-6489.004.patch, HDFS-6489.005.patch, 
> HDFS-6489.006.patch, HDFS-6489.007.patch, HDFS6489.java
>
>
> The current implementation of the Datanode will increase the DFS used space 
> on each block write operation. This is correct in most scenario (create new 
> file), but sometimes it will behave in-correct(append small data to a large 
> block).
> For example, I have a file with only one block(say, 60M). Then I try to 
> append to it very frequently but each time I append only 10 bytes;
> Then on each append, dfs used will be increased with the length of the 
> block(60M), not teh actual data length(10bytes).
> Consider in a scenario I use many clients to append concurrently to a large 
> number of files (1000+), assume the block size is 32M (half of the default 
> value), then the dfs used will be increased 1000*32M = 32G on each append to 
> the files; but actually I only write 10K bytes; this will cause the datanode 
> to report in-sufficient disk space on data write.
> {quote}2014-06-04 15:27:34,719 INFO 
> org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock  
> BP-1649188734-10.37.7.142-1398844098971:blk_1073742834_45306 received 
> exception org.apache.hadoop.util.DiskChecker$DiskOutOfSpaceException: 
> Insufficient space for appending to FinalizedReplica, blk_1073742834_45306, 
> FINALIZED{quote}
> But the actual disk usage:
> {quote}
> [root@hdsh143 ~]# df -h
> Filesystem            Size  Used Avail Use% Mounted on
> /dev/sda3              16G  2.9G   13G  20% /
> tmpfs                 1.9G   72K  1.9G   1% /dev/shm
> /dev/sda1              97M   32M   61M  35% /boot
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (HDFS-6489) DFS Used space is not correct computed on frequent append operations

Reply via email to