[ https://issues.apache.org/jira/browse/HDFS-6489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16155977#comment-16155977 ]
Ravi Prakash commented on HDFS-6489: ------------------------------------ Thanks for your reply Brahma! Sorry about the tangent on {{FsDatasetImpl#removeOldReplica}} . I'm afraid I'm also not sure you are the point person on this. Could you please redirect me to the right person if you're not? Let's focus on the {{HDFS6489.java}} test in written and reported by Bogdan. I see that it still fails on trunk. Here's the output {code} $ java HDFS6489 doing small appends... 17/09/06 13:20:25 INFO hdfs.DataStreamer: Exception in createBlockOutputStream blk_1073741835_1057 java.io.EOFException: Unexpected EOF while trying to read response from server at org.apache.hadoop.hdfs.protocolPB.PBHelperClient.vintPrefixed(PBHelperClient.java:444) at org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1750) at org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1495) at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1469) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:737) Exception in thread "main" java.io.IOException: All datanodes [DatanodeInfoWithStorage[127.0.0.1:9866,DS-af60f3f1-eb86-46c2-821a-8d2f1dcb339d,DISK]] are bad. Aborting... at org.apache.hadoop.hdfs.DataStreamer.handleBadDatanode(DataStreamer.java:1549) at org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1483) at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1469) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:737) {code} Why do you think that is? Where is the code you posted last? I wasn't able to find it in trunk or branch-2 > DFS Used space is not correct computed on frequent append operations > -------------------------------------------------------------------- > > Key: HDFS-6489 > URL: https://issues.apache.org/jira/browse/HDFS-6489 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Affects Versions: 2.2.0, 2.7.1, 2.7.2 > Reporter: stanley shi > Attachments: HDFS-6489.001.patch, HDFS-6489.002.patch, > HDFS-6489.003.patch, HDFS-6489.004.patch, HDFS-6489.005.patch, > HDFS-6489.006.patch, HDFS-6489.007.patch, HDFS6489.java > > > The current implementation of the Datanode will increase the DFS used space > on each block write operation. This is correct in most scenario (create new > file), but sometimes it will behave in-correct(append small data to a large > block). > For example, I have a file with only one block(say, 60M). Then I try to > append to it very frequently but each time I append only 10 bytes; > Then on each append, dfs used will be increased with the length of the > block(60M), not teh actual data length(10bytes). > Consider in a scenario I use many clients to append concurrently to a large > number of files (1000+), assume the block size is 32M (half of the default > value), then the dfs used will be increased 1000*32M = 32G on each append to > the files; but actually I only write 10K bytes; this will cause the datanode > to report in-sufficient disk space on data write. > {quote}2014-06-04 15:27:34,719 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock > BP-1649188734-10.37.7.142-1398844098971:blk_1073742834_45306 received > exception org.apache.hadoop.util.DiskChecker$DiskOutOfSpaceException: > Insufficient space for appending to FinalizedReplica, blk_1073742834_45306, > FINALIZED{quote} > But the actual disk usage: > {quote} > [root@hdsh143 ~]# df -h > Filesystem Size Used Avail Use% Mounted on > /dev/sda3 16G 2.9G 13G 20% / > tmpfs 1.9G 72K 1.9G 1% /dev/shm > /dev/sda1 97M 32M 61M 35% /boot > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org