I obtained a response from Todd Lipcon of Cloudera. I wanted to update this 
question in case others have this issue. I noticed this issue with CDH3u1 and 
this was his response:

The "append" function is known to be broken in CDH3 and is likely to have bugs 
like this. We would recommend you advise your users to not use it. This is true 
of all releases of Hadoop 0.20.x (CDH and otherwise) and will be fixed in CDH4 
(upstream version 0.23 or above).
Sorry for the bad news. I'll look into this particular bug to make sure it 
isn't present in the upstream trunk, but it's unlikely to be fixed in a CDH3 
release.
Thanks -Todd

Thanks all,
-Shawn

From: Alexander C.H. Lorenz [mailto:wget.n...@googlemail.com]
Sent: Saturday, November 05, 2011 4:56 AM
To: hdfs-user@hadoop.apache.org
Subject: Re: HDFS reporting varying number of under-replicated blocks; MISSING 
blocks

Hi,

could be that the balancer is running, I tried to reproduce and see some output 
like this with a running job. Just an idea.

- alex
On Fri, Nov 4, 2011 at 2:43 PM, 
<shawn.higg...@thomsonreuters.com<mailto:shawn.higg...@thomsonreuters.com>> 
wrote:

Hello all,

I am getting wildly varying reports of under-replicated blocks from dfsadmin 
and fsck. I am wondering what's causing this. hadoop dfsadmin -metasave reports 
~1,000 actual blocks awaiting replication and about ~404,000 MISSING blocks 
awaiting replication. How do I fix this? Why are there so many MISSING blocks? 
All data nodes have checked in and none are dead. The number of missing blocks 
has been increasing by about 50,000 to 100,000 each day for the past few days. 
Jobs run just fine and there does not appear to be data missing.

Please see output from the following commands below: hadoop fsck /, hadoop 
dfsadmin -report, hadoop dfsadmin -metasave, and the namenode web GUI.



DFSadmin report:
Configured Capacity: 59070545264640 (53.72 TB)
Present Capacity: 56104851291863 (51.03 TB)
DFS Remaining: 35451062198272 (32.24 TB)
DFS Used: 20653789093591 (18.78 TB)
DFS Used%: 36.81%
Under replicated blocks: 405571
Blocks with corrupt replicas: 110
Missing blocks: 0


Hadoop fsck /:
Total size:    6793151991960 B (Total open files size: 46894626752 B)
Total dirs:    1748
Total files:   229033 (Files currently being written: 486)
Total blocks (validated):      244648 (avg. block size 27767044 B) (Total open 
file blocks (not validated): 675)
Minimally replicated blocks:   244648 (100.0 %)
Over-replicated blocks:        0 (0.0 %)
Under-replicated blocks:       931 (0.38054675 %)
Mis-replicated blocks:         0 (0.0 %)
Default replication factor:    3
Average block replication:     2.9945922
Corrupt blocks:                0
Missing replicas:              1844 (0.25169903 %)
Number of data-nodes:          20
Number of racks:               1
FSCK ended at Fri Nov 04 08:26:25 CDT 2011 in 7222 milliseconds
The filesystem under path '/' is HEALTHY



Hadoop dfsadmin -metasave excerpt:
231268 files and directories, 245326 blocks = 476594 total
Live Datanodes: 20
Dead Datanodes: 0
Metasave: Blocks waiting for replication: 404829
/file/logs/prod/A/20111031/kj9384-91-prod-oidu.20111028.213210: 
blk_-9102337703676304543_9404421 (replicas: l: 1 d: 0 c: 0 e: 0)  
33.112.130.196:50010<http://33.112.130.196:50010> :
..........
: blk_9223167009736432600_10568409 MISSING (replicas: l: 0 d: 0 c: 0 e: 0)
: blk_9223167009736432600_10574235 MISSING (replicas: l: 0 d: 0 c: 0 e: 0)
: blk_9223167009736432600_10579879 MISSING (replicas: l: 0 d: 0 c: 0 e: 0)
: blk_9223167009736432600_10582788 MISSING (replicas: l: 0 d: 0 c: 0 e: 0)
: blk_9223167009736432600_10588269 MISSING (replicas: l: 0 d: 0 c: 0 e: 0)
: blk_9223167009736432600_10595646 MISSING (replicas: l: 0 d: 0 c: 0 e: 0)
: blk_9223167009736432600_10598074 MISSING (replicas: l: 0 d: 0 c: 0 e: 0)
: blk_9223167009736432600_10600859 MISSING (replicas: l: 0 d: 0 c: 0 e: 0)
: blk_9223167009736432600_10606703 MISSING (replicas: l: 0 d: 0 c: 0 e: 0)
: blk_9223167009736432600_10612156 MISSING (replicas: l: 0 d: 0 c: 0 e: 0)
: blk_9223167009736432600_10614776 MISSING (replicas: l: 0 d: 0 c: 0 e: 0)
: blk_9223167009736432600_10624115 MISSING (replicas: l: 0 d: 0 c: 0 e: 0)
: blk_9223167009736432600_10630240 MISSING (replicas: l: 0 d: 0 c: 0 e: 0)
: blk_9223167009736432600_10634047 MISSING (replicas: l: 0 d: 0 c: 0 e: 0)
: blk_9223167009736432600_10637486 MISSING (replicas: l: 0 d: 0 c: 0 e: 0)
: blk_9223167009736432600_10644655 MISSING (replicas: l: 0 d: 0 c: 0 e: 0)
: blk_9223167009736432600_10653784 MISSING (replicas: l: 0 d: 0 c: 0 e: 0)
: blk_9223167009736432600_10657780 MISSING (replicas: l: 0 d: 0 c: 0 e: 0)
: blk_9223167009736432600_10661247 MISSING (replicas: l: 0 d: 0 c: 0 e: 0)
: blk_9223167009736432600_10664704 MISSING (replicas: l: 0 d: 0 c: 0 e: 0)
Metasave: Blocks being replicated: 0
Metasave: Blocks 0 waiting deletion from 0 datanodes.
Metasave: Number of datanodes: 20

Web GUI:
Cluster Summary
231267 files and directories, 245316 blocks = 476583 total. Heap Size is 626.94 
MB / 8.68 GB (7%)
Configured Capacity:      53.72 TB
DFS Used             :               19.05 TB
Non DFS Used   :               2.43 TB
DFS Remaining  :               32.25 TB
DFS Used%         :               35.46 %
DFS Remaining%:             60.02 %
Live Nodes          :               20
Dead Nodes       :               0
Decommissioning Nodes :            0
Number of Under-Replicated Blocks:     405943

Shawn Higgins
Thomson Reuters

shawn.higg...@thomsonreuters.com<mailto:shawn.higg...@thomsonreuters.com>
thomsonreuters.com<http://thomsonreuters.com>



--
Alexander Lorenz
http://mapredit.blogspot.com

P Think of the environment: please don't print this email unless you really 
need to.


Reply via email to