Could you grep for one of the block ids (say 4522585614366970680) in the namenode logs and post them here with timestamps?

thanks,
Raghu.

Chris Kline wrote:
I believe there was at least one good block (see fsck output). All data nodes were up at the time according to the web page. I grep'd the namenode log files for the under replicated blocks and only got an entry for when it was created and entries for when the replication was fixed after the HDFS restart. Here is the result of fsck:

$HADOOP_HOME/bin/hadoop fsck /
.......................................................
/data/hbase1/hregion_70236052/compaction.dir/hregion_70236052/info/done: Under replicated blk_1984980330938654629. Target Replicas is 3 but found 1 replica(s).
..
/data/hbase1/hregion_70236052/info/info/2807320534360768620: Under replicated blk_1717622121416314549. Target Replicas is 3 but found 1 replica(s).
.
/data/hbase1/hregion_70236052/info/mapfiles/2807320534360768620/data: Under replicated blk_-5019714262388221150. Target Replicas is 3 but found 1 replica(s).
.
/data/hbase1/hregion_70236052/info/mapfiles/2807320534360768620/index: Under replicated blk_4522585614366970680. Target Replicas is 3 but found 1 replica(s).
.........................................
/data/hbase1/log_10.100.11.63_1199307142676_60020/hlog.dat.000: Under replicated blk_-2871471426720379908. Target Replicas is 3 but found 1 replica(s).
.......
/data/hbase1/log_10.100.11.65_1199307142711_60020/hlog.dat.000: MISSING 1 blocks of total size 0 B.
.Status: CORRUPT
 Total size:    71009158262 B
 Total blocks:  16318 (avg. block size 4351584 B)
 Total dirs:    21416
 Total files:   16253
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       5 (0.03064101 %)
 Target replication factor:     3
 Real replication factor:       2.9993873


The filesystem under path '/' is CORRUPT


-Chris

On Jan 4, 2008, at 1:02 PM, Raghu Angadi wrote:

This is of course not expected. A more detailed info or log message would help. Do you know if there is at least one good block? Sometimes, the remaining "good" block might actually be corrupted and thus can not replicate itself. Restarting might just have brought up the datanodes that were down (for whatever reason) before the restart.

Raghu.

Chris Kline wrote:
fsck reports several under replicated blocks, but these do not get fixed until I restart DFS. fsck also reports a missing block at the same time, but this should affect the function of fixing under replicated blocks. Has anyone seen this before?
I'm running 0.15.0.
-Chris Kline


-Chris

We're hiring engineers.  $10,007 reward for referrals we hire.




Reply via email to