Re: Under replicated block doesn't get fixed until DFS restart

Raghu Angadi Mon, 07 Jan 2008 12:33:16 -0800

Could you grep for one of the block ids (say 4522585614366970680) in thenamenode logs and post them here with timestamps?


thanks,
Raghu.

Chris Kline wrote:

I believe there was at least one good block (see fsck output). All datanodes were up at the time according to the web page. I grep'd thenamenode log files for the under replicated blocks and only got an entryfor when it was created and entries for when the replication was fixedafter the HDFS restart. Here is the result of fsck:
$HADOOP_HOME/bin/hadoop fsck /
.......................................................
/data/hbase1/hregion_70236052/compaction.dir/hregion_70236052/info/done:Under replicated blk_1984980330938654629. Target Replicas is 3 but found1 replica(s).
..
/data/hbase1/hregion_70236052/info/info/2807320534360768620: Underreplicated blk_1717622121416314549. Target Replicas is 3 but found 1replica(s).
.
/data/hbase1/hregion_70236052/info/mapfiles/2807320534360768620/data:Under replicated blk_-5019714262388221150. Target Replicas is 3 butfound 1 replica(s).
.
/data/hbase1/hregion_70236052/info/mapfiles/2807320534360768620/index:Under replicated blk_4522585614366970680. Target Replicas is 3 but found1 replica(s).
.........................................
/data/hbase1/log_10.100.11.63_1199307142676_60020/hlog.dat.000: Underreplicated blk_-2871471426720379908. Target Replicas is 3 but found 1replica(s).
.......
/data/hbase1/log_10.100.11.65_1199307142711_60020/hlog.dat.000: MISSING1 blocks of total size 0 B.
.Status: CORRUPT
 Total size:    71009158262 B
 Total blocks:  16318 (avg. block size 4351584 B)
 Total dirs:    21416
 Total files:   16253
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       5 (0.03064101 %)
 Target replication factor:     3
 Real replication factor:       2.9993873


The filesystem under path '/' is CORRUPT


-Chris

On Jan 4, 2008, at 1:02 PM, Raghu Angadi wrote:
This is of course not expected. A more detailed info or log messagewould help. Do you know if there is at least one good block?Sometimes, the remaining "good" block might actually be corrupted andthus can not replicate itself. Restarting might just have brought upthe datanodes that were down (for whatever reason) before the restart.
Raghu.

Chris Kline wrote:
fsck reports several under replicated blocks, but these do not getfixed until I restart DFS. fsck also reports a missing block at thesame time, but this should affect the function of fixing underreplicated blocks. Has anyone seen this before?
I'm running 0.15.0.
-Chris Kline
-Chris

We're hiring engineers.  $10,007 reward for referrals we hire.

Re: Under replicated block doesn't get fixed until DFS restart

Reply via email to