Could you grep for one of the block ids (say 4522585614366970680) in the
namenode logs and post them here with timestamps?
thanks,
Raghu.
Chris Kline wrote:
I believe there was at least one good block (see fsck output). All data
nodes were up at the time according to the web page. I grep'd the
namenode log files for the under replicated blocks and only got an entry
for when it was created and entries for when the replication was fixed
after the HDFS restart. Here is the result of fsck:
$HADOOP_HOME/bin/hadoop fsck /
.......................................................
/data/hbase1/hregion_70236052/compaction.dir/hregion_70236052/info/done:
Under replicated blk_1984980330938654629. Target Replicas is 3 but found
1 replica(s).
..
/data/hbase1/hregion_70236052/info/info/2807320534360768620: Under
replicated blk_1717622121416314549. Target Replicas is 3 but found 1
replica(s).
.
/data/hbase1/hregion_70236052/info/mapfiles/2807320534360768620/data:
Under replicated blk_-5019714262388221150. Target Replicas is 3 but
found 1 replica(s).
.
/data/hbase1/hregion_70236052/info/mapfiles/2807320534360768620/index:
Under replicated blk_4522585614366970680. Target Replicas is 3 but found
1 replica(s).
.........................................
/data/hbase1/log_10.100.11.63_1199307142676_60020/hlog.dat.000: Under
replicated blk_-2871471426720379908. Target Replicas is 3 but found 1
replica(s).
.......
/data/hbase1/log_10.100.11.65_1199307142711_60020/hlog.dat.000: MISSING
1 blocks of total size 0 B.
.Status: CORRUPT
Total size: 71009158262 B
Total blocks: 16318 (avg. block size 4351584 B)
Total dirs: 21416
Total files: 16253
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 5 (0.03064101 %)
Target replication factor: 3
Real replication factor: 2.9993873
The filesystem under path '/' is CORRUPT
-Chris
On Jan 4, 2008, at 1:02 PM, Raghu Angadi wrote:
This is of course not expected. A more detailed info or log message
would help. Do you know if there is at least one good block?
Sometimes, the remaining "good" block might actually be corrupted and
thus can not replicate itself. Restarting might just have brought up
the datanodes that were down (for whatever reason) before the restart.
Raghu.
Chris Kline wrote:
fsck reports several under replicated blocks, but these do not get
fixed until I restart DFS. fsck also reports a missing block at the
same time, but this should affect the function of fixing under
replicated blocks. Has anyone seen this before?
I'm running 0.15.0.
-Chris Kline
-Chris
We're hiring engineers. $10,007 reward for referrals we hire.