Sorry, changed the title: without -> with On Thu, Jul 7, 2011 at 9:14 PM, Wei Wu <woo....@gmail.com> wrote:
> Hi, > > We encountered a strange situation when restarting NameNode: it can not > leave safe mode automatically. "The ratio of reported blocks 0.9986 has not > reached the threshold 0.999". Our cluster has totally 83,276,820 blocks. So, > if the counter is right, we are missing about 116,587 blocks. But fsck > reported 83,276,779 blocks were healthy and 37 blocks in open files. Only 4 > blocks were marked as corrupt because its length is shorter than existing > ones. If the fsck result is believable, we got ratio higher than 0.999999 > and the threshold was reached. > > I think maybe the counter of blockSafe didn't function accurately. Is that > possible? Our case is similar to the situation described in jira: > https://issues.apache.org/jira/browse/HADOOP-2159 (our Hadoop release > already included this patch). > > Any suggestions? > > Wei >