Re: Best practices to recover from Corrupt Namenode

Harsh J Mon, 16 Jan 2012 23:19:21 -0800

You ran into a corrupt files issue, not a namenode corruption (which generally 
refers to the fsimage or edits getting corrupted).


Did your files not have adequate replication that they could not withstand the 
loss of one DN's disk? What exactly did fsck output? Did all block replicas go 
missing for your files?

On 17-Jan-2012, at 12:08 PM, praveenesh kumar wrote:

> Hi guys,
> 
> I just faced a weird situation, in which one of my hard disks on DN went
> down.
> Due to which when I restarted namenode, some of the blocks went missing and
> it was saying my namenode is CORRUPT and in safe mode, which doesn't allow
> you to add or delete any files on HDFS.
> 
> I know , we can close the safe mode part.
> Problem is how to deal with Corrupt Namenode problem in this case -- Best
> practices.
> 
> In my case, I was lucky that all missing blocks were that of the Outputs of
> my M/R codes I ran previously.
> So I just deleted all those files with the missing blocks from HDFS to come
> from CORRUPT --> HEALTHY state.
> 
> But had it be for the large input data files , it won't be a good solution
> in that case to delete those files.
> 
> So I wanted to know what should be the best practices to deal with above
> kind of problems to go from CORRUPT NAMENODE --> HEALTHY NAMENODE?
> 
> Thanks,
> Praveenesh

--
Harsh J
Customer Ops. Engineer, Cloudera

Re: Best practices to recover from Corrupt Namenode

Reply via email to