On 1/12/11 1:36 PM, Adam Phelps wrote:
Also, there apparently is a way of healing a corrupt edits file using
your favorite hex editor. There is a thread here:
http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-user/201010.mbox/%3caanlktinbhmn1x8dlir-c4ibhja9nh46tns588cqcn...@mail.gmail.com%3e
<http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-user/201010.mbox/<aanlktinbhmn1x8dlir-c4ibhja9nh46tns588cqcn...@mail.gmail.com>>
Thanks for the link. Manually editing the edits file is our current
thought, a little understanding of the format should save us some pain.
I made a brief attempt at doing manual edits, but ended up taking a
different approach and made some changes (which I revert after they'd
been used) to FSEditLog.java. I added a try/catch statement around the
code that was generating the NullPointerException to catch and ignore
that error, which appears to have allowed the namenode to come up
successfully. It looks like ~20 files were problematic, all apparently
temporary output from a MR job. At the moment everything seems to be
running correctly, we'll see if that continues.
Todd - let me know if there's any information that would be useful to
looking into this issue.
- Adam