Thanks to everyone who responded. Things are back on the air now - all the
replication issues seem to have gone away. I am wading through a detailed fsck
output now looking for specific problems on a file-by-file basis.
Just in case anybody is interested, we mirror our master nodes using DRBD.
It performed very well in this first "real world" test. If there is interest I
can write up how we protect our master nodes in more detail and share w/the
community.
Thanks,
C G
Ted Dunning <[EMAIL PROTECTED]> wrote:
You don't need to correct over-replicated files.
The under-replicated files should cure themselves, but there is a problem on
old versions where that doesn't happen quite right.
You can use hadoop fsck / to get a list of the files that are broken and
there are options to copy what remains of them to lost+found or to delete
them.
Other than that, things should correct themselves fairly quickly.
On 5/11/08 8:23 PM, "C G"
wrote:
> Hi All:
>
> We had a primary node failure over the weekend. When we brought the node
> back up and I ran Hadoop fsck, I see the file system is corrupt. I'm unsure
> how best to proceed. Any advice is greatly appreciated. If I've missed a
> Wiki page or documentation somewhere please feel free to tell me to RTFM and
> let me know where to look.
>
> Specific question: how to clear under and over replicated files? Is the
> correct procedure to copy the file locally, delete from HDFS, and then copy
> back to HDFS?
>
> The fsck output is long, but the final summary is:
>
> Total size: 4899680097382 B
> Total blocks: 994252 (avg. block size 4928006 B)
> Total dirs: 47404
> Total files: 952070
> ********************************
> CORRUPT FILES: 2
> MISSING BLOCKS: 24
> MISSING SIZE: 1501009630 B
> ********************************
> Over-replicated blocks: 1 (1.0057812E-4 %)
> Under-replicated blocks: 14958 (1.5044476 %)
> Target replication factor: 3
> Real replication factor: 2.9849212
>
> The filesystem under path '/' is CORRUPT
>
>
> ---------------------------------
> Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it
> now.
---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now.