Re: HDFS corrupt...how to proceed?

Michael Di Domenico Fri, 16 May 2008 07:20:46 -0700

I second that request...

I use DRDB for another project where I work and definitely see it's
benefits, but I haven't tried it with hadoop yet.


Thanks

On Tue, May 13, 2008 at 11:17 AM, Otis Gospodnetic <
[EMAIL PROTECTED]> wrote:

> Hi,
>
> I'd love to see the DRBD+Hadoop write up!  Not only would this be useful
> for Hadoop, I can see this being useful for Solr (master replication).
>
>
> Thanks,
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
> ----- Original Message ----
> > From: C G <[EMAIL PROTECTED]>
> > To: [email protected]
> > Sent: Monday, May 12, 2008 2:40:57 PM
> > Subject: Re: HDFS corrupt...how to proceed?
> >
> > Thanks to everyone who responded.   Things are back on the air now - all
> the
> > replication issues seem to have gone away.  I am wading through a
> detailed fsck
> > output now looking for specific problems on a file-by-file basis.
> >
> >   Just in case anybody is interested, we mirror our master nodes using
> DRBD.  It
> > performed very well in this first "real world" test.  If there is
> interest I can
> > write up how we protect our master nodes in more detail and share w/the
> > community.
> >
> >   Thanks,
> >   C G
> >
> > Ted Dunning wrote:
> >
> >
> > You don't need to correct over-replicated files.
> >
> > The under-replicated files should cure themselves, but there is a problem
> on
> > old versions where that doesn't happen quite right.
> >
> > You can use hadoop fsck / to get a list of the files that are broken and
> > there are options to copy what remains of them to lost+found or to delete
> > them.
> >
> > Other than that, things should correct themselves fairly quickly.
> >
> >
> > On 5/11/08 8:23 PM, "C G"
> > wrote:
> >
> > > Hi All:
> > >
> > > We had a primary node failure over the weekend. When we brought the
> node
> > > back up and I ran Hadoop fsck, I see the file system is corrupt. I'm
> unsure
> > > how best to proceed. Any advice is greatly appreciated. If I've missed
> a
> > > Wiki page or documentation somewhere please feel free to tell me to
> RTFM and
> > > let me know where to look.
> > >
> > > Specific question: how to clear under and over replicated files? Is the
> > > correct procedure to copy the file locally, delete from HDFS, and then
> copy
> > > back to HDFS?
> > >
> > > The fsck output is long, but the final summary is:
> > >
> > > Total size: 4899680097382 B
> > > Total blocks: 994252 (avg. block size 4928006 B)
> > > Total dirs: 47404
> > > Total files: 952070
> > > ********************************
> > > CORRUPT FILES: 2
> > > MISSING BLOCKS: 24
> > > MISSING SIZE: 1501009630 B
> > > ********************************
> > > Over-replicated blocks: 1 (1.0057812E-4 %)
> > > Under-replicated blocks: 14958 (1.5044476 %)
> > > Target replication factor: 3
> > > Real replication factor: 2.9849212
> > >
> > > The filesystem under path '/' is CORRUPT
> > >
> > >
> > > ---------------------------------
> > > Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try
> it
> > > now.
> >
> >
> >
> >
> > ---------------------------------
> > Be a better friend, newshound, and know-it-all with Yahoo! Mobile.  Try
> it now.
>
>

Re: HDFS corrupt...how to proceed?

Reply via email to