Re: HDFS corrupt...how to proceed?

2008-05-16 Thread Michael Di Domenico
MAIL PROTECTED]> > > To: core-user@hadoop.apache.org > > Sent: Monday, May 12, 2008 2:40:57 PM > > Subject: Re: HDFS corrupt...how to proceed? > > > > Thanks to everyone who responded. Things are back on the air now - all > the > > replication issues seem t

Re: HDFS corrupt...how to proceed?

2008-05-13 Thread Otis Gospodnetic
gt; > To: core-user@hadoop.apache.org > Sent: Monday, May 12, 2008 2:40:57 PM > Subject: Re: HDFS corrupt...how to proceed? > > Thanks to everyone who responded. Things are back on the air now - all the > replication issues seem to have gone away. I am wading through a detail

Re: HDFS corrupt...how to proceed?

2008-05-12 Thread C G
Thanks to everyone who responded. Things are back on the air now - all the replication issues seem to have gone away. I am wading through a detailed fsck output now looking for specific problems on a file-by-file basis. Just in case anybody is interested, we mirror our master nodes using

Re: HDFS corrupt...how to proceed?

2008-05-12 Thread Ted Dunning
You don't need to correct over-replicated files. The under-replicated files should cure themselves, but there is a problem on old versions where that doesn't happen quite right. You can use hadoop fsck / to get a list of the files that are broken and there are options to copy what remains of th

Re: HDFS corrupt...how to proceed?

2008-05-12 Thread lohit
Sunday, May 11, 2008 9:55:40 PM Subject: Re: HDFS corrupt...how to proceed? The system hosting the namenode experienced an OS panic and shut down, we subsequently rebooted it. Currently we don't believe there is/was a bad disk or other hardware problem. Something interesting: I'v

Re: HDFS corrupt...how to proceed?

2008-05-11 Thread C G
Yes, several of our logging apps had accumulated backlogs of data and were "eager" to write to HDFS Dhruba Borthakur <[EMAIL PROTECTED]> wrote: Is it possible that new files were being created by running applications between the first and second fsck runs? thans, dhruba On Sun, May 11, 2

Re: HDFS corrupt...how to proceed?

2008-05-11 Thread Dhruba Borthakur
Is it possible that new files were being created by running applications between the first and second fsck runs? thans, dhruba On Sun, May 11, 2008 at 8:55 PM, C G <[EMAIL PROTECTED]> wrote: > The system hosting the namenode experienced an OS panic and shut down, we > subsequently rebooted it.

Re: HDFS corrupt...how to proceed?

2008-05-11 Thread C G
The system hosting the namenode experienced an OS panic and shut down, we subsequently rebooted it. Currently we don't believe there is/was a bad disk or other hardware problem. Something interesting: I've ran fsck twice, the first time it gave the result I posted. The second time I sti

Re: HDFS corrupt...how to proceed?

2008-05-11 Thread Dhruba Borthakur
Did one datanode fail or did the namenode fail? By "fail" do you mean that the system was rebooted or was there a bad disk that caused the problem? thanks, dhruba On Sun, May 11, 2008 at 7:23 PM, C G <[EMAIL PROTECTED]> wrote: > Hi All: > > We had a primary node failure over the weekend. When

HDFS corrupt...how to proceed?

2008-05-11 Thread C G
Hi All: We had a primary node failure over the weekend. When we brought the node back up and I ran Hadoop fsck, I see the file system is corrupt. I'm unsure how best to proceed. Any advice is greatly appreciated. If I've missed a Wiki page or documentation somewhere please feel free t