I would suggest you run fsck with all options
hadoop fsck / -files -blocks -locations 
This will give you details of blocks which are missing and which files they 
belong to. fsck output depends on the current state of the namenode and its 
knowledge about the blocks. The two output differ suggests that namenode state 
has been updated, meaning blocks which were missing earlier might be reported 
now. Check with full options and see which blocks from which files are missing. 
Thanks,
Lohit

----- Original Message ----
From: C G <[EMAIL PROTECTED]>
To: [email protected]
Sent: Sunday, May 11, 2008 9:55:40 PM
Subject: Re: HDFS corrupt...how to proceed?

The system hosting the namenode experienced an OS panic and shut down, we 
subsequently rebooted it.  Currently we don't believe there is/was a bad disk 
or other hardware problem.
  
  Something interesting:  I've ran fsck twice, the first time it gave the 
result I posted.  The second time I still declared the FS to be corrupt, but 
said:
  [many rows of periods deleted]
  ..........Status: CORRUPT
Total size:    4900076384766 B
Total blocks:  994492 (avg. block size 4927215 B)
Total dirs:    47404
Total files:   952310
Over-replicated blocks:        0 (0.0 %)
Under-replicated blocks:       0 (0.0 %)
Target replication factor:     3
Real replication factor:       3.0
  
The filesystem under path '/' is CORRUPT

  So it seems like it's fixing some problems on its own?
  
  Thanks,
  C G
  
Dhruba Borthakur <[EMAIL PROTECTED]> wrote:
  Did one datanode fail or did the namenode fail? By "fail" do you mean
that the system was rebooted or was there a bad disk that caused the
problem?

thanks,
dhruba

On Sun, May 11, 2008 at 7:23 PM, C G 
wrote:
> Hi All:
>
> We had a primary node failure over the weekend. When we brought the node back 
> up and I ran Hadoop fsck, I see the file system is corrupt. I'm unsure how 
> best to proceed. Any advice is greatly appreciated. If I've missed a Wiki 
> page or documentation somewhere please feel free to tell me to RTFM and let 
> me know where to look.
>
> Specific question: how to clear under and over replicated files? Is the 
> correct procedure to copy the file locally, delete from HDFS, and then copy 
> back to HDFS?
>
> The fsck output is long, but the final summary is:
>
> Total size: 4899680097382 B
> Total blocks: 994252 (avg. block size 4928006 B)
> Total dirs: 47404
> Total files: 952070
> ********************************
> CORRUPT FILES: 2
> MISSING BLOCKS: 24
> MISSING SIZE: 1501009630 B
> ********************************
> Over-replicated blocks: 1 (1.0057812E-4 %)
> Under-replicated blocks: 14958 (1.5044476 %)
> Target replication factor: 3
> Real replication factor: 2.9849212
>
> The filesystem under path '/' is CORRUPT
>
>
>
> ---------------------------------
> Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now.


      
---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile.  Try it now.

Reply via email to