Thanks Andrzej, Works a treat, which is more than I can say for my thoroughly broken DFS :(
I used: # bin/hadoop org.apache.hadoop.dfs.DFSck /user/root/crawl and got! Status: CORRUPT Total size: 35381601016 B Total blocks: 2719 (avg. block size 13012725 B) Total dirs: 459 Total files: 1751 ******************************** CORRUPT FILES: 1082 MISSING BLOCKS: 1579 MISSING SIZE: 20189790423 B ******************************** Over-replicated blocks: 0 (0.0 %) Under-replicated blocks: 0 (0.0 %) Target replication factor: 3 Real replication factor: 3.0 This of course is true. For other users wanting to know how to break DFS (or not), one way is to add a new master node to the cluster, and misconfigure "fs.default.name" in nutch/hadoop-site.xml: <name>fs.default.name</name> <value>nutch0:50000</value> My error was that I intended to run nutch0 as job.tracker, but not as a datanode. So, when I ran bin/start-all.sh to start the cluster, it seemed to replicate the non-existent filesystem on nutch0; thereby starting to delete all my precious data. One way to learn! Thanks, Monu Ogbe ----- Original Message ----- From: "Andrzej Bialecki (JIRA)" <[EMAIL PROTECTED]> To: <[email protected]> Sent: Thursday, March 23, 2006 6:52 PM Subject: [jira] Updated: (HADOOP-101) DFSck - fsck-like utility for checking DFS volumes > [ http://issues.apache.org/jira/browse/HADOOP-101?page=all ] > > Andrzej Bialecki updated HADOOP-101: > ------------------------------------- > > Attachment: DFSck.java > >> DFSck - fsck-like utility for checking DFS volumes >> -------------------------------------------------- >> >> Key: HADOOP-101 >> URL: http://issues.apache.org/jira/browse/HADOOP-101 >> Project: Hadoop >> Type: New Feature >> Components: dfs >> Versions: 0.2 >> Reporter: Andrzej Bialecki >> Assignee: Andrzej Bialecki >> Attachments: DFSck.java >> >> This is a utility to check health status of a DFS volume, and collect some additional statistics. > > -- > This message is automatically generated by JIRA. > - > If you think it was sent incorrectly contact one of the administrators: > http://issues.apache.org/jira/secure/Administrators.jspa > - > For more information on JIRA, see: > http://www.atlassian.com/software/jira > >
