Re: [jira] Commented: (HADOOP-101) DFSck - fsck-like utility for checking DFS volumes

Eric Baldeschwieler Sat, 25 Mar 2006 11:23:09 -0800

Perhaps we should implement a system by which removed blocks lingeron the data nodes for a long period of time (a week?) (or until thespace is needed fifo)?

Ditto with the meta data? That would make archeology around failureseasier.


On Mar 23, 2006, at 4:56 PM, Doug Cutting (JIRA) wrote:

[ http://issues.apache.org/jira/browse/HADOOP-101?page=comments#action_12371659 ]
Doug Cutting commented on HADOOP-101:
-------------------------------------
I like that this does not use anything more than the client API tocheck the server. That keeps the server core lean and mean. Theuse of RPC's effectively restricts the impact of the scan on the FS.
A datanode operation that streams through a block withouttransferring it over the wire won't correctly check checksums usingour existing mechanism. To check file content we could insteadsimply implement a map-reduce job that streams through all thefiles in the fs. This would not take much code: nothing additionalin the core. MapReduce should handle the locality, so that mostdata shouldn't go over the wire.
BTW, blocks not used by any file are not known to the name node,are they? When they're reported by a datanode the datanode is toldto remove them.
DFSck - fsck-like utility for checking DFS volumes
--------------------------------------------------

         Key: HADOOP-101
         URL: http://issues.apache.org/jira/browse/HADOOP-101
     Project: Hadoop
        Type: New Feature
  Components: dfs
    Versions: 0.2
    Reporter: Andrzej Bialecki
    Assignee: Andrzej Bialecki
 Attachments: DFSck.java
This is a utility to check health status of a DFS volume, andcollect some additional statistics.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of theadministrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Re: [jira] Commented: (HADOOP-101) DFSck - fsck-like utility for checking DFS volumes

Reply via email to