[jira] Issue Comment Edited: (HADOOP-2012) Periodic verification at the Datanode

Raghu Angadi (JIRA) Tue, 04 Dec 2007 16:33:07 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-2012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12548478
 ]


rangadi edited comment on HADOOP-2012 at 12/4/07 4:31 PM:
---------------------------------------------------------------

I am preparing a patch for this Jira. I am thinking of disabling this currently 
for Windows. This is because as part of this feature we want to modify block 
metada file. Until now, once a file is written to on Datanode, it is never 
modified. This issue needs to be fixed for appends anyway, I think. The reason 
why this is a pain on Windows :

# A file may not be opened for writing if some other thread is reading from it 
(I need to verify this).
# {{file.renameTo(existingFile)}} is not allowed. This is simpler to handle.
# {{fileOpenForRead.renameTo(newFile)}} is not allowed. This is harder to fix 
since Datanode does not keep track of which files are being read, etc. To 
replace a file, we need to wait till all the readers are done.

      was (Author: rangadi):
    I am preparing a patch for this Jira. I am thinking of disabling this 
currently for Windows. This is because as part of this feature we want to 
modify block metada file. Until now, once a file is written to on Datanode, it 
is never modified. This issue needs to be fixed by Appends I think. The reason 
why this is painful on Windows :

# A file may not be opened for writing if some other thread is reading from it 
(I need to verify this).
# {{file.renameTo(existingFile)}} is not allowed. This is simpler to handler.
# {{fileOpenForRead.renameTo(newFile)}} is not allowed. This is harder to fix 
since Datanode does not keep track of which files are being read, etc. To 
replace a file, we need to wait till all the readers are done.
  
> Periodic verification at the Datanode
> -------------------------------------
>
>                 Key: HADOOP-2012
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2012
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Raghu Angadi
>            Assignee: Raghu Angadi
>             Fix For: 0.16.0
>
>         Attachments: HADOOP-2012.patch, HADOOP-2012.patch, HADOOP-2012.patch, 
> HADOOP-2012.patch
>
>
> Currently on-disk data corruption on data blocks is detected only when it is 
> read by the client or by another datanode.  These errors are detected much 
> earlier if datanode can periodically verify the data checksums for the local 
> blocks.
> Some of the issues to consider :
> - How should we check the blocks ( no more often than once every couple of 
> weeks ?)
> - How do we keep track of when a block was last verfied ( there is a .meta 
> file associcated with each lock ).
> - What action to take once a corruption is detected
> - Scanning should be done as a very low priority with rest of the datanode 
> disk traffic in mind.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (HADOOP-2012) Periodic verification at the Datanode

Reply via email to