[
https://issues.apache.org/jira/browse/HADOOP-2012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12548478
]
rangadi edited comment on HADOOP-2012 at 12/4/07 4:31 PM:
---------------------------------------------------------------
I am preparing a patch for this Jira. I am thinking of disabling this currently
for Windows. This is because as part of this feature we want to modify block
metada file. Until now, once a file is written to on Datanode, it is never
modified. This issue needs to be fixed for appends anyway, I think. The reason
why this is a pain on Windows :
# A file may not be opened for writing if some other thread is reading from it
(I need to verify this).
# {{file.renameTo(existingFile)}} is not allowed. This is simpler to handle.
# {{fileOpenForRead.renameTo(newFile)}} is not allowed. This is harder to fix
since Datanode does not keep track of which files are being read, etc. To
replace a file, we need to wait till all the readers are done.
was (Author: rangadi):
I am preparing a patch for this Jira. I am thinking of disabling this
currently for Windows. This is because as part of this feature we want to
modify block metada file. Until now, once a file is written to on Datanode, it
is never modified. This issue needs to be fixed by Appends I think. The reason
why this is painful on Windows :
# A file may not be opened for writing if some other thread is reading from it
(I need to verify this).
# {{file.renameTo(existingFile)}} is not allowed. This is simpler to handler.
# {{fileOpenForRead.renameTo(newFile)}} is not allowed. This is harder to fix
since Datanode does not keep track of which files are being read, etc. To
replace a file, we need to wait till all the readers are done.
> Periodic verification at the Datanode
> -------------------------------------
>
> Key: HADOOP-2012
> URL: https://issues.apache.org/jira/browse/HADOOP-2012
> Project: Hadoop
> Issue Type: New Feature
> Components: dfs
> Reporter: Raghu Angadi
> Assignee: Raghu Angadi
> Fix For: 0.16.0
>
> Attachments: HADOOP-2012.patch, HADOOP-2012.patch, HADOOP-2012.patch,
> HADOOP-2012.patch
>
>
> Currently on-disk data corruption on data blocks is detected only when it is
> read by the client or by another datanode. These errors are detected much
> earlier if datanode can periodically verify the data checksums for the local
> blocks.
> Some of the issues to consider :
> - How should we check the blocks ( no more often than once every couple of
> weeks ?)
> - How do we keep track of when a block was last verfied ( there is a .meta
> file associcated with each lock ).
> - What action to take once a corruption is detected
> - Scanning should be done as a very low priority with rest of the datanode
> disk traffic in mind.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.