[
https://issues.apache.org/jira/browse/HDFS-4300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13529254#comment-13529254
]
Todd Lipcon commented on HDFS-4300:
-----------------------------------
We don't currently have an md5 file on the edits. We do detect that the
transfer failed, but we don't properly use a temporary file. If we just write
to a .tmp file and do the fsync/rename dance on successful transfer, we should
be able to avoid the issue.
> TransferFsImage.downloadEditsToStorage should use a tmp file for destination
> ----------------------------------------------------------------------------
>
> Key: HDFS-4300
> URL: https://issues.apache.org/jira/browse/HDFS-4300
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 2.0.2-alpha
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Priority: Critical
>
> Currently, in TransferFsImage.downloadEditsToStorage, we download the edits
> file directly to its finalized path. So, if the transfer fails in the middle,
> a half-written file is left and cannot be distinguished from a correct file.
> So, future checkpoints by the 2NN will fail, since the file is truncated in
> the middle -- but it won't ever download a good copy because it thinks it
> already has the proper file.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira