[ 
https://issues.apache.org/jira/browse/HDFS-2759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13193466#comment-13193466
 ] 

Aaron T. Myers commented on HDFS-2759:
--------------------------------------

Sorry, I should have mentioned what I did for testing. No tests are included 
since writing a test seems like it would require adding a test hook in the 
EditLogFileOutputStream#flushAndSync method, which seems undesirable since that 
method is highly performance-sensitive. I verified that the bug exists by 
adding a "System.exit(0)" after pre-allocation, but before calling 
FileChannel#force. This indeed resulted in a file full of FF, which if an NN 
were to try to read would be interpreted as an invalid header version number. 
By moving the pre-allocation after the call to FileChannel#force, we guarantee 
that the valid data will hit the edit log before writing FFs for the purpose of 
pre-allocation.

In the course of preparing and testing this patch, I think I've discovered 
another potential issue, though. Note that in the call to FileChannel#force, we 
pass "false" which indicates that FileChannel#force does not need to wait for 
metadata of the file to be synced to disk, only the data. I'm not sure of the 
precise semantics of not syncing metadata. In particular, is the file length 
included? If not, I think this has the potential to cause some edits to not 
actually be read back from disk after a crash.

The explanation, per the comment, is that syncing metadata is unnecessary 
because of pre-allocation. I don't think that's reasonable, though, since 
EditLogFileOutputStream#preallocate doesn't call sync itself, which means that 
the file length might never get updated upon returning from 
EditLogFileOutputStream#flushAndSync.

Do people agree this is a potential problem? If so, I can either fix it here, 
or file a new JIRA.
                
> Pre-allocate HDFS edit log files after writing version number
> -------------------------------------------------------------
>
>                 Key: HDFS-2759
>                 URL: https://issues.apache.org/jira/browse/HDFS-2759
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: ha, name-node
>    Affects Versions: 0.24.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: HDFS-2759.patch
>
>
> In HDFS-2709 it was discovered that there's a potential race wherein edits 
> log files are pre-allocated before the version number is written into the 
> header of the file. This can cause the NameNode to read an invalid HDFS 
> layout version, and hence fail to read the edit log file. We should write the 
> header, then pre-allocate the rest of the file after this point.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to