[
https://issues.apache.org/jira/browse/HDFS-3574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402549#comment-13402549
]
Colin Patrick McCabe commented on HDFS-3574:
--------------------------------------------
It seems like there is still some TOCTOUs here, since we're getting the length
of the file and then reading the file afterwards. What if the length of the
file changes? Maybe we don't care about this case, but perhaps there should be
a comment that this race exists.
I also don't completely understand the logic behind this:
{code}
if (!imageFile.exists()) {
// Potential race where the file was deleted while we were in the
// process of setting headers!
throw new FileNotFoundException();
}
// send fsImage
TransferFsImage.getFileServer(response, imageFile, fis, getThrottler(conf));
{code}
Can't the imageFile be deleted immediately after the imageFile.exists() check
and before the call to TransferFsImage#getFileServer? So the original TOCTOU
isn't really fixed either.
It seems like the only real fix would be creating a hardlink to a temporary
file, so that if a user deleted the original file we were concerned with, the
data would not be lost. Alternately we could buffer the file in memory ahead
of time, so that we know the length of what we're going to send and it can't
disappear. But that probably isn't feasible for large files, at least.
> Fix small race and do some cleanup in GetImageServlet
> -----------------------------------------------------
>
> Key: HDFS-3574
> URL: https://issues.apache.org/jira/browse/HDFS-3574
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: name-node
> Affects Versions: 3.0.0
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Priority: Minor
> Attachments: hdfs-3574.txt
>
>
> There's a very small race window in GetImageServlet, if the following
> interleaving occurs:
> - The Storage object returns some local file in the storage directory (eg an
> edits file or image file)
> - *Race*: some other process removes the file
> - GetImageServlet calls file.length() which returns 0, since it doesn't
> exist. It thus faithfully sets the Content-Length header to 0
> - getFileClient() throws FileNotFoundException when trying to open the file.
> But, since we call response.getOutputStream() before this, the headers have
> already been sent, so we fail to send the "404" or "500" response that we
> should.
> Thus, the client sees a 0-length Content-Length followed by 0 lengths of
> content, and thinks it successfully has downloaded the target file, where in
> fact it downloads an empty one.
> I saw this in practice during the "edits synchronization" phase of recovery
> while working on HDFS-3077, though it could apply on existing code paths, as
> well, I believe.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira