[ 
https://issues.apache.org/jira/browse/HDFS-3574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3574:
------------------------------

    Attachment: hdfs-3574.txt

This patch fixes the above race as follows:

- after setting the headers, we check again to see that the file exists. If it 
doesn't exist at that point, we throw the FNFE before opening the response 
output stream. We pass the already-opened stream (from before the exists check) 
into {{getFileServer(...)}} so that we don't have a 
Time-of-check-to-time-of-use bug here.

I also did a little cleanup and made some stuff public for later use in 
HDFS-3077. I hope it's OK to do these trivial changes in this same JIRA. If 
it's a big problem I'll move them elsewhere.

Unfortunately I didn't write a unit test for this, as it's a somewhat difficult 
race.
                
> Fix small race and do some cleanup in GetImageServlet
> -----------------------------------------------------
>
>                 Key: HDFS-3574
>                 URL: https://issues.apache.org/jira/browse/HDFS-3574
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 3.0.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Minor
>         Attachments: hdfs-3574.txt
>
>
> There's a very small race window in GetImageServlet, if the following 
> interleaving occurs:
> - The Storage object returns some local file in the storage directory (eg an 
> edits file or image file)
> - *Race*: some other process removes the file
> - GetImageServlet calls file.length() which returns 0, since it doesn't 
> exist. It thus faithfully sets the Content-Length header to 0
> - getFileClient() throws FileNotFoundException when trying to open the file. 
> But, since we call response.getOutputStream() before this, the headers have 
> already been sent, so we fail to send the "404" or "500" response that we 
> should.
> Thus, the client sees a 0-length Content-Length followed by 0 lengths of 
> content, and thinks it successfully has downloaded the target file, where in 
> fact it downloads an empty one.
> I saw this in practice during the "edits synchronization" phase of recovery 
> while working on HDFS-3077, though it could apply on existing code paths, as 
> well, I believe.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to