[ 
https://issues.apache.org/jira/browse/HDFS-3574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3574:
------------------------------

    Description: 
There's a very small race window in GetImageServlet, if the following 
interleaving occurs:
- The Storage object returns some local file in the storage directory (eg an 
edits file or image file)
- *Race*: some other process removes the file
- GetImageServlet calls file.length() which returns 0, since it doesn't exist. 
It thus faithfully sets the Content-Length header to 0
- getFileClient() throws FileNotFoundException when trying to open the file. 
But, since we call response.getOutputStream() before this, the headers have 
already been sent, so we fail to send the "404" or "500" response that we 
should.

Thus, the client sees a 0-length Content-Length followed by 0 lengths of 
content, and thinks it successfully has downloaded the target file, where in 
fact it downloads an empty one.

I saw this in practice during the "edits synchronization" phase of recovery 
while working on HDFS-3077, though it could apply on existing code paths, as 
well, I believe.

  was:
There's a very small race window in GetImageServlet, if the following 
interleaving occurs:
- The Storage object returns some local file in the storage directory (eg an 
edits file or image file)
- *Race*: some other process removes the file
- GetImageServlet calls file.length() which returns 0, since it doesn't exist. 
It thus faithfully sets the Content-Length header to 0
- getFileClient() throws FileNotFoundException when trying to open the file. 
But, since we call response.getOutputStream() before this, the headers have 
already been sent, so we fail to send the "404" or "500" response that we 
should.

Thus, the client sees a 0-length Content-Length followed by 0 lengths of 
content, and thinks it successfully has downloaded the target file, where in 
fact it downloads an empty one.

I have filed this as a subtask of HDFS-3077 since I saw it only in practice 
during the "edits synchronization" phase of recovery during that work. Though 
it could apply on existing code paths, as well, I believe.

    
> Fix small race and do some cleanup in GetImageServlet
> -----------------------------------------------------
>
>                 Key: HDFS-3574
>                 URL: https://issues.apache.org/jira/browse/HDFS-3574
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 3.0.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Minor
>
> There's a very small race window in GetImageServlet, if the following 
> interleaving occurs:
> - The Storage object returns some local file in the storage directory (eg an 
> edits file or image file)
> - *Race*: some other process removes the file
> - GetImageServlet calls file.length() which returns 0, since it doesn't 
> exist. It thus faithfully sets the Content-Length header to 0
> - getFileClient() throws FileNotFoundException when trying to open the file. 
> But, since we call response.getOutputStream() before this, the headers have 
> already been sent, so we fail to send the "404" or "500" response that we 
> should.
> Thus, the client sees a 0-length Content-Length followed by 0 lengths of 
> content, and thinks it successfully has downloaded the target file, where in 
> fact it downloads an empty one.
> I saw this in practice during the "edits synchronization" phase of recovery 
> while working on HDFS-3077, though it could apply on existing code paths, as 
> well, I believe.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to