[jira] Commented: (HADOOP-2757) Should DFS outputstream's close wait forever?

dhruba borthakur (JIRA) Tue, 28 Apr 2009 23:51:55 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-2757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12704010#action_12704010
 ]


dhruba borthakur commented on HADOOP-2757:
------------------------------------------

You are referring to dfs.datanode.socket.write.timeout. These are configurable 
parameters and I already set them to an appropriate number, e.g. 20 seconds 
because I want real-timeish behaviour.

If all the datanode(s) in the pipeline die, then the client detects an error 
and aborts. That is intended behaviour. If one datanode is not really dead (but 
hangs), then the client will hang too. This patch does not fix that problem.

The main motivation for this patch is to detect namenode failures early. If a 
client is writing to a block, it might take a while for the block to get filled 
up.... this time is dependent at the rate at which the client is writing 
data... if the client is trickling data into the block, it will not experience 
the dfs.datanode.socket.write.timeout timeout for a while. In the existing code 
in trunk, the lease recovery thread will detect NN problem after a while but it 
does nothing to terminate the threads that were writing to the block. The patch 
does this.

> Should DFS outputstream's close wait forever?
> ---------------------------------------------
>
>                 Key: HADOOP-2757
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2757
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>         Attachments: softMount1.patch, softMount1.patch, softMount2.patch
>
>
> Currently {{DFSOutputStream.close()}} waits for ever if Namenode keeps 
> throwing {{NotYetReplicated}} exception, for whatever reason. Its pretty 
> annoying for a user. Shoud the loop inside close have a timeout? If so how 
> much? It could probably something like 10 minutes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-2757) Should DFS outputstream's close wait forever?

Reply via email to