[jira] Commented: (HADOOP-3124) DFS data node should not use hard coded 10 minutes as write timeout.

Raghu Angadi (JIRA) Wed, 16 Apr 2008 09:30:28 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12589629#action_12589629
 ]


Raghu Angadi commented on HADOOP-3124:
--------------------------------------

Regd unit tests : This patch essentially just makes a constant configurable.

Regd what the default should be (and why I think 2 min is certainly too low): 

My understanding of what this time is for :
- Only to catch rare exceptions (like some bugs, hardware failures, kernel 
hangs etc).
- Should be long enough that writes don't fail just because a node is currently 
loaded.

What this is *not* for :
- To improve performance.
- To reduce long tail because of slow nodes.
-- This needs to be handled at a different level (e.g. NameNode not scheduling 
so many blocks on such nodes, speculative execution in M/R)
- Unlike M/R level or at an application level, DFS does not know if some data 
that it is being asked to write can easily regenerated by another task or can 
be discarded. So it should try its best to wire to requested number of replicas.

If you define this timeout to something else, then it is quite possible that 
much smaller timeout is ok. Please suggest different value (preferably 
redefining it).

8min may not be the right value either. Even on 4 disk nodes, just doing 
'generateData' stage of gridmix, we have seen that 2 min is not enough. On a 
heavily loaded cluster running multiple jobs on 2 disk machines, it might be 
much larger. Thats why making this configurable helps.

One change we could do is to use different write-timeout values when data is 
written to DFS and when data is read from DFS (DataNode to DFSClient write).



> DFS data node should not use hard coded 10 minutes as write timeout.
> --------------------------------------------------------------------
>
>                 Key: HADOOP-3124
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3124
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Raghu Angadi
>             Fix For: 0.18.0
>
>         Attachments: HADOOP-3124.patch, HADOOP-3124.patch
>
>
> This problem happens in 0.17 trunk
> I saw reducers waited 10 minutes for writing data to dfs and got timeout.
> The client retries again and timeouted after another 19 minutes.
> After looking into the code, it seems that the dfs data node uses 10 minutes 
> as timeout for wtiting data into the data node pipeline.
> I thing we have three issues:
> 1. The 10 minutes timeout value is too big for writing a chunk of data (64K) 
> through the data node pipeline.
> 2. The timeout value should not be hard coded.
> 3. Different datanodes in a pipeline should use different timeout values for 
> writing to the downstream.
> A reasonable one maybe (20 secs * numOfDataNodesInTheDownStreamPipe).
> For example, if the replication factor is 3, the client uses 60 secs, the 
> first data node use 40 secs, the second datanode use 20secs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3124) DFS data node should not use hard coded 10 minutes as write timeout.

Reply via email to