[jira] [Commented] (HDFS-4301) 2NN image transfer timeout problematic

Andrew Wang (JIRA) Wed, 10 Apr 2013 15:55:20 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-4301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628393#comment-13628393
 ]


Andrew Wang commented on HDFS-4301:
-----------------------------------

I poked at this some, manually setting an artificially low timeout and transfer 
bandwidth. The behavior I see is the SNN successfully fetching big edits from 
the NN (indicating that this is in fact a socket timeout) but timing out in the 
{{putimage}} back to the NN. This is because of the HTTP GET inside the 
{{putimage}} GET handler; we hit the timeout because the {{putimage}} request 
sees no data until the nested {{getimage}} finishes.

The best fix here is to do away with this nested GET business and use an HTTP 
POST or PUT instead. I'll look into doing this more involved fix, since there 
are already workarounds in the meantime.
                
> 2NN image transfer timeout problematic
> --------------------------------------
>
>                 Key: HDFS-4301
>                 URL: https://issues.apache.org/jira/browse/HDFS-4301
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 3.0.0, 2.0.3-alpha
>            Reporter: Todd Lipcon
>            Assignee: Andrew Wang
>            Priority: Critical
>
> HDFS-1490 added a timeout on image transfer. But, it seems like the timeout 
> is actually applying to the entirety of the image transfer operation. So, if 
> the image or edits are large (multiple GB) or the transfer is heavily 
> throttled, it is likely to time out repeatedly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4301) 2NN image transfer timeout problematic

Reply via email to