[ 
https://issues.apache.org/jira/browse/HADOOP-6726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13117140#comment-13117140
 ] 

Steve Loughran commented on HADOOP-6726:
----------------------------------------

what kind of retry/backoff should go in the client? Or should it just bail out 
and let the client write their own?

On thing that may make sense in the client is a bit of jitter, to handle the 
"4000 boxes reboot simultaneously" problem; exponential backoff + jitter may be 
even better. Again, these could be options
                
> can't control maxRetries in case of SocketTimeoutException
> ----------------------------------------------------------
>
>                 Key: HADOOP-6726
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6726
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: ipc
>    Affects Versions: 0.20.2
>            Reporter: Johannes Zillmann
>
> One can set _ipc.client.connect.max.retries_ for 
> _org.apache.hadoop.ipc.Client_.
> This comes to effect on IOExceptions but not on SocketTimeoutException.
> Client$Connection:307:
> {code:java}
>           } catch (SocketTimeoutException toe) {
>             /* The max number of retries is 45,
>              * which amounts to 20s*45 = 15 minutes retries.
>              */
>             handleConnectionFailure(timeoutFailures++, 45, toe);
>           } catch (IOException ie) {
>             handleConnectionFailure(ioFailures++, maxRetries, ie);
>           }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to