[ 
https://issues.apache.org/jira/browse/HADOOP-7380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13061489#comment-13061489
 ] 

Eli Collins commented on HADOOP-7380:
-------------------------------------

Latest patch looks good.

The following is implicit in the change, but I think it's worth stating here in 
the jira explicitly: in FailoverOnNetworkExceptionRetry#shouldRetry we don't 
fail-over and retry if we're making a non-idempotent call and there's an 
IOException or SocketException that's not Connect, NoRouteToHost, UnknownHost, 
or Standby. The rationale of course is that the operation may have reached the 
server and retrying elsewhere could leave us in an insconsistent state. This 
means if a client doing a create/delete which gets a SocketTimeoutException 
(which is an IOE) or an EOF SocketException the exception will be thrown all 
the way up to the caller of FileSystem/FileContext. That's reasonable because 
only the user of the API at this level has sufficient knoweldge of how to 
handle the failure, eg if they get such an exception after issuing a delete 
they can check if the file still exists and if so re-issue the delete (however 
they may also not want to do this, and FileContext doesn't know w
 hich).

Minor comments:
* Need to mark the new Interfaces with @InterfaceStability.Evolving
* The new create methods in RetryProxy need javadocs (or you could move the 
javadoc to your new methods and add the FailoverProxyProvider param if you feel 
we're OD'ing on javadocs here)
* @param lines shouldn't end with a period (eg RetryPolicy and 
FailOverProxyProvider)
* Should RetryAction be in RetryPolicy or do you expect there to be a class 
here eventually?
* Wonder if the LOG.info in RetryInvocationFailure for the fail-over case 
should be a warning.

I think this change is sufficiently decoupled from HDFS-1973 that we can check 
it into trunk before we branch for HA.

> Common portion of HDFS-1973
> ---------------------------
>
>                 Key: HADOOP-7380
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7380
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: ipc
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>             Fix For: 0.23.0
>
>         Attachments: hadoop-7380.0.patch, hadoop-7380.1.patch, 
> hadoop-7380.2.patch
>
>
> Implementing client failover will likely require changes to {{o.a.h.io.ipc}} 
> and/or {{o.a.h.io.retry}}. This JIRA is to track those changes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to