[
https://issues.apache.org/jira/browse/HADOOP-7380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13061489#comment-13061489
]
Eli Collins commented on HADOOP-7380:
-------------------------------------
Latest patch looks good.
The following is implicit in the change, but I think it's worth stating here in
the jira explicitly: in FailoverOnNetworkExceptionRetry#shouldRetry we don't
fail-over and retry if we're making a non-idempotent call and there's an
IOException or SocketException that's not Connect, NoRouteToHost, UnknownHost,
or Standby. The rationale of course is that the operation may have reached the
server and retrying elsewhere could leave us in an insconsistent state. This
means if a client doing a create/delete which gets a SocketTimeoutException
(which is an IOE) or an EOF SocketException the exception will be thrown all
the way up to the caller of FileSystem/FileContext. That's reasonable because
only the user of the API at this level has sufficient knoweldge of how to
handle the failure, eg if they get such an exception after issuing a delete
they can check if the file still exists and if so re-issue the delete (however
they may also not want to do this, and FileContext doesn't know w
hich).
Minor comments:
* Need to mark the new Interfaces with @InterfaceStability.Evolving
* The new create methods in RetryProxy need javadocs (or you could move the
javadoc to your new methods and add the FailoverProxyProvider param if you feel
we're OD'ing on javadocs here)
* @param lines shouldn't end with a period (eg RetryPolicy and
FailOverProxyProvider)
* Should RetryAction be in RetryPolicy or do you expect there to be a class
here eventually?
* Wonder if the LOG.info in RetryInvocationFailure for the fail-over case
should be a warning.
I think this change is sufficiently decoupled from HDFS-1973 that we can check
it into trunk before we branch for HA.
> Common portion of HDFS-1973
> ---------------------------
>
> Key: HADOOP-7380
> URL: https://issues.apache.org/jira/browse/HADOOP-7380
> Project: Hadoop Common
> Issue Type: New Feature
> Components: ipc
> Affects Versions: 0.23.0
> Reporter: Aaron T. Myers
> Assignee: Aaron T. Myers
> Fix For: 0.23.0
>
> Attachments: hadoop-7380.0.patch, hadoop-7380.1.patch,
> hadoop-7380.2.patch
>
>
> Implementing client failover will likely require changes to {{o.a.h.io.ipc}}
> and/or {{o.a.h.io.retry}}. This JIRA is to track those changes.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira