[
https://issues.apache.org/jira/browse/HDFS-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16720204#comment-16720204
]
Kitti Nanasi commented on HDFS-14134:
-------------------------------------
I totally agree with you that retrying getXAttr on "attr could not find"
IOException is not good and wasteful, and that we have to have a better concept
than the current.
But we also have to keep in mind that the FailoverOnNetworkExceptionRetry
policy is used by many parts of the code and it is a bit risky to change it. I
think the idea behind the previous design is that non remote IOExceptions may
be network related exceptions, so it is worth to retry them if the operation is
idempotent.
> Idempotent operations throwing RemoteException should not be retried by the
> client
> ----------------------------------------------------------------------------------
>
> Key: HDFS-14134
> URL: https://issues.apache.org/jira/browse/HDFS-14134
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs, hdfs-client, ipc
> Reporter: Lukas Majercak
> Assignee: Lukas Majercak
> Priority: Critical
> Attachments: HDFS-14134.001.patch, HDFS-14134.002.patch,
> HDFS-14134.003.patch, HDFS-14134.004.patch, HDFS-14134.005.patch,
> HDFS-14134_retrypolicy_change_proposal.pdf
>
>
> Currently, some operations that throw IOException on the NameNode are
> evaluated by RetryPolicy as FAILOVER_AND_RETRY, but they should just fail
> fast.
> For example, when calling getXAttr("user.some_attr", file") where the file
> does not have the attribute, NN throws an IOException with message "could not
> find attr". The current client retry policy determines the action for that to
> be FAILOVER_AND_RETRY. The client then fails over and retries until it
> reaches the maximum number of retries. Supposedly, the client should be able
> to tell that this exception is normal and fail fast.
> Moreover, even if the action was FAIL, the RetryInvocationHandler looks at
> all the retry actions from all requests, and FAILOVER_AND_RETRY takes
> precedence over FAIL action.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]