[
https://issues.apache.org/jira/browse/HDFS-14937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17238886#comment-17238886
]
Ye Ni edited comment on HDFS-14937 at 11/25/20, 7:21 PM:
---------------------------------------------------------
[~xuzq_zander], [~xkrogen], [~vagarychen], could you share the background of
this change, or the issue to address?
We recently found an issue, that when Observer is down (machine taken offline
for maintenance), we will have InterruptedIOException, for example
ConnectTimeoutException. However, the request sent to Observer is not failed
over to the Active NN, we throw the exception directly. This leads to all the
request fail. Please note, for new client after the Observer is down, this
issue couldn't happen. Because new client will not treat this down machine as
Observer, so this code path will not hit.
I believe in this case, we should continue to try the active.
This change is in OSS trunk and 3+, not in 2.10+, which I don't know is
intentionally or not.
[~inigoiri]
was (Author: nickyye):
[~xuzq_zander], [~xkrogen], [~vagarychen], could you share the background of
this change, or the issue to address?
We recently found a issue, that when Observer is down (machine taken offline
for maintenance), we will have InterruptedIOException, for example
ConnectTimeoutException. However, the request sent to Observer is not failed
over to the Active NN, we throw the exception directly. This leads to all the
request fail. Please note, for new client after the Observer is down, this
issue couldn't happen. Because new client will not treat this down machine as
Observer, so this code path will not hit.
I believe in this case, we should continue to try the active.
This change is in OSS trunk and 3+, not in 2.10+, which I don't know is
intentionally or not.
[~inigoiri]
> [SBN read] ObserverReadProxyProvider should throw InterruptException
> --------------------------------------------------------------------
>
> Key: HDFS-14937
> URL: https://issues.apache.org/jira/browse/HDFS-14937
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: xuzq
> Assignee: xuzq
> Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14937-trunk-001.patch, HDFS-14937-trunk-002.patch
>
>
> ObserverReadProxyProvider should throw InterruptException immediately if one
> Observer catch InterruptException in invoking.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]