[
https://issues.apache.org/jira/browse/HDFS-17030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17727215#comment-17727215
]
ASF GitHub Bot commented on HDFS-17030:
---------------------------------------
xinglin opened a new pull request, #5700:
URL: https://github.com/apache/hadoop/pull/5700
### Description of PR
Added support to fail fast when detecting unreachable/irresponsible standby
NN in ObserverReaderProxy
### How was this patch tested?
* Unit tests
```
~/p/h/t/hadoop-hdfs-project (HDFS-17030)> mvn test
-Dtest="TestObserverReadProxyProvider.java"
[INFO] -------------------------------------------------------
[INFO] T E S T S
[INFO] -------------------------------------------------------
[INFO] Running
org.apache.hadoop.hdfs.server.namenode.ha.TestObserverReadProxyProvider
[INFO] Tests run: 13, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
1.136 s - in
org.apache.hadoop.hdfs.server.namenode.ha.TestObserverReadProxyProvider
[INFO]
[INFO] Results:
[INFO]
[INFO] Tests run: 13, Failures: 0, Errors: 0, Skipped: 0
```
* Tested in a testing cluster
+ We take a heap dump at a standby NN.
```
bash-4.2$ jmap -F -dump:format=b,file=heapdump-25801.hprof 25801
Attaching to process ID 25801, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 25.172-b11
Dumping heap to heapdump-25801.hprof ...
```
+ Existing hadoop-binary took more than 2 mins to complete the List
operation, because we set _ipc.client.rpc-timeout.ms_ to 2 mins.
```
[xinglin@ltx1-hcl14866 ~]$ time hdfs dfs -ls /tmp/testFile.txt
23/05/24 23:07:05 INFO fs.FileBasedMountTableLoader: TID: 1 -
Loading mount table from
hdfs://ltx1-yugiohnn01.grid.linkedin.com:9000/mounttable/linkfs/ltx1-yugioh-router-mountpoints.json.
-rw-r--r
> Limit wait time for getHAServiceState in ObserverReaderProxy
> ------------------------------------------------------------
>
> Key: HDFS-17030
> URL: https://issues.apache.org/jira/browse/HDFS-17030
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: hdfs
> Affects Versions: 3.4.0
> Reporter: Xing Lin
> Assignee: Xing Lin
> Priority: Minor
>
> When namenode HA is enabled and a standby NN is not responsible, we have
> observed it would take a long time to serve a request, even though we have a
> healthy observer or active NN.
> Basically, when a standby is down, the RPC client would (re)try to connect
> that standby for _ipc.client.connect.timeout_ _*
> ipc.client.connect.max.retries.on.timeouts_ before giving up. When we take a
> heap dump at a standby, the NN still accepts the socket connection but it
> won't send responses to these RPC requests and we would timeout after
> _ipc.client.rpc-timeout.ms._ This adds a significantly latency. For clusters
> at Linkedin, we set _ipc.client.rpc-timeout.ms_ to 120 seconds and thus a
> request would need to take more than 2 mins to complete when we take a heap
> dump at a standby. This has been causing user job failures.
> We could set _ipc.client.rpc-timeout.ms to_ a smaller value when sending
> getHAServiceState requests in ObserverReaderProxy (for user rpc requests, we
> still use the original value from the config). However, that would double the
> socket connection between clients and the NN.
> The proposal is to add a timeout on getHAServiceState() calls in
> ObserverReaderProxy and we will only wait for the timeout for an NN to
> respond its HA state. Once we pass that timeout, we will move on to the next
> NN.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]