xinglin commented on PR #6183: URL: https://github.com/apache/hadoop/pull/6183#issuecomment-1846575811
My understanding is a similar issue is happening here as what I tried to fix in [HDFS-17030](https://issues.apache.org/jira/browse/HDFS-17030): when a JN is not responsive (either it is down or it hangs), the starting NN would try to connect to it anyway with retries. Thus, it would wait for `ipc.client.connect.timeout` * `ipc.client.connect.max.retries.on.timeouts` when NN is not able to establish a socket to the journal node, or `ipc.client.rpc-timeout.ms` when a socket is established but the journal node fails to send back a response. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org