[ https://issues.apache.org/jira/browse/HDFS-15078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002739#comment-17002739 ]
Fei Hui commented on HDFS-15078: -------------------------------- {quote} The issue is the first router which c, That client did failover to another router, triggered a new call and the second router completed the call, and the first call came after this. {quote} Getting EOFException makes client failover to another router. And later and the second router completed the call, the first router the first router. {quote} If such a case where one Router is delaying, I think without client connection crashing still issues like these can come up. {quote} Yes. This issue only can resolve the problem on some scenarios. HDFS-15079 tracks the high level problem. In our scenarios. This fix works. > RBF: Should check connection channel before sending rpc to namenode > ------------------------------------------------------------------- > > Key: HDFS-15078 > URL: https://issues.apache.org/jira/browse/HDFS-15078 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rbf > Affects Versions: 3.3.0 > Reporter: Fei Hui > Assignee: Fei Hui > Priority: Major > Attachments: HDFS-15078.001.patch, HDFS-15078.002.patch > > > dfsrouter logs show that > {quote} > 2019-12-20 04:11:26,724 WARN org.apache.hadoop.ipc.Server: IPC Server handler > 6400 on 8888, call org.apache.hadoop.hdfs.protocol.ClientProtocol.create from > 10.83.164.11:56908 Call#2 Retry#0: output error > 2019-12-20 04:11:26,724 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 125 on 8888 caught an exception > java.nio.channels.ClosedChannelException > at > sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:270) > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:461) > at org.apache.hadoop.ipc.Server.channelWrite(Server.java:2731) > at org.apache.hadoop.ipc.Server.access$2100(Server.java:134) > at > org.apache.hadoop.ipc.Server$Responder.processResponse(Server.java:1089) > at org.apache.hadoop.ipc.Server$Responder.doRespond(Server.java:1161) > at > org.apache.hadoop.ipc.Server$Connection.sendResponse(Server.java:2109) > at > org.apache.hadoop.ipc.Server$Connection.access$400(Server.java:1229) > at org.apache.hadoop.ipc.Server$Call.sendResponse(Server.java:631) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2245) > {quote} > Maybe checking connection between client and router is better before > sendingrpc to namenode -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org