[ https://issues.apache.org/jira/browse/HDFS-17357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17811179#comment-17811179 ]
ASF GitHub Bot commented on HDFS-17357: --------------------------------------- hfutatzhanghb commented on PR #6502: URL: https://github.com/apache/hadoop/pull/6502#issuecomment-1911716925 @LiuGuH Hi, sir. I have one question here, Could you please explain more detailed how the connection leak? I see we invoke setKeepAlive(true) in `DataStreamer#createSocketForPipeline` and `DataXceiver#writeBlock`. Thanks a lot. > EC: NioInetPeer.close() should close socket connection. > ------------------------------------------------------- > > Key: HDFS-17357 > URL: https://issues.apache.org/jira/browse/HDFS-17357 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: liuguanghua > Assignee: liuguanghua > Priority: Major > Labels: pull-request-available > > NioInetPeer.close() now do not close socket connection. > > In my environment,all data were stored with EC. > And I found 3w+ connections leakage in datanode . And I found many warn > message as blew. > 2024-01-22 15:27:57,500 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > hostname:50010:DataXceiverServer > > When any Exception is found in DataXceiverServer, it will execute clostStream. > IOUtils.closeStream(peer) -> Peer.close() -> NioInetPeer.close() > But NioInetPeer.close() is not invoked with close socket connection. And > this will lead to connection leakage. > Other subClass of Peer's close() is implemented with socket.close(). See > EncryptedPeer, DomainPeer, BasicInetPeer > > > This solution can be reporduced as following: > (1) Client write data to HDFS > (2) datanode Xceiver count max to DFS_DATANODE_MAX_RECEIVER_THREADS_KEY , the > new Xceiver will fail and throw IOException . And the socket will not release. > (3) Client crash for that no new data will be added or client.close is > executed. > (4) There will be socket connection leakage between datanodes. > > > The connection leakage like this > dn1 > dn1:57042 dn2:50010 ESTABLISHED > dn2 > dn2:50010 dn1:57042 ESTABLISHED -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org