lchq created ZOOKEEPER-4736: ------------------------------- Summary: socket fd leak Key: ZOOKEEPER-4736 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4736 Project: ZooKeeper Issue Type: Bug Components: java client, server Affects Versions: 3.9.0, 3.8.0, 3.7.0, 3.6.3 Environment: zookeeper 3.6.3 !4cea510a57af58c08e73d146e8535ee4.jpg! Reporter: lchq Attachments: 4cea510a57af58c08e73d146e8535ee4.jpg, IMG_20230815_114433.jpg
if network service is unavailable, as "ifdown eth0" or "service network stop" and so on, zk-client process running on this node will experience fd leakage 。it happens for invoking "new Zookeeper(..)". when network service is unavailable, ClientCnxn::SendThread::run() method will continuely do startConnect(),and suffer exception "SocketException: Network is unreachable". Exception handlers catch this exception and do SendThread::cleanup() to do some clean operation,but because in ClientCnxnSocketNIO::registerAndConnect method socket is registed to selector firstly and do sock.connect operation leading the fd of sock can't be closed. Changing the order of sock.connect and sock.register can solve this issue,and it will not affect Original sense because of the sock.reister take effect when selector.select(waitTimeOut) is triggered -- This message was sent by Atlassian Jira (v8.20.10#820010)