[ https://issues.apache.org/jira/browse/ZOOKEEPER-1441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682124#comment-16682124 ]
Michael Han commented on ZOOKEEPER-1441: ---------------------------------------- PortAssignment itself is fine and if everyone is using it, they should not get conflicts because PortAssignment is the single source of truth of port allocation. However, the problem here is not every processes running on test machine using PortAssignment, despite most, if not all of ZK unit tests do use it. So if there are heavy workloads running on the test machine while ZK unit tests were running, potential port conflicts would occur. >> I never actually got why PortAssigment tries to bind the port before returns What PortAssignment implemented is a "reserve and release" pattern for port allocation, and this is better than "choose a port but not reserver" approach, because it is very unlikely the OS, regardless of how it allocates actual ports to the processes, will yield two consecutive port for two socket bind calls. Thus, by creating the socket via bind, and the immediately close it, we buy us sometime during which OS will not reuse this same socket for a successive socket call. This time however varies, thus there could be race conditions that by the time we actually going to bind this port again, it's already grabbed by another process. For ZK server, it requires an unbinded port number pass to it (otherwise it can't bind the port), but due to the same race condition it's possible when the server tries to bind, the port was taken already. The only way to guarantee atomicity in this case is to have ZK server asking a port from OS and bind immediately. > Some test cases are failing because Port bind issue. > ---------------------------------------------------- > > Key: ZOOKEEPER-1441 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1441 > Project: ZooKeeper > Issue Type: Test > Components: server, tests > Reporter: kavita sharma > Assignee: Michael Han > Priority: Major > Labels: flaky, flaky-test > > very frequently testcases are failing because of : > java.net.BindException: Address already in use > at sun.nio.ch.Net.bind(Native Method) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:126) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52) > at > org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:111) > at > org.apache.zookeeper.server.ServerCnxnFactory.createFactory(ServerCnxnFactory.java:112) > at > org.apache.zookeeper.server.quorum.QuorumPeer.<init>(QuorumPeer.java:514) > at > org.apache.zookeeper.test.QuorumBase.startServers(QuorumBase.java:156) > at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:103) > at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:67) > may be because of Port Assignment so please give me some suggestions if > someone is also facing same problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005)