[
https://issues.apache.org/jira/browse/ZOOKEEPER-1441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682124#comment-16682124
]
Michael Han commented on ZOOKEEPER-1441:
----------------------------------------
PortAssignment itself is fine and if everyone is using it, they should not get
conflicts because PortAssignment is the single source of truth of port
allocation. However, the problem here is not every processes running on test
machine using PortAssignment, despite most, if not all of ZK unit tests do use
it. So if there are heavy workloads running on the test machine while ZK unit
tests were running, potential port conflicts would occur.
>> I never actually got why PortAssigment tries to bind the port before returns
What PortAssignment implemented is a "reserve and release" pattern for port
allocation, and this is better than "choose a port but not reserver" approach,
because it is very unlikely the OS, regardless of how it allocates actual ports
to the processes, will yield two consecutive port for two socket bind calls.
Thus, by creating the socket via bind, and the immediately close it, we buy us
sometime during which OS will not reuse this same socket for a successive
socket call. This time however varies, thus there could be race conditions that
by the time we actually going to bind this port again, it's already grabbed by
another process. For ZK server, it requires an unbinded port number pass to it
(otherwise it can't bind the port), but due to the same race condition it's
possible when the server tries to bind, the port was taken already. The only
way to guarantee atomicity in this case is to have ZK server asking a port from
OS and bind immediately.
> Some test cases are failing because Port bind issue.
> ----------------------------------------------------
>
> Key: ZOOKEEPER-1441
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1441
> Project: ZooKeeper
> Issue Type: Test
> Components: server, tests
> Reporter: kavita sharma
> Assignee: Michael Han
> Priority: Major
> Labels: flaky, flaky-test
>
> very frequently testcases are failing because of :
> java.net.BindException: Address already in use
> at sun.nio.ch.Net.bind(Native Method)
> at
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:126)
> at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
> at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52)
> at
> org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:111)
> at
> org.apache.zookeeper.server.ServerCnxnFactory.createFactory(ServerCnxnFactory.java:112)
> at
> org.apache.zookeeper.server.quorum.QuorumPeer.<init>(QuorumPeer.java:514)
> at
> org.apache.zookeeper.test.QuorumBase.startServers(QuorumBase.java:156)
> at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:103)
> at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:67)
> may be because of Port Assignment so please give me some suggestions if
> someone is also facing same problem.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)