[ https://issues.apache.org/jira/browse/ZOOKEEPER-3510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated ZOOKEEPER-3510: -------------------------------------- Labels: pull-request-available (was: ) > Frequent 'zkServer.sh stop' failures when running C test suite > -------------------------------------------------------------- > > Key: ZOOKEEPER-3510 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3510 > Project: ZooKeeper > Issue Type: Bug > Reporter: Damien Diederen > Priority: Minor > Labels: pull-request-available > > As mentioned in > https://github.com/apache/zookeeper/pull/1054#discussion_r314208678 : > There is a {{sleep 3}} statement in {{zkServer.sh restart}}. I am unable to > unearth the history of that particular line, but I believe part—if not all—of > that {{sleep}} should be part of {{zkServer.sh stop}}. > I frequently observe {{FAILED TO START}} errors in the C test suite; the logs > consistently show that those are caused by {{java.net.BindException: Address > already in use}}. Adding a simple {{sleep 1}} before {{echo STOPPED}} > "fixes" it for me. I will submit an initial PR with the corresponding change > and a commit message akin to: > ---- > ZOOKEEPER-XXXX: Make zkServer.sh stop more reliable > Kill is asynchronous, and without the sleep, the server's TCP port can still > be busy when the next server is started—causing flaky runs of the C client's > test suite. > (It would probably be better to spin a few times, probing with ps -p.) > ---- > As noted above, the sleep is far from optimal, an adaptive mechanism would be > better—but I do not want to make the first iteration too complicated. -- This message was sent by Atlassian JIRA (v7.6.14#76016)