Github user neykov commented on a diff in the pull request:
https://github.com/apache/brooklyn-server/pull/560#discussion_r101483274
--- Diff:
launcher/src/main/java/org/apache/brooklyn/launcher/BrooklynWebServer.java ---
@@ -447,7 +450,15 @@ public synchronized void start() throws Exception {
rootContext.setTempDirectory(Os.mkdirs(new File(webappTempDir,
"war-root")));
server.setHandler(handlers);
- server.start();
+ try {
+ server.start();
+ } catch (BindException e) {
+ // port discovery routines may take some time to clear, e.g.
250ms for SO_TIMEOUT
+ // tests fail because of this; see if adding a delay improves
things
+ log.warn("Initial server start-up failed binding (retrying
after a delay): "+e);
+ Time.sleep(Duration.millis(500));
+ server.start();
--- End diff --
Just binding to a socket won't result in a `TIME_WAIT` state (the
`isPortAvailable` check). `TIME_WAIT` is the result of a previous closed
connection (previous test) or a client trying to connect in the small window
the probe socket is open.
One thing that will improve false positive tests is removing
[`setReuseAddress(true)`](https://github.com/apache/brooklyn-server/blob/master/utils/common/src/main/java/org/apache/brooklyn/util/net/Networking.java#L105).
The `SO_TIMEOUT=250` in the same method isn't used at all since it applies to
reads which we don't do here. So it can go away as well. `TIME_WAIT` timeout
could be in the minutes range.
Instead of papering over the problem suggest we go for the root cause
directly and the Apache servers are an excellent test bed for it. I can do the
networking changes if you prefer?
LGTM other than this.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---