[GitHub] brooklyn-server pull request #560: Attempted non-determinite test fixes

2017-02-16 Thread ahgittin
Github user ahgittin commented on a diff in the pull request:

https://github.com/apache/brooklyn-server/pull/560#discussion_r101513076
  
--- Diff: 
launcher/src/main/java/org/apache/brooklyn/launcher/BrooklynWebServer.java ---
@@ -447,7 +450,15 @@ public synchronized void start() throws Exception {
 rootContext.setTempDirectory(Os.mkdirs(new File(webappTempDir, 
"war-root")));
 
 server.setHandler(handlers);
-server.start();
+try {
+server.start();
+} catch (BindException e) {
+// port discovery routines may take some time to clear, e.g. 
250ms for SO_TIMEOUT
+// tests fail because of this; see if adding a delay improves 
things
+log.warn("Initial server start-up failed binding (retrying 
after a delay): "+e);
+Time.sleep(Duration.millis(500));
+server.start();
--- End diff --

@neykov Why would `setReuseAddress(true)` improve false positives?  As long 
as we make the same call when we start the server it should be identical?  (Or 
are you thinking that the root cause is that the server-start call isn't using 
that, and that is what you'd fix?)

I agree finding and fixing the root cause (and recognising and logging 
better when we encounter it?) is even better, but I don't know what it is!

I don't think we do parallel tests so I suspect that isn't the issue.  
Letting sockets pick their own ports is a big change so I'd rather not go down 
that line.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] brooklyn-server pull request #560: Attempted non-determinite test fixes

2017-02-16 Thread ahgittin
GitHub user ahgittin reopened a pull request:

https://github.com/apache/brooklyn-server/pull/560

Attempted non-determinite test fixes

Prompted by the following (from 
https://builds.apache.org/job/brooklyn-server-pull-requests/1730/ ):

```
Test Result (3 failures / +3)


org.apache.brooklyn.launcher.CleanOrphanedLocationsIntegrationTest.testCleanedCopiedPersistedState

org.apache.brooklyn.launcher.CleanOrphanedLocationsIntegrationTest.testSelectionWithOrphanedLocationsInData

org.apache.brooklyn.launcher.BrooklynLauncherTest.testErrorsCaughtByApiAndRestApiWorks
```

The first two have been observed in several places, and might be fixed by 
the first commit here.  The second one is less frequent but the second commit 
here might help and should improve logging.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ahgittin/brooklyn-server orphan-test-fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/brooklyn-server/pull/560.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #560


commit d07c2af901d50f7a2c2767f1175bf990c327489c
Author: Alex Heneveld 
Date:   2017-02-16T09:09:33Z

attempt to fix non-det failures by forcing a persist

commit 08608b1132bc42319a1f216c02159d2e0d76e31c
Author: Alex Heneveld 
Date:   2017-02-16T09:32:01Z

a retry and extra logging when bind exception happens

to fix non-det test failures on server, due to bind conflicts




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] brooklyn-server pull request #560: Attempted non-determinite test fixes

2017-02-16 Thread ahgittin
Github user ahgittin closed the pull request at:

https://github.com/apache/brooklyn-server/pull/560


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] brooklyn-server pull request #560: Attempted non-determinite test fixes

2017-02-16 Thread neykov
Github user neykov commented on a diff in the pull request:

https://github.com/apache/brooklyn-server/pull/560#discussion_r101483274
  
--- Diff: 
launcher/src/main/java/org/apache/brooklyn/launcher/BrooklynWebServer.java ---
@@ -447,7 +450,15 @@ public synchronized void start() throws Exception {
 rootContext.setTempDirectory(Os.mkdirs(new File(webappTempDir, 
"war-root")));
 
 server.setHandler(handlers);
-server.start();
+try {
+server.start();
+} catch (BindException e) {
+// port discovery routines may take some time to clear, e.g. 
250ms for SO_TIMEOUT
+// tests fail because of this; see if adding a delay improves 
things
+log.warn("Initial server start-up failed binding (retrying 
after a delay): "+e);
+Time.sleep(Duration.millis(500));
+server.start();
--- End diff --

Just binding to a socket won't result in a `TIME_WAIT` state (the 
`isPortAvailable` check). `TIME_WAIT` is the result of a previous closed 
connection (previous test) or a client trying to connect in the small window 
the probe socket is open.
One thing that will improve false positive tests is removing 
[`setReuseAddress(true)`](https://github.com/apache/brooklyn-server/blob/master/utils/common/src/main/java/org/apache/brooklyn/util/net/Networking.java#L105).
 The `SO_TIMEOUT=250` in the same method isn't used at all since it applies to 
reads which we don't do here. So it can go away as well. `TIME_WAIT` timeout 
could be in the minutes range.

Instead of papering over the problem suggest we go for the root cause 
directly and the Apache servers are an excellent test bed for it. I can do the 
networking changes if you prefer?

LGTM other than this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] brooklyn-server pull request #560: Attempted non-determinite test fixes

2017-02-16 Thread ahgittin
GitHub user ahgittin opened a pull request:

https://github.com/apache/brooklyn-server/pull/560

Attempted non-determinite test fixes

Prompted by the following (from 
https://builds.apache.org/job/brooklyn-server-pull-requests/1730/ ):

```
Test Result (3 failures / +3)


org.apache.brooklyn.launcher.CleanOrphanedLocationsIntegrationTest.testCleanedCopiedPersistedState

org.apache.brooklyn.launcher.CleanOrphanedLocationsIntegrationTest.testSelectionWithOrphanedLocationsInData

org.apache.brooklyn.launcher.BrooklynLauncherTest.testErrorsCaughtByApiAndRestApiWorks
```

The first two have been observed in several places, and might be fixed by 
the first commit here.  The second one is less frequent but the second commit 
here might help and should improve logging.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ahgittin/brooklyn-server orphan-test-fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/brooklyn-server/pull/560.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #560


commit d07c2af901d50f7a2c2767f1175bf990c327489c
Author: Alex Heneveld 
Date:   2017-02-16T09:09:33Z

attempt to fix non-det failures by forcing a persist

commit 08608b1132bc42319a1f216c02159d2e0d76e31c
Author: Alex Heneveld 
Date:   2017-02-16T09:32:01Z

a retry and extra logging when bind exception happens

to fix non-det test failures on server, due to bind conflicts




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---