[GitHub] brooklyn-server pull request #560: Attempted non-determinite test fixes
Github user ahgittin commented on a diff in the pull request: https://github.com/apache/brooklyn-server/pull/560#discussion_r101513076 --- Diff: launcher/src/main/java/org/apache/brooklyn/launcher/BrooklynWebServer.java --- @@ -447,7 +450,15 @@ public synchronized void start() throws Exception { rootContext.setTempDirectory(Os.mkdirs(new File(webappTempDir, "war-root"))); server.setHandler(handlers); -server.start(); +try { +server.start(); +} catch (BindException e) { +// port discovery routines may take some time to clear, e.g. 250ms for SO_TIMEOUT +// tests fail because of this; see if adding a delay improves things +log.warn("Initial server start-up failed binding (retrying after a delay): "+e); +Time.sleep(Duration.millis(500)); +server.start(); --- End diff -- @neykov Why would `setReuseAddress(true)` improve false positives? As long as we make the same call when we start the server it should be identical? (Or are you thinking that the root cause is that the server-start call isn't using that, and that is what you'd fix?) I agree finding and fixing the root cause (and recognising and logging better when we encounter it?) is even better, but I don't know what it is! I don't think we do parallel tests so I suspect that isn't the issue. Letting sockets pick their own ports is a big change so I'd rather not go down that line. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] brooklyn-server pull request #560: Attempted non-determinite test fixes
GitHub user ahgittin reopened a pull request: https://github.com/apache/brooklyn-server/pull/560 Attempted non-determinite test fixes Prompted by the following (from https://builds.apache.org/job/brooklyn-server-pull-requests/1730/ ): ``` Test Result (3 failures / +3) org.apache.brooklyn.launcher.CleanOrphanedLocationsIntegrationTest.testCleanedCopiedPersistedState org.apache.brooklyn.launcher.CleanOrphanedLocationsIntegrationTest.testSelectionWithOrphanedLocationsInData org.apache.brooklyn.launcher.BrooklynLauncherTest.testErrorsCaughtByApiAndRestApiWorks ``` The first two have been observed in several places, and might be fixed by the first commit here. The second one is less frequent but the second commit here might help and should improve logging. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ahgittin/brooklyn-server orphan-test-fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/brooklyn-server/pull/560.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #560 commit d07c2af901d50f7a2c2767f1175bf990c327489c Author: Alex Heneveld Date: 2017-02-16T09:09:33Z attempt to fix non-det failures by forcing a persist commit 08608b1132bc42319a1f216c02159d2e0d76e31c Author: Alex Heneveld Date: 2017-02-16T09:32:01Z a retry and extra logging when bind exception happens to fix non-det test failures on server, due to bind conflicts --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] brooklyn-server pull request #560: Attempted non-determinite test fixes
Github user ahgittin closed the pull request at: https://github.com/apache/brooklyn-server/pull/560 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] brooklyn-server pull request #560: Attempted non-determinite test fixes
Github user neykov commented on a diff in the pull request: https://github.com/apache/brooklyn-server/pull/560#discussion_r101483274 --- Diff: launcher/src/main/java/org/apache/brooklyn/launcher/BrooklynWebServer.java --- @@ -447,7 +450,15 @@ public synchronized void start() throws Exception { rootContext.setTempDirectory(Os.mkdirs(new File(webappTempDir, "war-root"))); server.setHandler(handlers); -server.start(); +try { +server.start(); +} catch (BindException e) { +// port discovery routines may take some time to clear, e.g. 250ms for SO_TIMEOUT +// tests fail because of this; see if adding a delay improves things +log.warn("Initial server start-up failed binding (retrying after a delay): "+e); +Time.sleep(Duration.millis(500)); +server.start(); --- End diff -- Just binding to a socket won't result in a `TIME_WAIT` state (the `isPortAvailable` check). `TIME_WAIT` is the result of a previous closed connection (previous test) or a client trying to connect in the small window the probe socket is open. One thing that will improve false positive tests is removing [`setReuseAddress(true)`](https://github.com/apache/brooklyn-server/blob/master/utils/common/src/main/java/org/apache/brooklyn/util/net/Networking.java#L105). The `SO_TIMEOUT=250` in the same method isn't used at all since it applies to reads which we don't do here. So it can go away as well. `TIME_WAIT` timeout could be in the minutes range. Instead of papering over the problem suggest we go for the root cause directly and the Apache servers are an excellent test bed for it. I can do the networking changes if you prefer? LGTM other than this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] brooklyn-server pull request #560: Attempted non-determinite test fixes
GitHub user ahgittin opened a pull request: https://github.com/apache/brooklyn-server/pull/560 Attempted non-determinite test fixes Prompted by the following (from https://builds.apache.org/job/brooklyn-server-pull-requests/1730/ ): ``` Test Result (3 failures / +3) org.apache.brooklyn.launcher.CleanOrphanedLocationsIntegrationTest.testCleanedCopiedPersistedState org.apache.brooklyn.launcher.CleanOrphanedLocationsIntegrationTest.testSelectionWithOrphanedLocationsInData org.apache.brooklyn.launcher.BrooklynLauncherTest.testErrorsCaughtByApiAndRestApiWorks ``` The first two have been observed in several places, and might be fixed by the first commit here. The second one is less frequent but the second commit here might help and should improve logging. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ahgittin/brooklyn-server orphan-test-fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/brooklyn-server/pull/560.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #560 commit d07c2af901d50f7a2c2767f1175bf990c327489c Author: Alex Heneveld Date: 2017-02-16T09:09:33Z attempt to fix non-det failures by forcing a persist commit 08608b1132bc42319a1f216c02159d2e0d76e31c Author: Alex Heneveld Date: 2017-02-16T09:32:01Z a retry and extra logging when bind exception happens to fix non-det test failures on server, due to bind conflicts --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---