We haven't converted flakyTest to run in parallel containers yet. I'll add a story for that.
--Jens On Wed, Aug 17, 2016 at 10:53 AM, Bruce Schuchardt <bschucha...@pivotal.io> wrote: > The membership-port-range changes have been checked in for a while now. > > > Le 8/17/2016 à 10:51 AM, Kirk Lund a écrit : > >> Have we made all of the changes that we think will help prevent >> BindException failures? >> >> Last night's nightly build failed with one again: >> >> :geode-core:flakyTest >> >> com.gemstone.gemfire.security.ClientAuthenticationDUnitTest > >> testCredentialsForNotifications FAILED >> com.gemstone.gemfire.test.dunit.RMIException: While invoking >> com.gemstone.gemfire.security.ClientAuthenticationTestCase$$ >> Lambda$26/1964608307.call >> in VM 0 running on Host asf902.gq1.ygridcore.net with 4 VMs >> at com.gemstone.gemfire.test.dunit.VM.invoke(VM.java:389) >> at com.gemstone.gemfire.test.dunit.VM.invoke(VM.java:355) >> at com.gemstone.gemfire.test.dunit.VM.invoke(VM.java:320) >> at >> com.gemstone.gemfire.security.ClientAuthenticationTestCase.d >> oTestCredentialsForNotifications(ClientAuthenticationTestCase.java:456) >> at >> com.gemstone.gemfire.security.ClientAuthenticationDUnitTest. >> testCredentialsForNotifications(ClientAuthenticationDUnitTest.java:82) >> >> Caused by: >> java.lang.AssertionError: Got unexpected exception when starting >> server >> >> Caused by: >> java.net.BindException: Failed to create server socket on >> null[60,026] >> >> Caused by: >> java.net.BindException: Address already in use >> >> 193 tests completed, 1 failed, 6 skipped >> >> On Thu, Aug 4, 2016 at 10:38 AM, Bruce Schuchardt <bschucha...@pivotal.io >> > >> wrote: >> >> I've pushed the port-range changes that I described in my last email on >>> this subject. >>> >>> >>> Le 8/1/2016 à 5:33 PM, Kirk Lund a écrit : >>> >>> I think that the changes mentioned by Jens and Bruce obviate the need to >>>> do >>>> what I was proposing. >>>> >>>> -Kirk >>>> >>>> >>>> On Fri, Jul 29, 2016 at 3:41 PM, Bruce Schuchardt < >>>> bschucha...@pivotal.io >>>> wrote: >>>> >>>> I'm making another change that will help. >>>> >>>>> One of the problems with these tests is that they will choose a random >>>>> port for a Cache Server or some other component and only use the port >>>>> after >>>>> opening a cache. Doing that allows the communications/membership >>>>> component >>>>> to grab two ports. AvailablePort restricts the ports it hands out to >>>>> the >>>>> range [20000, 30000], so if we restrict the communications/membership >>>>> component to use ports outside of that range it will help avoid >>>>> collisions. >>>>> >>>>> >>>>> Le 7/29/2016 à 3:23 PM, Nabarun Nag a écrit : >>>>> >>>>> +1 for the retry. >>>>> >>>>>> In my opinion, maintaining available port lists maybe hard as we move >>>>>> towards running test modules in parallel. Also maybe some non-geode >>>>>> entity >>>>>> may come up and pick up a port hence we will need to constantly >>>>>> refresh/update the list before/after each test run. (10000 ports needs >>>>>> to >>>>>> be checked as per geode getRandomWildcardBindPortNumber) >>>>>> >>>>>> >>>>>> Also for GEODE-1600 fix, DUnitLauncher now passes 0 as the port number >>>>>> while creating a locator. The system assigns it an available port >>>>>> number >>>>>> while staring the server rather than getting a random available port >>>>>> number >>>>>> first then asking things to be started on that port. (race conditions >>>>>> ensues ) >>>>>> >>>>>> On Fri, Jul 29, 2016 at 2:36 PM William Markito <wmark...@pivotal.io> >>>>>> wrote: >>>>>> >>>>>> Why not create a JUnit rule that returns available ports and keep >>>>>> track >>>>>> of >>>>>> >>>>>> ports being used ? >>>>>>> >>>>>>> I've cloned this gist from somewhere (don't remember now) but I've >>>>>>> planning >>>>>>> to send it for discussion... >>>>>>> >>>>>>> https://gist.github.com/markito/b5be3fc570c7c7c84e6d09e064a6e898 >>>>>>> >>>>>>> Still talking about rules, I've played a bit with the TemporaryFolder >>>>>>> rule >>>>>>> and that's very useful as well, specially to clean up after test runs >>>>>>> and >>>>>>> to avoid conflicts. >>>>>>> >>>>>>> http://junit.org/junit4/javadoc/4.12/org/junit/rules/Tempora >>>>>>> ryFolder.html >>>>>>> >>>>>>> Just my 2c >>>>>>> >>>>>>> On Fri, Jul 29, 2016 at 1:54 PM, Hitesh Khamesra < >>>>>>> hitesh...@yahoo.com.invalid> wrote: >>>>>>> >>>>>>> Is there any possibility of running multiple test same time on that >>>>>>> >>>>>>> machine? >>>>>>>> >>>>>>>> -Hitesh >>>>>>>> >>>>>>>> >>>>>>>> From: Kirk Lund <kl...@pivotal.io> >>>>>>>> To: geode <dev@geode.incubator.apache.org> >>>>>>>> Sent: Friday, July 29, 2016 1:21 PM >>>>>>>> Subject: Flaky tests failing with BindException >>>>>>>> >>>>>>>> Many of our flaky tests are flaky because they use AvailablePort or >>>>>>>> AvailablePortHelper to find randomly available ports. They then >>>>>>>> later >>>>>>>> >>>>>>>> fail >>>>>>>> >>>>>>> with a BindException because the port is already in use by the time >>>>>>> the >>>>>>> >>>>>>>> test uses it. >>>>>>>> >>>>>>>> Here's a proposal for a temporary fix: >>>>>>>> >>>>>>>> The module geode-junit contains a JUnit 4 rule called RetryRule. We >>>>>>>> could >>>>>>>> modify RetryRule to only retry if a BindException (or configurable >>>>>>>> exception/s) is detected. This rule would then be dropped into every >>>>>>>> test >>>>>>>> that uses AvailablePort or AvailablePortHelper. Then if the test >>>>>>>> fails >>>>>>>> >>>>>>>> with >>>>>>>> >>>>>>> a BindException, it would automatically retry (once or twice or >>>>>>> >>>>>>>> whatever >>>>>>>> >>>>>>>> we >>>>>>>> >>>>>>> decide to configure RetryRule with). If the test fails without any >>>>>>> >>>>>>>> detected >>>>>>>> >>>>>>> BindException, then it would just fail without retrying. >>>>>>> >>>>>>>> Opinions on this? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Kirk >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>> ~/William >>>>>>> >>>>>>> >>>>>>> >>>>>>> >