Bill Burcham created GEODE-8589:
-----------------------------------

             Summary: make 
locatorsStopWaitingForLocatorWaitTimeIfAllLocatorsContacted test deterministic
                 Key: GEODE-8589
                 URL: https://issues.apache.org/jira/browse/GEODE-8589
             Project: Geode
          Issue Type: Improvement
          Components: membership
            Reporter: Bill Burcham


A recent commit added a new test: 
{{MembershipIntegrationTest.locatorsStopWaitingForLocatorWaitTimeIfAllLocatorsContacted()}}

That's a good test. But it would be better if it ran faster and was less 
susceptible to timing issues. The problem is that the logic we are trying to 
test, {{GMSJoinLeave.leave()}} uses {{System.currentTimeMillis()}} to get the 
current time and it uses {{Thread.sleep()}} to sleep:

{code:java}
        long now = System.currentTimeMillis();
...
            if (state.joinedMembersContacted <= 0 && ((now >= locatorGiveUpTime 
&&
                tries >= minimumRetriesBeforeBecomingCoordinator) ||
                state.locatorsContacted >= locators.size())) {
...
            if (System.currentTimeMillis() > giveupTime) {
              break;
            }
...
              Thread.sleep(retrySleep);
...
              giveupTime = System.currentTimeMillis() + timeout;
...
      if (!this.isJoined) {
        logger.debug("giving up attempting to join the distributed system after 
"
            + (System.currentTimeMillis() - startTime) + "ms");
      }
...
      if (!this.isJoined && state.hasContactedAJoinedLocator) {
        throw new MemberStartupException("Unable to join the distributed system 
in "
            + (System.currentTimeMillis() - startTime) + "ms");
      }
{code}

The opportunity is to _inject_ objects that handle these two functions (time 
keeping and sleeping). This will enable us to artifically control these in our 
test! Sleeping doesn't have to take any time at all. And we can make time pass 
as quickly as we want to.

For an example of how this can be done, see [PR 
#5422|https://github.com/apache/geode/pull/5422] for GEODE-6950.

This particular test 
({{locatorsStopWaitingForLocatorWaitTimeIfAllLocatorsContacted()}}) is a little 
bit more involved than that one in that more objects are involved. In addition 
to {{GMSJoinLeave}}, other classes involved in the test also need current time 
and need to sleep.

When this ticket is complete, {{GMSJoinLeave}} and TBD other classes will have

* functional interfaces {{MillisecondProvider}} and {{Sleeper}}, currently 
defined inside {{PrimaryHandler}} will be moved higher in the (internal) 
hierarchy for use by other membership classes
* {{GMSJoinLeave}} will take a {{MillisecondProvider}} and {{Sleeper}} in its 
constructor and will delegate to those instead of calling 
{{System.currentTimeMillis()}} and {{Thread.sleep()}} directly
* TBD other classes may require injection of {{MillisecondProvider}} and 
{{Sleeper}}
* TBD other changes may be necessary in cases where collaborating classes 
currently spin up threads or synchronize between threads e.g. calling {{wait()}}
* {{locatorsStopWaitingForLocatorWaitTimeIfAllLocatorsContacted()}} will no 
longer employ {{Thread.sleep()}} nor will it call {{CompletableFuture.get(long 
timeout, TimeUnit unit)}}—instead it will operate deterministically (ideally in 
the same thread but not necessarily) with respect to the unit under test.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to