[ 
https://issues.apache.org/jira/browse/GEODE-7591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17126939#comment-17126939
 ] 

ASF GitHub Bot commented on GEODE-7591:
---------------------------------------

kirklund commented on a change in pull request #5182:
URL: https://github.com/apache/geode/pull/5182#discussion_r436033017



##########
File path: 
geode-core/src/distributedTest/java/org/apache/geode/distributed/internal/ClusterDistributionManagerDUnitTest.java
##########
@@ -370,6 +371,33 @@ public void testWaitForViewInstallation() {
         .untilAsserted(() -> 
assertThat(waitForViewInstallationDone.get()).isTrue());
   }
 
+  /**
+   * show that waitForViewInstallation works as expected when distribution 
manager is closed
+   * while waiting for the latest membership view to install
+   */
+  @Test
+  public void testWaitForViewInstallationDisconnectDS() {
+    InternalDistributedSystem system = getSystem();
+    ClusterDistributionManager dm = (ClusterDistributionManager) 
system.getDM();
+    MembershipView<InternalDistributedMember> view = 
dm.getDistribution().getView();
+
+    AtomicBoolean waitForViewInstallationDone = new AtomicBoolean();
+    executorService.submit(() -> {
+      try {
+        dm.waitForViewInstallation(view.getViewId() + 1);
+        waitForViewInstallationDone.set(true);
+      } catch (InterruptedException e) {
+        errorCollector.addError(e);
+      }
+    });
+
+    await().timeout(2000, TimeUnit.MILLISECONDS);

Review comment:
       I don't think await() without an until of some sort does anything. It's 
the until clauses that start the waiting. I recommend using a CyclicBarrier to 
sync up the actions of the two threads and a Future to wait for completion:
   ```
   CyclicBarrier cyclicBarrier = new CyclicBarrier(2);
   
   Future<Void> future = executorService.submit(() -> {
     try {
       cyclicBarrier.await(getTimeout().toMillis(), MILLISECONDS);
       dm.waitForViewInstallation(view.getViewId() + 1);
     } catch (InterruptedException e) {
       errorCollector.addError(e);
     }
   });
   
   cyclicBarrier.await(getTimeout().toMillis(), MILLISECONDS);
   system.disconnect();
   
   future.getTimeout().toMillis(), MILLISECONDS);
   ```
   There's still no guarantee that the executor thread will actually be deep 
inside dm.waitForViewInstallation when the main thread calls disconnect, but 
there will be some percentage of CI runs that are hitting this scenario as 
desired. So as long as the test remains passing 100% of the time, we're 
confident enough that the scenario is tested and passing.
   
   To really guarantee the exact combination of two threads being in specific 
places would require some crazy test hooks down in dm and/or system which would 
probably require some unnatural acts of coding (not recommended). Another 
approach is to write an IntegrationTest or UnitTest in which some of the 
classes involved are mocks or spies but even then it's nice to have the dunit 
coverage as well.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> potential hang
> --------------
>
>                 Key: GEODE-7591
>                 URL: https://issues.apache.org/jira/browse/GEODE-7591
>             Project: Geode
>          Issue Type: Improvement
>          Components: membership
>            Reporter: Bruce J Schuchardt
>            Assignee: Jakov Varenina
>            Priority: Major
>
> This method in ClusterDistributionManager waits for a new membership view to 
> be installed, but if the cache is being closed while waiting the method could 
> hang because it only checks for cache closure if the object it's waiting on 
> is notified.  We should change the wait() to have a timeout so that the 
> `stopper` is polled periodically
> {code:java}
> void waitForViewInstallation(long id) throws InterruptedException {
>   if (id <= membershipViewIdAcknowledged) {
>     return;
>   }
>   synchronized (membershipViewIdGuard) {
>     while (membershipViewIdAcknowledged < id && 
> !stopper.isCancelInProgress()) {
>       if (logger.isDebugEnabled()) {
>         logger.debug("waiting for view {}.  Current DM view processed by all 
> listeners is {}", id,
>             membershipViewIdAcknowledged);
>       }
>       membershipViewIdGuard.wait();
>     }
>   }
> }
>  {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to