[jira] [Commented] (GEODE-10017) Fix new ITs unstability for TCs that involve members restart

ASF GitHub Bot (Jira) Wed, 02 Mar 2022 07:58:06 -0800


    [ 
https://issues.apache.org/jira/browse/GEODE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500255#comment-17500255
 ]


ASF GitHub Bot commented on GEODE-10017:
----------------------------------------

pdxcodemonkey commented on a change in pull request #919:
URL: https://github.com/apache/geode-native/pull/919#discussion_r817836093



##########
File path: cppcache/integration/test/FunctionExecutionTest.cpp
##########
@@ -324,7 +324,9 @@ TEST(FunctionExecutionTest, 
testThatFunctionExecutionThrowsExceptionNonHA) {
       .withType("PARTITION")
       .execute();
 
-  cluster.getServers()[2].stop();
+  auto &targetServer = cluster.getServers()[2];
+  targetServer.stop();
+  targetServer.wait();

Review comment:
       Why this separation of concerns between stop() and wait()?  Is there 
ever a circumstance where we call stop() and don't intend for the process to 
actually be terminated yet when the function returns?

##########
File path: cppcache/integration/test/PdxInstanceTest.cpp
##########
@@ -395,9 +395,8 @@ TEST(PdxInstanceTest, testInstancePutAfterRestart) {
     cv_status.wait(lock, [&status] { return !status; });
   }
 
-  std::this_thread::sleep_for(std::chrono::seconds{30});
-
   for (auto& server : cluster.getServers()) {
+    server.wait();

Review comment:
       Okay I see this used separately here in several places, but... would it 
be necessary to call wait() in the other places if stop() just actually made 
sure the process was terminated?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscr...@geode.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Fix new ITs unstability for TCs that involve members restart
> ------------------------------------------------------------
>
>                 Key: GEODE-10017
>                 URL: https://issues.apache.org/jira/browse/GEODE-10017
>             Project: Geode
>          Issue Type: Bug
>          Components: native client
>            Reporter: Mario Salazar de Torres
>            Assignee: Mario Salazar de Torres
>            Priority: Major
>              Labels: needsTriage, pull-request-available
>
> *GIVEN* an integration TC on which a server restart needs to be restarted
> *WHEN* the server is starting up again
> *THEN* +{color:#172b4d}it might{color}+{color:#172b4d} happen that the TC 
> gets stuck{color}
> ----
> *Additional information.* This issue does not always happens, and I've seen 
> it happening more frequently with the latest version of Geode server (1.15.0)
> Some examples of this TC are:
>  * RegisterKeysTest.RegisterKeySetAndClusterRestart
>  * PartitionRegionWithRedundancyTest.putgetWithSingleHop
>  * ···
> Also, this is normally the exec flow for the TCs that get stuck:
>  # Setup cluster
>  # Do TC specific ops
>  # Stop server(s)/Cluster shutdown
>  # Start server(s)
> In all cases, the server gets stuck at step 4



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (GEODE-10017) Fix new ITs unstability for TCs that involve members restart

Reply via email to