devmadhuu commented on code in PR #5614:
URL: https://github.com/apache/ozone/pull/5614#discussion_r1398676547
##########
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/scm/node/TestQueryNode.java:
##########
@@ -114,25 +117,32 @@ public void testHealthyNodesCount() throws Exception {
}
@Test
- @Timeout(10)
public void testStaleNodesCount() throws Exception {
- cluster.shutdownHddsDatanode(0);
- cluster.shutdownHddsDatanode(1);
-
- GenericTestUtils.waitFor(() ->
- cluster.getStorageContainerManager().getNodeCount(STALE) == 2,
- 100, 4 * 1000);
-
- int nodeCount = scmClient.queryNode(null, STALE,
- HddsProtos.QueryScope.CLUSTER, "").size();
- assertEquals(2, nodeCount, "Mismatch of expected nodes count");
+ CompletableFuture.runAsync(() -> {
+ cluster.shutdownHddsDatanode(0);
+ cluster.shutdownHddsDatanode(1);
+ }, executor);
+ GenericTestUtils.waitFor(() -> numOfDatanodes -
+ cluster.getStorageContainerManager().getScmNodeManager()
+ .getNodeCount(NodeStatus.inServiceHealthy()) >= 1, 100, 10 * 1000);
Review Comment:
> If I understand the fix correctly, the problem was that one or both nodes
may have already become dead by the time we checked for 2 stale nodes.
>
> Do we need this additional wait, or is the "stale + dead == 2" wait below
enough?
Thanks @adoroszlai for review. I have removed this extra check and wait for
unhealthy nodes. Only "stale + dead == 2" is suffice.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]