rpuch commented on code in PR #4614:
URL: https://github.com/apache/ignite-3/pull/4614#discussion_r1825728926
##########
modules/cli/src/main/java/org/apache/ignite/internal/cli/decorators/ClusterStatusDecorator.java:
##########
@@ -45,4 +46,17 @@ public TerminalOutput decorate(ClusterStatus data) {
data.nodeCount(), fg(Color.RED).mark("not initialized")
));
}
+
+ private static String status(ClusterStatus status) {
+ switch (status) {
+ case MS_MAJORITY_LOST:
+ return fg(Color.RED).mark("Metastore majority lost");
+ case HEALTHY:
+ return fg(Color.GREEN).mark("active");
+ case CMG_MAJORITY_LOST:
+ return fg(Color.RED).mark("CMG majority lost");
Review Comment:
The ordering is a bit strange: why is `HEALTHY` in the middle?
##########
modules/rest-api/src/main/java/org/apache/ignite/internal/rest/api/cluster/ClusterState.java:
##########
@@ -34,24 +36,31 @@
*/
@Schema(description = "Information about current cluster state.")
public class ClusterState {
- @Schema(description = "List of cluster management group nodes. These nodes
are responsible for maintaining RAFT cluster topology.")
+ @Schema(description = "List of cluster management group nodes. These nodes
are responsible for maintaining RAFT cluster topology.",
+ requiredMode = RequiredMode.REQUIRED)
@IgniteToStringInclude
private final Collection<String> cmgNodes;
- @Schema(description = "List of metastorage nodes. These nodes are
responsible for storing RAFT cluster metadata.")
+ @Schema(description = "List of metastorage nodes. These nodes are
responsible for storing RAFT cluster metadata.",
+ requiredMode = RequiredMode.REQUIRED)
@IgniteToStringInclude
private final Collection<String> msNodes;
- @Schema(description = "Version of Apache Ignite that the cluster was
created on.")
+ @Schema(description = "Version of Apache Ignite that the cluster was
created on.", requiredMode = RequiredMode.REQUIRED)
private final String igniteVersion;
- @Schema(description = "Unique tag that identifies the cluster.")
+ @Schema(description = "Unique tag that identifies the cluster.",
+ requiredMode = RequiredMode.REQUIRED)
private final ClusterTag clusterTag;
@Schema(description = "IDs the cluster had before.")
@IgniteToStringInclude
private final @Nullable List<UUID> formerClusterIds;
+ @Schema(description = "Cluster status.",
+ requiredMode = RequiredMode.REQUIRED)
+ private final ClusterStatus clusterStatus;
Review Comment:
It seems that entities from different domains are mixed here. This
`ClusterState` class represents the CMG cluster state that is the state stored
in the CMG related to the cluster itself. But `ClusterStatus` is some volatile
thing, it might change during runtime however it wants, even if the CMG cluster
state does not change.
I believe we should not mix them together. Could a distinct API endpoint
(and CLI command) be added to access this volatile information?
##########
modules/cli/src/main/java/org/apache/ignite/internal/cli/call/cluster/status/ClusterState.java:
##########
@@ -38,15 +39,26 @@ public class ClusterStatus {
private final List<String> metadataStorageNodes;
- private ClusterStatus(int nodeCount, boolean initialized, String name,
- boolean connected, String nodeUrl, List<String> cmgNodes,
List<String> metadataStorageNodes) {
+ private final ClusterStatus clusterStatus;
+
+ private ClusterState(
+ int nodeCount,
+ boolean initialized,
+ String name,
+ boolean connected,
+ String nodeUrl,
+ List<String> cmgNodes,
+ List<String> metadataStorageNodes,
+ ClusterStatus clusterStatus
+ ) {
this.nodeCount = nodeCount;
this.initialized = initialized;
this.name = name;
this.connected = connected;
this.nodeUrl = nodeUrl;
this.cmgNodes = cmgNodes;
this.metadataStorageNodes = metadataStorageNodes;
Review Comment:
How about making defensive copies? Better safe than sorry, and this class
should not have a huge amount of instances
##########
modules/rest-api/src/main/java/org/apache/ignite/internal/rest/api/cluster/ClusterStatus.java:
##########
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.ignite.internal.rest.api.cluster;
+
+/**
+ * Current health status of the cluster.
+ */
+public enum ClusterStatus {
+ /**
+ * The cluster is completely healthy. Minor losses in any of the groups
are possible,
+ * but this does not affect the operation of the cluster.
+ */
+ HEALTHY,
+
+ /**
+ * The metastore group has lost its majority. Almost all of the cluster
functions are inoperative.
+ * To restore their operation, it is necessary to return the majority to
the metastore group.
Review Comment:
```suggestion
* To restore their operation, it is necessary to restore the majority
to the metastore group.
```
##########
modules/rest/src/main/java/org/apache/ignite/internal/rest/cluster/ClusterManagementController.java:
##########
@@ -84,16 +113,31 @@ public CompletableFuture<Void> init(@Body InitCommand
initCommand) {
});
}
- private static ClusterState
mapClusterState(org.apache.ignite.internal.cluster.management.ClusterState
clusterState) {
+ private ClusterState
mapClusterState(org.apache.ignite.internal.cluster.management.ClusterState
clusterState) {
return new ClusterState(
clusterState.cmgNodes(),
clusterState.metaStorageNodes(),
clusterState.igniteVersion().toString(),
new ClusterTag(clusterState.clusterTag().clusterName(),
clusterState.clusterTag().clusterId()),
- clusterState.formerClusterIds()
+ clusterState.formerClusterIds(),
+ mapClusterStatus(clusterState)
);
}
+ private ClusterStatus
mapClusterStatus(org.apache.ignite.internal.cluster.management.ClusterState
clusterState) {
+ Set<String> metaStorageNodes = clusterState.metaStorageNodes();
+ long presentMetaStorageNodes = topologyService.allMembers().stream()
+ .map(ClusterNode::name)
+ .filter(metaStorageNodes::contains)
+ .count();
+
+ if (presentMetaStorageNodes <= metaStorageNodes.size() / 2) {
+ return ClusterStatus.MS_MAJORITY_LOST;
Review Comment:
The fact that we have enough nodes in the physical topology does not
necessarily imply we have an MG majority. For instance, something on levels
higher than the network might prevent a leader to be elected. The most accurate
way would be to try executing a read command against the MS group (like reading
its current revision) and handle a timeout.
##########
modules/cli/src/main/java/org/apache/ignite/internal/cli/call/cluster/status/ClusterState.java:
##########
@@ -18,11 +18,12 @@
package org.apache.ignite.internal.cli.call.cluster.status;
import java.util.List;
+import org.apache.ignite.rest.client.model.ClusterStatus;
/**
* Class that represents the cluster status.
*/
-public class ClusterStatus {
+public class ClusterState {
Review Comment:
We already have an entity called 'cluster state' in the CMG domain. This one
is a different entity and yet it's called identically. Could a distinct name be
devised? Something like ClusterRuntimeState, maybe.
##########
modules/rest-api/src/main/java/org/apache/ignite/internal/rest/api/cluster/ClusterStatus.java:
##########
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.ignite.internal.rest.api.cluster;
+
+/**
+ * Current health status of the cluster.
+ */
+public enum ClusterStatus {
+ /**
+ * The cluster is completely healthy. Minor losses in any of the groups
are possible,
+ * but this does not affect the operation of the cluster.
+ */
+ HEALTHY,
+
+ /**
+ * The metastore group has lost its majority. Almost all of the cluster
functions are inoperative.
+ * To restore their operation, it is necessary to return the majority to
the metastore group.
+ */
+ MS_MAJORITY_LOST,
+
+ /**
+ * The cluster management group has lost its majority. The cluster is
completely inoperative until the majority is returned.
Review Comment:
```suggestion
* The cluster management group has lost its majority. The cluster is
completely inoperative until the majority is restored.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]