[ 
https://issues.apache.org/jira/browse/IGNITE-26547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18029607#comment-18029607
 ] 

Vladislav Pyatkov commented on IGNITE-26547:
--------------------------------------------

h2. Test Plan
h3. Unit Test

Individual tests for the CmgRaftService and MetaStorageServiceImpl services 
without launching the entire cluster.
h4. CmgRaftService Tests — Majority Lost
 # Deploy CMG on three nodes and verify that all API requests work correctly.
 # Stop the majority (2 out of 3 nodes).
 # All methods must return a CompletableFuture with a 
SystemGroupUnavailableException (the message should clearly indicate that it 
refers to the CMG).
 # Start one node (the majority is restored).
 # Verify that all API methods function correctly again.

h4. CmgRaftService Tests — All Nodes Unavailable
 # Deploy CMG on three nodes and verify that all API requests work correctly.
 # Stop all CMG nodes.
 # All methods must return a CompletableFuture with a 
SystemGroupUnavailableException (the message should clearly indicate that it 
refers to the CMG).
 # Bring all three nodes back online (CMG is restored).
 # Verify that all API methods work correctly again.

h4. MetaStorageServiceImpl Tests — Majority Lost
 # Deploy MG on three nodes and verify that all API requests work correctly.
 # Stop the majority (2 out of 3 nodes).
 # All methods must return a CompletableFuture with a 
SystemGroupUnavailableException (the message should clearly indicate that it 
refers to the MG).
 # Start one node (the majority is restored).
 # Verify that all API methods work correctly again.

h4. MetaStorageServiceImpl Tests — All Nodes Unavailable
 # Deploy MG on three nodes and verify that all API requests work correctly.
 # Stop all MG nodes.
 # All methods must return a CompletableFuture with a 
SystemGroupUnavailableException (the message should clearly indicate that it 
refers to the MG).
 # Bring all three nodes back online (MG is restored).
 # Verify that all API methods work correctly again.

h3. Integration Testing
h4. RO Transactions (Implicit)
 # Start three nodes on one MG.
 # Create a group (specifying a data node filter to include only nodes 2 and 
3), create a table, and perform a preload.
 # Stop the MG node.
 # Attempt to retrieve all records (SELECT * FROM table); a 
SystemGroupUnavailableException should be thrown (the message should clearly 
indicate that it refers to the MG).

h4. RW Transactions (Implicit)
 # Start three nodes on one MG.
 # Create a group (specifying a data node filter to include only nodes 2 and 
3), create a table, and perform a preload.
 # Stop the MG node.
 # Attempt to insert an additional record (INSERT INTO table (id, val)); a 
SystemGroupUnavailableException should be thrown (the message should clearly 
indicate that it refers to the MG).

h4. RO Transactions (Explicit)
 # Start three nodes on one MG.
 # Create a group (specifying a data node filter to include only nodes 2 and 
3), create a table, and perform a preload.
 # Stop the MG node.
 # Attempt to retrieve all records within an explicit transaction (SELECT * 
FROM table); a SystemGroupUnavailableException should be thrown, followed by a 
transaction rollback (the message should clearly indicate that it refers to the 
MG).

h4. RW Transactions (Explicit)
 # Start three nodes on one MG.
 # Create a group (specifying a data node filter to include only nodes 2 and 
3), create a table, and perform a preload.
 # Stop the MG node.
 # Attempt to insert an additional record within an explicit transaction 
(INSERT INTO table (id, val)); a SystemGroupUnavailableException should be 
thrown, followed by a transaction rollback (the message should clearly indicate 
that it refers to the MG).

h4. Non-Transactional Calls via Public API — Retrieving Tables or a Specific 
Table
 # Start three nodes on one MG.
 # Create a group, create a table, and perform a preload.
 # Stop the MG node.
 # Attempt to retrieve the table via the API 
(ignite.table().table("table_name")); a SystemGroupUnavailableException should 
be thrown (the message should clearly indicate that it refers to the MG).

h4. Non-Transactional Calls via Public API — Database Object Creation
 # Start three nodes on one MG.
 # Create a group, create a table, and perform a preload.
 # Stop the MG node.
 # Attempt to create a table (CREATE TABLE IF NOT EXISTS TEST(ID INT PRIMARY 
KEY, NAME VARCHAR) ZONE TEST_ZONE); a SystemGroupUnavailableException should be 
thrown.

h4. System Procedures Using CMG and/or MG — Adding a Node to the Topology
 # Start three nodes on one CMG.
 # Create a group, create a table, and perform a preload.
 # Stop the CMG node.
 # Attempt to start a fourth node (which was not previously part of the 
cluster). The procedure hangs.
 # Bring the node with CMG back online.
 # All nodes join the topology. The cluster now has four nodes.

h4. System Procedures Using CMG and/or MG — Deferred Scale Up/Down Adjust
 # Start three nodes on one MG.
 # Create a group (specifying a data node filter and sufficiently large values 
for data_nodes_auto_adjust_scale_up and data_nodes_auto_adjust_scale_down), 
create a table, and perform a preload.
 # Remove node 2 and add node 4 so that both nodes match the filter criteria.
 # Verify that no adjust occurred (data nodes = \{2,3}).
 # Stop the MG node.
 # Wait long enough for both auto-adjust timers to expire.
 # Bring the MG node back online.
 # Wait sufficient time and verify that the auto-adjust has completed (data 
nodes = \{3,4}).

h4. Cluster Restart
 # Start three nodes on one MG.
 # Create a group, create a table, and perform a preload.
 # Query the data and verify that all preloaded records are available.
 # Stop the cluster nodes in random order, introducing a delay of 1–5 seconds 
between each shutdown.
 # Start all cluster nodes again, introducing a delay of 1–5 seconds between 
each startup.
 # Repeat steps 3–5 several times.

> Extend test coverage for CMG/MG restarts
> ----------------------------------------
>
>                 Key: IGNITE-26547
>                 URL: https://issues.apache.org/jira/browse/IGNITE-26547
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Vladislav Pyatkov
>            Assignee: Vladislav Pyatkov
>            Priority: Major
>              Labels: ignite-3
>             Fix For: 3.2
>
>
> h3. Motivation
> Cases of unavailability of system replication groups (CMG, MG) can occur in 
> the cluster involuntarily, for example, when nodes leave the topology, as 
> well as intentionally when the topology is modified by a user with 
> reassignment of system group nodes.
> Currently, the system behavior is undefined when any of these system groups 
> are unavailable, and tests for this scenario are absent.
> h3. Definition of done
> A set of tests should be created to verify the system behavior under system 
> group unavailability.
> Implementation will be carried out after the design and support are provided 
> by the system.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to