hanahmily commented on issue #13229:
URL: https://github.com/apache/skywalking/issues/13229#issuecomment-3624046526

   You need to create a new integration test that verifies data replication 
works correctly when a data node fails. This involves defining a group with 
`replication > 1`, writing data using existing helpers, and managing the 
cluster lifecycle to simulate a node failure.
   
   Here is the detailed implementation guide:
   
   ### 1\. Define a Group with Replication \> 1
   
   Existing test groups in `pkg/test/stream/testdata` and 
`pkg/test/measure/testdata/groups` likely use the default replication (usually 
1). You need to create a new group definition file (or modify an existing one 
for this specific test) that explicitly sets the replication factor.
   
     * **Location:** Create a new YAML file, for example 
`pkg/test/measure/testdata/groups/replicated.yaml` (or `stream` equivalent).
     * **Content:** Add the `replica_num` field under `resource_opts`.
   
   <!-- end list -->
   
   ```yaml
   # Example: pkg/test/measure/testdata/groups/replicated.yaml
   metadata:
     name: "replicated_group"
   catalog: MEASURE # or STREAM
   resource_opts:
     shard_num: 2
     segment_interval:
       unit: UNIT_DAY
       num: 1
     ttl:
       unit: UNIT_DAY
       num: 7
     # This is the key change required by the issue
     replica_num: 2 
   ```
   
   ### 2\. Reuse `init.go` to Write Data
   
   The file `test/cases/init.go` likely contains helper functions to load these 
YAML definitions and write sample data to the server. You should leverage these 
functions to avoid rewriting data ingestion logic.
   
     * **Import the package:** 
`github.com/apache/skywalking-banyandb/test/cases`
     * **Usage:** Look for a function signature like `cases.Write(conn, 
"path/to/testdata")`
     * **Implementation:** In your test, after the cluster is up, call this 
function pointing to your new `replicated_group` data.
   
   <!-- end list -->
   
   ### 3\. Implement the Integration Test (Start Cluster & Stop Node)
   
   Create a new test file (e.g., 
`test/integration/replication/replication_test.go`). You should use 
`test/integration/handoff/handoff_suite_test.go` as your template because it 
already handles cluster bootstrapping.
   
   **Key Steps in the Test:**
   
   1.  **Bootstrap Cluster:** Start a BanyanDB cluster with **3 Data Nodes**. 
You need 3 nodes to safely stop 1 and still have a majority/quorum or at least 
enough nodes to hold the 2 replicas.
   2.  **Create Resources:** Apply the group schema created in Step 1.
   3.  **Write Data:** Use the helper from Step 2.
   4.  **Simulate Failure:** Stop one of the data nodes programmatically.
   5.  **Verification:** Query the data to ensure it is still available from 
the remaining nodes.
   
   
   
   To implement this, you could refer to 
https://github.com/apache/skywalking-banyandb/blob/main/test/integration/handoff/handoff_suite_test.go.
 
   
   ### 4\. Directory Structure
   
   Create the following folder structure:
   
   ```text
   skywalking-banyandb/
   └── test/
       └── integration/
           └── distributed/
               └── replication/
                   ├── replication_suite_test.go
                   └── replication_test.go
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to