hanahmily commented on issue #13229:
URL: https://github.com/apache/skywalking/issues/13229#issuecomment-3624046526
You need to create a new integration test that verifies data replication
works correctly when a data node fails. This involves defining a group with
`replication > 1`, writing data using existing helpers, and managing the
cluster lifecycle to simulate a node failure.
Here is the detailed implementation guide:
### 1\. Define a Group with Replication \> 1
Existing test groups in `pkg/test/stream/testdata` and
`pkg/test/measure/testdata/groups` likely use the default replication (usually
1). You need to create a new group definition file (or modify an existing one
for this specific test) that explicitly sets the replication factor.
* **Location:** Create a new YAML file, for example
`pkg/test/measure/testdata/groups/replicated.yaml` (or `stream` equivalent).
* **Content:** Add the `replica_num` field under `resource_opts`.
<!-- end list -->
```yaml
# Example: pkg/test/measure/testdata/groups/replicated.yaml
metadata:
name: "replicated_group"
catalog: MEASURE # or STREAM
resource_opts:
shard_num: 2
segment_interval:
unit: UNIT_DAY
num: 1
ttl:
unit: UNIT_DAY
num: 7
# This is the key change required by the issue
replica_num: 2
```
### 2\. Reuse `init.go` to Write Data
The file `test/cases/init.go` likely contains helper functions to load these
YAML definitions and write sample data to the server. You should leverage these
functions to avoid rewriting data ingestion logic.
* **Import the package:**
`github.com/apache/skywalking-banyandb/test/cases`
* **Usage:** Look for a function signature like `cases.Write(conn,
"path/to/testdata")`
* **Implementation:** In your test, after the cluster is up, call this
function pointing to your new `replicated_group` data.
<!-- end list -->
### 3\. Implement the Integration Test (Start Cluster & Stop Node)
Create a new test file (e.g.,
`test/integration/replication/replication_test.go`). You should use
`test/integration/handoff/handoff_suite_test.go` as your template because it
already handles cluster bootstrapping.
**Key Steps in the Test:**
1. **Bootstrap Cluster:** Start a BanyanDB cluster with **3 Data Nodes**.
You need 3 nodes to safely stop 1 and still have a majority/quorum or at least
enough nodes to hold the 2 replicas.
2. **Create Resources:** Apply the group schema created in Step 1.
3. **Write Data:** Use the helper from Step 2.
4. **Simulate Failure:** Stop one of the data nodes programmatically.
5. **Verification:** Query the data to ensure it is still available from
the remaining nodes.
To implement this, you could refer to
https://github.com/apache/skywalking-banyandb/blob/main/test/integration/handoff/handoff_suite_test.go.
### 4\. Directory Structure
Create the following folder structure:
```text
skywalking-banyandb/
└── test/
└── integration/
└── distributed/
└── replication/
├── replication_suite_test.go
└── replication_test.go
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]