GitHub user ethervoid created a discussion: Master node got unresponsive after 
restart one of the replicas

Hello everyone!

We've been using kvrocks for a while and to give a bit of context on how we're 
working with it our system has 2 replicas and 1 master node as part of a 
Sentinel cluster. 

In our case when we want to release an update we first make the release and 
restart of the replicas, then we do a manual fail-over with Sentinel and after 
that, we release in the former master node.

Using this workflow today we've found that for some reason after releasing the 
changes in the first replica our master becomes unresponsive. We started to 
have gaps in our metrics from Grafana as you can see 

<img width="463" alt="Captura de pantalla 2022-07-13 a las 19 05 48" 
src="https://user-images.githubusercontent.com/741240/178790699-de3a82e7-3127-483e-b5ac-4778372773fd.png";>

<img width="428" alt="Captura de pantalla 2022-07-13 a las 19 05 40" 
src="https://user-images.githubusercontent.com/741240/178790781-9a22389d-e5f0-45d1-8b1d-61bf18f70c24.png";>

We connected to the machine and checked the docker image and was running with 
the following logs for that timeline

```
E0713 12:22:43.009356 14447 replication.cc:111] Write error while sending batch 
to slave: Broken pipe. batches: 
0x243130360D0A7B11FE880D0000000200000003013201250B5F5F6E616D6573706163650000000C735F363737333231333331342F1B266EB733CEAC30060181F782F22601250B5F5F6E616D6573706163650000000C735F363737333231333331342F1B266EB733CEAC6105302E3735300D0A
E0713 12:23:08.211652    32 redis_cmd.cc:3533] checkWALBoundary with sequence: 
58132926866, but GetWALIter return older sequence: 58132926860
E0713 12:43:06.192111  9671 replication.cc:111] Write error while sending batch 
to slave: Broken pipe. batches: 0x2431350D0A510D26890D000000000000000301320D0A
```

We stopped writing on that node and after some minutes the node went back and 
started to be responsive again without doing anything else.

Could be this a bug, an issue or misconfiguration?

GitHub link: https://github.com/apache/incubator-kvrocks/discussions/728

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to