GitHub user ethervoid created a discussion: Master node got unresponsive after restart one of the replicas
Hello everyone! We've been using kvrocks for a while and to give a bit of context on how we're working with it our system has 2 replicas and 1 master node as part of a Sentinel cluster. In our case when we want to release an update we first make the release and restart of the replicas, then we do a manual fail-over with Sentinel and after that, we release in the former master node. Using this workflow today we've found that for some reason after releasing the changes in the first replica our master becomes unresponsive. We started to have gaps in our metrics from Grafana as you can see <img width="463" alt="Captura de pantalla 2022-07-13 a las 19 05 48" src="https://user-images.githubusercontent.com/741240/178790699-de3a82e7-3127-483e-b5ac-4778372773fd.png"> <img width="428" alt="Captura de pantalla 2022-07-13 a las 19 05 40" src="https://user-images.githubusercontent.com/741240/178790781-9a22389d-e5f0-45d1-8b1d-61bf18f70c24.png"> We connected to the machine and checked the docker image and was running with the following logs for that timeline ``` E0713 12:22:43.009356 14447 replication.cc:111] Write error while sending batch to slave: Broken pipe. batches: 0x243130360D0A7B11FE880D0000000200000003013201250B5F5F6E616D6573706163650000000C735F363737333231333331342F1B266EB733CEAC30060181F782F22601250B5F5F6E616D6573706163650000000C735F363737333231333331342F1B266EB733CEAC6105302E3735300D0A E0713 12:23:08.211652 32 redis_cmd.cc:3533] checkWALBoundary with sequence: 58132926866, but GetWALIter return older sequence: 58132926860 E0713 12:43:06.192111 9671 replication.cc:111] Write error while sending batch to slave: Broken pipe. batches: 0x2431350D0A510D26890D000000000000000301320D0A ``` We stopped writing on that node and after some minutes the node went back and started to be responsive again without doing anything else. Could be this a bug, an issue or misconfiguration? GitHub link: https://github.com/apache/incubator-kvrocks/discussions/728 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
