GitHub user bseto created a discussion: SetClusterNodes causing timeouts and mass disconnect/reconnects when under heavy load
Hi, I'm wondering if anyone else is also having this problem, and if there's anything I can do to help find a resolution to this. # Problem Note: This only happens to our production cluster with heavy load When adding an empty shard (so no migrations are involved) using the kvrocks-controller `create shard ...`, we observe that our readers and writers will start mass disconnecting, and reconnecting and continue to do so until we move reads off of the cluster. This also happens when we migrate slots, but I think I've narrowed it down to it happening on `SetClusterNodes` and maybe something to do with the `kCmdExclusive` taking too long and commands timeout? We have 4 regions we deploy to, and during November and December testing, the 2nd and 3rd largest clusters would display the above behaviour when I added an empty shard, or did any migrations. As a hail-mary, I had Claude Opus read the kvrocks codebase and see if it could find anything, and it suggested i set `persist-cluster-nodes-enabled no` since that could add to how long the exclusive lock needed to be held. It's January now, and the load is similar (but slightly lower than it was in November/December), but when I tried migrations on the 2nd and 3rd largest clusters after setting `persist-cluster-nodes-enabled no`, it seems those clusters no longer have issues. However when I try it on the largest cluster we have, I'm still seeing the issue. # Testing ## Attempt at Reproducing not in production I spent a week in November trying to reproduce this issue on a non-production cluster and I was unable to do so. I had 8 r6idn.large nodes in the cluster, with 16 c6gn.2xlarge nodes to load test (a mix of readers and writers) and was unable to get this issue happening. ## In Production I'll just give the details of our largest cluster: kvrocks version: 2.12.1 Instance Type: r6idn.8xlarge Number of nodes: 44 Operations: 3.4M ops/s (3M read, 130k hsetexpire, 130k hmget) Each Node averages 80k op/s at peak times. We were originally using [ruedis](https://github.com/redis/rueidis) client to connect to the cluster, and we tried to change the timeout settings and pipelines. Originally I had thought it might have been the way the client was handling `MOVED` errors, but getting the timeouts even when we just add an empty shard ruled that out (I had also modified the client to track if we were getting MOVED errors and we weren't). We then moved to [go-redis](https://github.com/redis/go-redis) and tried to play around with the timeouts there, and ignoring context.Timeout. Followed this [document](https://uptrace.dev/blog/golang-context-timeout) about Go Context Timeouts being potentially harmful and basically set go-redis to have a 10second timeout on the client, and we'd have our own wrapping function that'd timeout the original request, while letting the original redis connection live and not need to quit. The idea here was that future requests to the go-redis client would try to grab from the pool and get errors from go-redis since the pool of connections would be exhausted instead of spamming the kvrocks cluster. However this still didn't work. Somehow it'd still timeout. Currently we're using go-redis and using the `Limiter` interface to create a circuit breaker. So whenever we get a large amount of errors, we just open the circuit breaker for a bit before connecting again - but this still takes 5-10mins, and isn't ideal since it'd still mean we'd have 40+ migrations to do, with each migration ending with a disconnect/reconnect storm. Note: I'm also noticing it's not every node. It's usually maybe 10% of the nodes in the cluster that get hammered hard and have clients continuously connect/disconnect. # Thoughts Is there a way I can confirm if `SetClusterNodes` is indeed taking too long and the culprit? Or is there any ideas on what settings we can change? GitHub link: https://github.com/apache/kvrocks/discussions/3331 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
