wofr opened a new issue, #8036: URL: https://github.com/apache/apisix/issues/8036
### Description I do run a GKE Cluster with 3 Nodes. Beside several application I also deployed the APISIX gateway on the cluster (chart: apisix, repoURL: https://charts.apiseven.com/, targetRevision: "0.11.0"), which does deploy an etcd-cluster (version 3.4.14) with 3 nodes. Now it gets funny, the etcd cluster builds up fine and everything is ok until every-day at 5:00 am, at this time the 3rd member of the cluster is leaving the cluster, the second node just stays fine. (See the logs below) Logs (etcd-0 node) ``` 2022-09-29 04:59:04.652 CEST etcd {"caller":"etcdserver/zap_raft.go:77", "level":"info", "logger":"raft", "msg":"90126cc714381e07 switched to configuration voters=(3177002992052145560 10381479693335928327)", "ts":"2022-09-29T02:59:04.652Z"} 2022-09-29 04:59:04.653 CEST etcd {"caller":"membership/cluster.go:472", "cluster-id":"b0d7015fda1525c8", "level":"info", "local-member-id":"90126cc714381e07", "msg":"removed member", "removed-remote-peer-id":"3ff1b5cd453a87df", "removed-remote-peer-urls":[…], "ts":"2022-09-29T02:59:04.653Z"} 2022-09-29 04:59:04.653 CEST etcd {"caller":"rafthttp/peer.go:330", "level":"info", "msg":"stopping remote peer", "remote-peer-id":"3ff1b5cd453a87df", "ts":"2022-09-29T02:59:04.653Z"} ``` Logs (etcd-2 node) ``` 04:59:04.655 CEST{caller: rafthttp/stream.go:421, error: EOF, level: warn, local-member-id: 3ff1b5cd453a87df, msg: lost TCP streaming connection with remote peer, remote-peer-id: 90126cc714381e07, stream-reader-type: stream MsgApp v2, ts: 2022-09-29T02:59:04.654Z} 04:59:04.678 CEST{caller: rafthttp/stream.go:421, error: EOF, level: warn, local-member-id: 3ff1b5cd453a87df, msg: lost TCP streaming connection with remote peer, remote-peer-id: 90126cc714381e07, stream-reader-type: stream Message, ts: 2022-09-29T02:59:04.656Z} 04:59:04.678 CEST{caller: etcdserver/zap_raft.go:77, level: info, logger: raft, msg: 3ff1b5cd453a87df switched to configuration voters=(3177002992052145560 10381479693335928327), ts: 2022-09-29T02:59:04.653Z} 04:59:04.678 CEST{caller: membership/cluster.go:472, cluster-id: b0d7015fda1525c8, level: info, local-member-id: 3ff1b5cd453a87df, msg: removed member, removed-remote-peer-id: 3ff1b5cd453a87df, removed-remote-peer-urls: […], ts: 2022-09-29T02:59:04.657Z} 04:59:04.678 CEST{caller: rafthttp/peer_status.go:66, error: failed to dial 90126cc714381e07 on stream MsgApp v2 (the member has been permanently removed from the cluster), level: warn, msg: peer became inactive (message send to peer failed), peer-id: 90126cc714381e07, ts: 2022-09-29T02:59:04.659Z} 04:59:04.678 CEST{caller: etcdserver/server.go:1150, error: the member has been permanently removed from the cluster, level: warn, msg: server error, ts: 2022-09-29T02:59:04.659Z} 04:59:04.678 CEST{caller: etcdserver/server.go:1151, level: warn, msg: data-dir used by this member must be removed, ts: 2022-09-29T02:59:04.659Z} 04:59:04.678 CEST{caller: rafthttp/peer.go:330, level: info, msg: stopping remote peer, remote-peer-id: 2c16fb63879f0d98, ts: 2022-09-29T02:59:04.660Z} ``` I've observed the behaviour now serveral times and I have no idea what caused it. For me it seems to be a "problem of the GKE" rather than ETCD nevertheless maybe some of you do have idea what could cause the problem. Funny fact it is always the 3rd node of the etcd cluster which got removed at 5 am. ### Environment - APISIX version (run `apisix version`): /usr/local/openresty/luajit/bin/luajit ./apisix/cli/apisix.lua version 2.15.0 - Operating system (run `uname -a`): Linux apisix-6dffdc8545-jn8sp 5.10.127+ #1 SMP Sat Jul 16 08:53:19 UTC 2022 x86_64 Linux - OpenResty / Nginx version (run `openresty -V` or `nginx -V`): nginx version: openresty/1.21.4.1 built by gcc 10.3.1 20210424 Alpine 10.3.1_git20210424) built with OpenSSL 1.1.1g 21 Apr 2020 - etcd version, if relevant (run `curl http://127.0.0.1:9090/v1/server_info`): 3.4.14 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
