MirtoBusico opened a new issue, #8566:
URL: https://github.com/apache/apisix/issues/8566
### Description
Hi all,
I have a 3 worker node (plus 1 master) K3S cluster with Apisix 2.15.1
installed as LoadBalancer using the helm chart
Every node is a KVM virtual machine on the same host.
After an host crash the three etcs pods never go online.
Looking at the first etcd pod (apisix-etcd-0) logs I see
```
{"level":"warn","ts":"2022-12-23T17:15:46.357Z","caller":"flags/flag.go:93","msg":"unrecognized
environment variable","environment-variable":"ETCD_DAEMON_USER=etcd"}
{"level":"info","ts":"2022-12-23T17:15:46.357Z","caller":"etcdmain/etcd.go:73","msg":"Running:
","args":["etcd"]}
{"level":"warn","ts":"2022-12-23T17:15:46.357Z","caller":"etcdmain/etcd.go:446","msg":"found
invalid file under data
directory","filename":"member_id","data-dir":"/bitnami/etcd/data"}
{"level":"info","ts":"2022-12-23T17:15:46.357Z","caller":"etcdmain/etcd.go:116","msg":"server
has been already
initialized","data-dir":"/bitnami/etcd/data","dir-type":"member"}
{"level":"info","ts":"2022-12-23T17:15:46.357Z","caller":"embed/etcd.go:131","msg":"configuring
peer listeners","listen-peer-urls":["http://0.0.0.0:2380"]}
{"level":"info","ts":"2022-12-23T17:15:46.357Z","caller":"embed/etcd.go:139","msg":"configuring
client listeners","listen-client-urls":["http://0.0.0.0:2379"]}
{"level":"info","ts":"2022-12-23T17:15:46.358Z","caller":"embed/etcd.go:308","msg":"starting
an etcd
server","etcd-version":"3.5.4","git-sha":"08407ff76","go-version":"go1.16.15","go-os":"linux","go-arch":"amd64","max-cpu-set":6,"max-cpu-available":6,"member-initialized":true,"name":"apisix-etcd-0","data-dir":"/bitnami/etcd/data","wal-dir":"","wal-dir-dedicated":"","member-dir":"/bitnami/etcd/data/member","force-new-cluster":false,"heartbeat-interval":"100ms","election-timeout":"1s","initial-election-tick-advance":true,"snapshot-count":100000,"snapshot-catchup-entries":5000,"initial-advertise-peer-urls":["http://apisix-etcd-0.apisix-etcd-headless.apisix.svc.cluster.local:2380"],"listen-peer-urls":["http://0.0.0.0:2380"],"advertise-client-urls":["http://apisix-etcd-0.apisix-etcd-headless.apisix.svc.cluster.local:2379","http://apisix-etcd.apisix.svc.cluster.local:2379"],"listen-client-urls":["http://0.0.0.0:2379"],"listen-metrics-urls":[],"cors":["*"],"host-whitelist":["*"],"initial
-cluster":"","initial-cluster-state":"new","initial-cluster-token":"","quota-size-bytes":2147483648,"pre-vote":true,"initial-corrupt-check":false,"corrupt-check-time-interval":"0s","auto-compaction-mode":"periodic","auto-compaction-retention":"0s","auto-compaction-interval":"0s","discovery-url":"","discovery-proxy":"","downgrade-check-interval":"5s"}
{"level":"info","ts":"2022-12-23T17:15:46.358Z","caller":"etcdserver/backend.go:81","msg":"opened
backend db","path":"/bitnami/etcd/data/member/snap/db","took":"159.119µs"}
{"level":"info","ts":"2022-12-23T17:15:46.473Z","caller":"etcdserver/server.go:508","msg":"recovered
v2 store from snapshot","snapshot-index":200002,"snapshot-size":"26 kB"}
{"level":"warn","ts":"2022-12-23T17:15:46.474Z","caller":"snap/db.go:88","msg":"failed
to find
[SNAPSHOT-INDEX].snap.db","snapshot-index":200002,"snapshot-file-path":"/bitnami/etcd/data/member/snap/0000000000030d42.snap.db","error":"snap:
snapshot file doesn't exist"}
{"level":"panic","ts":"2022-12-23T17:15:46.474Z","caller":"etcdserver/server.go:515","msg":"failed
to recover v3 backend from snapshot","error":"failed to find database snapshot
file (snap: snapshot file doesn't
exist)","stacktrace":"go.etcd.io/etcd/server/v3/etcdserver.NewServer\n\t/go/src/go.etcd.io/etcd/release/etcd/server/etcdserver/server.go:515\ngo.etcd.io/etcd/server/v3/embed.StartEtcd\n\t/go/src/go.etcd.io/etcd/release/etcd/server/embed/etcd.go:245\ngo.etcd.io/etcd/server/v3/etcdmain.startEtcd\n\t/go/src/go.etcd.io/etcd/release/etcd/server/etcdmain/etcd.go:228\ngo.etcd.io/etcd/server/v3/etcdmain.startEtcdOrProxyV2\n\t/go/src/go.etcd.io/etcd/release/etcd/server/etcdmain/etcd.go:123\ngo.etcd.io/etcd/server/v3/etcdmain.Main\n\t/go/src/go.etcd.io/etcd/release/etcd/server/etcdmain/main.go:40\nmain.main\n\t/go/src/go.etcd.io/etcd/release/etcd/server/main.go:32\nruntime.main\n\t/go/gos/go1.16.15/src/runtime/proc.go:225"}
panic: failed to recover v3 backend from snapshot
goroutine 1 [running]:
```
How can I recover etcd?
Also recreating an empty etcd is god
### Environment
- APISIX version (run apisix version):
root@apisix-64fffcfb4c-55vhw:/usr/local/apisix# apisix version
/usr/local/openresty/luajit/bin/luajit ./apisix/cli/apisix.lua version
2.15.1
root@apisix-64fffcfb4c-55vhw:/usr/local/apisix#
- Operating system (run uname -a):
root@apisix-64fffcfb4c-55vhw:/usr/local/apisix# uname -a
Linux apisix-64fffcfb4c-55vhw 5.15.0-53-generic #59-Ubuntu SMP Mon Oct 17
18:53:30 UTC 2022 x86_64 GNU/Linux
root@apisix-64fffcfb4c-55vhw:/usr/local/apisix#
- OpenResty / Nginx version (run openresty -V or nginx -V):
- etcd version, if relevant (run curl
http://127.0.0.1:9090/v1/server_info):
- APISIX Dashboard version, if relevant: 2.13.0
- Plugin runner version, for issues related to plugin runners:
- LuaRocks version, for installation issues (run luarocks --version):
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]