Dear All, Recently I found out my 6 VMs of each node have trouble of executing etcd_process I think I did the config file right, because at first their etcd_process was running well and the “clearwater-etcdctl cluster health” shows all healthly. But somehow they suddenly
I tried “monit restart etcd_process” but still failed. Here are some command result for more information. (on the node ellis) root@ellis1:/var/log# clearwater-etcdctl cluster-health cluster may be unhealthy: failed to list members Error: client: etcd cluster is unavailable or misconfigured error #0: dial tcp 192.168.2.206:4000: getsockopt: connection refused (.206 is homestead’s ip) cat /var/log/boot.log ……….. ……….. ……….. zmq_msg_recv: Resource temporarily unavailable Configuring monit for only localhost access Error: dial tcp 192.168.2.205:4000: getsockopt: no route to host Rejoining cluster... Etcd failed to come up - exiting root@ellis1:/var/log/clearwater-etcd# cat clearwater-etcd.log …………. …………. ………… 2016-10-21 10:11:46.686827 I | etcdmain: etcd Version: 2.2.5 2016-10-21 10:11:46.686888 I | etcdmain: Git SHA: bc9ddf2 2016-10-21 10:11:46.686895 I | etcdmain: Go Version: go1.5.3 2016-10-21 10:11:46.686902 I | etcdmain: Go OS/Arch: linux/amd64 2016-10-21 10:11:46.686913 I | etcdmain: setting maximum number of CPUs to 4, total number of available CPUs is 4 2016-10-21 10:11:46.686953 N | etcdmain: the server is already initialized as member before, starting as etcd member... 2016-10-21 10:11:46.687015 I | etcdmain: listening for peers on http://192.168.2.206:2380 2016-10-21 10:11:46.687039 I | etcdmain: listening for client requests on http://192.168.2.206:4000 2016-10-21 10:11:46.689639 I | etcdserver: recovered store from snapshot at index 10001 2016-10-21 10:11:46.689654 I | etcdserver: name = 192-168-2-206 2016-10-21 10:11:46.689660 I | etcdserver: data dir = /var/lib/clearwater-etcd/192.168.2.206 2016-10-21 10:11:46.689668 I | etcdserver: member dir = /var/lib/clearwater-etcd/192.168.2.206/member 2016-10-21 10:11:46.689674 I | etcdserver: heartbeat = 100ms 2016-10-21 10:11:46.689680 I | etcdserver: election = 1000ms 2016-10-21 10:11:46.689686 I | etcdserver: snapshot count = 10000 2016-10-21 10:11:46.689696 I | etcdserver: advertise client URLs = http://192.168.2.206:4000 2016-10-21 10:11:46.689717 I | etcdserver: loaded cluster information from store: <nil> 2016-10-21 10:11:46.726159 I | etcdserver: restarting member 4cb5fd19beaa1750 in cluster 877b90a46cdaaa83 at commit index 14044 2016-10-21 10:11:46.727646 I | raft: 4cb5fd19beaa1750 became follower at term 814 2016-10-21 10:11:46.727690 I | raft: newRaft 4cb5fd19beaa1750 [peers: [1226bb321c91a88e,4cb5fd19beaa1750,8ac8820f24de7303,a4a5d4f826d5740a], term: 814, commit: 14044, applied: 10001, lastindex: 14045, lastterm: 814] 2016-10-21 10:11:46.734589 I | rafthttp: the connection with 1226bb321c91a88e became active 2016-10-21 10:11:46.739284 E | rafthttp: failed to dial 8ac8820f24de7303 on stream Message (dial tcp 192.168.2.202:2380: getsockopt: connection refused) 2016-10-21 10:11:46.740156 E | rafthttp: failed to dial 8ac8820f24de7303 on stream MsgApp v2 (dial tcp 192.168.2.202:2380: getsockopt: connection refused) 2016-10-21 10:11:46.745962 I | etcdserver: starting server... [version: 2.2.5, cluster version: 2.2] 2016-10-21 10:11:46.747252 E | rafthttp: failed to dial a4a5d4f826d5740a on stream Message (dial tcp 192.168.2.203:2380: getsockopt: connection refused) 2016-10-21 10:11:46.747394 E | rafthttp: failed to dial a4a5d4f826d5740a on stream MsgApp v2 (dial tcp 192.168.2.203:2380: getsockopt: connection refused) 2016-10-21 10:11:46.756637 I | rafthttp: the connection with 1226bb321c91a88e became inactive 2016-10-21 10:11:46.756660 E | rafthttp: failed to read 1226bb321c91a88e on stream Message (net/http: request canceled) 2016-10-21 10:11:46.756687 N | etcdserver: removed member 1226bb321c91a88e from cluster 877b90a46cdaaa83 2016-10-21 10:11:46.756766 D | etcdserver: skipped updating attributes of removed member 1226bb321c91a88e 2016-10-21 10:11:46.756853 C | etcdserver: nodeToMember should never fail: raftAttributes key doesn't exist panic: nodeToMember should never fail: raftAttributes key doesn't exist Seems like the nodes cannot connect to each other, but I test it with ping, they still ping to each other. Can anyone give us some advice or solution? Thank you. -- 本信件可能包含工研院機密資訊,非指定之收件者,請勿使用或揭露本信件內容,並請銷毀此信件。 This email may contain confidential information. Please do not use or disclose it in any way and delete it if you are not the intended recipient.
_______________________________________________ Clearwater mailing list [email protected] http://lists.projectclearwater.org/mailman/listinfo/clearwater_lists.projectclearwater.org
