Dear All,

Recently I found out my 6 VMs of each node have trouble of executing 
etcd_process
I think I did the config file right, because at first their etcd_process was 
running well and the “clearwater-etcdctl cluster health” shows all healthly.
But somehow they suddenly

I tried “monit restart etcd_process” but still failed.

Here are some command result for more information. (on the node ellis)

root@ellis1:/var/log# clearwater-etcdctl cluster-health
cluster may be unhealthy: failed to list members
Error:  client: etcd cluster is unavailable or misconfigured
error #0: dial tcp 192.168.2.206:4000: getsockopt: connection refused  (.206 is 
homestead’s ip)

cat /var/log/boot.log
………..
………..
………..
zmq_msg_recv: Resource temporarily unavailable
Configuring monit for only localhost access
Error:  dial tcp 192.168.2.205:4000: getsockopt: no route to host
Rejoining cluster...
Etcd failed to come up - exiting

root@ellis1:/var/log/clearwater-etcd# cat clearwater-etcd.log
………….
………….
…………
2016-10-21 10:11:46.686827 I | etcdmain: etcd Version: 2.2.5
2016-10-21 10:11:46.686888 I | etcdmain: Git SHA: bc9ddf2
2016-10-21 10:11:46.686895 I | etcdmain: Go Version: go1.5.3
2016-10-21 10:11:46.686902 I | etcdmain: Go OS/Arch: linux/amd64
2016-10-21 10:11:46.686913 I | etcdmain: setting maximum number of CPUs to 4, 
total number of available CPUs is 4
2016-10-21 10:11:46.686953 N | etcdmain: the server is already initialized as 
member before, starting as etcd member...
2016-10-21 10:11:46.687015 I | etcdmain: listening for peers on 
http://192.168.2.206:2380
2016-10-21 10:11:46.687039 I | etcdmain: listening for client requests on 
http://192.168.2.206:4000
2016-10-21 10:11:46.689639 I | etcdserver: recovered store from snapshot at 
index 10001
2016-10-21 10:11:46.689654 I | etcdserver: name = 192-168-2-206
2016-10-21 10:11:46.689660 I | etcdserver: data dir = 
/var/lib/clearwater-etcd/192.168.2.206
2016-10-21 10:11:46.689668 I | etcdserver: member dir = 
/var/lib/clearwater-etcd/192.168.2.206/member
2016-10-21 10:11:46.689674 I | etcdserver: heartbeat = 100ms
2016-10-21 10:11:46.689680 I | etcdserver: election = 1000ms
2016-10-21 10:11:46.689686 I | etcdserver: snapshot count = 10000
2016-10-21 10:11:46.689696 I | etcdserver: advertise client URLs = 
http://192.168.2.206:4000
2016-10-21 10:11:46.689717 I | etcdserver: loaded cluster information from 
store: <nil>
2016-10-21 10:11:46.726159 I | etcdserver: restarting member 4cb5fd19beaa1750 
in cluster 877b90a46cdaaa83 at commit index 14044
2016-10-21 10:11:46.727646 I | raft: 4cb5fd19beaa1750 became follower at term 
814
2016-10-21 10:11:46.727690 I | raft: newRaft 4cb5fd19beaa1750 [peers: 
[1226bb321c91a88e,4cb5fd19beaa1750,8ac8820f24de7303,a4a5d4f826d5740a], term: 
814, commit: 14044, applied: 10001, lastindex: 14045, lastterm: 814]
2016-10-21 10:11:46.734589 I | rafthttp: the connection with 1226bb321c91a88e 
became active
2016-10-21 10:11:46.739284 E | rafthttp: failed to dial 8ac8820f24de7303 on 
stream Message (dial tcp 192.168.2.202:2380: getsockopt: connection refused)
2016-10-21 10:11:46.740156 E | rafthttp: failed to dial 8ac8820f24de7303 on 
stream MsgApp v2 (dial tcp 192.168.2.202:2380: getsockopt: connection refused)
2016-10-21 10:11:46.745962 I | etcdserver: starting server... [version: 2.2.5, 
cluster version: 2.2]
2016-10-21 10:11:46.747252 E | rafthttp: failed to dial a4a5d4f826d5740a on 
stream Message (dial tcp 192.168.2.203:2380: getsockopt: connection refused)
2016-10-21 10:11:46.747394 E | rafthttp: failed to dial a4a5d4f826d5740a on 
stream MsgApp v2 (dial tcp 192.168.2.203:2380: getsockopt: connection refused)
2016-10-21 10:11:46.756637 I | rafthttp: the connection with 1226bb321c91a88e 
became inactive
2016-10-21 10:11:46.756660 E | rafthttp: failed to read 1226bb321c91a88e on 
stream Message (net/http: request canceled)
2016-10-21 10:11:46.756687 N | etcdserver: removed member 1226bb321c91a88e from 
cluster 877b90a46cdaaa83
2016-10-21 10:11:46.756766 D | etcdserver: skipped updating attributes of 
removed member 1226bb321c91a88e
2016-10-21 10:11:46.756853 C | etcdserver: nodeToMember should never fail: 
raftAttributes key doesn't exist
panic: nodeToMember should never fail: raftAttributes key doesn't exist


Seems like the nodes cannot connect to each other, but I test it with ping, 
they still ping to each other.
Can anyone give us some advice or solution?
Thank you.


--
本信件可能包含工研院機密資訊,非指定之收件者,請勿使用或揭露本信件內容,並請銷毀此信件。 This email may contain 
confidential information. Please do not use or disclose it in any way and 
delete it if you are not the intended recipient.
_______________________________________________
Clearwater mailing list
[email protected]
http://lists.projectclearwater.org/mailman/listinfo/clearwater_lists.projectclearwater.org

Reply via email to