Hi Peter, If you stop etcd, you will no longer be able to add nodes to the cluster, or remove them. You will also lose automatic config application, so you will need to apply changes to shared_config, or any json files, on each node, and manually restart the appropriate processes. See http://clearwater.readthedocs.io/en/stable/Modifying_Clearwater_settings.html for details on how to do this. You will need to use the instructions for deployments not using automatic clustering.
To make sure that everything is properly stopped, you should run the following monit commands: sudo monit stop -g etcd sudo monit stop -g clearwater_queue_manager sudo monit stop clearwater_cluster_manager sudo monit stop clearwater_config_manager You should also be aware that rebooting, or editing any monit config files may cause etcd to start running again, and you may have to run the above commands to stop it again. Yours, Chris From: Peter Skrzynski [mailto:[email protected]] Sent: 29 April 2016 00:05 To: Chris Elford (projectclearwater.org) <[email protected]> Subject: RE: Homestead etcd failing and using lots of memory Thanks Chris. OK I will try the latest release (I have to do a manual build, adding some custom code and do a recompile). In the meantime, is it acceptable to “sudo monit stop etcd_process” on all my nodes, since I will not be needing to add or remove nodes??? Cheers, Peter. From: Chris Elford (projectclearwater.org) [mailto:[email protected]] Sent: Friday, 29 April 2016 3:53 a.m. To: Peter Skrzynski Cc: [email protected]<mailto:[email protected]> Subject: RE: Homestead etcd failing and using lots of memory Thanks Peter, It looks like you may be hitting https://github.com/Metaswitch/clearwater-etcd/issues/264 and etcd is stuck in a bad state which it can’t recover from. We’ve fixed this issue in release 95 by upgrading to a later version of etcd. Do you hit the same issue running the latest release? Yours, Chris From: Peter Skrzynski [mailto:[email protected]] Sent: 27 April 2016 07:27 To: Chris Elford (projectclearwater.org) <[email protected]<mailto:[email protected]>> Subject: RE: Homestead etcd failing and using lots of memory Hi Chris, Here are the requested details, and the (large) log file attached. I don’t think any of my nodes are healthy. Just that my homestead and sprout are using massive amounts of memory, but bono and ibcf are not. BTW, I had increased the memory size to 10GB (from 4G) as an experiment, because my sprout/homestead nodes were using lots of swap memory. - Homestead gave “context deadline exceeded” and “cluster may be unhealthy…” - Bono gave “context deadline exceeded” and “cluster may be unhealthy…” - Sprout gave “context deadline exceeded” and “cluster may be unhealthy…” - Ibcf gave a valid answer to member list… [bono]nec@ibcf:~$ Clearwater-etcdctl member list 2e0eda3ad6bc6e1e: name=192-168-10-100 peerURLs=http://192.168.10.100:2380 clientURLs=http://192.168.10.100:4000 6401532bea5930f8: name=192-168-10-200 peerURLs=http://192.168.10.200:2380 clientURLs=http://192.168.10.200:4000 8c632555af4d958d: name=192-168-10-10 peerURLs=http://192.168.10.10:2380 clientURLs=http://192.168.10.10:4000 8ea0d0c11d6c5ba9: name=192-168-10-20 peerURLs=http://192.168.10.20:2380 clientURLs=http://192.168.10.20:4000 [bono]nec@ibcf:~$ and “cluster may be unhealthy…” From homestead… [homestead]nec@homestead-1:~$ clearwater-etcdctl member list Context deadline exceeded [homestead]nec@homestead-1:~$ clearwater-etcdctl cluster-health cluster may be unhealthy: failed to connect [http://192.168.10.100:4000 http://192.168.10.200:4000 http://192.168.10.10:4000 http://192.168.10.20:4000] [homestead]nec@homestead-1:~$ ls -l /var/log/Clearwater-etcd/ total 78624 -rw-r--r-- 1 clearwater-etcd clearwater-etcd 7768834 Apr 27 11:15 clearwater-etcd.log -rw-r--r-- 1 clearwater-etcd clearwater-etcd 69199875 Apr 24 06:35 clearwater-etcd.log.1 -rw-r--r-- 1 clearwater-etcd clearwater-etcd 1201118 Apr 17 06:32 clearwater-etcd.log.2.gz -rw-r--r-- 1 clearwater-etcd clearwater-etcd 2328508 Jan 24 06:37 clearwater-etcd.log.3.gz [homestead]nec@homestead-1:~$ more /etc/clearwater/local_config home_domain=bbip2.nec.co.nz sprout_hostname=sprout.bbip2.nec.co.nz chronos_hostname=localhost:7253 hs_hostname=hs.bbip2.nec.co.nz:8888 hs_provisioning_hostname=hs.bbip2.nec.co.nz:8889 snmp_ip=192.168.10.230 local_ip=192.168.10.200 public_ip=192.168.10.200 public_hostname=homestead-1.bbip2.nec.co.nz etcd_cluster="192.168.10.10,192.168.10.20,192.168.10.100,192.168.10.200" [homestead]nec@homestead-1:~$ Many thanks, Peter. From: Chris Elford (projectclearwater.org) [mailto:[email protected]] Sent: Tuesday, 26 April 2016 10:48 p.m. To: Peter Skrzynski; [email protected]<mailto:[email protected]> Subject: RE: Homestead etcd failing and using lots of memory Hi Peter, We tend to deploy Project Clearwater with many small nodes, with as little as 4GB of RAM, so I’m surprised to see etcd using over 6GB. I have a suspicion that something may be going wrong with the clustering process. Etcd produces some diagnostics which should help us to track down the problem. Can you collect: • The output from running clearwater-etcdctl member list on a healthy node, and on your homestead node. • The output from running clearwater-etcdctl cluster-health on a healthy node, and on your homestead node. • The contents of /var/log/clearwater-etcd/ on your homestead node. • A copy of /etc/clearwater/local_config from a healthy node, and from your homestead node. That should tell us what etcd thinks it should be doing, and why it is eating up so much memory. Yours, Chris From: Clearwater [mailto:[email protected]] On Behalf Of Peter Skrzynski Sent: 26 April 2016 03:39 To: [email protected]<mailto:[email protected]> Subject: [Project Clearwater] Homestead etcd failing and using lots of memory Hi, I am running a Release-89 bono/sprout/homestead/ibcf/dns configuration. The etcd process on Homestead appears to be consuming a massive amount of memory and is often restarting. Sprout etcd process is also using a large amount of memory. Whereas bono etcd and ibcf etcd processes are stable and using little memory. Is there a logical explanation for homestead doing this? Or is there a problem? (Calls are being processed OK) Any guidance would be most appreciated. Regards, Peter. The following messages are appearing in the console window… [cid:[email protected]] And the top command as follows… [cid:[email protected]] Peter Skrzynski Technical Lead NEC New Zealand Limited NEC House, Level 6, 40 Taranaki Street, PO Box 1936, Wellington 6011, New Zealand T: 043816257, M: 0274849530, F: +6443811110 [email protected]<mailto:[email protected]> nz.nec.com<http://nz.nec.com> [cid:[email protected]] Please consider the environment before printing this email Attention: The information contained in this message and or attachments is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination, copying or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any system and destroy any copies. NEC has no liability for any act or omission in reliance on the email or any attachment. Before opening this email or any attachment(s), please check them for viruses. NEC is not responsible for any viruses in this email or any attachment(s); any changes made to this email or any attachment(s) after they are sent; or any effects this email or any attachment(s) have on your network or computer system.
_______________________________________________ Clearwater mailing list [email protected] http://lists.projectclearwater.org/mailman/listinfo/clearwater_lists.projectclearwater.org
