Re: [Project Clearwater] Homestead etcd failing and using lots of memory

Chris Elford (projectclearwater.org) Fri, 29 Apr 2016 04:05:33 -0700

Hi Peter,

If you stop etcd, you will no longer be able to add nodes to the cluster, or 
remove them. You will also lose automatic config application, so you will need 
to apply changes to shared_config, or any json files, on each node, and 
manually restart the appropriate processes. See 
http://clearwater.readthedocs.io/en/stable/Modifying_Clearwater_settings.html 
for details on how to do this. You will need to use the instructions for 
deployments not using automatic clustering.


To make sure that everything is properly stopped, you should run the following 
monit commands:

sudo monit stop -g etcd
sudo monit stop -g clearwater_queue_manager
sudo monit stop clearwater_cluster_manager
sudo monit stop clearwater_config_manager

You should also be aware that rebooting, or editing any monit config files may 
cause etcd to start running again, and you may have to run the above commands 
to stop it again.

Yours,

Chris


From: Peter Skrzynski [mailto:[email protected]]
Sent: 29 April 2016 00:05
To: Chris Elford (projectclearwater.org) <[email protected]>
Subject: RE: Homestead etcd failing and using lots of memory

Thanks Chris.
OK I will try the latest release (I have to do a manual build, adding some 
custom code and do a recompile).
In the meantime, is it acceptable to “sudo monit stop etcd_process” on all my 
nodes, since I will not be needing to add or remove nodes???
Cheers,
Peter.

From: Chris Elford (projectclearwater.org) [mailto:[email protected]]
Sent: Friday, 29 April 2016 3:53 a.m.
To: Peter Skrzynski
Cc: 
[email protected]<mailto:[email protected]>
Subject: RE: Homestead etcd failing and using lots of memory

Thanks Peter,

It looks like you may be hitting 
https://github.com/Metaswitch/clearwater-etcd/issues/264 and etcd is stuck in a 
bad state which it can’t recover from. We’ve fixed this issue in release 95 by 
upgrading to a later version of etcd. Do you hit the same issue running the 
latest release?

Yours,

Chris

From: Peter Skrzynski [mailto:[email protected]]
Sent: 27 April 2016 07:27
To: Chris Elford (projectclearwater.org) 
<[email protected]<mailto:[email protected]>>
Subject: RE: Homestead etcd failing and using lots of memory


Hi Chris,
Here are the requested details, and the (large) log file attached.
I don’t think any of my nodes are healthy.
Just that my homestead and sprout are using massive amounts of memory, but bono 
and ibcf are not.
BTW, I had increased the memory size to 10GB (from 4G) as an experiment, 
because my sprout/homestead nodes were using lots of swap memory.


-          Homestead gave “context deadline exceeded” and “cluster may be 
unhealthy…”

-          Bono gave “context deadline exceeded” and “cluster may be unhealthy…”

-          Sprout gave “context deadline exceeded” and “cluster may be 
unhealthy…”

-          Ibcf gave a valid answer to member list…
[bono]nec@ibcf:~$ Clearwater-etcdctl member list
2e0eda3ad6bc6e1e: name=192-168-10-100 peerURLs=http://192.168.10.100:2380 
clientURLs=http://192.168.10.100:4000
6401532bea5930f8: name=192-168-10-200 peerURLs=http://192.168.10.200:2380 
clientURLs=http://192.168.10.200:4000
8c632555af4d958d: name=192-168-10-10 peerURLs=http://192.168.10.10:2380 
clientURLs=http://192.168.10.10:4000
8ea0d0c11d6c5ba9: name=192-168-10-20 peerURLs=http://192.168.10.20:2380 
clientURLs=http://192.168.10.20:4000
[bono]nec@ibcf:~$
and “cluster may be unhealthy…”


From homestead…
[homestead]nec@homestead-1:~$ clearwater-etcdctl member list
Context deadline exceeded
[homestead]nec@homestead-1:~$ clearwater-etcdctl cluster-health
cluster may be unhealthy: failed to connect [http://192.168.10.100:4000 
http://192.168.10.200:4000 http://192.168.10.10:4000 http://192.168.10.20:4000]
[homestead]nec@homestead-1:~$ ls -l /var/log/Clearwater-etcd/
total 78624
-rw-r--r-- 1 clearwater-etcd clearwater-etcd  7768834 Apr 27 11:15 
clearwater-etcd.log
-rw-r--r-- 1 clearwater-etcd clearwater-etcd 69199875 Apr 24 06:35 
clearwater-etcd.log.1
-rw-r--r-- 1 clearwater-etcd clearwater-etcd  1201118 Apr 17 06:32 
clearwater-etcd.log.2.gz
-rw-r--r-- 1 clearwater-etcd clearwater-etcd  2328508 Jan 24 06:37 
clearwater-etcd.log.3.gz
[homestead]nec@homestead-1:~$ more /etc/clearwater/local_config
home_domain=bbip2.nec.co.nz
sprout_hostname=sprout.bbip2.nec.co.nz
chronos_hostname=localhost:7253
hs_hostname=hs.bbip2.nec.co.nz:8888
hs_provisioning_hostname=hs.bbip2.nec.co.nz:8889
snmp_ip=192.168.10.230

local_ip=192.168.10.200
public_ip=192.168.10.200
public_hostname=homestead-1.bbip2.nec.co.nz
etcd_cluster="192.168.10.10,192.168.10.20,192.168.10.100,192.168.10.200"
[homestead]nec@homestead-1:~$

Many thanks,
Peter.

From: Chris Elford (projectclearwater.org) [mailto:[email protected]]
Sent: Tuesday, 26 April 2016 10:48 p.m.
To: Peter Skrzynski; 
[email protected]<mailto:[email protected]>
Subject: RE: Homestead etcd failing and using lots of memory

Hi Peter,

We tend to deploy Project Clearwater with many small nodes, with as little as 
4GB of RAM, so I’m surprised to see etcd using over 6GB. I have a suspicion 
that something may be going wrong with the clustering process.

Etcd produces some diagnostics which should help us to track down the problem. 
Can you collect:

•         The output from running clearwater-etcdctl member list on a healthy 
node, and on your homestead node.

•         The output from running clearwater-etcdctl cluster-health on a 
healthy node, and on your homestead node.

•         The contents of /var/log/clearwater-etcd/ on your homestead node.

•         A copy of /etc/clearwater/local_config from a healthy node, and from 
your homestead node.

That should tell us what etcd thinks it should be doing, and why it is eating 
up so much memory.

Yours,

Chris

From: Clearwater [mailto:[email protected]] On 
Behalf Of Peter Skrzynski
Sent: 26 April 2016 03:39
To: 
[email protected]<mailto:[email protected]>
Subject: [Project Clearwater] Homestead etcd failing and using lots of memory

Hi,
I am running a Release-89 bono/sprout/homestead/ibcf/dns configuration.
The etcd process on Homestead appears to be consuming a massive amount of 
memory and is often restarting.
Sprout etcd process is also using a large amount of memory.
Whereas bono etcd and ibcf etcd processes are stable and using little memory.
Is there a logical explanation for homestead doing this?
Or is there a problem?
(Calls are being processed OK)
Any guidance would be most appreciated.
Regards,
Peter.

The following messages are appearing in the console window…

[cid:[email protected]]


And the top command as follows…
[cid:[email protected]]





Peter Skrzynski
Technical Lead
NEC New Zealand Limited
NEC House, Level 6, 40 Taranaki Street, PO Box 1936, Wellington 6011, New 
Zealand
T: 043816257, M: 0274849530, F: +6443811110
[email protected]<mailto:[email protected]>
nz.nec.com<http://nz.nec.com>

[cid:[email protected]]

Please consider the environment before printing this email

Attention:
The information contained in this message and or attachments is intended only 
for the person or entity to which it is addressed and may contain confidential 
and/or privileged material.  Any review, retransmission, dissemination, copying 
or other use of, or taking of any action in reliance upon, this information by 
persons or entities other than the intended recipient is prohibited. If you 
received this in error, please contact the sender and delete the material from 
any system and destroy any copies. NEC has no liability for any act or omission 
in reliance on the email or any attachment.  Before opening this email or any 
attachment(s), please check them for viruses. NEC is not responsible for any 
viruses in this email or any attachment(s); any changes made to this  email or 
any attachment(s) after they are sent; or any effects this email or any 
attachment(s) have on your network or computer system.

_______________________________________________
Clearwater mailing list
[email protected]
http://lists.projectclearwater.org/mailman/listinfo/clearwater_lists.projectclearwater.org

Re: [Project Clearwater] Homestead etcd failing and using lots of memory

Reply via email to