Hi community,
I have solved the "does not exist" issue (it was due to file permission),
and then the Sprout cluster was in a not stable state.
I have tried to decommission the new sprout, but clearwater-cluster-manager
is not able to complete the query:
UTC ERROR common_etcd_synchronizer.py:139 (thread ChronosPlugin): 10.4.0.130
caught EtcdException("Unable to decode server response:
HTTPConnectionPool(host='10.4.0.130', port=4000): Read timed out.",) when
trying to read with index 1478037 - pause before retry
Is there a way to manually decommission a node from a deployment and leave
the cluster stable?
Thanks.
De: Clearwater [mailto:[email protected]] En
nombre de Nicola Principe
Enviado el: viernes, 09 de octubre de 2015 12:38
Para: [email protected]
Asunto: [Clearwater] - Sprout_process does not exit, elastic scaling
Hi community,
I have a PCW deployment with 3 Sprouts and 2 Homesteads.
I have tried to add a 4th Sprout following the automatic clustering scaling
instructions, but it does not work.
On one of my Sprout already in the deployment I see this (10.4.0.130 is the
new Sprout):
Describing the Sprout Memcached cluster in site site1:
The local node is in this cluster
The cluster is *not* stable
10.4.0.157 is in state normal
10.4.0.156 is in state normal
10.4.0.159 is in state normal
10.4.0.130 is in state joining, acknowledged change
Describing the Sprout Chronos cluster in site site1:
The local node is in this cluster
The cluster is *not* stable
10.4.0.157 is in state normal
10.4.0.156 is in state normal
10.4.0.159 is in state normal
10.4.0.130 is in state joining, acknowledged change
But on the new Sprout node the sprout_process does not exist:
[sprout]manager@sprout-4:/var/log/sprout$ sudo monit status
The Monit daemon 5.8.1 uptime: 7m
Process 'sprout_process'
status Does not exist
monitoring status Monitored
data collected Fri, 09 Oct 2015 12:23:25
Program 'poll_sprout_sip'
status Initializing
monitoring status Initializing
data collected Fri, 09 Oct 2015 12:15:54
Program 'poll_sprout_http'
status Initializing
monitoring status Initializing
data collected Fri, 09 Oct 2015 12:15:54
In the logs I can see the following:
09-10-2015 10:22:33.009 UTC Error memcached_config.cpp:133: Failed to open
'/etc/clearwater/cluster_settings'
09-10-2015 10:22:33.009 UTC Error memcachedstore.cpp:184: Failed to read
config, keeping previous settings
09-10-2015 10:22:33.010 UTC Error main.cpp:1885: Cluster settings file
'/etc/clearwater/cluster_settings' does not contain a valid set of servers
...but the cluster_settings file has been generated by etcd automatically.
Do you have any suggestion to sort this out?
Thanks,
Nicola
_______________________________________________
Clearwater mailing list
[email protected]
http://lists.projectclearwater.org/mailman/listinfo/clearwater_lists.projectclearwater.org