Hi Devendra, It looks like your etcd cluster is in a bad state. The fact that the error output states “Joining an existing cluster, not joining an unhealthy cluster” suggests this node was once part of a healthy cluster, but since that point the cluster has lost quorum. Are you able to provide some more information about what led to the deployment being in this state? It is possible that the etcd data became corrupt at some point due to a node failure, which is why the Ellis node is now attempting to rejoin the cluster. However, as the cluster as a whole entity is unhealthy it is unable to rejoin.
I would suggest that you: · Ensure all nodes in the deployment are able to contact each other at the IPs listed in the ‘etcd_cluster’ parameter, and that they have ports 2380 and 4000 open to traffic. · Check to see if other nodes are also in an unhealthy state o If multiple nodes have entered a bad state, you will need to re-create the etcd cluster from scratch. To do this, you can follow the process at https://clearwater.readthedocs.io/en/stable/Handling_Multiple_Failed_Nodes.html o If only your Ellis node is unable to rejoin the cluster, it is likely an issue in traffic from the Ellis node being unable to reach the other members of the cluster · Check that the local_config on each node is correct, and that the IPs in the ‘etcd_cluster’ parameter are set correctly to the IPs of all nodes in your cluster Hopefully this can get your deployment up and running. Let us know how you get on. If you aren’t able to get it up and running with the above, try taking a look in the logs under /var/log/clearwater-etcd/ to see if you can find anything to help guide you. Cheers, Adam From: Clearwater [mailto:[email protected]] On Behalf Of Devendra Singh Sent: 16 October 2017 11:05 To: [email protected] Subject: [Project Clearwater] etcd cluster is unavailable or misconfigured Hi, I am getting below error in manual installation (Bono,Ellis,Vellum,Homer,Dime,Sprout) on six machine [ellis]ist@ellis:~$ sudo monit summary Monit 5.18.1 uptime: 54m Service Name Status Type node-ellis Running System ntp_process Running Process nginx_process Running Process mysql_process Running Process ellis_process Running Process clearwater_queue_manager_pro... Running Process etcd_process Execution failed | Does... Process clearwater_diags_monitor_pro... Running Process clearwater_config_manager_pr... Running Process clearwater_cluster_manager_p... Running Process nginx_ping Status ok Program nginx_uptime Status ok Program monit_uptime Status ok Program poll_ellis Status ok Program poll_ellis_https Status ok Program clearwater_queue_manager_uptime Status ok Program etcd_uptime Wait parent Program poll_etcd_cluster Wait parent Program poll_etcd Wait parent Program [ellis]ist@ellis:~$ [ellis]ist@ellis:~$ sudo service clearwater-etcd start Error: client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 127.0.0.1:2379<http://127.0.0.1:2379>: getsockopt: connection refused ; error #1: dial tcp 127.0.0.1:4001<http://127.0.0.1:4001>: getsockopt: connection refused error #0: dial tcp 127.0.0.1:2379<http://127.0.0.1:2379>: getsockopt: connection refused error #1: dial tcp 127.0.0.1:4001<http://127.0.0.1:4001>: getsockopt: connection refused Joining existing cluster... Not joining an unhealthy cluster ------------------------------------------------------------------ local_config --------------------------------------------------------------------- #Local IP configuration local_ip=172.16.1.23 public_ip=172.16.1.23 public_hostname=ellis etcd_cluster="172.16.1.23,172.16.2.133,172.16.4.195,172.16.5.22,172.16.4.37,172.16.1.142" shared_config --------------------------------------------------------------------- # Deployment definitions home_domain=example.com<http://example.com> sprout_hostname=sprout.example.com<http://sprout.example.com> chronos_hostname=vellum.example.com:7253<http://vellum.example.com:7253> hs_hostname=hs.example.com:8888<http://hs.example.com:8888> hs_provisioning_hostname=hs.example.com:8889<http://hs.example.com:8889> sprout_impi_store=vellum.example.com<http://vellum.example.com> cassandra_hostname=vellum.example.com<http://vellum.example.com> xdms_hostname=homer.example.com:7888<http://homer.example.com:7888> dime_session_store=vellum.example.com<http://vellum.example.com> upstream_port=0 # Email server configuration smtp_smarthost=172.16.1.23 smtp_username=username smtp_password=password [email protected]<mailto:[email protected]> # Keys (you can change this secret to something else) signup_key=secret turn_workaround=secret ellis_api_key=secret ellis_cookie_key=secret Please let me know if anything wrong i have configured . Thanks and Regards, Devendra
_______________________________________________ Clearwater mailing list [email protected] http://lists.projectclearwater.org/mailman/listinfo/clearwater_lists.projectclearwater.org
