Thanks for replying Adam, all node start working but *clearwater_cluster_manager_process *does not exist
[bono]ubuntu@bono:~$ sudo monit summary [sudo] password for ubuntu: Monit 5.18.1 uptime: 22h 7m Service Name Status Type node-bono Running System restund_process Running Process ntp_process Running Process clearwater_queue_manager_pro... Running Process etcd_process Running Process clearwater_diags_monitor_pro... Running Process clearwater_config_manager_pr... Running Process *clearwater_cluster_manager_p... Execution failed | Does... Process* bono_process Running Process poll_restund Status ok Program monit_uptime Status ok Program clearwater_queue_manager_uptime Status ok Program etcd_uptime Status ok Program poll_etcd_cluster Status ok Program poll_etcd Status ok Program poll_bono Status ok Program how can I run it it is showing execution failed | does not exit on every node except vellum. In parallel I am also installing the clearwater from scratch using static IP (Using NAT + Host-only network). Please guide some solution regarding that also if possible. thansk, sunil On Thu, Apr 19, 2018 at 3:46 PM, Sunil Kumar <[email protected]> wrote: > the lost node is not the master node but IP of master nodes is changed and > I update them. > > thanks > > On Thu, Apr 19, 2018 at 3:23 PM, Sunil Kumar <[email protected]> wrote: > >> Hi Adam, >> >> Thanks a lot for replying. I am using virtualbox for installing the VMs, >> earlier I was using bridge adapter so it takes the network IP from dhcp, I >> assign them as public and local_ip same. >> >> As you mention for static IP I tried with with *NAT + Host-only Network >> (*NAT as primary interface eth0*), but all node having same IP in eth0 >> in NAT as 10.0.2.15 (is it fine to have all node same IP because I am not >> using that IP)* and i assign host-only ip as static as 192.168.56.110 >> etc. >> >> 1) can I use both local_ip and public_ip same as host only ip >> (192.168.56.110 etc) or public_ip would be the ip of host machine on which >> virtualbox is installed as in NAT VM use Host IP as public IP to contact >> outer world. >> >> 2) Is public_ip necessary as I only want stress testing to run in same >> network, I don't want to install the no. on client like zoiper and all. >> >> 3) Is port forwarding isnecessary in *NAT + Host-only Network, because * >> nodes >> are able to communicate each other and in host only network so I don't >> think port forwarding is necessary. >> >> 4) I just want to run stress testing for handling 1 lack call/sec. so how >> many sprout, vellum node is needed for this much calls. >> >> Thanks, >> Sunil >> >> >> On Thu, Apr 19, 2018 at 2:42 PM, Adam Lindley < >> [email protected]> wrote: >> >>> Hi Sunil, >>> >>> >>> >>> I’m afraid the steps you’ve taken are not supported in Project >>> Clearwater deployments. Both changing the ‘local_ip’ of a node, and >>> removing nodes just by deleting the VMs. >>> >>> >>> >>> On the first point, you need to be able to give your VMs permanent >>> static IP addresses. >>> >>> On the second, by deleting the VMs in your cluster, your underlying etcd >>> cluster has lost quorum. I would suggest http://clearwater.readthedocs. >>> io/en/stable/Handling_Multiple_Failed_Nodes.htm as a starting point for >>> recovering information from it. However, as your single remaining node will >>> likely also have problems due to the local IP changing, you may simply want >>> to redeploy from scratch. >>> >>> >>> >>> More in general, you seem to have hit a substantial number of issues in >>> deploying Project Clearwater, which is both not what we want, and not what >>> the experience of many other users seems to be. I would suggest taking a >>> wider look over our provided documentation, and making sure your >>> environment matches our expectations, and that you’re clear on our >>> processes. This should make your next deployment a lot smoother. >>> >>> >>> >>> Cheers, and good luck, >>> >>> Adam >>> >>> >>> >>> *From:* Clearwater [mailto:clearwater-bounces@lis >>> ts.projectclearwater.org] *On Behalf Of *Sunil Kumar >>> *Sent:* 19 April 2018 07:16 >>> *To:* [email protected] >>> *Subject:* Re: [Project Clearwater] Unable to contact the etcd cluster >>> >>> >>> >>> Hi, >>> >>> the node with ip 10.224.61.109, 10.224.61.112 etc is no more there, I >>> have deleted the node directly. It looks like they are still in the etcd >>> cluster. Can you please tell me how to remove them >>> >>> >>> >>> [IST Apr 19 19:32:45] error : 'etcd_process' process is not running >>> >>> [IST Apr 19 19:32:45] info : 'etcd_process' trying to restart >>> >>> [IST Apr 19 19:32:45] info : 'etcd_process' restart: /bin/bash >>> >>> [IST Apr 19 19:33:15] error : 'etcd_process' failed to restart (exit >>> status -1) -- /bin/bash: Program timed out -- zmq_msg_recv: Resource >>> temporarily unavailable >>> >>> cat: /var/run/clearwater-etcd/clearwater-etcd.pid: No such file or >>> directory >>> >>> cat: /var/run/clearwater-etcd/clearwater-etcd.pid: No such file or >>> directory >>> >>> context deadline excee >>> >>> [IST Apr 19 19:33:25] error : 'etcd_process' process is not running >>> >>> [IST Apr 19 19:33:25] info : 'etcd_process' trying to restart >>> >>> [IST Apr 19 19:33:25] info : 'etcd_process' restart: /bin/bash >>> >>> [IST Apr 19 19:33:55] error : 'etcd_process' failed to restart (exit >>> status -1) -- /bin/bash: Program timed out -- zmq_msg_recv: Resource >>> temporarily unavailable >>> >>> client: etcd cluster is unavailable or misconfigured; error #0: *dial >>> tcp 10.224.61.109:4000 <http://10.224.61.109:4000>*: getsockopt: no >>> route to host >>> >>> ; error #1: dial tcp 10.224.61.47:4000: getsockopt: co >>> >>> [IST Apr 19 19:34:05] error : 'etcd_process' process is not running >>> >>> [IST Apr 19 19:34:05] info : 'etcd_process' trying to restart >>> >>> [IST Apr 19 19:34:05] info : 'etcd_process' restart: /bin/bash >>> >>> [IST Apr 19 19:34:36] error : 'etcd_process' failed to restart (exit >>> status 2) -- /bin/bash: zmq_msg_recv: Resource temporarily unavailable >>> >>> context deadline exceeded >>> >>> >>> >>> >>> >>> On Thu, Apr 19, 2018 at 11:03 AM, Sunil Kumar <[email protected]> >>> wrote: >>> >>> Hi, >>> >>> Any body can help me on this. after ip lost, i update the ip in >>> local_config and dns and restart the service. extra vm is deleted lik i >>> had 3 sprout node so 2 are deleted. >>> >>> >>> >>> [vellum]ubuntu@vellum:~$ cw-config upload shared_config >>> >>> Unable to contact the etcd cluster. >>> >>> >>> >>> thanks >>> >>> sunil >>> >>> >>> >>> _______________________________________________ >>> Clearwater mailing list >>> [email protected] >>> http://lists.projectclearwater.org/mailman/listinfo/clearwat >>> er_lists.projectclearwater.org >>> >>> >> >
_______________________________________________ Clearwater mailing list [email protected] http://lists.projectclearwater.org/mailman/listinfo/clearwater_lists.projectclearwater.org
