Hi Austin, It sounds like your etcd process could be regularly restarting. Can you send me the etcd logs (in /var/log/clearwater-etcd) and monit logs (/var/log/monit.log) from your Sprout node? Thanks,
Ellie From: Clearwater [mailto:[email protected]] On Behalf Of Austin Marston Sent: 23 October 2015 07:22 To: [email protected] Subject: [Clearwater] SIP stress not working Hi all, I deployed manually clearwater infra with one bono,sprout,ellis,ralph,homer, and hs. My sip testing seem to be working fine but my stress testing is not working at all. I created a new node for sip testing, following https://github.com/Metaswitch/crest/blob/dev/docs/Bulk-Provisioning%20Numbers.md Note: Whenever I want to check what might be wrong I always get different status for my nodes. The clearwater cluster_manager seem to fail most of the time on bono and sprout and when I check the cluster health results are always different. Like for instance, when I ran clearwater-etcdctl cluster-health I see that the cluster is healthy but sometimes my bono node (172-16-1-20) or sprout node (172-16-1-20) are reported as not healthy. [17:07:14][sprout]user@cw-002:/var/log/sprout$ clearwater-etcdctl cluster-health cluster is healthy member 27a940d2104e9692 is unhealthy member 2ea8f3a5eea05584 is healthy member 5fdc25bd4ae527c0 is healthy member d26088cb54745bbc is healthy member e525c6a4ed161686 is healthy member f5765a98a56e9c4a is healthy [17:07:24]user@cw-002:/var/log/sprout$ clearwater-etcdctl member list 27a940d2104e9692: name=172-16-1-20 peerURLs=http://172.16.1.20:2380<http://172.16.1.20:2380/>clientURLs=http://172.16.1.20:4000<http://172.16.1.20:4000/> 2ea8f3a5eea05584: name=172-16-1-22 peerURLs=http://172.16.1.22:2380<http://172.16.1.22:2380/>clientURLs=http://172.16.1.22:4000<http://172.16.1.22:4000/> 5fdc25bd4ae527c0: name=172-16-1-25 peerURLs=http://172.16.1.25:2380<http://172.16.1.25:2380/>clientURLs=http://172.16.1.25:4000<http://172.16.1.25:4000/> d26088cb54745bbc: name=172-16-1-24 peerURLs=http://172.16.1.24:2380<http://172.16.1.24:2380/>clientURLs=http://172.16.1.24:4000<http://172.16.1.24:4000/> e525c6a4ed161686: name=172-16-1-21 peerURLs=http://172.16.1.21:2380<http://172.16.1.21:2380/>clientURLs=http://172.16.1.21:4000<http://172.16.1.21:4000/> f5765a98a56e9c4a: name=172-16-1-23 peerURLs=http://172.16.1.23:2380<http://172.16.1.23:2380/>clientURLs=http://172.16.1.23:4000<http://172.16.1.23:4000/> [17:10:00][sprout]user@cw-002:/var/log/sprout$ clearwater-etcdctl cluster-health cluster is healthy member 27a940d2104e9692 is healthy member 2ea8f3a5eea05584 is healthy member 5fdc25bd4ae527c0 is healthy member d26088cb54745bbc is healthy member e525c6a4ed161686 is healthy member f5765a98a56e9c4a is healthy I attach my sip stress logs and my sprout logs. I was running the test between 13:57 and 14:05 on the 22 of october. If you have any idea about why this could go wrong. I certainly forgot something that might be obvious but cannot catch it! Thanks, Austin [Image removed by sender.] sprout_20151022T140000Z.txt<https://drive.google.com/file/d/0BwD2rKlmArODN2V3NnJMeC1Vcms/view?usp=drive_web> [Image removed by sender.] all_sip_log.tgz<https://drive.google.com/file/d/0BwD2rKlmArODRXBsQXpySHhuSGc/view?usp=drive_web>
_______________________________________________ Clearwater mailing list [email protected] http://lists.projectclearwater.org/mailman/listinfo/clearwater_lists.projectclearwater.org
