Hi Lianjie, I’m sorry, the documentation I pointed you at is out of date (as of when we added support for multiple networks). I’ve added an issue to track fixing this up (https://github.com/Metaswitch/cpp-common/issues/274)
Instead, we now recommend querying statistics over SNMP (http://clearwater.readthedocs.org/en/latest/Clearwater_SNMP_Statistics/index.html). To use this, install the relevant clearwater-snmp-handler-* package (e.g. clearwater-snmp-handler-sprout for Sprout nodes). This makes the Clearwater statistics available over port 161. As an example, to query statistics on a node, copy the Project Clearwater MIB onto the node (found at https://github.com/Metaswitch/clearwater-snmp-handlers/blob/master/PROJECT-CLEARWATER-MIB), install snmp, and run “snmpwalk -v 2c -c clearwater localhost -M <folder containing the Project Clearwater MIB> PROJECT-CLEARWATER-MIB:<type of node>”. If you need to query the 0MQ interface, you’ll need to copy the C++ version of the ruby statistics script you were using onto your node (in https://github.com/Metaswitch/cpp-common/tree/master/scripts/stats-c), then build the C++ code (install libzmq3-dev, and run “g++ -o cw_stat cw_stat.cpp –lzmq”), and then run “./cw_stat <type of server> <name of statistic>” (e.g. “./cw_stat sprout latency_us”). Hope this helps, Ellie From: Lianjie Cao [mailto:[email protected]] Sent: 02 April 2015 16:29 To: Eleanor Merry Cc: [email protected]; Sharma, Puneet; Sonia Fahmy Subject: Re: [Clearwater] Sprout clustering problem Hi Ellie, Thanks a lot for the information! We are trying this out. One more question is about Clearwater-snmp packages. From the log, it seems the snmp libraries are not loaded successfully, so we couldn't poll any snmp info from Clearwater nodes. Can you advise us how to get it work? Thanks, Lianjie On Thu, Apr 2, 2015 at 11:03 AM, Eleanor Merry <[email protected]<mailto:[email protected]>> wrote: Hi Lianjie, This is due to a bug with how we pull out the memcached information from /etc/clearwater/cluster_settings, as we don’t strip off any whitespace. In your case, memcached tries to connect to “192.168.1.21:1121”, “<space>192.168.1.22:1121” and “<space>192.168.1.23:1121”, where the last two don’t exist. This means that reading/writing to memcached regularly fails. I’ve raised an issue to track fixing this (see https://github.com/Metaswitch/cpp-common/issues/273) – in the meantime can you please remove the whitespace from your cluster_settings files, and reload each Sprout (“sudo service sprout reload”)? Also, did you use the same file when there were two Sprouts? If so then I believe you’ve probably been hitting this issue already, except registrations would have succeeded as at least one replica was a valid memcached instance (as the replication factor is two, at least one of the replicas would have been the valid “192.168.1.21:1121” entry”). Ellie From: Lianjie Cao [mailto:[email protected]<mailto:[email protected]>] Sent: 30 March 2015 16:34 To: Eleanor Merry; [email protected]<mailto:[email protected]>; Sharma, Puneet; Sonia Fahmy Subject: Re: [Clearwater] Sprout clustering problem Hi Ellie, memcached runs fine on all 3 Sprout nodes: [sprout]cw@sprout-1:~$ ps -ef | grep memcached memcache 1091 1 0 16:44 ? 00:00:35 /usr/bin/memcached -v -e ignore_vbucket=true;cache_size=512000000 -p 11211 -u memcache -l 192.168.1.21 -c 4096 root 26144 1107 0 18:38 ? 00:00:00 [poll_memcached.] <defunct> cw 26173 15526 0 18:38 pts/0 00:00:00 grep --color=auto memcached [sprout]cw@sprout-2:~$ ps -ef | grep memcached memcache 1044 1 0 16:44 ? 00:00:00 /usr/bin/memcached -v -e ignore_vbucket=true;cache_size=512000000 -p 11211 -u memcache -l 192.168.1.22 -c 4096 root 25877 1035 0 18:39 ? 00:00:00 [poll_memcached.] <defunct> cw 25906 15808 0 18:39 pts/0 00:00:00 grep --color=auto memcached [sprout]cw@sprout-3:~$ ps -ef | grep memcached memcache 1101 1 0 17:43 ? 00:00:00 /usr/bin/memcached -v -e ignore_vbucket=true;cache_size=512000000 -p 11211 -u memcache -l 192.168.1.23 -c 4096 root 12949 1075 0 18:39 ? 00:00:00 [poll_memcached.] <defunct> cw 12975 3317 0 18:39 pts/0 00:00:00 grep --color=auto memcached memcached logs look alright: [sprout]cw@sprout-1:~$ tail -20 /var/log/memcached.log profiling:/var/lib/jenkins/workspace/memcached/memcached-hash.gcda:Skip profiling:/var/lib/jenkins:Cannot create directory profiling:/var/lib/jenkins/workspace/memcached/memcached-daemon.gcda:Skip Loaded engine: Default engine v0.1 Supplying the following features: LRU, compare and swap Initiating shutdown profiling:/var/lib/jenkins:Cannot create directory profiling:/var/lib/jenkins/workspace/memcached/memcached-cache.gcda:Skip profiling:/var/lib/jenkins:Cannot create directory profiling:/var/lib/jenkins/workspace/memcached/memcached-thread.gcda:Skip profiling:/var/lib/jenkins:Cannot create directory profiling:/var/lib/jenkins/workspace/memcached/memcached-stats.gcda:Skip profiling:/var/lib/jenkins:Cannot create directory profiling:/var/lib/jenkins/workspace/memcached/memcached-memcached.gcda:Skip profiling:/var/lib/jenkins:Cannot create directory profiling:/var/lib/jenkins/workspace/memcached/memcached-hash.gcda:Skip profiling:/var/lib/jenkins:Cannot create directory profiling:/var/lib/jenkins/workspace/memcached/memcached-daemon.gcda:Skip Loaded engine: Default engine v0.1 Supplying the following features: LRU, compare and swap [sprout]cw@sprout-2:~$ tail -20 /var/log/memcached.log profiling:/var/lib/jenkins/workspace/memcached/memcached-hash.gcda:Skip profiling:/var/lib/jenkins:Cannot create directory profiling:/var/lib/jenkins/workspace/memcached/memcached-daemon.gcda:Skip Loaded engine: Default engine v0.1 Supplying the following features: LRU, compare and swap Initiating shutdown profiling:/var/lib/jenkins:Cannot create directory profiling:/var/lib/jenkins/workspace/memcached/memcached-cache.gcda:Skip profiling:/var/lib/jenkins:Cannot create directory profiling:/var/lib/jenkins/workspace/memcached/memcached-thread.gcda:Skip profiling:/var/lib/jenkins:Cannot create directory profiling:/var/lib/jenkins/workspace/memcached/memcached-stats.gcda:Skip profiling:/var/lib/jenkins:Cannot create directory profiling:/var/lib/jenkins/workspace/memcached/memcached-memcached.gcda:Skip profiling:/var/lib/jenkins:Cannot create directory profiling:/var/lib/jenkins/workspace/memcached/memcached-hash.gcda:Skip profiling:/var/lib/jenkins:Cannot create directory profiling:/var/lib/jenkins/workspace/memcached/memcached-daemon.gcda:Skip Loaded engine: Default engine v0.1 Supplying the following features: LRU, compare and swap [sprout]cw@sprout-3:~$ tail -20 /var/log/memcached.log Supplying the following features: LRU, compare and swap Initiating shutdown profiling:/var/lib/jenkins:Cannot create directory profiling:/var/lib/jenkins/workspace/memcached/memcached-cache.gcda:Skip profiling:/var/lib/jenkins:Cannot create directory profiling:/var/lib/jenkins/workspace/memcached/memcached-thread.gcda:Skip profiling:/var/lib/jenkins:Cannot create directory profiling:/var/lib/jenkins/workspace/memcached/memcached-stats.gcda:Skip profiling:/var/lib/jenkins:Cannot create directory profiling:/var/lib/jenkins/workspace/memcached/memcached-memcached.gcda:Skip profiling:/var/lib/jenkins:Cannot create directory profiling:/var/lib/jenkins/workspace/memcached/memcached-hash.gcda:Skip profiling:/var/lib/jenkins:Cannot create directory profiling:/var/lib/jenkins/workspace/memcached/memcached-daemon.gcda:Skip Loaded engine: Default engine v0.1 Supplying the following features: LRU, compare and swap Loaded engine: Default engine v0.1 Supplying the following features: LRU, compare and swap Loaded engine: Default engine v0.1 Supplying the following features: LRU, compare and swap syslog also doesn't report errors except those about snmp, which is actually another problem I am having. Mar 29 17:43:28 sprout-3 snmpd[1620]: NET-SNMP version 5.4.3 Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: prNames Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: unknown payload OID Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: prNames Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: Unknown payload OID Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: prErrMessage Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: unknown payload OID Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: prErrMessage Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: Unknown payload OID Mar 29 18:53:24 sprout-3 snmpd[1062]: trigger OID: prErrorFlag Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: unknown monitor OID Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: memErrorName Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: unknown payload OID Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: memErrorName Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: Unknown payload OID Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: memSwapErrorMsg Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: unknown payload OID Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: memSwapErrorMsg Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: Unknown payload OID Mar 29 18:53:24 sprout-3 snmpd[1062]: trigger OID: memSwapError Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: unknown monitor OID Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: extNames Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: unknown payload OID Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: extNames Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: Unknown payload OID Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: extOutput Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: unknown payload OID Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: extOutput Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: Unknown payload OID Mar 29 18:53:24 sprout-3 snmpd[1062]: trigger OID: extResult Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: unknown monitor OID Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: dskPath Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: unknown payload OID Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: dskPath Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: Unknown payload OID Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: dskErrorMsg Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: unknown payload OID Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: dskErrorMsg Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: Unknown payload OID Mar 29 18:53:24 sprout-3 snmpd[1062]: trigger OID: dskErrorFlag Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: unknown monitor OID Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: laNames Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: unknown payload OID Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: laNames Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: Unknown payload OID Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: laErrMessage Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: unknown payload OID Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: laErrMessage Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: Unknown payload OID Mar 29 18:53:24 sprout-3 snmpd[1062]: trigger OID: laErrorFlag Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: unknown monitor OID Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: fileName Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: unknown payload OID Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: fileName Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: Unknown payload OID Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: fileErrorMsg Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: unknown payload OID Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: fileErrorMsg Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: Unknown payload OID Mar 29 18:53:24 sprout-3 snmpd[1062]: trigger OID: fileErrorFlag Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: unknown monitor OID Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: snmperrErrMessage Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: unknown payload OID Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: snmperrErrMessage Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: Unknown payload OID Mar 29 18:53:24 sprout-3 snmpd[1062]: trigger OID: snmperrErrorFlag Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: unknown monitor OID Mar 29 18:53:24 sprout-3 snmpd[1062]: Turning on AgentX master support. Mar 29 18:53:24 sprout-3 snmpd[1062]: dlopen failed: /usr/lib/clearwater/sprout_handler.so: cannot open shared object file: No such file or directory Mar 29 18:53:24 sprout-3 snmpd[1062]: dlopen failed: /usr/lib/clearwater/bono_handler.so: cannot open shared object file: No such file or directory Mar 29 18:53:24 sprout-3 snmpd[1062]: dlopen failed: /usr/lib/clearwater/homestead_handler.so: cannot open shared object file: No such file or directory Mar 29 18:53:24 sprout-3 snmpd[1062]: dlopen failed: /usr/lib/clearwater/alarm_handler.so: cannot open shared object file: No such file or directory Mar 29 18:53:24 sprout-3 snmpd[1062]: dlopen failed: /usr/lib/clearwater/cdiv_handler.so: cannot open shared object file: No such file or directory Mar 29 18:53:24 sprout-3 snmpd[1062]: dlopen failed: /usr/lib/clearwater/memento_handler.so: cannot open shared object file: No such file or directory Mar 29 18:53:24 sprout-3 snmpd[1062]: dlopen failed: /usr/lib/clearwater/memento_as_handler.so: cannot open shared object file: No such file or directory Mar 29 18:53:24 sprout-3 snmpd[1062]: net-snmp: 33 error(s) in config file(s) Mar 29 18:53:24 sprout-3 snmpd[1062]: NET-SNMP version 5.4.3 Also, I couldn't poll snmp information from any node of Clearwater using https://github.com/Metaswitch/cpp-common/tree/master/scripts/stats. It doesn't return any response: cw@gateway:~/stats/bin$ ./cw_stat 192.168.1.11 6669 client_count Error: No response from host, ensure hostname and port are correct Error: Terminating due to previous errors NTP on 3 Sprout nodes are synchronizing with the same server which generates SIPp traffic for testing. And they are all upgraded to the latest release. Attached are Sprout logs and SIPp message and error logs of a new test with 10 reg-unreg calls (number 6500000000-650000009, 1 call/sec) for simplicity of analysis. The 7 failed calls are all due to the second REGISTER message was returned with 401 or 500. It seems like for the three successfully registered numbers (6500000002, 6500000005 and 6500000008), the first and second REGISTER messages were handled by the same Sprout node. And the failed ones were handled by two different Sprout nodes. For instance, 6500000000's first REGISTER message was processed by Sprout-1 which was successfull. And the second REGISTER message was processed by Sprout-2. Sprout-2 could read AV information from previous step and retrieve register information from Homestead, but failed on reading it afterwards. So, it finally generated 500 response. You can check the complete logs for more details. Could this be some synchronization problem of memcached? Interesting thing is, if I disable Sprout-3, everything works fine. nfv@Z420-ctl:~/cw-test/openstack/trace/3-sprout$ grep 6500000000 sp-*-sprout_20150330T020000Z.txt sp-1-sprout_20150330T020000Z.txt:REGISTER sip:[email protected]<mailto:sip%[email protected]> SIP/2.0 sp-1-sprout_20150330T020000Z.txt:From: <sip:[email protected]<mailto:sip%[email protected]>>;tag=2619reg1 sp-1-sprout_20150330T020000Z.txt:To: <sip:[email protected]<mailto:sip%[email protected]>> sp-1-sprout_20150330T020000Z.txt:Contact: <sip:[email protected]:31135;transport=UDP> sp-1-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.562 UTC Debug pjutils.cpp:470: Private identity defaulted from public identity = [email protected]<mailto:[email protected]> sp-1-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.566 UTC Debug httpconnection.cpp:553: Sending HTTP request : http://hs.hp-clearwater.com:8888/impi/6500000000%40hp-clearwater.com/av?impu=sip%3A6500000000%40hp-clearwater.com (trying 192.168.1.31) on new connection sp-1-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.634 UTC Debug avstore.cpp:66: Set AV for [email protected]<mailto:[email protected]>\6a661a5278f64006 sp-1-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.634 UTC Debug memcachedstore.cpp:540: Writing 148 bytes to table av key [email protected]<mailto:[email protected]>\6a661a5278f64006, CAS = 0, expiry = 40 sp-1-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.634 UTC Debug memcachedstore.cpp:193: Key av\\[email protected]<mailto:[email protected]>\6a661a5278f64006 hashes to vbucket 77 via hash 0x15f14fcd sp-1-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.878 UTC Debug memcachedstore.cpp:560: 2 write replicas for key av\\[email protected]<mailto:[email protected]>\6a661a5278f64006 sp-1-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.878 UTC Debug memcachedstore.cpp:812: Attempting to add data for key av\\[email protected]<mailto:[email protected]>\6a661a5278f64006 sp-1-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.881 UTC Debug authentication.cpp:396: Sending {"impi": "[email protected]<mailto:[email protected]>", "impu": "sip:[email protected]<mailto:sip%[email protected]>", "nonce": "6a661a5278f64006"} to Chronos to set AV timer sp-1-sprout_20150330T020000Z.txt:From: <sip:[email protected]<mailto:sip%[email protected]>>;tag=2619reg1 sp-1-sprout_20150330T020000Z.txt:To: <sip:[email protected]<mailto:sip%[email protected]>>;tag=z9hG4bKPjWD7evZbxVle9AjzwRDtxOfsPP46A2SBN sp-2-sprout_20150330T020000Z.txt:REGISTER sip:[email protected]<mailto:sip%[email protected]> SIP/2.0 sp-2-sprout_20150330T020000Z.txt:From: <sip:[email protected]<mailto:sip%[email protected]>>;tag=2619reg1 sp-2-sprout_20150330T020000Z.txt:To: <sip:[email protected]<mailto:sip%[email protected]>> sp-2-sprout_20150330T020000Z.txt:Contact: <sip:[email protected]:31135;transport=UDP> sp-2-sprout_20150330T020000Z.txt:Authorization: Digest username="[email protected]<mailto:[email protected]>", realm="hp-clearwater.com<http://hp-clearwater.com>", nonce="6a661a5278f64006", uri="sip:15.9.114.162:5060<http://15.9.114.162:5060>", response="46d9005162ed55dd1629c6766c7b50c6", algorithm=MD5, cnonce="6b8b4567", opaque="4abd75561ff7aa78", qop=auth, nc=00000001,integrity-protected=ip-assoc-pending sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.893 UTC Debug memcachedstore.cpp:193: Key av\\[email protected]<mailto:[email protected]>\6a661a5278f64006 hashes to vbucket 77 via hash 0x15f14fcd sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.903 UTC Debug memcachedstore.cpp:365: 2 read replicas for key av\\[email protected]<mailto:[email protected]>\6a661a5278f64006 sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.903 UTC Debug memcachedstore.cpp:408: Read for av\\[email protected]<mailto:[email protected]>\6a661a5278f64006 on replica 0 returned SUCCESS sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.903 UTC Debug memcachedstore.cpp:451: Read 148 bytes from table av key [email protected]<mailto:[email protected]>\6a661a5278f64006, CAS = 1 sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.903 UTC Debug avstore.cpp:99: Retrieved AV for [email protected]<mailto:[email protected]>\6a661a5278f64006 sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.904 UTC Debug avstore.cpp:66: Set AV for [email protected]<mailto:[email protected]>\6a661a5278f64006 sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.904 UTC Debug memcachedstore.cpp:540: Writing 167 bytes to table av key [email protected]<mailto:[email protected]>\6a661a5278f64006, CAS = 1, expiry = 40 sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.904 UTC Debug memcachedstore.cpp:193: Key av\\[email protected]<mailto:[email protected]>\6a661a5278f64006 hashes to vbucket 77 via hash 0x15f14fcd sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.904 UTC Debug memcachedstore.cpp:560: 2 write replicas for key av\\[email protected]<mailto:[email protected]>\6a661a5278f64006 sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.909 UTC Debug registrar.cpp:518: Process REGISTER for public ID sip:[email protected]<mailto:sip%[email protected]> sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.909 UTC Debug hssconnection.cpp:563: Making Homestead request for /impu/sip%3A6500000000%40hp-clearwater.com/reg-data?private_id=6500000000%40hp-clearwater.com<http://40hp-clearwater.com/reg-data?private_id=6500000000%40hp-clearwater.com> sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.915 UTC Debug httpconnection.cpp:553: Sending HTTP request : http://hs.hp-clearwater.com:8888/impu/sip%3A6500000000%40hp-clearwater.com/reg-data?private_id=6500000000%40hp-clearwater.com (trying 192.168.1.31) on new connection sp-2-sprout_20150330T020000Z.txt: <PrivateID>[email protected]<mailto:[email protected]></PrivateID> sp-2-sprout_20150330T020000Z.txt: <Identity>sip:[email protected]<mailto:sip%[email protected]></Identity> sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.958 UTC Debug hssconnection.cpp:349: Processing Identity node from HSS XML - sip:[email protected]<mailto:sip%[email protected]> sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.958 UTC Debug registrar.cpp:618: REGISTER for public ID sip:[email protected]<mailto:sip%[email protected]> uses AOR sip:[email protected]<mailto:sip%[email protected]> sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.958 UTC Debug regstore.cpp:120: Get AoR data for sip:[email protected]<mailto:sip%[email protected]> sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.958 UTC Debug memcachedstore.cpp:193: Key reg\\sip:[email protected]<mailto:sip%[email protected]> hashes to vbucket 14 via hash 0xe25f38e sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.958 UTC Debug memcachedstore.cpp:365: 2 read replicas for key reg\\sip:[email protected]<mailto:sip%[email protected]> sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.966 UTC Debug memcachedstore.cpp:423: Read for reg\\sip:[email protected]<mailto:sip%[email protected]> on replica 0 returned error 2 (getaddrinfo() or getnameinfo() HOSTNAME LOOKUP FAILURE) sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.970 UTC Debug memcachedstore.cpp:423: Read for reg\\sip:[email protected]<mailto:sip%[email protected]> on replica 1 returned error 2 (getaddrinfo() or getnameinfo() HOSTNAME LOOKUP FAILURE) sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.970 UTC Error memcachedstore.cpp:512: Failed to read data for reg\\sip:[email protected]<mailto:sip%[email protected]> from 2 replicas sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.970 UTC Error registrar.cpp:249: Failed to get AoR binding for sip:[email protected]<mailto:sip%[email protected]> from store sp-2-sprout_20150330T020000Z.txt:From: <sip:[email protected]<mailto:sip%[email protected]>>;tag=2619reg1 sp-2-sprout_20150330T020000Z.txt:To: <sip:[email protected]<mailto:sip%[email protected]>>;tag=z9hG4bKPjuv9-F1vnED88Va2irPvoVQDyFmJ0s.kw sp-3-sprout_20150330T020000Z.txt:30-03-2015 02:45:40.897 UTC Debug memcachedstore.cpp:193: Key av\\[email protected]<mailto:[email protected]>\6a661a5278f64006 hashes to vbucket 77 via hash 0x15f14fcd sp-3-sprout_20150330T020000Z.txt:30-03-2015 02:45:40.914 UTC Debug memcachedstore.cpp:365: 2 read replicas for key av\\[email protected]<mailto:[email protected]>\6a661a5278f64006 sp-3-sprout_20150330T020000Z.txt:30-03-2015 02:45:40.915 UTC Debug memcachedstore.cpp:408: Read for av\\[email protected]<mailto:[email protected]>\6a661a5278f64006 on replica 0 returned SUCCESS sp-3-sprout_20150330T020000Z.txt:30-03-2015 02:45:40.915 UTC Debug memcachedstore.cpp:451: Read 167 bytes from table av key [email protected]<mailto:[email protected]>\6a661a5278f64006, CAS = 2 sp-3-sprout_20150330T020000Z.txt:30-03-2015 02:45:40.915 UTC Debug avstore.cpp:99: Retrieved AV for [email protected]<mailto:[email protected]>\6a661a5278f64006 Thanks, Lianjie On Fri, Mar 27, 2015 at 2:59 PM Eleanor Merry <[email protected]<mailto:[email protected]>> wrote: Hi, It looks like there's an issue with the underlying memcached process. Can you check whether memcached is running on all your Sprout nodes, and check the memcached logs in /var/log/memcached.log and /var/log/syslog? Also, can you confirm that all your nodes are running Anna Karenina, and that NTP is running on your Sprout nodes? Ellie -----Original Message----- From: [email protected]<mailto:[email protected]> [mailto:[email protected]<mailto:[email protected]>] On Behalf Of Lianjie Cao Sent: 26 March 2015 15:55 To: [email protected]<mailto:[email protected]>; Sharma, Puneet; Sonia Fahmy Subject: [Clearwater] Sprout clustering problem Hi, Our current Clearwater deployment has 1 Bono node, 2 Sprout nodes, 2 Homestead nodes and 1 Homer node running Anna Karenina Release. Everything works fine. However, when we are trying to add the third Sprout node, we get error responses during registration test, such as "500 Server Internal Error" and ''401 Unauthorized" for the second register request. To cluster Sprout nodes, we follow the instruction to modify cluster_settings and chronos.conf as below. [sprout]cw@sprout-3:~$ cat /etc/clearwater/cluster_settings servers=192.168.1.21:11211<http://192.168.1.21:11211>, 192.168.1.22:11211<http://192.168.1.22:11211>, 192.168.1.23:11211<http://192.168.1.23:11211> [sprout]cw@sprout-3:~$ cat /etc/chronos/chronos.conf [http] bind-address = 192.168.1.23 bind-port = 7253 threads = 50 [logging] folder = /var/log/chronos level = 2 [cluster] localhost = 192.168.1.23 node = 192.168.1.21 node = 192.168.1.22 node = 192.168.1.23 [alarms] enabled = true [exceptions] max_ttl = 600 The three nodes share the same files except "bind-address = 192.168.1.23" and "localhost = 192.168.1.23" in chronos.conf. In the Sprout log, we found errors messages like this on all of the 3 Sprout nodes with log level 5 (complete logs are attached): 26-03-2015 07:49:16.717 UTC Debug registrar.cpp:618: REGISTER for public ID sip:[email protected]<mailto:sip%[email protected]> uses AOR sip:[email protected]<mailto:sip%[email protected]> 26-03-2015 07:49:16.717 UTC Debug regstore.cpp:120: Get AoR data for sip:[email protected]<mailto:sip%[email protected]> 26-03-2015 07:49:16.717 UTC Debug memcachedstore.cpp:193: Key reg\\ sip:[email protected]<mailto:sip%[email protected]> hashes to vbucket 14 via hash 0xe25f38e 26-03-2015 07:49:16.717 UTC Debug memcachedstore.cpp:365: 2 read replicas for key reg\\sip:[email protected]<mailto:sip%[email protected]> 26-03-2015 07:49:16.717 UTC Debug memcachedstore.cpp:400: Attempt to read from replica 0 (connection 0x7f6ce000ca30) 26-03-2015 07:49:16.718 UTC Debug memcachedstore.cpp:423: Read for reg\\ sip:[email protected]<mailto:sip%[email protected]> on replica 0 returned error 2 (getaddrinfo() or getnameinfo() HOSTNAME LOOKUP FAILURE) 26-03-2015 07:49:16.718 UTC Debug memcachedstore.cpp:400: Attempt to read from replica 1 (connection 0x7f6ce00129e0) 26-03-2015 07:49:16.720 UTC Debug memcachedstore.cpp:423: Read for reg\\ sip:[email protected]<mailto:sip%[email protected]> on replica 1 returned error 2 (getaddrinfo() or getnameinfo() HOSTNAME LOOKUP FAILURE) 26-03-2015 07:49:16.720 UTC Error memcachedstore.cpp:512: Failed to read data for reg\\sip:[email protected]<mailto:sip%[email protected]> from 2 replicas 26-03-2015 07:49:16.720 UTC Debug registrar.cpp:242: Retrieved AoR data (nil) 26-03-2015 07:49:16.720 UTC Error registrar.cpp:249: Failed to get AoR binding for sip:[email protected]<mailto:sip%[email protected]> from store 26-03-2015 07:49:16.720 UTC Debug pjsip: endpoint Response msg 500/REGISTER/cseq=2 (tdta0x7f6ce0081900) created 26-03-2015 07:49:16.720 UTC Verbose common_sip_processing.cpp:136: TX 449 bytes Response msg 500/REGISTER/cseq=2 (tdta0x7f6ce0081900) to TCP 192.168.1.11:36197<http://192.168.1.11:36197>: 26-03-2015 07:49:16.933 UTC Debug memcachedstore.cpp:234: Set up new view 1 for thread 26-03-2015 07:49:16.933 UTC Debug memcachedstore.cpp:241: Setting up server 0 for connection 0x7f6d10040e80 (--CONNECT-TIMEOUT=10 --SUPPORT-CAS --POLL-TIMEOUT=250 --BINARY-PROTOCOL) 26-03-2015 07:49:16.933 UTC Debug memcachedstore.cpp:243: Set up connection 0x7f6d10041460 to server 192.168.1.21:11211<http://192.168.1.21:11211> 26-03-2015 07:49:16.933 UTC Debug memcachedstore.cpp:254: Setting server to IP address 192.168.1.21 port 11211 26-03-2015 07:49:16.933 UTC Debug memcachedstore.cpp:241: Setting up server 1 for connection 0x7f6d10040e80 (--CONNECT-TIMEOUT=10 --SUPPORT-CAS --POLL-TIMEOUT=250 --BINARY-PROTOCOL) 26-03-2015 07:49:16.933 UTC Debug memcachedstore.cpp:243: Set up connection 0x7f6d10045be0 to server 192.168.1.22:11211<http://192.168.1.22:11211> 26-03-2015 07:49:16.933 UTC Debug memcachedstore.cpp:254: Setting server to IP address 192.168.1.22 port 11211 26-03-2015 07:49:16.936 UTC Debug memcachedstore.cpp:241: Setting up server 2 for connection 0x7f6d10040e80 (--CONNECT-TIMEOUT=10 --SUPPORT-CAS --POLL-TIMEOUT=250 --BINARY-PROTOCOL) 26-03-2015 07:49:16.936 UTC Debug memcachedstore.cpp:243: Set up connection 0x7f6d1004b3c0 to server 192.168.1.23:11211<http://192.168.1.23:11211> 26-03-2015 07:49:16.936 UTC Debug memcachedstore.cpp:254: Setting server to IP address 192.168.1.23 port 11211 26-03-2015 07:49:16.938 UTC Debug memcachedstore.cpp:560: 2 write replicas for key av\\[email protected]<mailto:[email protected]>\3a44ea384df9161e 26-03-2015 07:49:16.939 UTC Debug memcachedstore.cpp:612: Attempt conditional write to vbucket 93 on replica 0 (connection 0x7f6d1004b3c0), CAS = 0, expiry = 40 26-03-2015 07:49:16.939 UTC Debug memcachedstore.cpp:812: Attempting to add data for key av\\[email protected]<mailto:[email protected]>\3a44ea384df9161e 26-03-2015 07:49:16.939 UTC Debug memcachedstore.cpp:822: Attempting memcached ADD command 26-03-2015 07:49:16.942 UTC Debug memcachedstore.cpp:912: ADD/CAS returned rc = 5 (WRITE FAILURE) (140106396906432) getaddrinfo() or getnameinfo() HOSTNAME LOOKUP FAILURE, Name or service not known, host: 192.168.1.23:11211<http://192.168.1.23:11211> -> libmemcached/connect.cc:197 26-03-2015 07:49:16.942 UTC Debug memcachedstore.cpp:612: Attempt conditional write to vbucket 93 on replica 1 (connection 0x7f6d10045be0), CAS = 0, expiry = 40 26-03-2015 07:49:16.942 UTC Debug memcachedstore.cpp:812: Attempting to add data for key av\\[email protected]<mailto:[email protected]>\3a44ea384df9161e 26-03-2015 07:49:16.942 UTC Debug memcachedstore.cpp:822: Attempting memcached ADD command 26-03-2015 07:49:16.945 UTC Debug memcachedstore.cpp:912: ADD/CAS returned rc = 5 (WRITE FAILURE) (140106396883936) getaddrinfo() or getnameinfo() HOSTNAME LOOKUP FAILURE, Name or service not known, host: 192.168.1.22:11211<http://192.168.1.22:11211> -> libmemcached/connect.cc:197 26-03-2015 07:49:16.945 UTC Error memcachedstore.cpp:715: Failed to write data for av\\[email protected]<mailto:[email protected]>\3a44ea384df9161e to 2 replicas 26-03-2015 07:49:16.945 UTC Error avstore.cpp:71: Failed to write Authentication Vector for private_id [email protected]<mailto:[email protected]> 26-03-2015 07:49:16.945 UTC Verbose common_sip_processing.cpp:136: TX 569 bytes Response msg 401/REGISTER/cseq=1 (tdta0x7f6d10006fa0) to TCP 192.168.1.11:40630<http://192.168.1.11:40630>: Can you help us on this? Thanks, Lianjie _______________________________________________ Clearwater mailing list [email protected] http://lists.projectclearwater.org/listinfo/clearwater
