Hi Lianjie,

This is due to a bug with how we pull out the memcached information from 
/etc/clearwater/cluster_settings, as we don’t strip off any whitespace. In your 
case, memcached tries to connect to
“192.168.1.21:1121”, “<space>192.168.1.22:1121” and “<space>192.168.1.23:1121”, 
where the last two don’t exist. This means that reading/writing to memcached 
regularly fails.

I’ve raised an issue to track fixing this (see 
https://github.com/Metaswitch/cpp-common/issues/273) – in the meantime can you 
please remove the whitespace from your cluster_settings files, and reload each 
Sprout (“sudo service sprout reload”)?

Also, did you use the same file when there were two Sprouts? If so then I 
believe you’ve probably been hitting this issue already, except registrations 
would have succeeded as at least one replica was a valid memcached instance (as 
the replication factor is two, at least one of the replicas would have been the 
valid “192.168.1.21:1121” entry”).

Ellie

From: Lianjie Cao [mailto:[email protected]]
Sent: 30 March 2015 16:34
To: Eleanor Merry; [email protected]; Sharma, Puneet; 
Sonia Fahmy
Subject: Re: [Clearwater] Sprout clustering problem

Hi Ellie,
memcached runs fine on all 3 Sprout nodes:
[sprout]cw@sprout-1:~$ ps -ef | grep memcached
memcache  1091     1  0 16:44 ?        00:00:35 /usr/bin/memcached -v -e 
ignore_vbucket=true;cache_size=512000000 -p 11211 -u memcache -l 192.168.1.21 
-c 4096
root     26144  1107  0 18:38 ?        00:00:00 [poll_memcached.] <defunct>
cw       26173 15526  0 18:38 pts/0    00:00:00 grep --color=auto memcached

[sprout]cw@sprout-2:~$ ps -ef | grep memcached
memcache  1044     1  0 16:44 ?        00:00:00 /usr/bin/memcached -v -e 
ignore_vbucket=true;cache_size=512000000 -p 11211 -u memcache -l 192.168.1.22 
-c 4096
root     25877  1035  0 18:39 ?        00:00:00 [poll_memcached.] <defunct>
cw       25906 15808  0 18:39 pts/0    00:00:00 grep --color=auto memcached

[sprout]cw@sprout-3:~$ ps -ef | grep memcached
memcache  1101     1  0 17:43 ?        00:00:00 /usr/bin/memcached -v -e 
ignore_vbucket=true;cache_size=512000000 -p 11211 -u memcache -l 192.168.1.23 
-c 4096
root     12949  1075  0 18:39 ?        00:00:00 [poll_memcached.] <defunct>
cw       12975  3317  0 18:39 pts/0    00:00:00 grep --color=auto memcached

memcached logs look alright:
[sprout]cw@sprout-1:~$ tail -20 /var/log/memcached.log
profiling:/var/lib/jenkins/workspace/memcached/memcached-hash.gcda:Skip
profiling:/var/lib/jenkins:Cannot create directory
profiling:/var/lib/jenkins/workspace/memcached/memcached-daemon.gcda:Skip
Loaded engine: Default engine v0.1
Supplying the following features: LRU, compare and swap
Initiating shutdown
profiling:/var/lib/jenkins:Cannot create directory
profiling:/var/lib/jenkins/workspace/memcached/memcached-cache.gcda:Skip
profiling:/var/lib/jenkins:Cannot create directory
profiling:/var/lib/jenkins/workspace/memcached/memcached-thread.gcda:Skip
profiling:/var/lib/jenkins:Cannot create directory
profiling:/var/lib/jenkins/workspace/memcached/memcached-stats.gcda:Skip
profiling:/var/lib/jenkins:Cannot create directory
profiling:/var/lib/jenkins/workspace/memcached/memcached-memcached.gcda:Skip
profiling:/var/lib/jenkins:Cannot create directory
profiling:/var/lib/jenkins/workspace/memcached/memcached-hash.gcda:Skip
profiling:/var/lib/jenkins:Cannot create directory
profiling:/var/lib/jenkins/workspace/memcached/memcached-daemon.gcda:Skip
Loaded engine: Default engine v0.1
Supplying the following features: LRU, compare and swap

[sprout]cw@sprout-2:~$ tail -20 /var/log/memcached.log
profiling:/var/lib/jenkins/workspace/memcached/memcached-hash.gcda:Skip
profiling:/var/lib/jenkins:Cannot create directory
profiling:/var/lib/jenkins/workspace/memcached/memcached-daemon.gcda:Skip
Loaded engine: Default engine v0.1
Supplying the following features: LRU, compare and swap
Initiating shutdown
profiling:/var/lib/jenkins:Cannot create directory
profiling:/var/lib/jenkins/workspace/memcached/memcached-cache.gcda:Skip
profiling:/var/lib/jenkins:Cannot create directory
profiling:/var/lib/jenkins/workspace/memcached/memcached-thread.gcda:Skip
profiling:/var/lib/jenkins:Cannot create directory
profiling:/var/lib/jenkins/workspace/memcached/memcached-stats.gcda:Skip
profiling:/var/lib/jenkins:Cannot create directory
profiling:/var/lib/jenkins/workspace/memcached/memcached-memcached.gcda:Skip
profiling:/var/lib/jenkins:Cannot create directory
profiling:/var/lib/jenkins/workspace/memcached/memcached-hash.gcda:Skip
profiling:/var/lib/jenkins:Cannot create directory
profiling:/var/lib/jenkins/workspace/memcached/memcached-daemon.gcda:Skip
Loaded engine: Default engine v0.1
Supplying the following features: LRU, compare and swap

[sprout]cw@sprout-3:~$ tail -20 /var/log/memcached.log
Supplying the following features: LRU, compare and swap
Initiating shutdown
profiling:/var/lib/jenkins:Cannot create directory
profiling:/var/lib/jenkins/workspace/memcached/memcached-cache.gcda:Skip
profiling:/var/lib/jenkins:Cannot create directory
profiling:/var/lib/jenkins/workspace/memcached/memcached-thread.gcda:Skip
profiling:/var/lib/jenkins:Cannot create directory
profiling:/var/lib/jenkins/workspace/memcached/memcached-stats.gcda:Skip
profiling:/var/lib/jenkins:Cannot create directory
profiling:/var/lib/jenkins/workspace/memcached/memcached-memcached.gcda:Skip
profiling:/var/lib/jenkins:Cannot create directory
profiling:/var/lib/jenkins/workspace/memcached/memcached-hash.gcda:Skip
profiling:/var/lib/jenkins:Cannot create directory
profiling:/var/lib/jenkins/workspace/memcached/memcached-daemon.gcda:Skip
Loaded engine: Default engine v0.1
Supplying the following features: LRU, compare and swap
Loaded engine: Default engine v0.1
Supplying the following features: LRU, compare and swap
Loaded engine: Default engine v0.1
Supplying the following features: LRU, compare and swap

syslog also doesn't report errors except those about snmp, which is actually 
another problem I am having.
Mar 29 17:43:28 sprout-3 snmpd[1620]: NET-SNMP version 5.4.3
Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: prNames
Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: 
unknown payload OID
Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: prNames
Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: 
Unknown payload OID
Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: prErrMessage
Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: 
unknown payload OID
Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: prErrMessage
Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: 
Unknown payload OID
Mar 29 18:53:24 sprout-3 snmpd[1062]: trigger OID: prErrorFlag
Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: 
unknown monitor OID
Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: memErrorName
Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: 
unknown payload OID
Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: memErrorName
Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: 
Unknown payload OID
Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: memSwapErrorMsg
Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: 
unknown payload OID
Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: memSwapErrorMsg
Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: 
Unknown payload OID
Mar 29 18:53:24 sprout-3 snmpd[1062]: trigger OID: memSwapError
Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: 
unknown monitor OID
Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: extNames
Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: 
unknown payload OID
Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: extNames
Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: 
Unknown payload OID
Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: extOutput
Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: 
unknown payload OID
Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: extOutput
Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: 
Unknown payload OID
Mar 29 18:53:24 sprout-3 snmpd[1062]: trigger OID: extResult
Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: 
unknown monitor OID
Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: dskPath
Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: 
unknown payload OID
Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: dskPath
Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: 
Unknown payload OID
Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: dskErrorMsg
Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: 
unknown payload OID
Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: dskErrorMsg
Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: 
Unknown payload OID
Mar 29 18:53:24 sprout-3 snmpd[1062]: trigger OID: dskErrorFlag
Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: 
unknown monitor OID
Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: laNames
Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: 
unknown payload OID
Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: laNames
Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: 
Unknown payload OID
Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: laErrMessage
Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: 
unknown payload OID
Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: laErrMessage
Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: 
Unknown payload OID
Mar 29 18:53:24 sprout-3 snmpd[1062]: trigger OID: laErrorFlag
Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: 
unknown monitor OID
Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: fileName
Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: 
unknown payload OID
Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: fileName
Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: 
Unknown payload OID
Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: fileErrorMsg
Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: 
unknown payload OID
Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: fileErrorMsg
Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: 
Unknown payload OID
Mar 29 18:53:24 sprout-3 snmpd[1062]: trigger OID: fileErrorFlag
Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: 
unknown monitor OID
Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: snmperrErrMessage
Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: 
unknown payload OID
Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: snmperrErrMessage
Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: 
Unknown payload OID
Mar 29 18:53:24 sprout-3 snmpd[1062]: trigger OID: snmperrErrorFlag
Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144: Error: 
unknown monitor OID
Mar 29 18:53:24 sprout-3 snmpd[1062]: Turning on AgentX master support.
Mar 29 18:53:24 sprout-3 snmpd[1062]: dlopen failed: 
/usr/lib/clearwater/sprout_handler.so: cannot open shared object file: No such 
file or directory
Mar 29 18:53:24 sprout-3 snmpd[1062]: dlopen failed: 
/usr/lib/clearwater/bono_handler.so: cannot open shared object file: No such 
file or directory
Mar 29 18:53:24 sprout-3 snmpd[1062]: dlopen failed: 
/usr/lib/clearwater/homestead_handler.so: cannot open shared object file: No 
such file or directory
Mar 29 18:53:24 sprout-3 snmpd[1062]: dlopen failed: 
/usr/lib/clearwater/alarm_handler.so: cannot open shared object file: No such 
file or directory
Mar 29 18:53:24 sprout-3 snmpd[1062]: dlopen failed: 
/usr/lib/clearwater/cdiv_handler.so: cannot open shared object file: No such 
file or directory
Mar 29 18:53:24 sprout-3 snmpd[1062]: dlopen failed: 
/usr/lib/clearwater/memento_handler.so: cannot open shared object file: No such 
file or directory
Mar 29 18:53:24 sprout-3 snmpd[1062]: dlopen failed: 
/usr/lib/clearwater/memento_as_handler.so: cannot open shared object file: No 
such file or directory
Mar 29 18:53:24 sprout-3 snmpd[1062]: net-snmp: 33 error(s) in config file(s)
Mar 29 18:53:24 sprout-3 snmpd[1062]: NET-SNMP version 5.4.3

Also, I couldn't poll snmp information from any node of Clearwater using 
https://github.com/Metaswitch/cpp-common/tree/master/scripts/stats.
It doesn't return any response:
cw@gateway:~/stats/bin$ ./cw_stat 192.168.1.11 6669 client_count
Error: No response from host, ensure hostname and port are correct
Error: Terminating due to previous errors

NTP on 3 Sprout nodes are synchronizing with the same server which generates 
SIPp traffic for testing.
And they are all upgraded to the latest release.

Attached are Sprout logs and SIPp message and error logs of a new test with 10 
reg-unreg calls (number 6500000000-650000009, 1 call/sec) for simplicity of 
analysis.
The 7 failed calls are all due to the second REGISTER message was returned with 
401 or 500.
It seems like for the three successfully registered numbers (6500000002, 
6500000005 and 6500000008), the first and second REGISTER messages were handled 
by the same Sprout node.
And the failed ones were handled by two different Sprout nodes.
For instance, 6500000000's first REGISTER message was processed by Sprout-1 
which was successfull. And the second REGISTER message was processed by 
Sprout-2.
Sprout-2 could read AV information from previous step and retrieve register 
information from Homestead, but failed on reading it afterwards. So, it finally 
generated 500 response.
You can check the complete logs for more details.
Could this be some synchronization problem of memcached? Interesting thing is, 
if I disable Sprout-3, everything works fine.

nfv@Z420-ctl:~/cw-test/openstack/trace/3-sprout$ grep 6500000000 
sp-*-sprout_20150330T020000Z.txt
sp-1-sprout_20150330T020000Z.txt:REGISTER 
sip:[email protected]<mailto:sip%[email protected]> 
SIP/2.0
sp-1-sprout_20150330T020000Z.txt:From: 
<sip:[email protected]<mailto:sip%[email protected]>>;tag=2619reg1
sp-1-sprout_20150330T020000Z.txt:To: 
<sip:[email protected]<mailto:sip%[email protected]>>
sp-1-sprout_20150330T020000Z.txt:Contact: 
<sip:[email protected]:31135;transport=UDP>
sp-1-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.562 UTC Debug 
pjutils.cpp:470: Private identity defaulted from public identity = 
[email protected]<mailto:[email protected]>
sp-1-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.566 UTC Debug 
httpconnection.cpp:553: Sending HTTP request : 
http://hs.hp-clearwater.com:8888/impi/6500000000%40hp-clearwater.com/av?impu=sip%3A6500000000%40hp-clearwater.com
 (trying 192.168.1.31) on new connection
sp-1-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.634 UTC Debug 
avstore.cpp:66: Set AV for 
[email protected]<mailto:[email protected]>\6a661a5278f64006
sp-1-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.634 UTC Debug 
memcachedstore.cpp:540: Writing 148 bytes to table av key 
[email protected]<mailto:[email protected]>\6a661a5278f64006,
 CAS = 0, expiry = 40
sp-1-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.634 UTC Debug 
memcachedstore.cpp:193: Key 
av\\[email protected]<mailto:[email protected]>\6a661a5278f64006
 hashes to vbucket 77 via hash 0x15f14fcd
sp-1-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.878 UTC Debug 
memcachedstore.cpp:560: 2 write replicas for key 
av\\[email protected]<mailto:[email protected]>\6a661a5278f64006
sp-1-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.878 UTC Debug 
memcachedstore.cpp:812: Attempting to add data for key 
av\\[email protected]<mailto:[email protected]>\6a661a5278f64006
sp-1-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.881 UTC Debug 
authentication.cpp:396: Sending {"impi": 
"[email protected]<mailto:[email protected]>", "impu": 
"sip:[email protected]<mailto:sip%[email protected]>", 
"nonce": "6a661a5278f64006"} to Chronos to set AV timer
sp-1-sprout_20150330T020000Z.txt:From: 
<sip:[email protected]<mailto:sip%[email protected]>>;tag=2619reg1
sp-1-sprout_20150330T020000Z.txt:To: 
<sip:[email protected]<mailto:sip%[email protected]>>;tag=z9hG4bKPjWD7evZbxVle9AjzwRDtxOfsPP46A2SBN
sp-2-sprout_20150330T020000Z.txt:REGISTER 
sip:[email protected]<mailto:sip%[email protected]> 
SIP/2.0
sp-2-sprout_20150330T020000Z.txt:From: 
<sip:[email protected]<mailto:sip%[email protected]>>;tag=2619reg1
sp-2-sprout_20150330T020000Z.txt:To: 
<sip:[email protected]<mailto:sip%[email protected]>>
sp-2-sprout_20150330T020000Z.txt:Contact: 
<sip:[email protected]:31135;transport=UDP>
sp-2-sprout_20150330T020000Z.txt:Authorization: Digest 
username="[email protected]<mailto:[email protected]>", 
realm="hp-clearwater.com<http://hp-clearwater.com>", nonce="6a661a5278f64006", 
uri="sip:15.9.114.162:5060<http://15.9.114.162:5060>", 
response="46d9005162ed55dd1629c6766c7b50c6", algorithm=MD5, cnonce="6b8b4567", 
opaque="4abd75561ff7aa78", qop=auth, 
nc=00000001,integrity-protected=ip-assoc-pending
sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.893 UTC Debug 
memcachedstore.cpp:193: Key 
av\\[email protected]<mailto:[email protected]>\6a661a5278f64006
 hashes to vbucket 77 via hash 0x15f14fcd
sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.903 UTC Debug 
memcachedstore.cpp:365: 2 read replicas for key 
av\\[email protected]<mailto:[email protected]>\6a661a5278f64006
sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.903 UTC Debug 
memcachedstore.cpp:408: Read for 
av\\[email protected]<mailto:[email protected]>\6a661a5278f64006
 on replica 0 returned SUCCESS
sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.903 UTC Debug 
memcachedstore.cpp:451: Read 148 bytes from table av key 
[email protected]<mailto:[email protected]>\6a661a5278f64006,
 CAS = 1
sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.903 UTC Debug 
avstore.cpp:99: Retrieved AV for 
[email protected]<mailto:[email protected]>\6a661a5278f64006
sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.904 UTC Debug 
avstore.cpp:66: Set AV for 
[email protected]<mailto:[email protected]>\6a661a5278f64006
sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.904 UTC Debug 
memcachedstore.cpp:540: Writing 167 bytes to table av key 
[email protected]<mailto:[email protected]>\6a661a5278f64006,
 CAS = 1, expiry = 40
sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.904 UTC Debug 
memcachedstore.cpp:193: Key 
av\\[email protected]<mailto:[email protected]>\6a661a5278f64006
 hashes to vbucket 77 via hash 0x15f14fcd
sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.904 UTC Debug 
memcachedstore.cpp:560: 2 write replicas for key 
av\\[email protected]<mailto:[email protected]>\6a661a5278f64006
sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.909 UTC Debug 
registrar.cpp:518: Process REGISTER for public ID 
sip:[email protected]<mailto:sip%[email protected]>
sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.909 UTC Debug 
hssconnection.cpp:563: Making Homestead request for 
/impu/sip%3A6500000000%40hp-clearwater.com/reg-data?private_id=6500000000%40hp-clearwater.com<http://40hp-clearwater.com/reg-data?private_id=6500000000%40hp-clearwater.com>
sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.915 UTC Debug 
httpconnection.cpp:553: Sending HTTP request : 
http://hs.hp-clearwater.com:8888/impu/sip%3A6500000000%40hp-clearwater.com/reg-data?private_id=6500000000%40hp-clearwater.com
 (trying 192.168.1.31) on new connection
sp-2-sprout_20150330T020000Z.txt:               
<PrivateID>[email protected]<mailto:[email protected]></PrivateID>
sp-2-sprout_20150330T020000Z.txt:                               
<Identity>sip:[email protected]<mailto:sip%[email protected]></Identity>
sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.958 UTC Debug 
hssconnection.cpp:349: Processing Identity node from HSS XML - 
sip:[email protected]<mailto:sip%[email protected]>
sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.958 UTC Debug 
registrar.cpp:618: REGISTER for public ID 
sip:[email protected]<mailto:sip%[email protected]> 
uses AOR 
sip:[email protected]<mailto:sip%[email protected]>
sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.958 UTC Debug 
regstore.cpp:120: Get AoR data for 
sip:[email protected]<mailto:sip%[email protected]>
sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.958 UTC Debug 
memcachedstore.cpp:193: Key 
reg\\sip:[email protected]<mailto:sip%[email protected]>
 hashes to vbucket 14 via hash 0xe25f38e
sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.958 UTC Debug 
memcachedstore.cpp:365: 2 read replicas for key 
reg\\sip:[email protected]<mailto:sip%[email protected]>
sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.966 UTC Debug 
memcachedstore.cpp:423: Read for 
reg\\sip:[email protected]<mailto:sip%[email protected]>
 on replica 0 returned error 2 (getaddrinfo() or getnameinfo() HOSTNAME LOOKUP 
FAILURE)
sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.970 UTC Debug 
memcachedstore.cpp:423: Read for 
reg\\sip:[email protected]<mailto:sip%[email protected]>
 on replica 1 returned error 2 (getaddrinfo() or getnameinfo() HOSTNAME LOOKUP 
FAILURE)
sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.970 UTC Error 
memcachedstore.cpp:512: Failed to read data for 
reg\\sip:[email protected]<mailto:sip%[email protected]>
 from 2 replicas
sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.970 UTC Error 
registrar.cpp:249: Failed to get AoR binding for 
sip:[email protected]<mailto:sip%[email protected]> 
from store
sp-2-sprout_20150330T020000Z.txt:From: 
<sip:[email protected]<mailto:sip%[email protected]>>;tag=2619reg1
sp-2-sprout_20150330T020000Z.txt:To: 
<sip:[email protected]<mailto:sip%[email protected]>>;tag=z9hG4bKPjuv9-F1vnED88Va2irPvoVQDyFmJ0s.kw
sp-3-sprout_20150330T020000Z.txt:30-03-2015 02:45:40.897 UTC Debug 
memcachedstore.cpp:193: Key 
av\\[email protected]<mailto:[email protected]>\6a661a5278f64006
 hashes to vbucket 77 via hash 0x15f14fcd
sp-3-sprout_20150330T020000Z.txt:30-03-2015 02:45:40.914 UTC Debug 
memcachedstore.cpp:365: 2 read replicas for key 
av\\[email protected]<mailto:[email protected]>\6a661a5278f64006
sp-3-sprout_20150330T020000Z.txt:30-03-2015 02:45:40.915 UTC Debug 
memcachedstore.cpp:408: Read for 
av\\[email protected]<mailto:[email protected]>\6a661a5278f64006
 on replica 0 returned SUCCESS
sp-3-sprout_20150330T020000Z.txt:30-03-2015 02:45:40.915 UTC Debug 
memcachedstore.cpp:451: Read 167 bytes from table av key 
[email protected]<mailto:[email protected]>\6a661a5278f64006,
 CAS = 2
sp-3-sprout_20150330T020000Z.txt:30-03-2015 02:45:40.915 UTC Debug 
avstore.cpp:99: Retrieved AV for 
[email protected]<mailto:[email protected]>\6a661a5278f64006


Thanks,
Lianjie

On Fri, Mar 27, 2015 at 2:59 PM Eleanor Merry 
<[email protected]<mailto:[email protected]>> wrote:
Hi,

It looks like there's an issue with the underlying memcached process. Can you 
check whether memcached is running on all your Sprout nodes, and check the 
memcached logs in /var/log/memcached.log and /var/log/syslog?

Also, can you confirm that all your nodes are running Anna Karenina, and that 
NTP is running on your Sprout nodes?

Ellie

-----Original Message-----
From: 
[email protected]<mailto:[email protected]>
 
[mailto:[email protected]<mailto:[email protected]>]
 On Behalf Of Lianjie Cao
Sent: 26 March 2015 15:55
To: 
[email protected]<mailto:[email protected]>;
 Sharma, Puneet; Sonia Fahmy
Subject: [Clearwater] Sprout clustering problem

Hi,

Our current Clearwater deployment has 1 Bono node, 2 Sprout nodes, 2 Homestead 
nodes and 1 Homer node running Anna Karenina Release. Everything works fine.
However, when we are trying to add the third Sprout node, we get error 
responses during registration test, such as "500 Server Internal Error" and
''401 Unauthorized" for the second register request.

To cluster Sprout nodes, we follow the instruction to modify cluster_settings 
and chronos.conf as below.

[sprout]cw@sprout-3:~$ cat /etc/clearwater/cluster_settings 
servers=192.168.1.21:11211<http://192.168.1.21:11211>, 
192.168.1.22:11211<http://192.168.1.22:11211>, 
192.168.1.23:11211<http://192.168.1.23:11211>

[sprout]cw@sprout-3:~$ cat /etc/chronos/chronos.conf [http] bind-address = 
192.168.1.23 bind-port = 7253 threads = 50

[logging]
folder = /var/log/chronos
level = 2

[cluster]
localhost = 192.168.1.23
node = 192.168.1.21
node = 192.168.1.22
node = 192.168.1.23

[alarms]
enabled = true

[exceptions]
max_ttl = 600



The three nodes share the same files except "bind-address = 192.168.1.23"
and "localhost = 192.168.1.23" in chronos.conf.

In the Sprout log, we found errors messages like this on all of the 3 Sprout 
nodes with log level 5 (complete logs are attached):

26-03-2015 07:49:16.717 UTC Debug registrar.cpp:618: REGISTER for public ID 
sip:[email protected]<mailto:sip%[email protected]> 
uses AOR 
sip:[email protected]<mailto:sip%[email protected]>
26-03-2015 07:49:16.717 UTC Debug regstore.cpp:120: Get AoR data for 
sip:[email protected]<mailto:sip%[email protected]>
26-03-2015 07:49:16.717 UTC Debug memcachedstore.cpp:193: Key reg\\ 
sip:[email protected]<mailto:sip%[email protected]> 
hashes to vbucket 14 via hash 0xe25f38e
26-03-2015 07:49:16.717 UTC Debug memcachedstore.cpp:365: 2 read replicas for 
key 
reg\\sip:[email protected]<mailto:sip%[email protected]>
26-03-2015 07:49:16.717 UTC Debug memcachedstore.cpp:400: Attempt to read from 
replica 0 (connection 0x7f6ce000ca30)
26-03-2015 07:49:16.718 UTC Debug memcachedstore.cpp:423: Read for reg\\ 
sip:[email protected]<mailto:sip%[email protected]> on 
replica 0 returned error 2
(getaddrinfo() or getnameinfo() HOSTNAME LOOKUP FAILURE)
26-03-2015 07:49:16.718 UTC Debug memcachedstore.cpp:400: Attempt to read from 
replica 1 (connection 0x7f6ce00129e0)
26-03-2015 07:49:16.720 UTC Debug memcachedstore.cpp:423: Read for reg\\ 
sip:[email protected]<mailto:sip%[email protected]> on 
replica 1 returned error 2
(getaddrinfo() or getnameinfo() HOSTNAME LOOKUP FAILURE)
26-03-2015 07:49:16.720 UTC Error memcachedstore.cpp:512: Failed to read data 
for 
reg\\sip:[email protected]<mailto:sip%[email protected]>
 from 2 replicas
26-03-2015 07:49:16.720 UTC Debug registrar.cpp:242: Retrieved AoR data
(nil)
26-03-2015 07:49:16.720 UTC Error registrar.cpp:249: Failed to get AoR binding 
for sip:[email protected]<mailto:sip%[email protected]> 
from store
26-03-2015 07:49:16.720 UTC Debug pjsip:       endpoint Response msg
500/REGISTER/cseq=2 (tdta0x7f6ce0081900) created
26-03-2015 07:49:16.720 UTC Verbose common_sip_processing.cpp:136: TX 449 bytes 
Response msg 500/REGISTER/cseq=2 (tdta0x7f6ce0081900) to TCP
192.168.1.11:36197<http://192.168.1.11:36197>:

26-03-2015 07:49:16.933 UTC Debug memcachedstore.cpp:234: Set up new view 1 for 
thread
26-03-2015 07:49:16.933 UTC Debug memcachedstore.cpp:241: Setting up server
0 for connection 0x7f6d10040e80 (--CONNECT-TIMEOUT=10 --SUPPORT-CAS
--POLL-TIMEOUT=250 --BINARY-PROTOCOL)
26-03-2015 07:49:16.933 UTC Debug memcachedstore.cpp:243: Set up connection
0x7f6d10041460 to server 192.168.1.21:11211<http://192.168.1.21:11211>
26-03-2015 07:49:16.933 UTC Debug memcachedstore.cpp:254: Setting server to IP 
address 192.168.1.21 port 11211
26-03-2015 07:49:16.933 UTC Debug memcachedstore.cpp:241: Setting up server
1 for connection 0x7f6d10040e80 (--CONNECT-TIMEOUT=10 --SUPPORT-CAS
--POLL-TIMEOUT=250 --BINARY-PROTOCOL)
26-03-2015 07:49:16.933 UTC Debug memcachedstore.cpp:243: Set up connection
0x7f6d10045be0 to server  192.168.1.22:11211<http://192.168.1.22:11211>
26-03-2015 07:49:16.933 UTC Debug memcachedstore.cpp:254: Setting server to IP 
address  192.168.1.22 port 11211
26-03-2015 07:49:16.936 UTC Debug memcachedstore.cpp:241: Setting up server
2 for connection 0x7f6d10040e80 (--CONNECT-TIMEOUT=10 --SUPPORT-CAS
--POLL-TIMEOUT=250 --BINARY-PROTOCOL)
26-03-2015 07:49:16.936 UTC Debug memcachedstore.cpp:243: Set up connection
0x7f6d1004b3c0 to server  192.168.1.23:11211<http://192.168.1.23:11211>
26-03-2015 07:49:16.936 UTC Debug memcachedstore.cpp:254: Setting server to IP 
address  192.168.1.23 port 11211
26-03-2015 07:49:16.938 UTC Debug memcachedstore.cpp:560: 2 write replicas for 
key 
av\\[email protected]<mailto:[email protected]>\3a44ea384df9161e
26-03-2015 07:49:16.939 UTC Debug memcachedstore.cpp:612: Attempt conditional 
write to vbucket 93 on replica 0 (connection 0x7f6d1004b3c0), CAS = 0, expiry = 
40
26-03-2015 07:49:16.939 UTC Debug memcachedstore.cpp:812: Attempting to add 
data for key 
av\\[email protected]<mailto:[email protected]>\3a44ea384df9161e
26-03-2015 07:49:16.939 UTC Debug memcachedstore.cpp:822: Attempting memcached 
ADD command
26-03-2015 07:49:16.942 UTC Debug memcachedstore.cpp:912: ADD/CAS returned rc = 
5 (WRITE FAILURE)
(140106396906432) getaddrinfo() or getnameinfo() HOSTNAME LOOKUP FAILURE, Name 
or service not known,  host:  192.168.1.23:11211<http://192.168.1.23:11211> ->
libmemcached/connect.cc:197
26-03-2015 07:49:16.942 UTC Debug memcachedstore.cpp:612: Attempt conditional 
write to vbucket 93 on replica 1 (connection 0x7f6d10045be0), CAS = 0, expiry = 
40
26-03-2015 07:49:16.942 UTC Debug memcachedstore.cpp:812: Attempting to add 
data for key 
av\\[email protected]<mailto:[email protected]>\3a44ea384df9161e
26-03-2015 07:49:16.942 UTC Debug memcachedstore.cpp:822: Attempting memcached 
ADD command
26-03-2015 07:49:16.945 UTC Debug memcachedstore.cpp:912: ADD/CAS returned rc = 
5 (WRITE FAILURE)
(140106396883936) getaddrinfo() or getnameinfo() HOSTNAME LOOKUP FAILURE, Name 
or service not known,  host:  192.168.1.22:11211<http://192.168.1.22:11211> ->
libmemcached/connect.cc:197
26-03-2015 07:49:16.945 UTC Error memcachedstore.cpp:715: Failed to write data 
for 
av\\[email protected]<mailto:[email protected]>\3a44ea384df9161e
 to 2 replicas
26-03-2015 07:49:16.945 UTC Error avstore.cpp:71: Failed to write 
Authentication Vector for private_id 
[email protected]<mailto:[email protected]>
26-03-2015 07:49:16.945 UTC Verbose common_sip_processing.cpp:136: TX 569 bytes 
Response msg 401/REGISTER/cseq=1 (tdta0x7f6d10006fa0) to TCP
192.168.1.11:40630<http://192.168.1.11:40630>:



Can you help us on this?

Thanks,
Lianjie
_______________________________________________
Clearwater mailing list
[email protected]
http://lists.projectclearwater.org/listinfo/clearwater

Reply via email to