Hi Ellie,

Thanks a lot for the information! We are trying this out.

One more question is about Clearwater-snmp packages. From the log, it seems
the snmp libraries are not loaded successfully, so we couldn't poll any
snmp info from Clearwater nodes.

Can you advise us how to get it work?

Thanks,
Lianjie

On Thu, Apr 2, 2015 at 11:03 AM, Eleanor Merry <[email protected]
> wrote:

>  Hi Lianjie,
>
>
>
> This is due to a bug with how we pull out the memcached information from
> /etc/clearwater/cluster_settings, as we don’t strip off any whitespace. In
> your case, memcached tries to connect to
>
> “192.168.1.21:1121”, “<space>192.168.1.22:1121” and
> “<space>192.168.1.23:1121”, where the last two don’t exist. This means
> that reading/writing to memcached regularly fails.
>
>
>
> I’ve raised an issue to track fixing this (see
> https://github.com/Metaswitch/cpp-common/issues/273) – in the meantime
> can you please remove the whitespace from your cluster_settings files, and
> reload each Sprout (“sudo service sprout reload”)?
>
>
>
> Also, did you use the same file when there were two Sprouts? If so then I
> believe you’ve probably been hitting this issue already, except
> registrations would have succeeded as at least one replica was a valid
> memcached instance (as the replication factor is two, at least one of the
> replicas would have been the valid “192.168.1.21:1121” entry”).
>
>
>
> Ellie
>
>
>
> *From:* Lianjie Cao [mailto:[email protected]]
> *Sent:* 30 March 2015 16:34
> *To:* Eleanor Merry; [email protected]; Sharma,
> Puneet; Sonia Fahmy
> *Subject:* Re: [Clearwater] Sprout clustering problem
>
>
>
> Hi Ellie,
>
> memcached runs fine on all 3 Sprout nodes:
>
>  [sprout]cw@sprout-1:~$ ps -ef | grep memcached
>
> memcache  1091     1  0 16:44 ?        00:00:35 /usr/bin/memcached -v -e
> ignore_vbucket=true;cache_size=512000000 -p 11211 -u memcache -l
> 192.168.1.21 -c 4096
>
> root     26144  1107  0 18:38 ?        00:00:00 [poll_memcached.] <defunct>
>
> cw       26173 15526  0 18:38 pts/0    00:00:00 grep --color=auto memcached
>
>
>
> [sprout]cw@sprout-2:~$ ps -ef | grep memcached
>
> memcache  1044     1  0 16:44 ?        00:00:00 /usr/bin/memcached -v -e
> ignore_vbucket=true;cache_size=512000000 -p 11211 -u memcache -l
> 192.168.1.22 -c 4096
>
> root     25877  1035  0 18:39 ?        00:00:00 [poll_memcached.] <defunct>
>
> cw       25906 15808  0 18:39 pts/0    00:00:00 grep --color=auto memcached
>
>
>
> [sprout]cw@sprout-3:~$ ps -ef | grep memcached
>
> memcache  1101     1  0 17:43 ?        00:00:00 /usr/bin/memcached -v -e
> ignore_vbucket=true;cache_size=512000000 -p 11211 -u memcache -l
> 192.168.1.23 -c 4096
>
> root     12949  1075  0 18:39 ?        00:00:00 [poll_memcached.] <defunct>
>
> cw       12975  3317  0 18:39 pts/0    00:00:00 grep --color=auto memcached
>
>
>
> memcached logs look alright:
>
>  [sprout]cw@sprout-1:~$ tail -20 /var/log/memcached.log
>
> profiling:/var/lib/jenkins/workspace/memcached/memcached-hash.gcda:Skip
>
> profiling:/var/lib/jenkins:Cannot create directory
>
> profiling:/var/lib/jenkins/workspace/memcached/memcached-daemon.gcda:Skip
>
> Loaded engine: Default engine v0.1
>
> Supplying the following features: LRU, compare and swap
>
> Initiating shutdown
>
> profiling:/var/lib/jenkins:Cannot create directory
>
> profiling:/var/lib/jenkins/workspace/memcached/memcached-cache.gcda:Skip
>
> profiling:/var/lib/jenkins:Cannot create directory
>
> profiling:/var/lib/jenkins/workspace/memcached/memcached-thread.gcda:Skip
>
> profiling:/var/lib/jenkins:Cannot create directory
>
> profiling:/var/lib/jenkins/workspace/memcached/memcached-stats.gcda:Skip
>
> profiling:/var/lib/jenkins:Cannot create directory
>
>
> profiling:/var/lib/jenkins/workspace/memcached/memcached-memcached.gcda:Skip
>
> profiling:/var/lib/jenkins:Cannot create directory
>
> profiling:/var/lib/jenkins/workspace/memcached/memcached-hash.gcda:Skip
>
> profiling:/var/lib/jenkins:Cannot create directory
>
> profiling:/var/lib/jenkins/workspace/memcached/memcached-daemon.gcda:Skip
>
> Loaded engine: Default engine v0.1
>
> Supplying the following features: LRU, compare and swap
>
>
>
> [sprout]cw@sprout-2:~$ tail -20 /var/log/memcached.log
>
> profiling:/var/lib/jenkins/workspace/memcached/memcached-hash.gcda:Skip
>
> profiling:/var/lib/jenkins:Cannot create directory
>
> profiling:/var/lib/jenkins/workspace/memcached/memcached-daemon.gcda:Skip
>
> Loaded engine: Default engine v0.1
>
> Supplying the following features: LRU, compare and swap
>
> Initiating shutdown
>
> profiling:/var/lib/jenkins:Cannot create directory
>
> profiling:/var/lib/jenkins/workspace/memcached/memcached-cache.gcda:Skip
>
> profiling:/var/lib/jenkins:Cannot create directory
>
> profiling:/var/lib/jenkins/workspace/memcached/memcached-thread.gcda:Skip
>
> profiling:/var/lib/jenkins:Cannot create directory
>
> profiling:/var/lib/jenkins/workspace/memcached/memcached-stats.gcda:Skip
>
> profiling:/var/lib/jenkins:Cannot create directory
>
>
> profiling:/var/lib/jenkins/workspace/memcached/memcached-memcached.gcda:Skip
>
> profiling:/var/lib/jenkins:Cannot create directory
>
> profiling:/var/lib/jenkins/workspace/memcached/memcached-hash.gcda:Skip
>
> profiling:/var/lib/jenkins:Cannot create directory
>
> profiling:/var/lib/jenkins/workspace/memcached/memcached-daemon.gcda:Skip
>
> Loaded engine: Default engine v0.1
>
> Supplying the following features: LRU, compare and swap
>
>
>
> [sprout]cw@sprout-3:~$ tail -20 /var/log/memcached.log
>
> Supplying the following features: LRU, compare and swap
>
> Initiating shutdown
>
> profiling:/var/lib/jenkins:Cannot create directory
>
> profiling:/var/lib/jenkins/workspace/memcached/memcached-cache.gcda:Skip
>
> profiling:/var/lib/jenkins:Cannot create directory
>
> profiling:/var/lib/jenkins/workspace/memcached/memcached-thread.gcda:Skip
>
> profiling:/var/lib/jenkins:Cannot create directory
>
> profiling:/var/lib/jenkins/workspace/memcached/memcached-stats.gcda:Skip
>
> profiling:/var/lib/jenkins:Cannot create directory
>
>
> profiling:/var/lib/jenkins/workspace/memcached/memcached-memcached.gcda:Skip
>
> profiling:/var/lib/jenkins:Cannot create directory
>
> profiling:/var/lib/jenkins/workspace/memcached/memcached-hash.gcda:Skip
>
> profiling:/var/lib/jenkins:Cannot create directory
>
> profiling:/var/lib/jenkins/workspace/memcached/memcached-daemon.gcda:Skip
>
> Loaded engine: Default engine v0.1
>
> Supplying the following features: LRU, compare and swap
>
> Loaded engine: Default engine v0.1
>
> Supplying the following features: LRU, compare and swap
>
> Loaded engine: Default engine v0.1
>
> Supplying the following features: LRU, compare and swap
>
>
>
> syslog also doesn't report errors except those about snmp, which is
> actually another problem I am having.
>
>   Mar 29 17:43:28 sprout-3 snmpd[1620]: NET-SNMP version 5.4.3
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: prNames
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144:
> Error: unknown payload OID
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: prNames
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144:
> Error: Unknown payload OID
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: prErrMessage
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144:
> Error: unknown payload OID
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: prErrMessage
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144:
> Error: Unknown payload OID
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: trigger OID: prErrorFlag
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144:
> Error: unknown monitor OID
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: memErrorName
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144:
> Error: unknown payload OID
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: memErrorName
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144:
> Error: Unknown payload OID
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: memSwapErrorMsg
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144:
> Error: unknown payload OID
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: memSwapErrorMsg
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144:
> Error: Unknown payload OID
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: trigger OID: memSwapError
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144:
> Error: unknown monitor OID
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: extNames
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144:
> Error: unknown payload OID
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: extNames
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144:
> Error: Unknown payload OID
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: extOutput
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144:
> Error: unknown payload OID
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: extOutput
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144:
> Error: Unknown payload OID
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: trigger OID: extResult
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144:
> Error: unknown monitor OID
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: dskPath
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144:
> Error: unknown payload OID
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: dskPath
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144:
> Error: Unknown payload OID
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: dskErrorMsg
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144:
> Error: unknown payload OID
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: dskErrorMsg
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144:
> Error: Unknown payload OID
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: trigger OID: dskErrorFlag
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144:
> Error: unknown monitor OID
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: laNames
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144:
> Error: unknown payload OID
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: laNames
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144:
> Error: Unknown payload OID
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: laErrMessage
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144:
> Error: unknown payload OID
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: laErrMessage
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144:
> Error: Unknown payload OID
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: trigger OID: laErrorFlag
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144:
> Error: unknown monitor OID
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: fileName
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144:
> Error: unknown payload OID
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: fileName
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144:
> Error: Unknown payload OID
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: fileErrorMsg
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144:
> Error: unknown payload OID
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID: fileErrorMsg
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144:
> Error: Unknown payload OID
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: trigger OID: fileErrorFlag
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144:
> Error: unknown monitor OID
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: payload OID: snmperrErrMessage
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144:
> Error: unknown payload OID
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: Unknown payload OID:
> snmperrErrMessage
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144:
> Error: Unknown payload OID
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: trigger OID: snmperrErrorFlag
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: /etc/snmp/snmpd.conf: line 144:
> Error: unknown monitor OID
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: Turning on AgentX master support.
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: dlopen failed:
> /usr/lib/clearwater/sprout_handler.so: cannot open shared object file: No
> such file or directory
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: dlopen failed:
> /usr/lib/clearwater/bono_handler.so: cannot open shared object file: No
> such file or directory
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: dlopen failed:
> /usr/lib/clearwater/homestead_handler.so: cannot open shared object file:
> No such file or directory
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: dlopen failed:
> /usr/lib/clearwater/alarm_handler.so: cannot open shared object file: No
> such file or directory
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: dlopen failed:
> /usr/lib/clearwater/cdiv_handler.so: cannot open shared object file: No
> such file or directory
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: dlopen failed:
> /usr/lib/clearwater/memento_handler.so: cannot open shared object file: No
> such file or directory
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: dlopen failed:
> /usr/lib/clearwater/memento_as_handler.so: cannot open shared object file:
> No such file or directory
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: net-snmp: 33 error(s) in config
> file(s)
>
> Mar 29 18:53:24 sprout-3 snmpd[1062]: NET-SNMP version 5.4.3
>
>
>
> Also, I couldn't poll snmp information from any node of Clearwater using
> https://github.com/Metaswitch/cpp-common/tree/master/scripts/stats.
>
> It doesn't return any response:
>
>   cw@gateway:~/stats/bin$ ./cw_stat 192.168.1.11 6669 client_count
>
> Error: No response from host, ensure hostname and port are correct
>
> Error: Terminating due to previous errors
>
>
>
> NTP on 3 Sprout nodes are synchronizing with the same server which
> generates SIPp traffic for testing.
>
> And they are all upgraded to the latest release.
>
>
>
> Attached are Sprout logs and SIPp message and error logs of a new test
> with 10 reg-unreg calls (number 6500000000-650000009, 1 call/sec) for
> simplicity of analysis.
>
> The 7 failed calls are all due to the second REGISTER message was returned
> with 401 or 500.
>
> It seems like for the three successfully registered numbers
> (6500000002, 6500000005 and 6500000008), the first and second REGISTER
> messages were handled by the same Sprout node.
>
> And the failed ones were handled by two different Sprout nodes.
>
> For instance, 6500000000's first REGISTER message was processed by
> Sprout-1 which was successfull. And the second REGISTER message was
> processed by Sprout-2.
>
> Sprout-2 could read AV information from previous step and retrieve
> register information from Homestead, but failed on reading it afterwards.
> So, it finally generated 500 response.
>
> You can check the complete logs for more details.
>
> Could this be some synchronization problem of memcached? Interesting thing
> is, if I disable Sprout-3, everything works fine.
>
>
>
>  nfv@Z420-ctl:~/cw-test/openstack/trace/3-sprout$ grep 6500000000
> sp-*-sprout_20150330T020000Z.txt
>
> sp-1-sprout_20150330T020000Z.txt:REGISTER sip:[email protected]
> SIP/2.0
>
> sp-1-sprout_20150330T020000Z.txt:From: <sip:[email protected]
> >;tag=2619reg1
>
> sp-1-sprout_20150330T020000Z.txt:To: <sip:[email protected]>
>
> sp-1-sprout_20150330T020000Z.txt:Contact: <
> sip:[email protected]:31135;transport=UDP>
>
> sp-1-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.562 UTC Debug
> pjutils.cpp:470: Private identity defaulted from public identity =
> [email protected]
>
> sp-1-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.566 UTC Debug
> httpconnection.cpp:553: Sending HTTP request :
> http://hs.hp-clearwater.com:8888/impi/6500000000%40hp-clearwater.com/av?impu=sip%3A6500000000%40hp-clearwater.com
> (trying 192.168.1.31) on new connection
>
> sp-1-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.634 UTC Debug
> avstore.cpp:66: Set AV for [email protected]\6a661a5278f64006
>
> sp-1-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.634 UTC Debug
> memcachedstore.cpp:540: Writing 148 bytes to table av key
> [email protected]\6a661a5278f64006, CAS = 0, expiry = 40
>
> sp-1-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.634 UTC Debug
> memcachedstore.cpp:193: Key av\\[email protected]\6a661a5278f64006
> hashes to vbucket 77 via hash 0x15f14fcd
>
> sp-1-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.878 UTC Debug
> memcachedstore.cpp:560: 2 write replicas for key av\\
> [email protected]\6a661a5278f64006
>
> sp-1-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.878 UTC Debug
> memcachedstore.cpp:812: Attempting to add data for key av\\
> [email protected]\6a661a5278f64006
>
> sp-1-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.881 UTC Debug
> authentication.cpp:396: Sending {"impi": "[email protected]",
> "impu": "sip:[email protected]", "nonce": "6a661a5278f64006"}
> to Chronos to set AV timer
>
> sp-1-sprout_20150330T020000Z.txt:From: <sip:[email protected]
> >;tag=2619reg1
>
> sp-1-sprout_20150330T020000Z.txt:To: <sip:[email protected]
> >;tag=z9hG4bKPjWD7evZbxVle9AjzwRDtxOfsPP46A2SBN
>
> sp-2-sprout_20150330T020000Z.txt:REGISTER sip:[email protected]
> SIP/2.0
>
> sp-2-sprout_20150330T020000Z.txt:From: <sip:[email protected]
> >;tag=2619reg1
>
> sp-2-sprout_20150330T020000Z.txt:To: <sip:[email protected]>
>
> sp-2-sprout_20150330T020000Z.txt:Contact: <
> sip:[email protected]:31135;transport=UDP>
>
> sp-2-sprout_20150330T020000Z.txt:Authorization: Digest username="
> [email protected]", realm="hp-clearwater.com",
> nonce="6a661a5278f64006", uri="sip:15.9.114.162:5060",
> response="46d9005162ed55dd1629c6766c7b50c6", algorithm=MD5,
> cnonce="6b8b4567", opaque="4abd75561ff7aa78", qop=auth,
> nc=00000001,integrity-protected=ip-assoc-pending
>
> sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.893 UTC Debug
> memcachedstore.cpp:193: Key av\\[email protected]\6a661a5278f64006
> hashes to vbucket 77 via hash 0x15f14fcd
>
> sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.903 UTC Debug
> memcachedstore.cpp:365: 2 read replicas for key av\\
> [email protected]\6a661a5278f64006
>
> sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.903 UTC Debug
> memcachedstore.cpp:408: Read for 
> av\\[email protected]\6a661a5278f64006
> on replica 0 returned SUCCESS
>
> sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.903 UTC Debug
> memcachedstore.cpp:451: Read 148 bytes from table av key
> [email protected]\6a661a5278f64006, CAS = 1
>
> sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.903 UTC Debug
> avstore.cpp:99: Retrieved AV for [email protected]
> \6a661a5278f64006
>
> sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.904 UTC Debug
> avstore.cpp:66: Set AV for [email protected]\6a661a5278f64006
>
> sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.904 UTC Debug
> memcachedstore.cpp:540: Writing 167 bytes to table av key
> [email protected]\6a661a5278f64006, CAS = 1, expiry = 40
>
> sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.904 UTC Debug
> memcachedstore.cpp:193: Key av\\[email protected]\6a661a5278f64006
> hashes to vbucket 77 via hash 0x15f14fcd
>
> sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.904 UTC Debug
> memcachedstore.cpp:560: 2 write replicas for key av\\
> [email protected]\6a661a5278f64006
>
> sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.909 UTC Debug
> registrar.cpp:518: Process REGISTER for public ID
> sip:[email protected]
>
> sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.909 UTC Debug
> hssconnection.cpp:563: Making Homestead request for /impu/sip%3A6500000000%
> 40hp-clearwater.com/reg-data?private_id=6500000000%40hp-clearwater.com
>
> sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.915 UTC Debug
> httpconnection.cpp:553: Sending HTTP request :
> http://hs.hp-clearwater.com:8888/impu/sip%3A6500000000%40hp-clearwater.com/reg-data?private_id=6500000000%40hp-clearwater.com
> (trying 192.168.1.31) on new connection
>
> sp-2-sprout_20150330T020000Z.txt:               <PrivateID>
> [email protected]</PrivateID>
>
> sp-2-sprout_20150330T020000Z.txt:                               <Identity>
> sip:[email protected]</Identity>
>
> sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.958 UTC Debug
> hssconnection.cpp:349: Processing Identity node from HSS XML -
> sip:[email protected]
>
> sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.958 UTC Debug
> registrar.cpp:618: REGISTER for public ID sip:[email protected]
> uses AOR sip:[email protected]
>
> sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.958 UTC Debug
> regstore.cpp:120: Get AoR data for sip:[email protected]
>
> sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.958 UTC Debug
> memcachedstore.cpp:193: Key reg\\sip:[email protected] hashes
> to vbucket 14 via hash 0xe25f38e
>
> sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.958 UTC Debug
> memcachedstore.cpp:365: 2 read replicas for key reg\\
> sip:[email protected]
>
> sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.966 UTC Debug
> memcachedstore.cpp:423: Read for reg\\sip:[email protected] on
> replica 0 returned error 2 (getaddrinfo() or getnameinfo() HOSTNAME LOOKUP
> FAILURE)
>
> sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.970 UTC Debug
> memcachedstore.cpp:423: Read for reg\\sip:[email protected] on
> replica 1 returned error 2 (getaddrinfo() or getnameinfo() HOSTNAME LOOKUP
> FAILURE)
>
> *sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.970 UTC Error
> memcachedstore.cpp:512: Failed to read data for
> reg\\sip:[email protected] <sip%[email protected]>
> from 2 replicas*
>
> *sp-2-sprout_20150330T020000Z.txt:30-03-2015 02:45:10.970 UTC Error
> registrar.cpp:249: Failed to get AoR binding for
> sip:[email protected] <sip%[email protected]> from
> store*
>
> sp-2-sprout_20150330T020000Z.txt:From: <sip:[email protected]
> >;tag=2619reg1
>
> sp-2-sprout_20150330T020000Z.txt:To: <sip:[email protected]
> >;tag=z9hG4bKPjuv9-F1vnED88Va2irPvoVQDyFmJ0s.kw
>
> sp-3-sprout_20150330T020000Z.txt:30-03-2015 02:45:40.897 UTC Debug
> memcachedstore.cpp:193: Key av\\[email protected]\6a661a5278f64006
> hashes to vbucket 77 via hash 0x15f14fcd
>
> sp-3-sprout_20150330T020000Z.txt:30-03-2015 02:45:40.914 UTC Debug
> memcachedstore.cpp:365: 2 read replicas for key av\\
> [email protected]\6a661a5278f64006
>
> sp-3-sprout_20150330T020000Z.txt:30-03-2015 02:45:40.915 UTC Debug
> memcachedstore.cpp:408: Read for 
> av\\[email protected]\6a661a5278f64006
> on replica 0 returned SUCCESS
>
> sp-3-sprout_20150330T020000Z.txt:30-03-2015 02:45:40.915 UTC Debug
> memcachedstore.cpp:451: Read 167 bytes from table av key
> [email protected]\6a661a5278f64006, CAS = 2
>
> sp-3-sprout_20150330T020000Z.txt:30-03-2015 02:45:40.915 UTC Debug
> avstore.cpp:99: Retrieved AV for [email protected]
> \6a661a5278f64006
>
>
>
>
>
> Thanks,
>
> Lianjie
>
>
>
> On Fri, Mar 27, 2015 at 2:59 PM Eleanor Merry <
> [email protected]> wrote:
>
> Hi,
>
> It looks like there's an issue with the underlying memcached process. Can
> you check whether memcached is running on all your Sprout nodes, and check
> the memcached logs in /var/log/memcached.log and /var/log/syslog?
>
> Also, can you confirm that all your nodes are running Anna Karenina, and
> that NTP is running on your Sprout nodes?
>
> Ellie
>
> -----Original Message-----
> From: [email protected] [mailto:
> [email protected]] On Behalf Of Lianjie Cao
> Sent: 26 March 2015 15:55
> To: [email protected]; Sharma, Puneet; Sonia Fahmy
> Subject: [Clearwater] Sprout clustering problem
>
> Hi,
>
> Our current Clearwater deployment has 1 Bono node, 2 Sprout nodes, 2
> Homestead nodes and 1 Homer node running Anna Karenina Release. Everything
> works fine.
> However, when we are trying to add the third Sprout node, we get error
> responses during registration test, such as "500 Server Internal Error" and
> ''401 Unauthorized" for the second register request.
>
> To cluster Sprout nodes, we follow the instruction to modify
> cluster_settings and chronos.conf as below.
>
> [sprout]cw@sprout-3:~$ cat /etc/clearwater/cluster_settings servers=
> 192.168.1.21:11211, 192.168.1.22:11211, 192.168.1.23:11211
>
> [sprout]cw@sprout-3:~$ cat /etc/chronos/chronos.conf [http] bind-address
> = 192.168.1.23 bind-port = 7253 threads = 50
>
> [logging]
> folder = /var/log/chronos
> level = 2
>
> [cluster]
> localhost = 192.168.1.23
> node = 192.168.1.21
> node = 192.168.1.22
> node = 192.168.1.23
>
> [alarms]
> enabled = true
>
> [exceptions]
> max_ttl = 600
>
>
>
> The three nodes share the same files except "bind-address = 192.168.1.23"
> and "localhost = 192.168.1.23" in chronos.conf.
>
> In the Sprout log, we found errors messages like this on all of the 3
> Sprout nodes with log level 5 (complete logs are attached):
>
> 26-03-2015 07:49:16.717 UTC Debug registrar.cpp:618: REGISTER for public
> ID sip:[email protected] uses AOR
> sip:[email protected]
> 26-03-2015 07:49:16.717 UTC Debug regstore.cpp:120: Get AoR data for
> sip:[email protected]
> 26-03-2015 07:49:16.717 UTC Debug memcachedstore.cpp:193: Key reg\\
> sip:[email protected] hashes to vbucket 14 via hash 0xe25f38e
> 26-03-2015 07:49:16.717 UTC Debug memcachedstore.cpp:365: 2 read replicas
> for key reg\\sip:[email protected]
> 26-03-2015 07:49:16.717 UTC Debug memcachedstore.cpp:400: Attempt to read
> from replica 0 (connection 0x7f6ce000ca30)
> 26-03-2015 07:49:16.718 UTC Debug memcachedstore.cpp:423: Read for reg\\
> sip:[email protected] on replica 0 returned error 2
> (getaddrinfo() or getnameinfo() HOSTNAME LOOKUP FAILURE)
> 26-03-2015 07:49:16.718 UTC Debug memcachedstore.cpp:400: Attempt to read
> from replica 1 (connection 0x7f6ce00129e0)
> 26-03-2015 07:49:16.720 UTC Debug memcachedstore.cpp:423: Read for reg\\
> sip:[email protected] on replica 1 returned error 2
> (getaddrinfo() or getnameinfo() HOSTNAME LOOKUP FAILURE)
> 26-03-2015 07:49:16.720 UTC Error memcachedstore.cpp:512: Failed to read
> data for reg\\sip:[email protected] from 2 replicas
> 26-03-2015 07:49:16.720 UTC Debug registrar.cpp:242: Retrieved AoR data
> (nil)
> 26-03-2015 07:49:16.720 UTC Error registrar.cpp:249: Failed to get AoR
> binding for sip:[email protected] from store
> 26-03-2015 07:49:16.720 UTC Debug pjsip:       endpoint Response msg
> 500/REGISTER/cseq=2 (tdta0x7f6ce0081900) created
> 26-03-2015 07:49:16.720 UTC Verbose common_sip_processing.cpp:136: TX 449
> bytes Response msg 500/REGISTER/cseq=2 (tdta0x7f6ce0081900) to TCP
> 192.168.1.11:36197:
>
> 26-03-2015 07:49:16.933 UTC Debug memcachedstore.cpp:234: Set up new view
> 1 for thread
> 26-03-2015 07:49:16.933 UTC Debug memcachedstore.cpp:241: Setting up server
> 0 for connection 0x7f6d10040e80 (--CONNECT-TIMEOUT=10 --SUPPORT-CAS
> --POLL-TIMEOUT=250 --BINARY-PROTOCOL)
> 26-03-2015 07:49:16.933 UTC Debug memcachedstore.cpp:243: Set up connection
> 0x7f6d10041460 to server 192.168.1.21:11211
> 26-03-2015 07:49:16.933 UTC Debug memcachedstore.cpp:254: Setting server
> to IP address 192.168.1.21 port 11211
> 26-03-2015 07:49:16.933 UTC Debug memcachedstore.cpp:241: Setting up server
> 1 for connection 0x7f6d10040e80 (--CONNECT-TIMEOUT=10 --SUPPORT-CAS
> --POLL-TIMEOUT=250 --BINARY-PROTOCOL)
> 26-03-2015 07:49:16.933 UTC Debug memcachedstore.cpp:243: Set up connection
> 0x7f6d10045be0 to server  192.168.1.22:11211
> 26-03-2015 07:49:16.933 UTC Debug memcachedstore.cpp:254: Setting server
> to IP address  192.168.1.22 port 11211
> 26-03-2015 07:49:16.936 UTC Debug memcachedstore.cpp:241: Setting up server
> 2 for connection 0x7f6d10040e80 (--CONNECT-TIMEOUT=10 --SUPPORT-CAS
> --POLL-TIMEOUT=250 --BINARY-PROTOCOL)
> 26-03-2015 07:49:16.936 UTC Debug memcachedstore.cpp:243: Set up connection
> 0x7f6d1004b3c0 to server  192.168.1.23:11211
> 26-03-2015 07:49:16.936 UTC Debug memcachedstore.cpp:254: Setting server
> to IP address  192.168.1.23 port 11211
> 26-03-2015 07:49:16.938 UTC Debug memcachedstore.cpp:560: 2 write replicas
> for key av\\[email protected]\3a44ea384df9161e
> 26-03-2015 07:49:16.939 UTC Debug memcachedstore.cpp:612: Attempt
> conditional write to vbucket 93 on replica 0 (connection 0x7f6d1004b3c0),
> CAS = 0, expiry = 40
> 26-03-2015 07:49:16.939 UTC Debug memcachedstore.cpp:812: Attempting to
> add data for key av\\[email protected]\3a44ea384df9161e
> 26-03-2015 07:49:16.939 UTC Debug memcachedstore.cpp:822: Attempting
> memcached ADD command
> 26-03-2015 07:49:16.942 UTC Debug memcachedstore.cpp:912: ADD/CAS returned
> rc = 5 (WRITE FAILURE)
> (140106396906432) getaddrinfo() or getnameinfo() HOSTNAME LOOKUP FAILURE,
> Name or service not known,  host:  192.168.1.23:11211 ->
> libmemcached/connect.cc:197
> 26-03-2015 07:49:16.942 UTC Debug memcachedstore.cpp:612: Attempt
> conditional write to vbucket 93 on replica 1 (connection 0x7f6d10045be0),
> CAS = 0, expiry = 40
> 26-03-2015 07:49:16.942 UTC Debug memcachedstore.cpp:812: Attempting to
> add data for key av\\[email protected]\3a44ea384df9161e
> 26-03-2015 07:49:16.942 UTC Debug memcachedstore.cpp:822: Attempting
> memcached ADD command
> 26-03-2015 07:49:16.945 UTC Debug memcachedstore.cpp:912: ADD/CAS returned
> rc = 5 (WRITE FAILURE)
> (140106396883936) getaddrinfo() or getnameinfo() HOSTNAME LOOKUP FAILURE,
> Name or service not known,  host:  192.168.1.22:11211 ->
> libmemcached/connect.cc:197
> 26-03-2015 07:49:16.945 UTC Error memcachedstore.cpp:715: Failed to write
> data for av\\[email protected]\3a44ea384df9161e to 2 replicas
> 26-03-2015 07:49:16.945 UTC Error avstore.cpp:71: Failed to write
> Authentication Vector for private_id [email protected]
> 26-03-2015 07:49:16.945 UTC Verbose common_sip_processing.cpp:136: TX 569
> bytes Response msg 401/REGISTER/cseq=1 (tdta0x7f6d10006fa0) to TCP
> 192.168.1.11:40630:
>
>
>
> Can you help us on this?
>
> Thanks,
> Lianjie
>
_______________________________________________
Clearwater mailing list
[email protected]
http://lists.projectclearwater.org/listinfo/clearwater

Reply via email to