Hi Shiva, Thanks for sending the full log files – those are very useful, and I think they’ve allowed me to pinpoint the problem.
The root cause of the issue is that, when Sprout tries to read from the memcached database, it times out: 13-07-2015 06:29:29.940 UTC Debug memcachedstore.cpp:241: Setting up server 0 for connection 0x7ff4f8076f40 (--CONNECT-TIMEOUT=10 --SUPPORT-CAS --POLL-TIMEOUT=250 --BINARY-PROTOCOL) 13-07-2015 06:29:29.940 UTC Debug memcachedstore.cpp:243: Set up connection 0x7ff4f80b4aa0 to server 10.81.31.98:11211 13-07-2015 06:29:29.940 UTC Debug memcachedstore.cpp:254: Setting server to IP address 10.81.31.98 port 11211 13-07-2015 06:29:29.990 UTC Debug memcachedstore.cpp:365: 1 read replicas for key av\\[email protected]\56bb2100288c05f8 13-07-2015 06:29:29.990 UTC Debug memcachedstore.cpp:400: Attempt to read from replica 0 (connection 0x7ff4f80b4aa0) 13-07-2015 06:29:30.091 UTC Debug memcachedstore.cpp:423: Read for av\\[email protected]\56bb2100288c05f8 on replica 0 returned error 31 (A TIMEOUT OCCURRED) 13-07-2015 06:29:30.091 UTC Error memcachedstore.cpp:512: Failed to read data for av\\[email protected]\56bb2100288c05f8 from 1 replicas I think this is because the IP address in your /etc/clearwater/cluster_settings file is 10.81.31.98. However, from the logs you’ve sent, it looks like Sprout’s local IP address is 11.0.0.9 – which would explain why it can’t reach its local memcached if it’s trying to use 10.81.31.98. Can you update /etc/clearwater/cluster_settings to read ‘servers=11.0.0.9:11211’, and run “sudo service sprout reload”? This should solve your problem. If it doesn’t, can you try updating it to ‘servers=10.0.0.55:11211’ instead? This is the IP address reported by netstat below. If you continue to have problems, can you clarify the IP address configuration on this machine? In particular, there seem to be three IP addresses referenced – 11.0.0.9, 10.0.0.55 and 10.81.31.98 – and I’m not clear on why there are three or what the difference between them is. Best regards, Rob P.S. I don’t think you’re signed up to the Clearwater mailing list – if you sign up at http://lists.projectclearwater.org/listinfo/clearwater , your requests won’t be delayed in the moderation queue, and you may see messages from other users who have solved similar issues. -- Rob Day Software Engineer, Project Clearwater From: Shiva Charan [mailto:[email protected]] Sent: 13 July 2015 10:39 To: Robert Day; [email protected] Subject: Re: [Clearwater] zoiper client number registration fails Hi Rob, From the mails i saw community, I tried to remove the ralf node from my deployment and tired to make the calls again which didnt go through but i have attached the complete logs( both sprout and bono nodes) with the the services restarted and a call register request and stopped the request as the request failed. Also please let me know how to install a fresh deployment setup with the current version or current release that I have so that I can double check if the this is a environment issue. Thanks, Shiva On Fri, Jul 10, 2015 at 4:39 PM, Shiva Charan <[email protected]<mailto:[email protected]>> wrote: Hi Rob, Also wanted to add the syslog of sprout which is saying http request to ralf is failing. How do I fix this? I telnet to ralf with the mentioned port and it is going through. **************error******************* 1005 - Description: http://11.0.0.10:10888/call-id/OTY4MDljYTlmYjZhMGFkNzZkOGMxMzY3MWI2ZmYzNTM. failed to communicate with HTTP server 11.0.0.10 with curl error No error code 0. @@Cause: An HTTP connection attempt failed to the specified server with the specified error code. @@Effect: This condition impacts the ability to register, subscribe, or make a call. @@Action: (1). Check to see if the specified host has failed. (2). Check to see if there is TCP connectivity to the host by using ping and/or Wireshark. ***************************************** ************telnet**************************** [sprout]sprout1@sprout1:/var/log$ telnet 11.0.0.10 10888 Trying 11.0.0.10... Connected to 11.0.0.10. Escape character is '^]'. ^] telnet> ***************************** Same behaviour is observed in bono syslog. Thanks, Shiva On Wed, Jul 8, 2015 at 7:52 PM, Shiva Charan <[email protected]<mailto:[email protected]>> wrote: Hi Rob, Thanks for the reply, I have attached the log for sprout,also i have restarted the services and reloaded the sprout to check the cluster settings I have checked the cluster _setting on both ralf and sprout again, seems to be correct. And for the memcached services its running on 11211 port. root@sprout1:/var/log/sprout# netstat -tnlp Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 10.0.0.55:11211<http://10.0.0.55:11211> 0.0.0.0:* LISTEN 1673/memcached tcp 0 0 127.0.0.1:7253<http://127.0.0.1:7253> 0.0.0.0:* LISTEN 1874/chronos tcp 0 0 255.255.255.255:7253<http://255.255.255.255:7253> 0.0.0.0:* LISTEN 1874/chronos tcp 0 0 0.0.0.0:53<http://0.0.0.0:53> 0.0.0.0:* LISTEN 867/dnsmasq tcp 0 0 0.0.0.0:22<http://0.0.0.0:22> 0.0.0.0:* LISTEN 730/sshd tcp 0 0 127.0.0.1:6010<http://127.0.0.1:6010> 0.0.0.0:* LISTEN 4111/0 tcp 0 0 127.0.0.1:2812<http://127.0.0.1:2812> 0.0.0.0:* LISTEN 1006/monit tcp 0 0 10.0.0.55:5054<http://10.0.0.55:5054> 0.0.0.0:* LISTEN 1063/sprout tcp 0 0 127.0.0.1:9888<http://127.0.0.1:9888> 0.0.0.0:* LISTEN 1063/sprout tcp 0 0 10.0.0.55:9888<http://10.0.0.55:9888> 0.0.0.0:* LISTEN 1063/sprout tcp6 0 0 :::53 :::* LISTEN 867/dnsmasq tcp6 0 0 :::22 :::* LISTEN 730/sshd tcp6 0 0 ::1:6010 :::* LISTEN 4111/0 Should I change the cluster settings apart from the /etc/clearwater/cluster_settings files? Thanks, Shiva On Wed, Jul 8, 2015 at 4:46 PM, Robert Day <[email protected]<mailto:[email protected]>> wrote: Hi Shiva, The logs you sent were just from Bono, but everything looks normal there – could you send me a log file from Sprout as well? poll_cassandra_ring being in ‘Waiting’ state is normal – that shouldn’t be causing this problem. It looks similar to previous problems we’ve seen where Sprout couldn’t contact memcached (e.g. because cluster_settings was wrong), which is why I asked if your IP configuration was correct. It’s worth double-checking this, though, so can you: • Check that /etc/clearwater/cluster_settings is valid • Run ‘sudo service sprout reload’ to force /etc/clearwater/cluster_settings to be re-read (and check the Sprout logs and syslog for any errors or warnings) • Run ‘netstat -tnlp’ to check that memcached is listening on port 11211? Thanks, Rob -- Rob Day Software Engineer, Project Clearwater From: Shiva Charan [mailto:[email protected]<mailto:[email protected]>] Sent: 08 July 2015 10:50 To: Robert Day Subject: Re: [Clearwater] zoiper client number registration fails Hi Rob, Just to add on, earlier we had a issue with the cluster setting which we changed and it went through. Is there any other changes needed for this issue? Thanks, Shiva On Thu, Jul 2, 2015 at 3:39 PM, Shiva Charan <[email protected]<mailto:[email protected]>> wrote: Hi Rob, Yes I have double checked with the ip configuration, everything is in place. I have attached the log file and the homestead monit status output. poll_cassandra_ring is still in waiting state will that make a difference? Thanks, Shiva On Wed, Jul 1, 2015 at 2:21 AM, Robert Day <[email protected]<mailto:[email protected]>> wrote: Hi Shiva, The logs you attached seem to be very short (covering only a 10-millisecond period, and one message in the REGISTER flow), so I haven’t been able to diagnose the issue from them. Is it possible for you to copy off and attach the full log files, rather than a short extract from them? The only things that immediately come to mind as possible causes are: • If your environment uses DHCP, the IP addresses of your VMs may have changed, and you may need to change configuration to reflect this. • The Cassandra database on Homestead may not be running (or may be frequently restarting). Running ‘sudo monit status’ on Homestead should show whether this is the case. Best, Rob -- Rob Day Software Engineer, Project Clearwater From: Shiva Charan [mailto:[email protected]<mailto:[email protected]>] Sent: 30 June 2015 15:25 To: Robert Day Subject: Re: [Clearwater] zoiper client number registration fails Hi Rob, Due to an infrastructure issue we had to restart all the virtual machines after which the calls( from zopier client) are not being registered. Attached here are the bono and sprout logs, any inputs? Shiva On Mon, Jun 22, 2015 at 12:23 PM, Shiva Charan <[email protected]<mailto:[email protected]>> wrote: Hi Rob, Its working now :). There was a IP mismatch in the cluster_settings file. Shiva On Fri, Jun 19, 2015 at 6:46 PM, Shiva Charan <[email protected]<mailto:[email protected]>> wrote: Hi Rob, Thanks for your reply, sorry about the bono log pleas find the log files attached along with the syslog of bono node. syslog shows an error regarding smtp and also few files missing in lib folder. Also attached the config file I am using, the clearwater version of nodes are follows, [ellis]clearwater1@Ellis1:/etc/clearwater$ dpkg-query -W ellis ellis 1.0-150605.133914 [bono]clearwater2@clearwater:~$ dpkg-query -W bono bono 1.0-150216.174621 [sprout]sprout1@sprout1:~$ dpkg-query -W sprout sprout 1.0-150329.161304 [homer]clearwater4@clearwater:~$ dpkg-query -W homer homer 1.0-150213.175004 [homestead]clearwater@ubuntu:~$ dpkg-query -W homestead homestead 1.0-150227.194739 [ralf]clearwater6@clearwater:~$ dpkg-query -W ralf ralf 1.0-150213.143613 Thanks, Shiva On Fri, Jun 19, 2015 at 1:10 AM, Robert Day <[email protected]<mailto:[email protected]>> wrote: Hi Shiva, It looks like those logs might have been truncated – they only contain about 30 lines each, but I’d expect more. Could you send the full log files through from Bono, Sprout and Homestead? (Homestead logs will be useful in investigating authentication failures like this, as they’ll contain information about the subscriber state.) Could you also send the contents of /var/log/syslog? We log high-level troubleshooting logs there, so if there are connectivity issues or similar configuration problems, that might show up in a log there. Thanks, Rob -- Rob Day Software Engineer, Project Clearwater From: [email protected]<mailto:[email protected]> [mailto:[email protected]<mailto:[email protected]>] On Behalf Of Shiva Charan Sent: 17 June 2015 19:26 To: [email protected]<mailto:[email protected]> Subject: [Clearwater] zoiper client number registration fails Hi, In my manual install setup all the services are running but when i try to register a number created from the UI the registration fails. Logs for Bono and Sprout are attached to mail. also the log_current for bono is as below. Bono log_current.txt 17-06-2015 18:00:01.267 UTC Call-Disconnected: CALL_ID=NDc4MjY4Nzk5NzkyMWExMWZiMGFhNzY4MDU4YTUzZDY. REASON=401 17-06-2015 18:00:01.328 UTC Call-Disconnected: CALL_ID=NDc4MjY4Nzk5NzkyMWExMWZiMGFhNzY4MDU4YTUzZDY. REASON=401 17-06-2015 18:00:01.390 UTC Call-Disconnected: CALL_ID=NDc4MjY4Nzk5NzkyMWExMWZiMGFhNzY4MDU4YTUzZDY. REASON=401 17-06-2015 18:00:01.451 UTC Call-Disconnected: Please suggest. Shiva
_______________________________________________ Clearwater mailing list [email protected] http://lists.projectclearwater.org/listinfo/clearwater
