Hi, Thanks for sending the logs across. Looking in the Sprout logs, there are a number of logs like the following:
3-07-2015 06:34:52.596 UTC Debug authentication.cpp:486: Authorization header in request 13-07-2015 06:34:52.596 UTC Debug memcachedstore.cpp:193: Key av\\[email protected]\0813efe7732eac54 hashes to vbucket 96 via hash 0xbff291e0 13-07-2015 06:34:52.596 UTC Debug memcachedstore.cpp:365: 1 read replicas for key av\\[email protected]\0813efe7732eac54 13-07-2015 06:34:52.596 UTC Debug memcachedstore.cpp:400: Attempt to read from replica 0 (connection 0x7ff4d40ffd60) 13-07-2015 06:34:52.646 UTC Debug memcachedstore.cpp:423: Read for av\\[email protected]\0813efe7732eac54 on replica 0 returned error 47 (SERVER HAS FAILED AND IS DISABLED UNTIL TIMED RETRY) 13-07-2015 06:34:52.646 UTC Error memcachedstore.cpp:512: Failed to read data for av\\[email protected]\0813efe7732eac54 from 1 replicas These show the Sprout parsing the authorization header in a REGISTER, and trying to retrieve the authentication vector from memcached. This fails with the error “SERVER HAS FAILED AND IS DISABLED UNTIL TIMED RETRY”. Any later writes to memcached fail with the same error. This causes the Register to fail. Looking at one of the write errors: 13-07-2015 06:34:52.649 UTC Debug memcachedstore.cpp:824: Attempting memcached ADD command 13-07-2015 06:34:52.649 UTC Debug memcachedstore.cpp:914: ADD/CAS returned rc = 5 (WRITE FAILURE) (140689506565472) SERVER HAS FAILED AND IS DISABLED UNTIL TIMED RETRY, host: 10.81.31.98:11211 -> libmemcached/connect.cc:633 This is attempting to connect to host 10.81.31.98 – is this the correct Sprout address? Can you check that the value in /etc/clearwater/cluster_settings are correct for your current system (see more details at http://clearwater.readthedocs.org/en/latest/Old_Manual_Install/index.html#larger-scale-deployments, Clustering Sprout). Ellie From: Sudha Koushik [mailto:[email protected]] Sent: 14 July 2015 06:37 To: Eleanor Merry Cc: [email protected] Subject: Re: [Clearwater] Fiailed to register SIP call Hi Ellie, Thanks for the reply, I have attached the log file for both sprout and bono node, from the moment i restarted the services, made a register request from zoiper client and closed the request. It also contains the log for transaction destroyed I assume this is because of the failed registration? I have confirmed communication between all the nodes with respect to the ports as i came across that usually the issue would be with the communication? Thanks, Sudhakoushik B On Fri, Jul 10, 2015 at 7:49 PM, Eleanor Merry <[email protected]<mailto:[email protected]>> wrote: Hi, The logs you’ve attached don’t appear to have any registration attempts in them (the Sprout log only contains OPTIONS polls). Can you please retry the registration and send me the logs? I’d also like the logs from Bono. I would guess that the HTTP errors are that Sprout and Bono are attempting to send an ACR to Ralf, and this is failing. Do you have a CDF (not part of Project Clearwater) in your deployment? If you don't then you can remove the ralf_hostname setting from your /etc/clearwater/config on Sprout and Bono (and delete your Ralf). Ellie From: Sudha Koushik [mailto:[email protected]<mailto:[email protected]>] Sent: 10 July 2015 13:39 To: Eleanor Merry Cc: [email protected]<mailto:[email protected]> Subject: Re: [Clearwater] Fiailed to register SIP call Hi Ellie, Thanks for the reply. Sprout is able to telnet to ralf with the mention port, [sprout]sprout1@sprout1:/var/log$ telnet 11.0.0.10 10888 Trying 11.0.0.10... Connected to 11.0.0.10. Escape character is '^]'. ^] telnet> Also the same syslog is observed in bono node too with the http request failing to ralf node. I have attached the logs for sprout node and ralf node after making a registration request and stopping it after the registration fails. Thanks, Sudhakoushik B On Thu, Jul 9, 2015 at 10:14 PM, Eleanor Merry <[email protected]<mailto:[email protected]>> wrote: Hi, The dlopen errors in syslog are benign, and can be ignored (we’re currently fixing these up). Can you confirm that the Sprout and Ralf nodes can communicate over port 10888, and that the Sprout and Ralf processes are running reliably? Can you also send me the debug logs from Sprout and Ralf for a call? You can turn on debug logging by creating/editing the file /etc/clearwater/user_settings, adding log_level=5 and then restarting Sprout/Ralf (service <sprout/ralf> stop - they're automatically restarted by monit). The logs are output in /var/log/<ralf/sprout>/<ralf/sprout>_<date>. Ellie From: [email protected]<mailto:[email protected]> [mailto:[email protected]<mailto:[email protected]>] On Behalf Of Sudha Koushik Sent: 09 July 2015 15:12 To: [email protected]<mailto:[email protected]> Subject: [Clearwater] Fiailed to register SIP call Hi all, I have multi vm seup of clearwater where in the syslog file of sprout node its saying the following errors. How do i resolve the file not foung issue? and also about the http request that is going to the ralf node, the connection seems to be failing even though both machines can reach each other. ***********syslog******************* dlopen failed: /usr/lib/clearwater/bono_handler.so: cannot open shared object file: No such file or directory Jul 9 12:35:13 sprout1 snmpd[27209]: dlopen failed: /usr/lib/clearwater/homestead_handler.so: cannot open shared object file: No such file or directory Jul 9 12:35:13 sprout1 snmpd[27209]: dlopen failed: /usr/lib/clearwater/alarm_handler.so: cannot open shared object file: No such file or directory Jul 9 12:35:13 sprout1 snmpd[27209]: dlopen failed: /usr/lib/clearwater/cdiv_handler.so: cannot open shared object file: No such file or directory Jul 9 12:35:13 sprout1 snmpd[27209]: dlopen failed: /usr/lib/clearwater/memento_handler.so: cannot open shared object file: No such file or directory Jul 9 12:35:13 sprout1 snmpd[27209]: dlopen failed: /usr/lib/clearwater/memento_as_handler.so: cannot open shared object file: No such file or directory Jul 9 13:46:15 sprout1 sprout[1059]: 1005 - Description: http://11.0.0.10:10888/call-id/OTY4MDljYTlmYjZhMGFkNzZkOGMxMzY3MWI2ZmYzNTM. failed to communicate with HTTP server 11.0.0.10 with curl error No error code 0. @@Cause: An HTTP connection attempt failed to the specified server with the specified error code. @@Effect: This condition impacts the ability to register, subscribe, or make a call. @@Action: (1). Check to see if the specified host has failed. (2). Check to see if there is TCP connectivity to the host by using ping and/or Wireshark ************************************ Thanks for your Co-operation Regards, Sudhakoushik B -- Best Regards B.S.KoushiK -- Best Regards B.S.KoushiK
_______________________________________________ Clearwater mailing list [email protected] http://lists.projectclearwater.org/listinfo/clearwater
