Hi David, 

Yes, that crash will do it (given that it happened to both Chronos services). 

We've got a known issue with Chronos (described at 
https://github.com/Metaswitch/chronos/issues/19) that's causing these crashes - 
the problem is understood now though and a fix is in progress. 

Ellie


-----Original Message-----
From: [email protected] 
[mailto:[email protected]] On Behalf Of Luong, 
David
Sent: 25 April 2014 16:17
To: [email protected]
Subject: Re: [Clearwater] Registrar doesn't update HSS for registrations that 
have expired

Thanks Ellie for the pointers.


We are using sprout     1.0-140414.183451

Sprout log indicates the timer was registered with Chronos without errors.

25-04-2014 14:29:42.822 Debug httpconnection.cpp:456: Sending HTTP request
: http://localhost:7253/timers (try 0) on new connection
25-04-2014 14:29:42.822 Debug httpconnection.cpp:736: Received header 
http/1.1200ok with value
25-04-2014 14:29:42.822 Debug httpconnection.cpp:736: Received header location 
with value
http://localhost:7253/timers/375ae6f84000000112a40020c0400c10
25-04-2014 14:29:42.822 Debug httpconnection.cpp:736: Received header date with 
value Fri,25Apr201414:29:42GMT
25-04-2014 14:29:42.822 Debug httpconnection.cpp:736: Received header 
content-length with value 0
25-04-2014 14:29:42.822 Debug httpconnection.cpp:736: Received header 
content-type with value text/html;charset=ISO-8859-1
25-04-2014 14:29:42.822 Debug httpconnection.cpp:736: Received header with value
25-04-2014 14:29:42.822 Debug httpconnection.cpp:467: Received HTTP response :
 
However, Chronos crashed around the same time and restarted. Seems like this 
happened often. Could this be a config problem? We have 2 sprouts in a cluster. 
Chronos crashes on both nodes. Could this be the cause of us not receiving the 
timeout event that trigger registration timeout handling?

Signal 11 caught

Basic stack dump:
/usr/bin/chronos[0x438a9e]
/usr/bin/chronos[0x438186]
/usr/bin/chronos[0x439061]
/lib/x86_64-linux-gnu/libc.so.6(+0x364a0)[0x7f60eaf1b4a0]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(_ZNKSs4sizeEv+0x3)[0x7f60eb854dd3]
/usr/bin/chronos[0x41c032]
/usr/bin/chronos[0x41dbc9]
/usr/bin/chronos[0x432a76]
/usr/bin/chronos[0x43277a]
/usr/bin/chronos[0x432701]
/usr/bin/chronos[0x43249b]
/usr/bin/chronos[0x43084e]
/usr/bin/chronos[0x430410]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a)[0x7f60ec4c3e9a]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f60eafd93fd]

Advanced stack dump (requires gdb):
sh: 1: /usr/bin/gdb: not found

gdb failed with return code 32512
25-04-2014 14:30:42.256 Status globals.cpp:60: Bind address: 0.0.0.0
25-04-2014 14:30:42.256 Status globals.cpp:64: Bind port: 7253
25-04-2014 14:30:42.256 Status globals.cpp:68: Cluster local address:
xx.xx.xx.97
25-04-2014 14:30:42.256 Status globals.cpp:73: Cluster nodes:
25-04-2014 14:30:42.256 Status globals.cpp:76:  - xx.xx.xx.97
25-04-2014 14:30:42.256 Status globals.cpp:76:  - xx.xx.xx.197
25-04-2014 14:30:42.256 Status globals.cpp:60: Bind address: 0.0.0.0
25-04-2014 14:30:42.257 Status globals.cpp:64: Bind port: 7253
25-04-2014 14:30:42.257 Status globals.cpp:68: Cluster local address:
xx.xx.xx.97
25-04-2014 14:30:42.257 Status globals.cpp:73: Cluster nodes:
25-04-2014 14:30:42.257 Status globals.cpp:76:  - xx.xx.xx.97
25-04-2014 14:30:42.257 Status globals.cpp:76:  - xx.xx.xx.197



Regards.
David.


On 4/25/14 8:50 AM, "Eleanor Merry" <[email protected]> wrote:

>Hi David,
>
>One reason this could happen is that the initial write to Chronos 
>fails, meaning that it never tells sprout when the registrations expire.
>
>Can you check if you are seeing any logs in sprout (in 
>/var/log/sprout/) of the form "Error httpconnection.cpp:536:
>http://localhost:7253/timers/00f90c494000001b0000060108108000 failed at 
>server 127.0.0.1 : Couldn't connect to server"? Can you also check the 
>chronos logs (/var/log/chronos) and the monit logs (/var/log/monit.log) 
>for any reported errors?
>
>If there isn't a problem with Chronos, then can you look in the debug 
>logs for Sprout/Homestead for reported errors. Sprout and Homestead 
>logs are in /var/log/sprout and /var/log/homestead, and to set the logs 
>to debug level you will need to create/edit the file 
>/etc/clearwater/user_settings, add log_level=5, and then run "service 
><sprout/homestead> stop" to restart the sprout/homestead servers 
>(they're automatically restarted by monit).
>
>Also, what version of sprout are you running ("dpkg-query -W sprout")?
>
>Ellie
>
>
>
>-----Original Message-----
>From: [email protected]
>[mailto:[email protected]] On Behalf Of 
>Luong, David
>Sent: 24 April 2014 21:11
>To: [email protected]
>Subject: [Clearwater] Registrar doesn't update HSS for registrations 
>that have expired
>
>Hi,
>
>According to the following snippet of code in sprout/handlers.cpp, the 
>HSS should receive a SAR with Server-Assignment-Type AVP set to value 
>TIMEOUT_DEREGISTRATION when a registration expires. However we do not 
>see any SAR being sent from homestead. Is this a known issue? Any help 
>is appreciated.
>
>
>    if (all_bindings_expired)
>
>    {
>
>      //LCOV_EXCL_START
>
>      LOG_DEBUG("All bindings have expired based on a Chronos callback 
>- triggering deregistration at the HSS");
>
>      _cfg->_hss->update_registration_state(_aor_id, "", 
>HSSConnection::DEREG_TIMEOUT, 0);
>
>      //LCOV_EXCL_STOP
>
>    }
>
>
>Regards.
>
>David.
>_______________________________________________
>Clearwater mailing list
>[email protected]
>http://lists.projectclearwater.org/listinfo/clearwater

_______________________________________________
Clearwater mailing list
[email protected]
http://lists.projectclearwater.org/listinfo/clearwater
_______________________________________________
Clearwater mailing list
[email protected]
http://lists.projectclearwater.org/listinfo/clearwater

Reply via email to