Ellie,

We updated to "Elite" and have not seen any Chronos crashes. However,
registration expiration is still not updating the HSS.

01-05-2014 19:35:06.135 Verbose httpstack.cpp:231: Handling request for
URL /timers, args (null)
01-05-2014 19:35:06.135 Verbose httpstack.cpp:64: Sending response 200 to
request for URL /timers, args (null)
01-05-2014 19:35:06.135 Debug regstore.cpp:102: Get AoR data for
sip:[email protected]
01-05-2014 19:35:06.135 Debug memcachedstore.cpp:260: Key
reg\\sip:[email protected] hashes to vbucket 92 via hash 0x16f11ddc
01-05-2014 19:35:06.135 Debug memcachedstore.cpp:304: 2 read replicas for
key reg\\sip:[email protected]
01-05-2014 19:35:06.135 Debug memcachedstore.cpp:337: Attempt to read from
replica 0 (connection 0x7fcec00151b0)
01-05-2014 19:35:06.135 Debug memcachedstore.cpp:372: Read for
reg\\sip:[email protected] on replica 0 returned error 20 (NO SERVERS
DEFINED)
01-05-2014 19:35:06.135 Debug memcachedstore.cpp:337: Attempt to read from
replica 1 (connection 0x7fcec0010990)
01-05-2014 19:35:06.135 Debug memcachedstore.cpp:343: Fetch result
01-05-2014 19:35:06.136 Debug memcachedstore.cpp:351: Found record on
replica 1
01-05-2014 19:35:06.136 Debug memcachedstore.cpp:392: Read 623 bytes from
table reg key sip:[email protected], CAS = 1440
01-05-2014 19:35:06.136 Debug regstore.cpp:450: Deserialize 1 bindings
01-05-2014 19:35:06.136 Debug regstore.cpp:457: Binding
<urn:gsma:imei:35526604-120549-1>:1
01-05-2014 19:35:06.136 Debug regstore.cpp:482: Deserialize 1 path headers
01-05-2014 19:35:06.136 Debug regstore.cpp:488: Deserialized path header
sip:[email protected]:5058;transport=TCP;lr;ob
01-05-2014 19:35:06.136 Debug regstore.cpp:496: Deserialize 0 subscriptions
01-05-2014 19:35:06.136 Debug regstore.cpp:114: Data store returned a
record, CAS = 1440
01-05-2014 19:35:06.136 Debug httpconnection.cpp:456: Sending HTTP request
: http://localhost:7253/timers/3f5931dd4000000512a40020c0400c10 (try 0) on
new connection
01-05-2014 19:35:06.138 Debug httpconnection.cpp:467: Received HTTP
response :
01-05-2014 19:35:06.138 Debug handlers.cpp:60: Retrieved AoR data
0x7fcec0108820
01-05-2014 19:35:06.138 Debug regstore.cpp:190: All bindings have expired,
so this is a deregistration for AOR sip:[email protected]
01-05-2014 19:35:06.138 Debug regstore.cpp:201: Set AoR data for
sip:[email protected], CAS=1440, expiry = 1398972916
01-05-2014 19:35:06.138 Debug regstore.cpp:369: Serialize 0 bindings
01-05-2014 19:35:06.138 Debug regstore.cpp:406: Serialize 0 subscriptions


Is it correct for us to expect this debug log entry to follow the previous
entries?


"All bindings have expired based on a Chronos callback - triggering
deregistration at the HSS"


Regards,
David.



On 4/25/14 11:36 AM, "Eleanor Merry" <[email protected]> wrote:

>Hi David, 
>
>Yes, that crash will do it (given that it happened to both Chronos
>services). 
>
>We've got a known issue with Chronos (described at
>https://github.com/Metaswitch/chronos/issues/19) that's causing these
>crashes - the problem is understood now though and a fix is in progress.
>
>Ellie
>
>
>-----Original Message-----
>From: [email protected]
>[mailto:[email protected]] On Behalf Of
>Luong, David
>Sent: 25 April 2014 16:17
>To: [email protected]
>Subject: Re: [Clearwater] Registrar doesn't update HSS for registrations
>that have expired
>
>Thanks Ellie for the pointers.
>
>
>We are using sprout    1.0-140414.183451
>
>Sprout log indicates the timer was registered with Chronos without errors.
>
>25-04-2014 14:29:42.822 Debug httpconnection.cpp:456: Sending HTTP request
>: http://localhost:7253/timers (try 0) on new connection
>25-04-2014 14:29:42.822 Debug httpconnection.cpp:736: Received header
>http/1.1200ok with value
>25-04-2014 14:29:42.822 Debug httpconnection.cpp:736: Received header
>location with value
>http://localhost:7253/timers/375ae6f84000000112a40020c0400c10
>25-04-2014 14:29:42.822 Debug httpconnection.cpp:736: Received header
>date with value Fri,25Apr201414:29:42GMT
>25-04-2014 14:29:42.822 Debug httpconnection.cpp:736: Received header
>content-length with value 0
>25-04-2014 14:29:42.822 Debug httpconnection.cpp:736: Received header
>content-type with value text/html;charset=ISO-8859-1
>25-04-2014 14:29:42.822 Debug httpconnection.cpp:736: Received header
>with value
>25-04-2014 14:29:42.822 Debug httpconnection.cpp:467: Received HTTP
>response :
> 
>However, Chronos crashed around the same time and restarted. Seems like
>this happened often. Could this be a config problem? We have 2 sprouts in
>a cluster. Chronos crashes on both nodes. Could this be the cause of us
>not receiving the timeout event that trigger registration timeout
>handling?
>
>Signal 11 caught
>
>Basic stack dump:
>/usr/bin/chronos[0x438a9e]
>/usr/bin/chronos[0x438186]
>/usr/bin/chronos[0x439061]
>/lib/x86_64-linux-gnu/libc.so.6(+0x364a0)[0x7f60eaf1b4a0]
>/usr/lib/x86_64-linux-gnu/libstdc++.so.6(_ZNKSs4sizeEv+0x3)[0x7f60eb854dd3
>]
>/usr/bin/chronos[0x41c032]
>/usr/bin/chronos[0x41dbc9]
>/usr/bin/chronos[0x432a76]
>/usr/bin/chronos[0x43277a]
>/usr/bin/chronos[0x432701]
>/usr/bin/chronos[0x43249b]
>/usr/bin/chronos[0x43084e]
>/usr/bin/chronos[0x430410]
>/lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a)[0x7f60ec4c3e9a]
>/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f60eafd93fd]
>
>Advanced stack dump (requires gdb):
>sh: 1: /usr/bin/gdb: not found
>
>gdb failed with return code 32512
>25-04-2014 14:30:42.256 Status globals.cpp:60: Bind address: 0.0.0.0
>25-04-2014 14:30:42.256 Status globals.cpp:64: Bind port: 7253
>25-04-2014 14:30:42.256 Status globals.cpp:68: Cluster local address:
>xx.xx.xx.97
>25-04-2014 14:30:42.256 Status globals.cpp:73: Cluster nodes:
>25-04-2014 14:30:42.256 Status globals.cpp:76:  - xx.xx.xx.97
>25-04-2014 14:30:42.256 Status globals.cpp:76:  - xx.xx.xx.197
>25-04-2014 14:30:42.256 Status globals.cpp:60: Bind address: 0.0.0.0
>25-04-2014 14:30:42.257 Status globals.cpp:64: Bind port: 7253
>25-04-2014 14:30:42.257 Status globals.cpp:68: Cluster local address:
>xx.xx.xx.97
>25-04-2014 14:30:42.257 Status globals.cpp:73: Cluster nodes:
>25-04-2014 14:30:42.257 Status globals.cpp:76:  - xx.xx.xx.97
>25-04-2014 14:30:42.257 Status globals.cpp:76:  - xx.xx.xx.197
>
>
>
>Regards.
>David.
>
>
>On 4/25/14 8:50 AM, "Eleanor Merry" <[email protected]> wrote:
>
>>Hi David,
>>
>>One reason this could happen is that the initial write to Chronos
>>fails, meaning that it never tells sprout when the registrations expire.
>>
>>Can you check if you are seeing any logs in sprout (in
>>/var/log/sprout/) of the form "Error httpconnection.cpp:536:
>>http://localhost:7253/timers/00f90c494000001b0000060108108000 failed at
>>server 127.0.0.1 : Couldn't connect to server"? Can you also check the
>>chronos logs (/var/log/chronos) and the monit logs (/var/log/monit.log)
>>for any reported errors?
>>
>>If there isn't a problem with Chronos, then can you look in the debug
>>logs for Sprout/Homestead for reported errors. Sprout and Homestead
>>logs are in /var/log/sprout and /var/log/homestead, and to set the logs
>>to debug level you will need to create/edit the file
>>/etc/clearwater/user_settings, add log_level=5, and then run "service
>><sprout/homestead> stop" to restart the sprout/homestead servers
>>(they're automatically restarted by monit).
>>
>>Also, what version of sprout are you running ("dpkg-query -W sprout")?
>>
>>Ellie
>>
>>
>>
>>-----Original Message-----
>>From: [email protected]
>>[mailto:[email protected]] On Behalf Of
>>Luong, David
>>Sent: 24 April 2014 21:11
>>To: [email protected]
>>Subject: [Clearwater] Registrar doesn't update HSS for registrations
>>that have expired
>>
>>Hi,
>>
>>According to the following snippet of code in sprout/handlers.cpp, the
>>HSS should receive a SAR with Server-Assignment-Type AVP set to value
>>TIMEOUT_DEREGISTRATION when a registration expires. However we do not
>>see any SAR being sent from homestead. Is this a known issue? Any help
>>is appreciated.
>>
>>
>>    if (all_bindings_expired)
>>
>>    {
>>
>>      //LCOV_EXCL_START
>>
>>      LOG_DEBUG("All bindings have expired based on a Chronos callback
>>- triggering deregistration at the HSS");
>>
>>      _cfg->_hss->update_registration_state(_aor_id, "",
>>HSSConnection::DEREG_TIMEOUT, 0);
>>
>>      //LCOV_EXCL_STOP
>>
>>    }
>>
>>
>>Regards.
>>
>>David.
>>_______________________________________________
>>Clearwater mailing list
>>[email protected]
>>http://lists.projectclearwater.org/listinfo/clearwater
>
>_______________________________________________
>Clearwater mailing list
>[email protected]
>http://lists.projectclearwater.org/listinfo/clearwater

_______________________________________________
Clearwater mailing list
[email protected]
http://lists.projectclearwater.org/listinfo/clearwater

Reply via email to