Hi David, 

I've looked into this, and there's a bug  in how whether all the bindings are 
expired is reported to the RegistrationTimeout handler. I've raised an issue at 
https://github.com/Metaswitch/sprout/issues/530 and a fix is in progress.

Once this issue is fixed then you should see the debug log entry "All bindings 
have expired based on a Chronos callback - triggering deregistration at the 
HSS" where you've suggested. 

Ellie

-----Original Message-----
From: [email protected] 
[mailto:[email protected]] On Behalf Of Luong, 
David
Sent: 01 May 2014 22:17
To: [email protected]
Subject: Re: [Clearwater] Registrar doesn't update HSS for registrations that 
have expired

Ellie,


We updated to "Elite" and have not seen any Chronos crashes. However, 
registration expiration is still not updating the HSS.

01-05-2014 19:35:06.135 Verbose httpstack.cpp:231: Handling request for URL 
/timers, args (null)
01-05-2014 19:35:06.135 Verbose httpstack.cpp:64: Sending response 200 to 
request for URL /timers, args (null)
01-05-2014 19:35:06.135 Debug regstore.cpp:102: Get AoR data for 
sip:[email protected]
01-05-2014 19:35:06.135 Debug memcachedstore.cpp:260: Key 
reg\\sip:[email protected] hashes to vbucket 92 via hash 0x16f11ddc
01-05-2014 19:35:06.135 Debug memcachedstore.cpp:304: 2 read replicas for key 
reg\\sip:[email protected]
01-05-2014 19:35:06.135 Debug memcachedstore.cpp:337: Attempt to read from 
replica 0 (connection 0x7fcec00151b0)
01-05-2014 19:35:06.135 Debug memcachedstore.cpp:372: Read for 
reg\\sip:[email protected] on replica 0 returned error 20 (NO SERVERS
DEFINED)
01-05-2014 19:35:06.135 Debug memcachedstore.cpp:337: Attempt to read from 
replica 1 (connection 0x7fcec0010990)
01-05-2014 19:35:06.135 Debug memcachedstore.cpp:343: Fetch result
01-05-2014 19:35:06.136 Debug memcachedstore.cpp:351: Found record on replica 1
01-05-2014 19:35:06.136 Debug memcachedstore.cpp:392: Read 623 bytes from table 
reg key sip:[email protected], CAS = 1440
01-05-2014 19:35:06.136 Debug regstore.cpp:450: Deserialize 1 bindings
01-05-2014 19:35:06.136 Debug regstore.cpp:457: Binding
<urn:gsma:imei:35526604-120549-1>:1
01-05-2014 19:35:06.136 Debug regstore.cpp:482: Deserialize 1 path headers
01-05-2014 19:35:06.136 Debug regstore.cpp:488: Deserialized path header 
sip:[email protected]:5058;transport=TCP;lr;ob
01-05-2014 19:35:06.136 Debug regstore.cpp:496: Deserialize 0 subscriptions
01-05-2014 19:35:06.136 Debug regstore.cpp:114: Data store returned a record, 
CAS = 1440
01-05-2014 19:35:06.136 Debug httpconnection.cpp:456: Sending HTTP request
: http://localhost:7253/timers/3f5931dd4000000512a40020c0400c10 (try 0) on new 
connection
01-05-2014 19:35:06.138 Debug httpconnection.cpp:467: Received HTTP response :
01-05-2014 19:35:06.138 Debug handlers.cpp:60: Retrieved AoR data
0x7fcec0108820
01-05-2014 19:35:06.138 Debug regstore.cpp:190: All bindings have expired, so 
this is a deregistration for AOR sip:[email protected]
01-05-2014 19:35:06.138 Debug regstore.cpp:201: Set AoR data for 
sip:[email protected], CAS=1440, expiry = 1398972916
01-05-2014 19:35:06.138 Debug regstore.cpp:369: Serialize 0 bindings
01-05-2014 19:35:06.138 Debug regstore.cpp:406: Serialize 0 subscriptions


Is it correct for us to expect this debug log entry to follow the previous 
entries?


"All bindings have expired based on a Chronos callback - triggering 
deregistration at the HSS"


Regards,
David.



On 4/25/14 11:36 AM, "Eleanor Merry" <[email protected]> wrote:

>Hi David,
>
>Yes, that crash will do it (given that it happened to both Chronos 
>services).
>
>We've got a known issue with Chronos (described at
>https://github.com/Metaswitch/chronos/issues/19) that's causing these 
>crashes - the problem is understood now though and a fix is in progress.
>
>Ellie
>
>
>-----Original Message-----
>From: [email protected]
>[mailto:[email protected]] On Behalf Of 
>Luong, David
>Sent: 25 April 2014 16:17
>To: [email protected]
>Subject: Re: [Clearwater] Registrar doesn't update HSS for 
>registrations that have expired
>
>Thanks Ellie for the pointers.
>
>
>We are using sprout    1.0-140414.183451
>
>Sprout log indicates the timer was registered with Chronos without errors.
>
>25-04-2014 14:29:42.822 Debug httpconnection.cpp:456: Sending HTTP 
>request
>: http://localhost:7253/timers (try 0) on new connection
>25-04-2014 14:29:42.822 Debug httpconnection.cpp:736: Received header 
>http/1.1200ok with value
>25-04-2014 14:29:42.822 Debug httpconnection.cpp:736: Received header 
>location with value
>http://localhost:7253/timers/375ae6f84000000112a40020c0400c10
>25-04-2014 14:29:42.822 Debug httpconnection.cpp:736: Received header 
>date with value Fri,25Apr201414:29:42GMT
>25-04-2014 14:29:42.822 Debug httpconnection.cpp:736: Received header 
>content-length with value 0
>25-04-2014 14:29:42.822 Debug httpconnection.cpp:736: Received header 
>content-type with value text/html;charset=ISO-8859-1
>25-04-2014 14:29:42.822 Debug httpconnection.cpp:736: Received header 
>with value
>25-04-2014 14:29:42.822 Debug httpconnection.cpp:467: Received HTTP 
>response :
> 
>However, Chronos crashed around the same time and restarted. Seems like 
>this happened often. Could this be a config problem? We have 2 sprouts 
>in a cluster. Chronos crashes on both nodes. Could this be the cause of 
>us not receiving the timeout event that trigger registration timeout 
>handling?
>
>Signal 11 caught
>
>Basic stack dump:
>/usr/bin/chronos[0x438a9e]
>/usr/bin/chronos[0x438186]
>/usr/bin/chronos[0x439061]
>/lib/x86_64-linux-gnu/libc.so.6(+0x364a0)[0x7f60eaf1b4a0]
>/usr/lib/x86_64-linux-gnu/libstdc++.so.6(_ZNKSs4sizeEv+0x3)[0x7f60eb854
>dd3
>]
>/usr/bin/chronos[0x41c032]
>/usr/bin/chronos[0x41dbc9]
>/usr/bin/chronos[0x432a76]
>/usr/bin/chronos[0x43277a]
>/usr/bin/chronos[0x432701]
>/usr/bin/chronos[0x43249b]
>/usr/bin/chronos[0x43084e]
>/usr/bin/chronos[0x430410]
>/lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a)[0x7f60ec4c3e9a]
>/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f60eafd93fd]
>
>Advanced stack dump (requires gdb):
>sh: 1: /usr/bin/gdb: not found
>
>gdb failed with return code 32512
>25-04-2014 14:30:42.256 Status globals.cpp:60: Bind address: 0.0.0.0
>25-04-2014 14:30:42.256 Status globals.cpp:64: Bind port: 7253
>25-04-2014 14:30:42.256 Status globals.cpp:68: Cluster local address:
>xx.xx.xx.97
>25-04-2014 14:30:42.256 Status globals.cpp:73: Cluster nodes:
>25-04-2014 14:30:42.256 Status globals.cpp:76:  - xx.xx.xx.97
>25-04-2014 14:30:42.256 Status globals.cpp:76:  - xx.xx.xx.197
>25-04-2014 14:30:42.256 Status globals.cpp:60: Bind address: 0.0.0.0
>25-04-2014 14:30:42.257 Status globals.cpp:64: Bind port: 7253
>25-04-2014 14:30:42.257 Status globals.cpp:68: Cluster local address:
>xx.xx.xx.97
>25-04-2014 14:30:42.257 Status globals.cpp:73: Cluster nodes:
>25-04-2014 14:30:42.257 Status globals.cpp:76:  - xx.xx.xx.97
>25-04-2014 14:30:42.257 Status globals.cpp:76:  - xx.xx.xx.197
>
>
>
>Regards.
>David.
>
>
>On 4/25/14 8:50 AM, "Eleanor Merry" <[email protected]> wrote:
>
>>Hi David,
>>
>>One reason this could happen is that the initial write to Chronos 
>>fails, meaning that it never tells sprout when the registrations expire.
>>
>>Can you check if you are seeing any logs in sprout (in
>>/var/log/sprout/) of the form "Error httpconnection.cpp:536:
>>http://localhost:7253/timers/00f90c494000001b0000060108108000 failed 
>>at server 127.0.0.1 : Couldn't connect to server"? Can you also check 
>>the chronos logs (/var/log/chronos) and the monit logs 
>>(/var/log/monit.log) for any reported errors?
>>
>>If there isn't a problem with Chronos, then can you look in the debug 
>>logs for Sprout/Homestead for reported errors. Sprout and Homestead 
>>logs are in /var/log/sprout and /var/log/homestead, and to set the 
>>logs to debug level you will need to create/edit the file 
>>/etc/clearwater/user_settings, add log_level=5, and then run "service 
>><sprout/homestead> stop" to restart the sprout/homestead servers 
>>(they're automatically restarted by monit).
>>
>>Also, what version of sprout are you running ("dpkg-query -W sprout")?
>>
>>Ellie
>>
>>
>>
>>-----Original Message-----
>>From: [email protected]
>>[mailto:[email protected]] On Behalf Of 
>>Luong, David
>>Sent: 24 April 2014 21:11
>>To: [email protected]
>>Subject: [Clearwater] Registrar doesn't update HSS for registrations 
>>that have expired
>>
>>Hi,
>>
>>According to the following snippet of code in sprout/handlers.cpp, the 
>>HSS should receive a SAR with Server-Assignment-Type AVP set to value 
>>TIMEOUT_DEREGISTRATION when a registration expires. However we do not 
>>see any SAR being sent from homestead. Is this a known issue? Any help 
>>is appreciated.
>>
>>
>>    if (all_bindings_expired)
>>
>>    {
>>
>>      //LCOV_EXCL_START
>>
>>      LOG_DEBUG("All bindings have expired based on a Chronos callback
>>- triggering deregistration at the HSS");
>>
>>      _cfg->_hss->update_registration_state(_aor_id, "", 
>>HSSConnection::DEREG_TIMEOUT, 0);
>>
>>      //LCOV_EXCL_STOP
>>
>>    }
>>
>>
>>Regards.
>>
>>David.
>>_______________________________________________
>>Clearwater mailing list
>>[email protected]
>>http://lists.projectclearwater.org/listinfo/clearwater
>
>_______________________________________________
>Clearwater mailing list
>[email protected]
>http://lists.projectclearwater.org/listinfo/clearwater

_______________________________________________
Clearwater mailing list
[email protected]
http://lists.projectclearwater.org/listinfo/clearwater
_______________________________________________
Clearwater mailing list
[email protected]
http://lists.projectclearwater.org/listinfo/clearwater

Reply via email to