Ellie,
We updated to "Elite" and have not seen any Chronos crashes. However, registration expiration is still not updating the HSS. 01-05-2014 19:35:06.135 Verbose httpstack.cpp:231: Handling request for URL /timers, args (null) 01-05-2014 19:35:06.135 Verbose httpstack.cpp:64: Sending response 200 to request for URL /timers, args (null) 01-05-2014 19:35:06.135 Debug regstore.cpp:102: Get AoR data for sip:[email protected] 01-05-2014 19:35:06.135 Debug memcachedstore.cpp:260: Key reg\\sip:[email protected] hashes to vbucket 92 via hash 0x16f11ddc 01-05-2014 19:35:06.135 Debug memcachedstore.cpp:304: 2 read replicas for key reg\\sip:[email protected] 01-05-2014 19:35:06.135 Debug memcachedstore.cpp:337: Attempt to read from replica 0 (connection 0x7fcec00151b0) 01-05-2014 19:35:06.135 Debug memcachedstore.cpp:372: Read for reg\\sip:[email protected] on replica 0 returned error 20 (NO SERVERS DEFINED) 01-05-2014 19:35:06.135 Debug memcachedstore.cpp:337: Attempt to read from replica 1 (connection 0x7fcec0010990) 01-05-2014 19:35:06.135 Debug memcachedstore.cpp:343: Fetch result 01-05-2014 19:35:06.136 Debug memcachedstore.cpp:351: Found record on replica 1 01-05-2014 19:35:06.136 Debug memcachedstore.cpp:392: Read 623 bytes from table reg key sip:[email protected], CAS = 1440 01-05-2014 19:35:06.136 Debug regstore.cpp:450: Deserialize 1 bindings 01-05-2014 19:35:06.136 Debug regstore.cpp:457: Binding <urn:gsma:imei:35526604-120549-1>:1 01-05-2014 19:35:06.136 Debug regstore.cpp:482: Deserialize 1 path headers 01-05-2014 19:35:06.136 Debug regstore.cpp:488: Deserialized path header sip:[email protected]:5058;transport=TCP;lr;ob 01-05-2014 19:35:06.136 Debug regstore.cpp:496: Deserialize 0 subscriptions 01-05-2014 19:35:06.136 Debug regstore.cpp:114: Data store returned a record, CAS = 1440 01-05-2014 19:35:06.136 Debug httpconnection.cpp:456: Sending HTTP request : http://localhost:7253/timers/3f5931dd4000000512a40020c0400c10 (try 0) on new connection 01-05-2014 19:35:06.138 Debug httpconnection.cpp:467: Received HTTP response : 01-05-2014 19:35:06.138 Debug handlers.cpp:60: Retrieved AoR data 0x7fcec0108820 01-05-2014 19:35:06.138 Debug regstore.cpp:190: All bindings have expired, so this is a deregistration for AOR sip:[email protected] 01-05-2014 19:35:06.138 Debug regstore.cpp:201: Set AoR data for sip:[email protected], CAS=1440, expiry = 1398972916 01-05-2014 19:35:06.138 Debug regstore.cpp:369: Serialize 0 bindings 01-05-2014 19:35:06.138 Debug regstore.cpp:406: Serialize 0 subscriptions Is it correct for us to expect this debug log entry to follow the previous entries? "All bindings have expired based on a Chronos callback - triggering deregistration at the HSS" Regards, David. On 4/25/14 11:36 AM, "Eleanor Merry" <[email protected]> wrote: >Hi David, > >Yes, that crash will do it (given that it happened to both Chronos >services). > >We've got a known issue with Chronos (described at >https://github.com/Metaswitch/chronos/issues/19) that's causing these >crashes - the problem is understood now though and a fix is in progress. > >Ellie > > >-----Original Message----- >From: [email protected] >[mailto:[email protected]] On Behalf Of >Luong, David >Sent: 25 April 2014 16:17 >To: [email protected] >Subject: Re: [Clearwater] Registrar doesn't update HSS for registrations >that have expired > >Thanks Ellie for the pointers. > > >We are using sprout 1.0-140414.183451 > >Sprout log indicates the timer was registered with Chronos without errors. > >25-04-2014 14:29:42.822 Debug httpconnection.cpp:456: Sending HTTP request >: http://localhost:7253/timers (try 0) on new connection >25-04-2014 14:29:42.822 Debug httpconnection.cpp:736: Received header >http/1.1200ok with value >25-04-2014 14:29:42.822 Debug httpconnection.cpp:736: Received header >location with value >http://localhost:7253/timers/375ae6f84000000112a40020c0400c10 >25-04-2014 14:29:42.822 Debug httpconnection.cpp:736: Received header >date with value Fri,25Apr201414:29:42GMT >25-04-2014 14:29:42.822 Debug httpconnection.cpp:736: Received header >content-length with value 0 >25-04-2014 14:29:42.822 Debug httpconnection.cpp:736: Received header >content-type with value text/html;charset=ISO-8859-1 >25-04-2014 14:29:42.822 Debug httpconnection.cpp:736: Received header >with value >25-04-2014 14:29:42.822 Debug httpconnection.cpp:467: Received HTTP >response : > >However, Chronos crashed around the same time and restarted. Seems like >this happened often. Could this be a config problem? We have 2 sprouts in >a cluster. Chronos crashes on both nodes. Could this be the cause of us >not receiving the timeout event that trigger registration timeout >handling? > >Signal 11 caught > >Basic stack dump: >/usr/bin/chronos[0x438a9e] >/usr/bin/chronos[0x438186] >/usr/bin/chronos[0x439061] >/lib/x86_64-linux-gnu/libc.so.6(+0x364a0)[0x7f60eaf1b4a0] >/usr/lib/x86_64-linux-gnu/libstdc++.so.6(_ZNKSs4sizeEv+0x3)[0x7f60eb854dd3 >] >/usr/bin/chronos[0x41c032] >/usr/bin/chronos[0x41dbc9] >/usr/bin/chronos[0x432a76] >/usr/bin/chronos[0x43277a] >/usr/bin/chronos[0x432701] >/usr/bin/chronos[0x43249b] >/usr/bin/chronos[0x43084e] >/usr/bin/chronos[0x430410] >/lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a)[0x7f60ec4c3e9a] >/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f60eafd93fd] > >Advanced stack dump (requires gdb): >sh: 1: /usr/bin/gdb: not found > >gdb failed with return code 32512 >25-04-2014 14:30:42.256 Status globals.cpp:60: Bind address: 0.0.0.0 >25-04-2014 14:30:42.256 Status globals.cpp:64: Bind port: 7253 >25-04-2014 14:30:42.256 Status globals.cpp:68: Cluster local address: >xx.xx.xx.97 >25-04-2014 14:30:42.256 Status globals.cpp:73: Cluster nodes: >25-04-2014 14:30:42.256 Status globals.cpp:76: - xx.xx.xx.97 >25-04-2014 14:30:42.256 Status globals.cpp:76: - xx.xx.xx.197 >25-04-2014 14:30:42.256 Status globals.cpp:60: Bind address: 0.0.0.0 >25-04-2014 14:30:42.257 Status globals.cpp:64: Bind port: 7253 >25-04-2014 14:30:42.257 Status globals.cpp:68: Cluster local address: >xx.xx.xx.97 >25-04-2014 14:30:42.257 Status globals.cpp:73: Cluster nodes: >25-04-2014 14:30:42.257 Status globals.cpp:76: - xx.xx.xx.97 >25-04-2014 14:30:42.257 Status globals.cpp:76: - xx.xx.xx.197 > > > >Regards. >David. > > >On 4/25/14 8:50 AM, "Eleanor Merry" <[email protected]> wrote: > >>Hi David, >> >>One reason this could happen is that the initial write to Chronos >>fails, meaning that it never tells sprout when the registrations expire. >> >>Can you check if you are seeing any logs in sprout (in >>/var/log/sprout/) of the form "Error httpconnection.cpp:536: >>http://localhost:7253/timers/00f90c494000001b0000060108108000 failed at >>server 127.0.0.1 : Couldn't connect to server"? Can you also check the >>chronos logs (/var/log/chronos) and the monit logs (/var/log/monit.log) >>for any reported errors? >> >>If there isn't a problem with Chronos, then can you look in the debug >>logs for Sprout/Homestead for reported errors. Sprout and Homestead >>logs are in /var/log/sprout and /var/log/homestead, and to set the logs >>to debug level you will need to create/edit the file >>/etc/clearwater/user_settings, add log_level=5, and then run "service >><sprout/homestead> stop" to restart the sprout/homestead servers >>(they're automatically restarted by monit). >> >>Also, what version of sprout are you running ("dpkg-query -W sprout")? >> >>Ellie >> >> >> >>-----Original Message----- >>From: [email protected] >>[mailto:[email protected]] On Behalf Of >>Luong, David >>Sent: 24 April 2014 21:11 >>To: [email protected] >>Subject: [Clearwater] Registrar doesn't update HSS for registrations >>that have expired >> >>Hi, >> >>According to the following snippet of code in sprout/handlers.cpp, the >>HSS should receive a SAR with Server-Assignment-Type AVP set to value >>TIMEOUT_DEREGISTRATION when a registration expires. However we do not >>see any SAR being sent from homestead. Is this a known issue? Any help >>is appreciated. >> >> >> if (all_bindings_expired) >> >> { >> >> //LCOV_EXCL_START >> >> LOG_DEBUG("All bindings have expired based on a Chronos callback >>- triggering deregistration at the HSS"); >> >> _cfg->_hss->update_registration_state(_aor_id, "", >>HSSConnection::DEREG_TIMEOUT, 0); >> >> //LCOV_EXCL_STOP >> >> } >> >> >>Regards. >> >>David. >>_______________________________________________ >>Clearwater mailing list >>[email protected] >>http://lists.projectclearwater.org/listinfo/clearwater > >_______________________________________________ >Clearwater mailing list >[email protected] >http://lists.projectclearwater.org/listinfo/clearwater _______________________________________________ Clearwater mailing list [email protected] http://lists.projectclearwater.org/listinfo/clearwater
