Hi David, Yes, that crash will do it (given that it happened to both Chronos services).
We've got a known issue with Chronos (described at https://github.com/Metaswitch/chronos/issues/19) that's causing these crashes - the problem is understood now though and a fix is in progress. Ellie -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Luong, David Sent: 25 April 2014 16:17 To: [email protected] Subject: Re: [Clearwater] Registrar doesn't update HSS for registrations that have expired Thanks Ellie for the pointers. We are using sprout 1.0-140414.183451 Sprout log indicates the timer was registered with Chronos without errors. 25-04-2014 14:29:42.822 Debug httpconnection.cpp:456: Sending HTTP request : http://localhost:7253/timers (try 0) on new connection 25-04-2014 14:29:42.822 Debug httpconnection.cpp:736: Received header http/1.1200ok with value 25-04-2014 14:29:42.822 Debug httpconnection.cpp:736: Received header location with value http://localhost:7253/timers/375ae6f84000000112a40020c0400c10 25-04-2014 14:29:42.822 Debug httpconnection.cpp:736: Received header date with value Fri,25Apr201414:29:42GMT 25-04-2014 14:29:42.822 Debug httpconnection.cpp:736: Received header content-length with value 0 25-04-2014 14:29:42.822 Debug httpconnection.cpp:736: Received header content-type with value text/html;charset=ISO-8859-1 25-04-2014 14:29:42.822 Debug httpconnection.cpp:736: Received header with value 25-04-2014 14:29:42.822 Debug httpconnection.cpp:467: Received HTTP response : However, Chronos crashed around the same time and restarted. Seems like this happened often. Could this be a config problem? We have 2 sprouts in a cluster. Chronos crashes on both nodes. Could this be the cause of us not receiving the timeout event that trigger registration timeout handling? Signal 11 caught Basic stack dump: /usr/bin/chronos[0x438a9e] /usr/bin/chronos[0x438186] /usr/bin/chronos[0x439061] /lib/x86_64-linux-gnu/libc.so.6(+0x364a0)[0x7f60eaf1b4a0] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(_ZNKSs4sizeEv+0x3)[0x7f60eb854dd3] /usr/bin/chronos[0x41c032] /usr/bin/chronos[0x41dbc9] /usr/bin/chronos[0x432a76] /usr/bin/chronos[0x43277a] /usr/bin/chronos[0x432701] /usr/bin/chronos[0x43249b] /usr/bin/chronos[0x43084e] /usr/bin/chronos[0x430410] /lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a)[0x7f60ec4c3e9a] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f60eafd93fd] Advanced stack dump (requires gdb): sh: 1: /usr/bin/gdb: not found gdb failed with return code 32512 25-04-2014 14:30:42.256 Status globals.cpp:60: Bind address: 0.0.0.0 25-04-2014 14:30:42.256 Status globals.cpp:64: Bind port: 7253 25-04-2014 14:30:42.256 Status globals.cpp:68: Cluster local address: xx.xx.xx.97 25-04-2014 14:30:42.256 Status globals.cpp:73: Cluster nodes: 25-04-2014 14:30:42.256 Status globals.cpp:76: - xx.xx.xx.97 25-04-2014 14:30:42.256 Status globals.cpp:76: - xx.xx.xx.197 25-04-2014 14:30:42.256 Status globals.cpp:60: Bind address: 0.0.0.0 25-04-2014 14:30:42.257 Status globals.cpp:64: Bind port: 7253 25-04-2014 14:30:42.257 Status globals.cpp:68: Cluster local address: xx.xx.xx.97 25-04-2014 14:30:42.257 Status globals.cpp:73: Cluster nodes: 25-04-2014 14:30:42.257 Status globals.cpp:76: - xx.xx.xx.97 25-04-2014 14:30:42.257 Status globals.cpp:76: - xx.xx.xx.197 Regards. David. On 4/25/14 8:50 AM, "Eleanor Merry" <[email protected]> wrote: >Hi David, > >One reason this could happen is that the initial write to Chronos >fails, meaning that it never tells sprout when the registrations expire. > >Can you check if you are seeing any logs in sprout (in >/var/log/sprout/) of the form "Error httpconnection.cpp:536: >http://localhost:7253/timers/00f90c494000001b0000060108108000 failed at >server 127.0.0.1 : Couldn't connect to server"? Can you also check the >chronos logs (/var/log/chronos) and the monit logs (/var/log/monit.log) >for any reported errors? > >If there isn't a problem with Chronos, then can you look in the debug >logs for Sprout/Homestead for reported errors. Sprout and Homestead >logs are in /var/log/sprout and /var/log/homestead, and to set the logs >to debug level you will need to create/edit the file >/etc/clearwater/user_settings, add log_level=5, and then run "service ><sprout/homestead> stop" to restart the sprout/homestead servers >(they're automatically restarted by monit). > >Also, what version of sprout are you running ("dpkg-query -W sprout")? > >Ellie > > > >-----Original Message----- >From: [email protected] >[mailto:[email protected]] On Behalf Of >Luong, David >Sent: 24 April 2014 21:11 >To: [email protected] >Subject: [Clearwater] Registrar doesn't update HSS for registrations >that have expired > >Hi, > >According to the following snippet of code in sprout/handlers.cpp, the >HSS should receive a SAR with Server-Assignment-Type AVP set to value >TIMEOUT_DEREGISTRATION when a registration expires. However we do not >see any SAR being sent from homestead. Is this a known issue? Any help >is appreciated. > > > if (all_bindings_expired) > > { > > //LCOV_EXCL_START > > LOG_DEBUG("All bindings have expired based on a Chronos callback >- triggering deregistration at the HSS"); > > _cfg->_hss->update_registration_state(_aor_id, "", >HSSConnection::DEREG_TIMEOUT, 0); > > //LCOV_EXCL_STOP > > } > > >Regards. > >David. >_______________________________________________ >Clearwater mailing list >[email protected] >http://lists.projectclearwater.org/listinfo/clearwater _______________________________________________ Clearwater mailing list [email protected] http://lists.projectclearwater.org/listinfo/clearwater _______________________________________________ Clearwater mailing list [email protected] http://lists.projectclearwater.org/listinfo/clearwater
