So, I see three things of interest in the logs here. 1) The SIGHUP signal received
> 2015-07-28 14:49:10.378 22487 INFO oslo_service.service [-] Caught > SIGHUP, exiting Do we know what's sending this to signal to Designate? 2) The first exception, "Calling waitall() ..." > 2015-07-28 14:49:10.379 22487 ERROR oslo_service.threadgroup > AssertionError: Calling waitall() from within one of the GreenPool's > greenthreads will never terminate. We believe this to be effectively harmless, though still an open bug: https://bugs.launchpad.net/designate/+bug/1442141 3) The second exception, "Could not bind ..." > 2015-07-28 14:49:40.445 22487 ERROR oslo_service.threadgroup [-] Could > not bind to 0.0.0.0:9001 <http://0.0.0.0:9001> after trying for 30 seconds What kernel version are you running? Can you try: a) Removing the socket.SO_REUSEADDR option from the API socket Approx L211 of designate/service.py, comment this line out: > sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) b) Putting socket.SO_REUSEADDR back in, and adding socket.SO_REUSEPORT. This will need a 3.9+ kernel. Approx L211 of designate/service.py, add this line: > sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1) Thanks, Kiall On 29/07/15 09:35, Jaime Fernández wrote: > Hi Kiall, > > We haven't found the cause yet. Even moving the host to the same > network, the error still happens :( > > We decided to migrate to rabbitmq. While migrating OST to rabbitmq, I've > connected to the local rabbitmq, and I've also seen the same problem > (all designate processes and rabbitmq in the same host). In this > occasion, the designate-api process died after 5 hours. > > Here it is the logs: > > 2015-07-28 12:37:52.730 22487 INFO eventlet.wsgi > [req-84e3674a-860e-46bf-b7a7-b9866866152d noauth-user > 4e3b6c0108f04b309737522a9deee9d8 - - -] 127.0.0.1 - - [28/Jul/2015 > 12:37:52] "GET /v1/domains/38f8f79d-0f6c-42fb-abd7-98ae8cc87fc5/records > HTTP/1.1" 200 1757 0.049479 > 2015-07-28 14:49:10.378 22487 INFO oslo_service.service [-] Caught > SIGHUP, exiting > 2015-07-28 14:49:10.379 22487 INFO designate.service [-] Stopping api > service > 2015-07-28 14:49:10.379 22487 ERROR oslo_service.threadgroup [-] Error > stopping thread. > 2015-07-28 14:49:10.379 22487 ERROR oslo_service.threadgroup Traceback > (most recent call last): > 2015-07-28 14:49:10.379 22487 ERROR oslo_service.threadgroup File > "/home/sysadmin/openstack/designate/.venv/local/lib/python2.7/site-packages/oslo.service-0.4.0-py2.7.egg/oslo_service/threadgroup.py", > line 107, in _stop_threads > 2015-07-28 14:49:10.379 22487 ERROR oslo_service.threadgroup x.stop() > 2015-07-28 14:49:10.379 22487 ERROR oslo_service.threadgroup File > "/home/sysadmin/openstack/designate/.venv/local/lib/python2.7/site-packages/oslo.service-0.4.0-py2.7.egg/oslo_service/threadgroup.py", > line 48, in stop > 2015-07-28 14:49:10.379 22487 ERROR oslo_service.threadgroup > self.thread.kill() > 2015-07-28 14:49:10.379 22487 ERROR oslo_service.threadgroup File > "/home/sysadmin/openstack/designate/.venv/local/lib/python2.7/site-packages/eventlet-0.17.4-py2.7.egg/eventlet/greenthread.py", > line 240, in kill > 2015-07-28 14:49:10.379 22487 ERROR oslo_service.threadgroup return > kill(self, *throw_args) > 2015-07-28 14:49:10.379 22487 ERROR oslo_service.threadgroup File > "/home/sysadmin/openstack/designate/.venv/local/lib/python2.7/site-packages/eventlet-0.17.4-py2.7.egg/eventlet/greenthread.py", > line 294, in kill > 2015-07-28 14:49:10.379 22487 ERROR oslo_service.threadgroup > g.throw(*throw_args) > 2015-07-28 14:49:10.379 22487 ERROR oslo_service.threadgroup File > "/home/sysadmin/openstack/designate/.venv/local/lib/python2.7/site-packages/eventlet-0.17.4-py2.7.egg/eventlet/greenthread.py", > line 214, in main > 2015-07-28 14:49:10.379 22487 ERROR oslo_service.threadgroup result > = function(*args, **kwargs) > 2015-07-28 14:49:10.379 22487 ERROR oslo_service.threadgroup File > "/home/sysadmin/openstack/designate/designate/service.py", line 230, in > _wsgi_handle > 2015-07-28 14:49:10.379 22487 ERROR oslo_service.threadgroup > log=loggers.WritableLogger(logger)) > 2015-07-28 14:49:10.379 22487 ERROR oslo_service.threadgroup File > "/home/sysadmin/openstack/designate/.venv/local/lib/python2.7/site-packages/eventlet-0.17.4-py2.7.egg/eventlet/wsgi.py", > line 842, in server > 2015-07-28 14:49:10.379 22487 ERROR oslo_service.threadgroup > pool.waitall() > 2015-07-28 14:49:10.379 22487 ERROR oslo_service.threadgroup File > "/home/sysadmin/openstack/designate/.venv/local/lib/python2.7/site-packages/eventlet-0.17.4-py2.7.egg/eventlet/greenpool.py", > line 117, in waitall > 2015-07-28 14:49:10.379 22487 ERROR oslo_service.threadgroup > "Calling waitall() from within one of the " \ > 2015-07-28 14:49:10.379 22487 ERROR oslo_service.threadgroup > AssertionError: Calling waitall() from within one of the GreenPool's > greenthreads will never terminate. > 2015-07-28 14:49:10.379 22487 ERROR oslo_service.threadgroup > 2015-07-28 14:49:10.382 22487 WARNING oslo_config.cfg [-] Option > "logdir" from group "DEFAULT" is deprecated. Use option "log-dir" from > group "DEFAULT". > 2015-07-28 14:49:10.383 22487 INFO designate.service [-] Stopping api > service > 2015-07-28 14:49:10.385 22487 WARNING oslo_config.cfg [-] Option > "rabbit_password" from group "DEFAULT" is deprecated. Use option > "rabbit_password" from group "oslo_messaging_rabbit". > 2015-07-28 14:49:10.386 22487 WARNING oslo_config.cfg [-] Option > "rabbit_userid" from group "DEFAULT" is deprecated. Use option > "rabbit_userid" from group "oslo_messaging_rabbit". > 2015-07-28 14:49:10.396 22487 INFO designate.service [-] Starting api > service (version: 1.0.0) > 2015-07-28 14:49:40.445 22487 ERROR oslo_service.threadgroup [-] Could > not bind to 0.0.0.0:9001 <http://0.0.0.0:9001> after trying for 30 seconds > 2015-07-28 14:49:40.445 22487 ERROR oslo_service.threadgroup Traceback > (most recent call last): > 2015-07-28 14:49:40.445 22487 ERROR oslo_service.threadgroup File > "/home/sysadmin/openstack/designate/.venv/local/lib/python2.7/site-packages/oslo.service-0.4.0-py2.7.egg/oslo_service/threadgroup.py", > line 154, in wait > 2015-07-28 14:49:40.445 22487 ERROR oslo_service.threadgroup x.wait() > 2015-07-28 14:49:40.445 22487 ERROR oslo_service.threadgroup File > "/home/sysadmin/openstack/designate/.venv/local/lib/python2.7/site-packages/oslo.service-0.4.0-py2.7.egg/oslo_service/threadgroup.py", > line 51, in wait > 2015-07-28 14:49:40.445 22487 ERROR oslo_service.threadgroup return > self.thread.wait() > 2015-07-28 14:49:40.445 22487 ERROR oslo_service.threadgroup File > "/home/sysadmin/openstack/designate/.venv/local/lib/python2.7/site-packages/eventlet-0.17.4-py2.7.egg/eventlet/greenthread.py", > line 175, in wait > 2015-07-28 14:49:40.445 22487 ERROR oslo_service.threadgroup return > self._exit_event.wait() > 2015-07-28 14:49:40.445 22487 ERROR oslo_service.threadgroup File > "/home/sysadmin/openstack/designate/.venv/local/lib/python2.7/site-packages/eventlet-0.17.4-py2.7.egg/eventlet/event.py", > line 121, in wait > 2015-07-28 14:49:40.445 22487 ERROR oslo_service.threadgroup return > hubs.get_hub().switch() > 2015-07-28 14:49:40.445 22487 ERROR oslo_service.threadgroup File > "/home/sysadmin/openstack/designate/.venv/local/lib/python2.7/site-packages/eventlet-0.17.4-py2.7.egg/eventlet/hubs/hub.py", > line 294, in switch > 2015-07-28 14:49:40.445 22487 ERROR oslo_service.threadgroup return > self.greenlet.switch() > 2015-07-28 14:49:40.445 22487 ERROR oslo_service.threadgroup File > "/home/sysadmin/openstack/designate/.venv/local/lib/python2.7/site-packages/eventlet-0.17.4-py2.7.egg/eventlet/greenthread.py", > line 214, in main > 2015-07-28 14:49:40.445 22487 ERROR oslo_service.threadgroup result > = function(*args, **kwargs) > 2015-07-28 14:49:40.445 22487 ERROR oslo_service.threadgroup File > "/home/sysadmin/openstack/designate/.venv/local/lib/python2.7/site-packages/oslo.service-0.4.0-py2.7.egg/oslo_service/service.py", > line 623, in run_service > 2015-07-28 14:49:40.445 22487 ERROR oslo_service.threadgroup > service.start() > 2015-07-28 14:49:40.445 22487 ERROR oslo_service.threadgroup File > "/home/sysadmin/openstack/designate/designate/service.py", line 173, in > start > 2015-07-28 14:49:40.445 22487 ERROR oslo_service.threadgroup socket > = self._wsgi_get_socket() > 2015-07-28 14:49:40.445 22487 ERROR oslo_service.threadgroup File > "/home/sysadmin/openstack/designate/designate/service.py", line 209, in > _wsgi_get_socket > 2015-07-28 14:49:40.445 22487 ERROR oslo_service.threadgroup 'port': > self._service_config.api_port}) > 2015-07-28 14:49:40.445 22487 ERROR oslo_service.threadgroup > RuntimeError: Could not bind to 0.0.0.0:9001 <http://0.0.0.0:9001> after > trying for 30 seconds > 2015-07-28 14:49:40.445 22487 ERROR oslo_service.threadgroup > 2015-07-28 14:49:40.446 22487 INFO designate.service [-] Stopping api > service > > It looks like a problem in Oslo (SIGHUP) that ends the designate-api > process. > > The same problem appears in designate-central but it looks like it is > restored (the designate-central PID is not changed): > 2015-07-28 14:47:53.446 22486 INFO designate.policy > [req-8f9c5aad-0d0e-4f00-8594-a987f88082f9 - - - - -] Policy check > succeeded for rule 'find_domains' on target {'tenant_id': None} > 2015-07-28 14:49:10.366 22486 INFO oslo_service.service [-] Caught > SIGHUP, exiting > 2015-07-28 14:49:10.367 22486 INFO designate.service [-] Stopping > central service > 2015-07-28 14:49:10.368 22486 WARNING oslo_config.cfg [-] Option > "logdir" from group "DEFAULT" is deprecated. Use option "log-dir" from > group "DEFAULT". > 2015-07-28 14:49:10.369 22486 INFO designate.service [-] Stopping > central service > 2015-07-28 14:49:10.388 22486 WARNING oslo_config.cfg [-] Option > "rabbit_password" from group "DEFAULT" is deprecated. Use option > "rabbit_password" from group "oslo_messaging_rabbit". > 2015-07-28 14:49:10.388 22486 WARNING oslo_config.cfg [-] Option > "rabbit_userid" from group "DEFAULT" is deprecated. Use option > "rabbit_userid" from group "oslo_messaging_rabbit". > 2015-07-28 14:49:10.398 22486 INFO designate.central.service [-] NOT > checking for TLDs > 2015-07-28 14:49:10.398 22486 WARNING designate.central.service [-] > Managed Resource Tenant ID is not properly configured > 2015-07-28 14:49:10.399 22486 INFO designate.service [-] Starting > central service (version: 1.0.0) > > To restore the service (I'm testing that API is working with: curl > http://localhost:9001/v1/domains), I killed designate-central and > designate-pool-manager. I restarted designate-api, designate-central and > designate-pool-manager but the api did not work yet. Then I restarted > the process designate-mdns, and the api started working correctly. > > In mdns logs, I also have: > > 2015-07-28 14:49:10.359 22489 INFO oslo_service.service [-] Caught > SIGHUP, exiting > 2015-07-28 14:49:10.360 22489 INFO designate.service [-] Stopping mdns > service > 2015-07-28 14:49:10.362 22489 WARNING oslo_config.cfg [-] Option > "logdir" from group "DEFAULT" is deprecated. Use option "log-dir" from > group "DEFAULT". > 2015-07-28 14:49:10.363 22489 INFO designate.service [-] Stopping mdns > service > 2015-07-28 14:49:10.366 22489 WARNING oslo_config.cfg [-] Option > "rabbit_password" from group "DEFAULT" is deprecated. Use option > "rabbit_password" from group "oslo_messaging_rabbit". > 2015-07-28 14:49:10.378 22489 WARNING oslo_config.cfg [-] Option > "rabbit_userid" from group "DEFAULT" is deprecated. Use option > "rabbit_userid" from group "oslo_messaging_rabbit". > 2015-07-28 14:49:10.383 22489 INFO designate.service [-] Starting mdns > service (version: 1.0.0) > 2015-07-28 14:49:10.383 22489 INFO designate.dnsutils [-] Opening TCP > Listening Socket on 0.0.0.0:5354 <http://0.0.0.0:5354> > 2015-07-28 14:49:10.383 22489 INFO designate.dnsutils [-] Opening UDP > Listening Socket on 0.0.0.0:5354 <http://0.0.0.0:5354> > 2015-07-28 14:49:10.384 22489 INFO designate.service [-] _handle_tcp > thread started > 2015-07-28 14:49:10.384 22489 INFO designate.service [-] _handle_udp > thread started > 2015-07-28 15:03:20.492 22489 INFO designate.policy > [req-f9065d06-e1c8-4934-82f7-00f6e16fab32 - - - - -] Policy check > succeeded for rule 'all_tenants' on target {} > > I've added the rabbitmq configuration to oslo_messaging_rabbit section > to fix the warning in logs (although I'm almost sure that it won't fix > the problem): > [oslo_messaging_rabbit] > # RabbitMQ Config > rabbit_userid = designate > rabbit_password = designate > > Why oslo.service receives a SIGHUP? The version installed in my host is > oslo.service-0.4.0. > > > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev