Public bug reported: When sending a SIGTERM to the main glance-api process, api.log shows 2017-08-31 13:10:30.996 10618 INFO glance.common.wsgi [-] Removed dead child 10628 2017-08-31 13:10:31.004 10618 INFO glance.common.wsgi [-] Started child 10642 2017-08-31 13:10:31.006 10642 INFO eventlet.wsgi.server [-] (10642) wsgi starting up on https://10.162.184.83:5510 2017-08-31 13:10:31.008 10642 INFO eventlet.wsgi.server [-] (10642) wsgi exited, is_accepting=True 2017-08-31 13:10:31.009 10642 INFO glance.common.wsgi [-] Child 10642 exiting normally
This is because kill_children sends a SIGTERM to all children and wait_on_children restarts one, when it notices a dead child We noticed this, because this triggered a fencing in our cloud's pacemaker setup because systemd seems to have a race condition in the cgroup code that should detect that all related services have terminated. # systemctl status openstack-glance-api ● openstack-glance-api.service - OpenStack Image Service API server Loaded: loaded (/usr/lib/systemd/system/openstack-glance-api.service; disabled; vendor preset: disabled) Active: deactivating (final-sigterm) since Thu 2017-08-31 10:13:48 UTC; 1min 14s ago Main PID: 25077 (code=exited, status=0/SUCCESS) Tasks: 0 (limit: 512) CGroup: /system.slice/openstack-glance-api.service Aug 31 10:13:48 d08-9e-01-b4-9e-42 systemd[1]: Stopping OpenStack Image Service API server... Aug 31 10:15:21 d08-9e-01-b4-9e-42 systemd[1]: openstack-glance-api.service: State 'stop-final-sigterm' timed out. Killing. Aug 31 10:15:21 d08-9e-01-b4-9e-42 systemd[1]: Stopped OpenStack Image Service API server. Aug 31 10:15:21 d08-9e-01-b4-9e-42 systemd[1]: openstack-glance-api.service: Unit entered failed state. Aug 31 10:15:21 d08-9e-01-b4-9e-42 systemd[1]: openstack-glance-api.service: Failed with result 'timeout'. ** Affects: glance Importance: Undecided Status: New ** Description changed: When sending a SIGTERM to the main glance-api process, api.log shows 2017-08-31 13:10:30.996 10618 INFO glance.common.wsgi [-] Removed dead child 10628 2017-08-31 13:10:31.004 10618 INFO glance.common.wsgi [-] Started child 10642 2017-08-31 13:10:31.006 10642 INFO eventlet.wsgi.server [-] (10642) wsgi starting up on https://10.162.184.83:5510 2017-08-31 13:10:31.008 10642 INFO eventlet.wsgi.server [-] (10642) wsgi exited, is_accepting=True 2017-08-31 13:10:31.009 10642 INFO glance.common.wsgi [-] Child 10642 exiting normally This is because kill_children sends a SIGTERM to all children and wait_on_children restarts one, when it notices a dead child + + We noticed this, because this triggered a fencing in our cloud's + pacemaker setup because systemd seems to have a race condition in the + cgroup code that should detect that all related services have + terminated. + + + # systemctl status openstack-glance-api + ● openstack-glance-api.service - OpenStack Image Service API server + Loaded: loaded (/usr/lib/systemd/system/openstack-glance-api.service; disabled; vendor preset: disabled) + Active: deactivating (final-sigterm) since Thu 2017-08-31 10:13:48 UTC; 1min 14s ago + Main PID: 25077 (code=exited, status=0/SUCCESS) + Tasks: 0 (limit: 512) + CGroup: /system.slice/openstack-glance-api.service + Aug 31 10:13:48 d08-9e-01-b4-9e-42 systemd[1]: Stopping OpenStack Image Service API server... + Aug 31 10:15:21 d08-9e-01-b4-9e-42 systemd[1]: openstack-glance-api.service: State 'stop-final-sigterm' timed out. Killing. + Aug 31 10:15:21 d08-9e-01-b4-9e-42 systemd[1]: Stopped OpenStack Image Service API server. + Aug 31 10:15:21 d08-9e-01-b4-9e-42 systemd[1]: openstack-glance-api.service: Unit entered failed state. + Aug 31 10:15:21 d08-9e-01-b4-9e-42 systemd[1]: openstack-glance-api.service: Failed with result 'timeout'. -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1714240 Title: glance re-spawns a child when terminating Status in Glance: New Bug description: When sending a SIGTERM to the main glance-api process, api.log shows 2017-08-31 13:10:30.996 10618 INFO glance.common.wsgi [-] Removed dead child 10628 2017-08-31 13:10:31.004 10618 INFO glance.common.wsgi [-] Started child 10642 2017-08-31 13:10:31.006 10642 INFO eventlet.wsgi.server [-] (10642) wsgi starting up on https://10.162.184.83:5510 2017-08-31 13:10:31.008 10642 INFO eventlet.wsgi.server [-] (10642) wsgi exited, is_accepting=True 2017-08-31 13:10:31.009 10642 INFO glance.common.wsgi [-] Child 10642 exiting normally This is because kill_children sends a SIGTERM to all children and wait_on_children restarts one, when it notices a dead child We noticed this, because this triggered a fencing in our cloud's pacemaker setup because systemd seems to have a race condition in the cgroup code that should detect that all related services have terminated. # systemctl status openstack-glance-api ● openstack-glance-api.service - OpenStack Image Service API server Loaded: loaded (/usr/lib/systemd/system/openstack-glance-api.service; disabled; vendor preset: disabled) Active: deactivating (final-sigterm) since Thu 2017-08-31 10:13:48 UTC; 1min 14s ago Main PID: 25077 (code=exited, status=0/SUCCESS) Tasks: 0 (limit: 512) CGroup: /system.slice/openstack-glance-api.service Aug 31 10:13:48 d08-9e-01-b4-9e-42 systemd[1]: Stopping OpenStack Image Service API server... Aug 31 10:15:21 d08-9e-01-b4-9e-42 systemd[1]: openstack-glance-api.service: State 'stop-final-sigterm' timed out. Killing. Aug 31 10:15:21 d08-9e-01-b4-9e-42 systemd[1]: Stopped OpenStack Image Service API server. Aug 31 10:15:21 d08-9e-01-b4-9e-42 systemd[1]: openstack-glance-api.service: Unit entered failed state. Aug 31 10:15:21 d08-9e-01-b4-9e-42 systemd[1]: openstack-glance-api.service: Failed with result 'timeout'. To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1714240/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp