Hi everyone, Thank you so much for the work on this, I'm sure we can progress with this together. I have noticed that this only occurs in master and never in the stable branches. Also, it only occurs under Ubuntu (so maybe something related to mod_wsgi version?)
Given that we don't have any "master" built packages for Ubuntu, we test against the latest release which is the pike release. https://github.com/openstack/puppet-openstack-integration/blob/master/manifests/repos.pp#L6-L10 I've noticed the issue is not as present in older branches but much more visible in master. Thanks, Mohammed On Tue, Nov 14, 2017 at 6:21 AM, Tobias Urdin <tobias.ur...@crystone.com> wrote: > Yea, I've been scavenging the logs for any kind of indicator on what > might have gone wrong but I can't see anything > related to a deadlock even though I'm very certain that's the issue but > don't know what's causing it. > > Perhaps we will need to manually recreate this issue and then > troubleshoot it manually. > The apache2 mod_wsgi config should be OK according to the docs [1]. > > Best regards > > [1] > http://modwsgi.readthedocs.io/en/develop/configuration-directives/WSGIDaemonProcess.html > > On 11/14/2017 11:12 AM, Jens Harbott wrote: >> 2017-11-14 8:24 GMT+00:00 Tobias Urdin <tobias.ur...@crystone.com>: >>> Trying to trace this, tempest calls the POST /servers/<instance id>/action >>> API endpoint for the nova compute api. >>> >>> https://github.com/openstack/tempest/blob/master/tempest/lib/services/compute/floating_ips_client.py#L82 >>> >>> Nova then takes the requests and tries to do this floating ip association >>> using the neutron server api. >>> >>> http://logs.openstack.org/47/514347/1/check/puppet-openstack-integration-4-scenario001-tempest-ubuntu-xenial/ed5a657/logs/nova/nova-api.txt.gz >>> >>> 2017-10-29 23:12:35.521 17800 ERROR nova.api.openstack.compute.floating_ips >>> [req-7f810cc7-a498-4bf4-b27e-8fc80d652785 42526a28b1a14c629b83908b2d75c647 >>> 2493426e6a3c4253a60c0b7eb35cfe19 - default default] Unable to associate >>> floating IP 172.24.5.17 to fixed IP 10.100.0.8 for instance >>> d265626a-77c1-4d2f-8260-46abe548293e. Error: Request to >>> https://127.0.0.1:9696/v2.0/floatingips/2e3fa334-d6ac-443c-b5ba-eeb521d6324c >>> timed out: ConnectTimeout: Request to >>> https://127.0.0.1:9696/v2.0/floatingips/2e3fa334-d6ac-443c-b5ba-eeb521d6324c >>> timed out >>> >>> Checking that timestamp in the neutron-server logs: >>> http://paste.openstack.org/show/626240/ >>> >>> We can see that during this timestamp right before at 23:12:30.377 and then >>> after 23:12:35.611 everything seems to be doing fine. >>> So there is some connectivity issues to the neutron API from where the Nova >>> API is running causing a timeout. >>> >>> Now some more questions would be: >>> >>> * Why is the return code 400? Are we being fooled or is it actually a >>> connection timeout. >>> * Is the Neutron API stuck causing the failed connection? All talk are done >>> over loopback so chance of a problem there is very low. >>> * Any firewall catching this? Not likely since the agent processes requests >>> right before and after. >>> >>> I can't find anything interesting in the overall other system logs that >>> could explain that. >>> Back to the logs! >> I'm pretty certain that this is a deadlock between nova and neutron, >> though I cannot put my finger on the exact spot yet. But looking at >> the neutron log that you extracted you can see that neutron indeed >> tries to give a successful answer to the fip request just after nova >> has given up waiting for it (seems the timeout is 30s here): >> >> 2017-10-29 23:12:35.932 18958 INFO neutron.wsgi >> [req-e737b7dd-ed9c-46a7-911b-eb77efe11aa8 >> 42526a28b1a14c629b83908b2d75c647 2493426e6a3c4253a60c0b7eb35cfe19 - >> default default] 127.0.0.1 "PUT >> /v2.0/floatingips/2e3fa334-d6ac-443c-b5ba-eeb521d6324c HTTP/1.1" >> status: 200 len: 746 time: 30.4427412 >> >> Also, looking at >> http://logs.openstack.org/47/514347/1/check/puppet-openstack-integration-4-scenario001-tempest-ubuntu-xenial/ed5a657/logs/apache_config/10-nova_api_wsgi.conf.txt.gz >> is seems that nova-api is started with two processes and one thread, >> not sure if that means two processes with one thread each or only one >> thread total, anyway nova-api might be getting stuck there. >> >> __________________________________________________________________________ >> OpenStack Development Mailing List (not for usage questions) >> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> > > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev