Hi John, I did a lot more testing and I got a lot more details now:

The issue does not happen in bionic-queens, but it started happening in 
bionic-rocky.
What is different between queens and rocky: queens uses python 2.7, and rocky 
uses python3. Also queens uses a different API library I think (rocky uses 
wsgi, but I don't know how to call the one that was being used before). I don't 
know how to describe it but I can see that the log messages are very different 
between the versions, and even the log files are different between queens and 
rocky. Rocky is nova-api-wsgi.log, but it also has an additional file 
nova-api-os-compute.log, while Queens is nova-osapi_compute.log.

I found this commit [1] that apparently transitions from one to the
other but I haven't dug enough, nor may be relevant given the commit
date it is included in Queens, so it might be a customization of how it
is deployed through debian packages.

To reproduce the issue:
1) Deploy OpenStack, make sure to have debug and verbose enabled
2) Create a VM
3) Create a Volume
4) Edit nova code in the nova-compute node and inject a time.sleep(45) in [2] 
to give time to reload apache2 in the middle of the request
5) SSH to nova-api node as root and have 2 terminals ready, one with the 
command "invoke-rc.d apache2 reload" and the other tailing 
/var/log/nova/nova-api-wsgi.log
6) send a request to attach the volume to the VM "openstack server add volume 
<vm> <vol>"
7) while tailing the logs, as soon as you see the message "exchange 'nova' 
topic 'compute.<hostname>'" then invoke the apache2 reload
8) The API call to add the volume will fail with a 500 internal server error a 
few seconds after
9) Wait 1 minute and invoke "openstack server add volume" again, it will say 
that the volume is already attached.


Observations:
a) The timing is easy, but it doesn't always fail. For me it failed 95% of the 
time
b) If you reload the service before the RPC message is sent between nova-api to 
nova-compute (step 7 above), then it doesn't drop the API call to the client, 
nor the thread, it still works as intended. This means that the threads, client 
socket connections and state is not being lost during the reload, it is ONLY 
rabbitmq synchronous sessions that are failing
c) For me the biggest indication that the service is being reloaded is that 
when you run such command, the next log entry is a big header loading all the 
variables, which is the header printed when the nova API service is first 
started.


In conclusion it seems like the problem is not in apache2, especially 
considering that both bionic-queens and bionic-rocky uses the same version of 
apache2, same logrotate script. It seems to be how nova handles the requests. 
Based on observation (b) above it seems to be handling mostly well, but only 
rabbitmq RPC calls are not handled properly.

[1] 
https://github.com/openstack/nova/commit/d3c084f23448d1890bfda4a06de246f2be3c1279
[2] 
https://github.com/openstack/nova/blob/1ad11b13884baeaa6ed9f8f5818f4d176f4d3134/nova/compute/manager.py#L8005

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2106500

Title:
  logrotate script definition forces application restart

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/apache2/+bug/2106500/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to