> * nova-api should receive an acknowledgement from nova-compute. It is > unclear to me why today it uses a non-reply mechanism - probably to > free the worker as fast as it can.
Yes, wherever possible, we want the API to return immediately and let the action complete later. Making a wholesale change to blocking calls from the API to any other service is not a good idea, IMHO. > * Change the task_state mechanism to prevent this kind of a stuck > state to stay in the DB. nova-compute can be the one that writes the > task_state to the DB, but this is not enough of course, but maybe > there's another way? The task_state being set in the API is our way of limiting/locking the operation so that if the request is queued for a long time, a user doesn't reissue the command a bunch of time and add load to the API and/or jam up the queue with a thousand requests to do the same operation just because it's taking a while. > * nova-api could start a timer for the action to complete. If the > shelving operation hasn't completed in X seconds, it will clean it > by itself and rollback\try-again. I have wanted to make a change for a while that involves a TTL on messages, along with a deadline record so that we can know when to retry or revert things that were in flight. This requires a lot of machinery to accomplish, and is probably interwoven with the task concept we've had on the back burner for a while. The complexity of moving nova to this sort of scheme means that nobody has picked it up as of yet, but it's certainly in the minds of many of us as something we need to do before too long. --Dan __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev