Hi folks!

If you're not aware already, I'm working on solving "node is locked" problems breaking users (and tracking it at https://etherpad.openstack.org/p/ironic-locking-reform). We have retries in place in client, but we all agree that it's not the eventual solution.

One of the things we've figured out is that we actually have server-side retries - in task_manager.acquire. They're nice and configurable. Alas, we have one place that checks reservations without task_manager: https://github.com/openstack/ironic/blob/master/ironic/api/controllers/v1/node.py#L401-L403 (note that this check is actually racy)

I'd like to ask your opinions on how to solve it? I have 3 ideas:
1. Just implement retries on API level (possibly split away a common function from task_manager).
2. Move update to conductor instead of doing it directly in API.
3. Don't check reservation when updating node. At all.

Ideas?

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to