On Tue, Jul 21, 2015 at 02:24:14PM +0200, Dmitry Tantsur wrote: > Hi folks! > > If you're not aware already, I'm working on solving "node is locked" > problems breaking users (and tracking it at > https://etherpad.openstack.org/p/ironic-locking-reform). We have retries in > place in client, but we all agree that it's not the eventual solution. > > One of the things we've figured out is that we actually have server-side > retries - in task_manager.acquire. They're nice and configurable. Alas, we > have one place that checks reservations without task_manager: > https://github.com/openstack/ironic/blob/master/ironic/api/controllers/v1/node.py#L401-L403 > (note that this check is actually racy) > > I'd like to ask your opinions on how to solve it? I have 3 ideas: > 1. Just implement retries on API level (possibly split away a common > function from task_manager). > 2. Move update to conductor instead of doing it directly in API. > 3. Don't check reservation when updating node. At all. > > Ideas?
So, it looks like the only reason we check the reservation field here is because we want to return a 409 for "node is locked" rather than a 400, right? do_node_deploy and such will raise a NodeLocked, which should do the same as this check. It's unclear to me why we can't just remove this check and let the conductor deal with it. // jim > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: [email protected]?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: [email protected]?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
