Public bug reported: Under heavy load floating IP operations can trigger a lock wait timeout, thus causing the operation itself to fail.
The reason for the timeout is the usual untimely eventlet yield which can be triggered in many places during the operation. The chances of this happening are increased by the fact that _update_fip_assoc (called within a DB transaction) does several interactions with the NSX backend. Unfortunately it is not practical to change the logic of the plugin in a way such that _update_fip_assoc does not go to the backend anymore, especially because the fix would be so extensive that it would be hardly backportable. An attempt in this direction also did not provide a solution: https://review.openstack.org/#/c/138078/ ** Affects: neutron Importance: Undecided Status: Won't Fix ** Affects: neutron/juno Importance: High Assignee: Salvatore Orlando (salvatore-orlando) Status: New ** Affects: vmware-nsx Importance: High Assignee: Salvatore Orlando (salvatore-orlando) Status: In Progress ** Also affects: neutron Importance: Undecided Status: New ** No longer affects: neutron ** Also affects: neutron Importance: Undecided Status: New ** Also affects: neutron/juno Importance: Undecided Status: New ** No longer affects: neutron ** Also affects: neutron Importance: Undecided Status: New ** Changed in: neutron/juno Assignee: (unassigned) => Salvatore Orlando (salvatore-orlando) ** Changed in: neutron Status: New => Won't Fix ** Changed in: neutron/juno Importance: Undecided => High ** Changed in: vmware-nsx Importance: Undecided => High -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1410777 Title: Floating IP ops lock wait timeout Status in OpenStack Neutron (virtual network service): Won't Fix Status in neutron juno series: New Status in VMware NSX: In Progress Bug description: Under heavy load floating IP operations can trigger a lock wait timeout, thus causing the operation itself to fail. The reason for the timeout is the usual untimely eventlet yield which can be triggered in many places during the operation. The chances of this happening are increased by the fact that _update_fip_assoc (called within a DB transaction) does several interactions with the NSX backend. Unfortunately it is not practical to change the logic of the plugin in a way such that _update_fip_assoc does not go to the backend anymore, especially because the fix would be so extensive that it would be hardly backportable. An attempt in this direction also did not provide a solution: https://review.openstack.org/#/c/138078/ To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1410777/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp