[openstack-dev] [neutron] Cross-server locking for neutron server
Hello everyone! Some recent change requests ([1], [2]) show that there is a number of issues with locking db resources in Neutron. One of them is initialization of drivers which can be performed simultaneously by several neutron servers. In this case locking is essential for avoiding conflicts which is now mostly done via using SQLAlchemy's with_lockmode() method, which emits SELECT..FOR UPDATE resulting in rows being locked within a transaction. As it has been already stated by Mike Bayer [3], this statement is not supported by Galera and, what’s more, by Postgresql for which a lock doesn’t work in case when a table is empty. That is why there is a need for an easy solution that would allow cross-server locking and would work for every backend. First thing that comes into mind is to create a table which would contain all locks acquired by various pieces of code. Each time a code, that wishes to access a table that needs locking, would have to perform the following steps: 1. Check whether a lock is already acquired by using SELECT lock_name FROM cross_server_locks table. 2. If SELECT returned None, acquire a lock by inserting it into the cross_server_locks table. In other case wait and then try again until a timeout is reached. 3. After a code has executed it should release the lock by deleting the corresponding entry from the cross_server_locks table. The locking process can be implemented by decorating a function that performs a transaction by a special function, or as a context manager. Thus, I wanted to ask the community whether this approach deserves consideration and, if yes, it would be necessary to decide on the format of an entry in cross_server_locks table: how a lock_name should be formed, whether to support different locking modes, etc. [1] https://review.openstack.org/#/c/101982/ [2] https://review.openstack.org/#/c/107350/ [3] https://wiki.openstack.org/wiki/OpenStack_and_SQLAlchemy#Pessimistic_Locking_-_SELECT_FOR_UPDATE ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Cross-server locking for neutron server
Please do not re-invent locking.. the way we reinvented locking in Heat. ;) There are well known distributed coordination services such as Zookeeper and etcd, and there is an abstraction for them already called tooz: https://git.openstack.org/cgit/stackforge/tooz/ Excerpts from Elena Ezhova's message of 2014-07-30 09:09:27 -0700: Hello everyone! Some recent change requests ([1], [2]) show that there is a number of issues with locking db resources in Neutron. One of them is initialization of drivers which can be performed simultaneously by several neutron servers. In this case locking is essential for avoiding conflicts which is now mostly done via using SQLAlchemy's with_lockmode() method, which emits SELECT..FOR UPDATE resulting in rows being locked within a transaction. As it has been already stated by Mike Bayer [3], this statement is not supported by Galera and, what’s more, by Postgresql for which a lock doesn’t work in case when a table is empty. That is why there is a need for an easy solution that would allow cross-server locking and would work for every backend. First thing that comes into mind is to create a table which would contain all locks acquired by various pieces of code. Each time a code, that wishes to access a table that needs locking, would have to perform the following steps: 1. Check whether a lock is already acquired by using SELECT lock_name FROM cross_server_locks table. 2. If SELECT returned None, acquire a lock by inserting it into the cross_server_locks table. In other case wait and then try again until a timeout is reached. 3. After a code has executed it should release the lock by deleting the corresponding entry from the cross_server_locks table. The locking process can be implemented by decorating a function that performs a transaction by a special function, or as a context manager. Thus, I wanted to ask the community whether this approach deserves consideration and, if yes, it would be necessary to decide on the format of an entry in cross_server_locks table: how a lock_name should be formed, whether to support different locking modes, etc. [1] https://review.openstack.org/#/c/101982/ [2] https://review.openstack.org/#/c/107350/ [3] https://wiki.openstack.org/wiki/OpenStack_and_SQLAlchemy#Pessimistic_Locking_-_SELECT_FOR_UPDATE ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Cross-server locking for neutron server
There's also no need to use locks at all for this (distributed or otherwise). You can use a compare and update strategy with an exponential backoff similar to the approach taken here: https://review.openstack.org/#/c/109837/ I'd have to look at the Neutron code, but I suspect that a simple strategy of issuing the UPDATE SQL statement with a WHERE condition that is constructed to take into account the expected current record state would do the trick... Best, -jay On 07/30/2014 09:33 AM, Clint Byrum wrote: Please do not re-invent locking.. the way we reinvented locking in Heat. ;) There are well known distributed coordination services such as Zookeeper and etcd, and there is an abstraction for them already called tooz: https://git.openstack.org/cgit/stackforge/tooz/ Excerpts from Elena Ezhova's message of 2014-07-30 09:09:27 -0700: Hello everyone! Some recent change requests ([1], [2]) show that there is a number of issues with locking db resources in Neutron. One of them is initialization of drivers which can be performed simultaneously by several neutron servers. In this case locking is essential for avoiding conflicts which is now mostly done via using SQLAlchemy's with_lockmode() method, which emits SELECT..FOR UPDATE resulting in rows being locked within a transaction. As it has been already stated by Mike Bayer [3], this statement is not supported by Galera and, what’s more, by Postgresql for which a lock doesn’t work in case when a table is empty. That is why there is a need for an easy solution that would allow cross-server locking and would work for every backend. First thing that comes into mind is to create a table which would contain all locks acquired by various pieces of code. Each time a code, that wishes to access a table that needs locking, would have to perform the following steps: 1. Check whether a lock is already acquired by using SELECT lock_name FROM cross_server_locks table. 2. If SELECT returned None, acquire a lock by inserting it into the cross_server_locks table. In other case wait and then try again until a timeout is reached. 3. After a code has executed it should release the lock by deleting the corresponding entry from the cross_server_locks table. The locking process can be implemented by decorating a function that performs a transaction by a special function, or as a context manager. Thus, I wanted to ask the community whether this approach deserves consideration and, if yes, it would be necessary to decide on the format of an entry in cross_server_locks table: how a lock_name should be formed, whether to support different locking modes, etc. [1] https://review.openstack.org/#/c/101982/ [2] https://review.openstack.org/#/c/107350/ [3] https://wiki.openstack.org/wiki/OpenStack_and_SQLAlchemy#Pessimistic_Locking_-_SELECT_FOR_UPDATE ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Cross-server locking for neutron server
I'd have to look at the Neutron code, but I suspect that a simple strategy of issuing the UPDATE SQL statement with a WHERE condition that I¹m assuming the locking is for serializing code, whereas for what you describe above, is there some reason we wouldn¹t just use a transaction? Thanks, doug On 7/30/14, 9:41 AM, Jay Pipes jaypi...@gmail.com wrote: There's also no need to use locks at all for this (distributed or otherwise). You can use a compare and update strategy with an exponential backoff similar to the approach taken here: https://review.openstack.org/#/c/109837/ I'd have to look at the Neutron code, but I suspect that a simple strategy of issuing the UPDATE SQL statement with a WHERE condition that is constructed to take into account the expected current record state would do the trick... Best, -jay On 07/30/2014 09:33 AM, Clint Byrum wrote: Please do not re-invent locking.. the way we reinvented locking in Heat. ;) There are well known distributed coordination services such as Zookeeper and etcd, and there is an abstraction for them already called tooz: https://git.openstack.org/cgit/stackforge/tooz/ Excerpts from Elena Ezhova's message of 2014-07-30 09:09:27 -0700: Hello everyone! Some recent change requests ([1], [2]) show that there is a number of issues with locking db resources in Neutron. One of them is initialization of drivers which can be performed simultaneously by several neutron servers. In this case locking is essential for avoiding conflicts which is now mostly done via using SQLAlchemy's with_lockmode() method, which emits SELECT..FOR UPDATE resulting in rows being locked within a transaction. As it has been already stated by Mike Bayer [3], this statement is not supported by Galera and, what¹s more, by Postgresql for which a lock doesn¹t work in case when a table is empty. That is why there is a need for an easy solution that would allow cross-server locking and would work for every backend. First thing that comes into mind is to create a table which would contain all locks acquired by various pieces of code. Each time a code, that wishes to access a table that needs locking, would have to perform the following steps: 1. Check whether a lock is already acquired by using SELECT lock_name FROM cross_server_locks table. 2. If SELECT returned None, acquire a lock by inserting it into the cross_server_locks table. In other case wait and then try again until a timeout is reached. 3. After a code has executed it should release the lock by deleting the corresponding entry from the cross_server_locks table. The locking process can be implemented by decorating a function that performs a transaction by a special function, or as a context manager. Thus, I wanted to ask the community whether this approach deserves consideration and, if yes, it would be necessary to decide on the format of an entry in cross_server_locks table: how a lock_name should be formed, whether to support different locking modes, etc. [1] https://review.openstack.org/#/c/101982/ [2] https://review.openstack.org/#/c/107350/ [3] https://wiki.openstack.org/wiki/OpenStack_and_SQLAlchemy#Pessimistic_Loc king_-_SELECT_FOR_UPDATE ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Cross-server locking for neutron server
On 07/30/2014 09:48 AM, Doug Wiegley wrote: I'd have to look at the Neutron code, but I suspect that a simple strategy of issuing the UPDATE SQL statement with a WHERE condition that I¹m assuming the locking is for serializing code, whereas for what you describe above, is there some reason we wouldn¹t just use a transaction? Because you can't do a transaction from two different threads... The compare and update strategy is for avoiding the use of SELECT FOR UPDATE. Best, -jay ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Cross-server locking for neutron server
Excerpts from Doug Wiegley's message of 2014-07-30 09:48:17 -0700: I'd have to look at the Neutron code, but I suspect that a simple strategy of issuing the UPDATE SQL statement with a WHERE condition that I¹m assuming the locking is for serializing code, whereas for what you describe above, is there some reason we wouldn¹t just use a transaction? I believe the code in question is doing something like this: 1) Check DB for initialized SDN controller driver 2) Not initialized - initialize the SDN controller via its API 3) Record in DB that it is initialized Step (2) above needs serialization, not (3). Compare and update will end up working like a distributed lock anyway, because the db model will have to be changed to have an initializing state, and then if initializing fails, you'll have to have a timeout.. and stealing for stuck processes. Sometimes a distributed lock is actually a simpler solution. Tooz will need work, no doubt. Perhaps if we call it 'oslo.locking' it will make more sense. Anyway, my point stands: trust the experts, avoid reinventing locking. And if you don't like tooz, extract the locking code from Heat and turn it into an oslo.locking library or something. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Cross-server locking for neutron server
i.e. 'optimistic locking' as opposed to the 'pessimistic locking' referenced in the 3rd link of the email starting the thread. On Wed, Jul 30, 2014 at 9:55 AM, Jay Pipes jaypi...@gmail.com wrote: On 07/30/2014 09:48 AM, Doug Wiegley wrote: I'd have to look at the Neutron code, but I suspect that a simple strategy of issuing the UPDATE SQL statement with a WHERE condition that I¹m assuming the locking is for serializing code, whereas for what you describe above, is there some reason we wouldn¹t just use a transaction? Because you can't do a transaction from two different threads... The compare and update strategy is for avoiding the use of SELECT FOR UPDATE. Best, -jay ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Kevin Benton ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Cross-server locking for neutron server
-Original Message- From: Jay Pipes jaypi...@gmail.com Reply: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Date: July 30, 2014 at 09:59:15 To: openstack-dev@lists.openstack.org openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [neutron] Cross-server locking for neutron server On 07/30/2014 09:48 AM, Doug Wiegley wrote: I'd have to look at the Neutron code, but I suspect that a simple strategy of issuing the UPDATE SQL statement with a WHERE condition that I¹m assuming the locking is for serializing code, whereas for what you describe above, is there some reason we wouldn¹t just use a transaction? Because you can't do a transaction from two different threads... The compare and update strategy is for avoiding the use of SELECT FOR UPDATE. Best, -jay As a quick example of the optimistic locking you describe (UPDATE with WHERE clause) you can take a look at the Keystone “consume trust” logic: https://review.openstack.org/#/c/97059/14/keystone/trust/backends/sql.py Line 93 does the initial query, an update is performed then on line 108 and 115 we do the update and check to see how many rows were affected. Feel free to hit me up if I can help in any way on this. Cheers, Morgan — Morgan Fainberg ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Cross-server locking for neutron server
On 07/30/2014 10:05 AM, Kevin Benton wrote: i.e. 'optimistic locking' as opposed to the 'pessimistic locking' referenced in the 3rd link of the email starting the thread. No, there's no locking. On Wed, Jul 30, 2014 at 9:55 AM, Jay Pipes jaypi...@gmail.com mailto:jaypi...@gmail.com wrote: On 07/30/2014 09:48 AM, Doug Wiegley wrote: I'd have to look at the Neutron code, but I suspect that a simple strategy of issuing the UPDATE SQL statement with a WHERE condition that I¹m assuming the locking is for serializing code, whereas for what you describe above, is there some reason we wouldn¹t just use a transaction? Because you can't do a transaction from two different threads... The compare and update strategy is for avoiding the use of SELECT FOR UPDATE. Best, -jay _ OpenStack-dev mailing list OpenStack-dev@lists.openstack.__org mailto:OpenStack-dev@lists.openstack.org http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Kevin Benton ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Cross-server locking for neutron server
Hello, I stop to improve vxlan population and remove SELECT FOR UPDATE[1] because i am not sure the current approach is the right approach to handle vxlan/gre tenant pools: 1- Do we really to populate vxlan/gre tenant pools? The neutron-server could also choose randomly an vxlan vni in vni_ranges and tries to allocate it and retries until allocate success. I did not verify but mac address allocation should use the same principle? It is efficient if used_vnis is small enough (50%) compared to vni_ranges size. I am about to propose an update of neutron.plugins.ml2.drivers.helpers[2] in this direction. 2- Do we need to populate/update vxlan/gre tenant pools on neutron-server restart? A specific command could populate/update them (neutron-db-manage populate / neutron-db-populate) Any thoughts? [1] https://review.openstack.org/#/c/101982 [2] https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers/helpers.py Cedric ZZelle@IRC On Wed, Jul 30, 2014 at 7:30 PM, Jay Pipes jaypi...@gmail.com wrote: On 07/30/2014 10:05 AM, Kevin Benton wrote: i.e. 'optimistic locking' as opposed to the 'pessimistic locking' referenced in the 3rd link of the email starting the thread. No, there's no locking. On Wed, Jul 30, 2014 at 9:55 AM, Jay Pipes jaypi...@gmail.com mailto:jaypi...@gmail.com wrote: On 07/30/2014 09:48 AM, Doug Wiegley wrote: I'd have to look at the Neutron code, but I suspect that a simple strategy of issuing the UPDATE SQL statement with a WHERE condition that I¹m assuming the locking is for serializing code, whereas for what you describe above, is there some reason we wouldn¹t just use a transaction? Because you can't do a transaction from two different threads... The compare and update strategy is for avoiding the use of SELECT FOR UPDATE. Best, -jay _ OpenStack-dev mailing list OpenStack-dev@lists.openstack.__org mailto:OpenStack-dev@lists.openstack.org http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Kevin Benton ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Cross-server locking for neutron server
Using the UPDATE WHERE statement you described is referred to as optimistic locking. [1] https://docs.jboss.org/jbossas/docs/Server_Configuration_Guide/4/html/The_CMP_Engine-Optimistic_Locking.html On Wed, Jul 30, 2014 at 10:30 AM, Jay Pipes jaypi...@gmail.com wrote: On 07/30/2014 10:05 AM, Kevin Benton wrote: i.e. 'optimistic locking' as opposed to the 'pessimistic locking' referenced in the 3rd link of the email starting the thread. No, there's no locking. On Wed, Jul 30, 2014 at 9:55 AM, Jay Pipes jaypi...@gmail.com mailto:jaypi...@gmail.com wrote: On 07/30/2014 09:48 AM, Doug Wiegley wrote: I'd have to look at the Neutron code, but I suspect that a simple strategy of issuing the UPDATE SQL statement with a WHERE condition that I¹m assuming the locking is for serializing code, whereas for what you describe above, is there some reason we wouldn¹t just use a transaction? Because you can't do a transaction from two different threads... The compare and update strategy is for avoiding the use of SELECT FOR UPDATE. Best, -jay _ OpenStack-dev mailing list OpenStack-dev@lists.openstack.__org mailto:OpenStack-dev@lists.openstack.org http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Kevin Benton ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Kevin Benton ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Cross-server locking for neutron server
On 07/30/2014 10:53 AM, Kevin Benton wrote: Using the UPDATE WHERE statement you described is referred to as optimistic locking. [1] https://docs.jboss.org/jbossas/docs/Server_Configuration_Guide/4/html/The_CMP_Engine-Optimistic_Locking.html SQL != JBoss. It's not optimistic locking in the database world. In the database world, optimistic locking is an entirely separate animal: http://en.wikipedia.org/wiki/Lock_(database) And what I am describing is not optimistic lock concurrency in databases. -jay ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Cross-server locking for neutron server
Maybe I misunderstood your approach then. I though you were suggesting where a node performs an UPDATE record WHERE record = last_state_node_saw query and then checks the number of affected rows. That's optimistic locking by every definition I've heard of it. It matches the following statement from the wiki article you linked to as well: The latter situation (optimistic locking) is only appropriate when there is less chance of someone needing to access the record while it is locked; otherwise it cannot be certain that the update will succeed because the attempt to update the record will fail if another user updates the record first. Did I misinterpret how your approach works? On Wed, Jul 30, 2014 at 11:07 AM, Jay Pipes jaypi...@gmail.com wrote: On 07/30/2014 10:53 AM, Kevin Benton wrote: Using the UPDATE WHERE statement you described is referred to as optimistic locking. [1] https://docs.jboss.org/jbossas/docs/Server_Configuration_Guide/4/html/ The_CMP_Engine-Optimistic_Locking.html SQL != JBoss. It's not optimistic locking in the database world. In the database world, optimistic locking is an entirely separate animal: http://en.wikipedia.org/wiki/Lock_(database) And what I am describing is not optimistic lock concurrency in databases. -jay ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Kevin Benton ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Cross-server locking for neutron server
On 07/30/2014 12:21 PM, Kevin Benton wrote: Maybe I misunderstood your approach then. I though you were suggesting where a node performs an UPDATE record WHERE record = last_state_node_saw query and then checks the number of affected rows. That's optimistic locking by every definition I've heard of it. It matches the following statement from the wiki article you linked to as well: The latter situation (optimistic locking) is only appropriate when there is less chance of someone needing to access the record while it is locked; otherwise it cannot be certain that the update will succeed because the attempt to update the record will fail if another user updates the record first. Did I misinterpret how your approach works? The record is never locked in my approach, why is why I don't like to think of it as optimistic locking. It's more like optimistic read and update with retry if certain conditions continue to be met... :) To be very precise, the record is never locked explicitly -- either through the use of SELECT FOR UPDATE or some explicit file or distributed lock. InnoDB won't even hold a lock on anything, as it will simply add a new version to the row using its MGCC (sometimes called MVCC) methods. The technique I am showing in the patch relies on the behaviour of the SQL UPDATE statement with a WHERE clause that contains certain columns and values from the original view of the record. The behaviour of the UPDATE statement will be a NOOP when some other thread has updated the record in between the time that the first thread read the record, and the time the first thread attempted to update the record. The caller of UPDATE can detect this NOOP by checking the number of affected rows, and retry the UPDATE if certain conditions remain kosher... So, there's actually no locks taken in the entire process, which is why I object to the term optimistic locking :) I think where the confusion has been is that the initial SELECT and the following UPDATE statements are *not* done in the context of a single SQL transaction... Best, -jay On Wed, Jul 30, 2014 at 11:07 AM, Jay Pipes jaypi...@gmail.com mailto:jaypi...@gmail.com wrote: On 07/30/2014 10:53 AM, Kevin Benton wrote: Using the UPDATE WHERE statement you described is referred to as optimistic locking. [1] https://docs.jboss.org/__jbossas/docs/Server___Configuration_Guide/4/html/__The_CMP_Engine-Optimistic___Locking.html https://docs.jboss.org/jbossas/docs/Server_Configuration_Guide/4/html/The_CMP_Engine-Optimistic_Locking.html SQL != JBoss. It's not optimistic locking in the database world. In the database world, optimistic locking is an entirely separate animal: http://en.wikipedia.org/wiki/__Lock_(database) http://en.wikipedia.org/wiki/Lock_(database) And what I am describing is not optimistic lock concurrency in databases. -jay _ OpenStack-dev mailing list OpenStack-dev@lists.openstack.__org mailto:OpenStack-dev@lists.openstack.org http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Kevin Benton ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Cross-server locking for neutron server
Excerpts from Jay Pipes's message of 2014-07-30 13:53:38 -0700: On 07/30/2014 12:21 PM, Kevin Benton wrote: Maybe I misunderstood your approach then. I though you were suggesting where a node performs an UPDATE record WHERE record = last_state_node_saw query and then checks the number of affected rows. That's optimistic locking by every definition I've heard of it. It matches the following statement from the wiki article you linked to as well: The latter situation (optimistic locking) is only appropriate when there is less chance of someone needing to access the record while it is locked; otherwise it cannot be certain that the update will succeed because the attempt to update the record will fail if another user updates the record first. Did I misinterpret how your approach works? The record is never locked in my approach, why is why I don't like to think of it as optimistic locking. It's more like optimistic read and update with retry if certain conditions continue to be met... :) To be very precise, the record is never locked explicitly -- either through the use of SELECT FOR UPDATE or some explicit file or distributed lock. InnoDB won't even hold a lock on anything, as it will simply add a new version to the row using its MGCC (sometimes called MVCC) methods. The technique I am showing in the patch relies on the behaviour of the SQL UPDATE statement with a WHERE clause that contains certain columns and values from the original view of the record. The behaviour of the UPDATE statement will be a NOOP when some other thread has updated the record in between the time that the first thread read the record, and the time the first thread attempted to update the record. The caller of UPDATE can detect this NOOP by checking the number of affected rows, and retry the UPDATE if certain conditions remain kosher... So, there's actually no locks taken in the entire process, which is why I object to the term optimistic locking :) I think where the confusion has been is that the initial SELECT and the following UPDATE statements are *not* done in the context of a single SQL transaction... This is all true at a low level Jay. But if you're serializing something outside the DB by using the doing it versus done it state, it still acts like a lock. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Cross-server locking for neutron server
In fact there are more applications for distributed locking than just accessing data in database. One of such use cases is serializing access to devices. This is what is not yet hardly needed, but will be as we get more service drivers working with appliances. It would be great if some existing library could be adopted for it. Thanks, Eugene. On Thu, Jul 31, 2014 at 12:53 AM, Jay Pipes jaypi...@gmail.com wrote: On 07/30/2014 12:21 PM, Kevin Benton wrote: Maybe I misunderstood your approach then. I though you were suggesting where a node performs an UPDATE record WHERE record = last_state_node_saw query and then checks the number of affected rows. That's optimistic locking by every definition I've heard of it. It matches the following statement from the wiki article you linked to as well: The latter situation (optimistic locking) is only appropriate when there is less chance of someone needing to access the record while it is locked; otherwise it cannot be certain that the update will succeed because the attempt to update the record will fail if another user updates the record first. Did I misinterpret how your approach works? The record is never locked in my approach, why is why I don't like to think of it as optimistic locking. It's more like optimistic read and update with retry if certain conditions continue to be met... :) To be very precise, the record is never locked explicitly -- either through the use of SELECT FOR UPDATE or some explicit file or distributed lock. InnoDB won't even hold a lock on anything, as it will simply add a new version to the row using its MGCC (sometimes called MVCC) methods. The technique I am showing in the patch relies on the behaviour of the SQL UPDATE statement with a WHERE clause that contains certain columns and values from the original view of the record. The behaviour of the UPDATE statement will be a NOOP when some other thread has updated the record in between the time that the first thread read the record, and the time the first thread attempted to update the record. The caller of UPDATE can detect this NOOP by checking the number of affected rows, and retry the UPDATE if certain conditions remain kosher... So, there's actually no locks taken in the entire process, which is why I object to the term optimistic locking :) I think where the confusion has been is that the initial SELECT and the following UPDATE statements are *not* done in the context of a single SQL transaction... Best, -jay On Wed, Jul 30, 2014 at 11:07 AM, Jay Pipes jaypi...@gmail.com mailto:jaypi...@gmail.com wrote: On 07/30/2014 10:53 AM, Kevin Benton wrote: Using the UPDATE WHERE statement you described is referred to as optimistic locking. [1] https://docs.jboss.org/__jbossas/docs/Server___ Configuration_Guide/4/html/__The_CMP_Engine-Optimistic___Locking.html https://docs.jboss.org/jbossas/docs/Server_ Configuration_Guide/4/html/The_CMP_Engine-Optimistic_Locking.html SQL != JBoss. It's not optimistic locking in the database world. In the database world, optimistic locking is an entirely separate animal: http://en.wikipedia.org/wiki/__Lock_(database) http://en.wikipedia.org/wiki/Lock_(database) And what I am describing is not optimistic lock concurrency in databases. -jay _ OpenStack-dev mailing list OpenStack-dev@lists.openstack.__org mailto:OpenStack-dev@lists.openstack.org http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Kevin Benton ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Cross-server locking for neutron server
Yes, we are talking about the same thing. I think the term 'optimistic locking' comes from what happens during the sql transaction. The sql engine converts a read (the WHERE clause) and update (the UPDATE clause) operations into an atomic operation. The atomic guarantee requires an internal lock in the database on the record after it is found by the WHERE clause but before the UPDATE is completed to prevent a simultaneous query with the same WHERE clause from updating the record at the same time. So the lock (or some serialization strategy) is still there, but it's just buried deep in the SQL engine internals where they belong. :-) On Jul 30, 2014 2:00 PM, Jay Pipes jaypi...@gmail.com wrote: On 07/30/2014 12:21 PM, Kevin Benton wrote: Maybe I misunderstood your approach then. I though you were suggesting where a node performs an UPDATE record WHERE record = last_state_node_saw query and then checks the number of affected rows. That's optimistic locking by every definition I've heard of it. It matches the following statement from the wiki article you linked to as well: The latter situation (optimistic locking) is only appropriate when there is less chance of someone needing to access the record while it is locked; otherwise it cannot be certain that the update will succeed because the attempt to update the record will fail if another user updates the record first. Did I misinterpret how your approach works? The record is never locked in my approach, why is why I don't like to think of it as optimistic locking. It's more like optimistic read and update with retry if certain conditions continue to be met... :) To be very precise, the record is never locked explicitly -- either through the use of SELECT FOR UPDATE or some explicit file or distributed lock. InnoDB won't even hold a lock on anything, as it will simply add a new version to the row using its MGCC (sometimes called MVCC) methods. The technique I am showing in the patch relies on the behaviour of the SQL UPDATE statement with a WHERE clause that contains certain columns and values from the original view of the record. The behaviour of the UPDATE statement will be a NOOP when some other thread has updated the record in between the time that the first thread read the record, and the time the first thread attempted to update the record. The caller of UPDATE can detect this NOOP by checking the number of affected rows, and retry the UPDATE if certain conditions remain kosher... So, there's actually no locks taken in the entire process, which is why I object to the term optimistic locking :) I think where the confusion has been is that the initial SELECT and the following UPDATE statements are *not* done in the context of a single SQL transaction... Best, -jay On Wed, Jul 30, 2014 at 11:07 AM, Jay Pipes jaypi...@gmail.com mailto:jaypi...@gmail.com wrote: On 07/30/2014 10:53 AM, Kevin Benton wrote: Using the UPDATE WHERE statement you described is referred to as optimistic locking. [1] https://docs.jboss.org/__jbossas/docs/Server___ Configuration_Guide/4/html/__The_CMP_Engine-Optimistic___Locking.html https://docs.jboss.org/jbossas/docs/Server_ Configuration_Guide/4/html/The_CMP_Engine-Optimistic_Locking.html SQL != JBoss. It's not optimistic locking in the database world. In the database world, optimistic locking is an entirely separate animal: http://en.wikipedia.org/wiki/__Lock_(database) http://en.wikipedia.org/wiki/Lock_(database) And what I am describing is not optimistic lock concurrency in databases. -jay _ OpenStack-dev mailing list OpenStack-dev@lists.openstack.__org mailto:OpenStack-dev@lists.openstack.org http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Kevin Benton ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Cross-server locking for neutron server
On 07/30/2014 02:29 PM, Kevin Benton wrote: Yes, we are talking about the same thing. I think the term 'optimistic locking' comes from what happens during the sql transaction. The sql engine converts a read (the WHERE clause) and update (the UPDATE clause) operations into an atomic operation. The atomic guarantee requires an internal lock in the database on the record after it is found by the WHERE clause but before the UPDATE is completed to prevent a simultaneous query with the same WHERE clause from updating the record at the same time. So the lock (or some serialization strategy) is still there, but it's just buried deep in the SQL engine internals where they belong. :-) Gah, let us resolve to agree then :) Yes, a mutex is held on a section/page of the write-ahead transaction log for a moment in order to guarantee durability, so sure, yes, there is a lock held. Just not on the record itself :P -jay On Jul 30, 2014 2:00 PM, Jay Pipes jaypi...@gmail.com mailto:jaypi...@gmail.com wrote: On 07/30/2014 12:21 PM, Kevin Benton wrote: Maybe I misunderstood your approach then. I though you were suggesting where a node performs an UPDATE record WHERE record = last_state_node_saw query and then checks the number of affected rows. That's optimistic locking by every definition I've heard of it. It matches the following statement from the wiki article you linked to as well: The latter situation (optimistic locking) is only appropriate when there is less chance of someone needing to access the record while it is locked; otherwise it cannot be certain that the update will succeed because the attempt to update the record will fail if another user updates the record first. Did I misinterpret how your approach works? The record is never locked in my approach, why is why I don't like to think of it as optimistic locking. It's more like optimistic read and update with retry if certain conditions continue to be met... :) To be very precise, the record is never locked explicitly -- either through the use of SELECT FOR UPDATE or some explicit file or distributed lock. InnoDB won't even hold a lock on anything, as it will simply add a new version to the row using its MGCC (sometimes called MVCC) methods. The technique I am showing in the patch relies on the behaviour of the SQL UPDATE statement with a WHERE clause that contains certain columns and values from the original view of the record. The behaviour of the UPDATE statement will be a NOOP when some other thread has updated the record in between the time that the first thread read the record, and the time the first thread attempted to update the record. The caller of UPDATE can detect this NOOP by checking the number of affected rows, and retry the UPDATE if certain conditions remain kosher... So, there's actually no locks taken in the entire process, which is why I object to the term optimistic locking :) I think where the confusion has been is that the initial SELECT and the following UPDATE statements are *not* done in the context of a single SQL transaction... Best, -jay On Wed, Jul 30, 2014 at 11:07 AM, Jay Pipes jaypi...@gmail.com mailto:jaypi...@gmail.com mailto:jaypi...@gmail.com mailto:jaypi...@gmail.com wrote: On 07/30/2014 10:53 AM, Kevin Benton wrote: Using the UPDATE WHERE statement you described is referred to as optimistic locking. [1] https://docs.jboss.org/jbossas/docs/Server_Configuration_Guide/4/html/The_CMP_Engine-Optimistic_Locking.html https://docs.jboss.org/__jbossas/docs/Server___Configuration_Guide/4/html/__The_CMP_Engine-Optimistic___Locking.html https://docs.jboss.org/__jbossas/docs/Server___Configuration_Guide/4/html/__The_CMP_Engine-Optimistic___Locking.html https://docs.jboss.org/jbossas/docs/Server_Configuration_Guide/4/html/The_CMP_Engine-Optimistic_Locking.html SQL != JBoss. It's not optimistic locking in the database world. In the database world, optimistic locking is an entirely separate animal: http://en.wikipedia.org/wiki/Lock_(database) http://en.wikipedia.org/wiki/__Lock_(database) http://en.wikipedia.org/wiki/__Lock_(database) http://en.wikipedia.org/wiki/Lock_(database) And what I am describing is not optimistic lock concurrency in databases. -jay ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org mailto:OpenStack-dev@lists.__openstack.org mailto:OpenStack-dev@lists.openstack.org
Re: [openstack-dev] [neutron] Cross-server locking for neutron server
I'll just start by saying I'm not the expert in what should-be the solution for neutron here (this is their developers ultimate decision) but I just wanted to add my thoughts Jays solution looks/sounds like a spin lock with a test-and-set[1] (imho still a lock, no matter the makeup u put on it), Seems similar to: https://review.openstack.org/#/c/97059/ in concept that I also saw recently. I though start to feel we should be figuring out how to use a proven correct locking mechanism (kazoo - zookeeper, tooz - memcache, redis or zookeeper...) and avoid the premature optimization that we seem to be falling into when creating our own types of spin locks, optimistic locks and so on... I'd much rather prefer correctness that *might* be a little slower than a solution that is hard to debug, hard to reason about and requires retry magic numbers/hacks (for example that prior keystone review has a magic 10 iteration limit, after all who really knows what that magic number should be...), especially in cases where it is really necessary (I can't qualify to say whether this neutron situation is appropriate for this). Maybe this is the appropriate time to focus on correct (maybe slower, maybe requires zookeeper or redis...) solutions instead of reinvent another solution that we will regret in the future. I'd rather not put my operators through hell (they will be the ones left at the middle of the night trying to figure out why the lock didn't lock) when I can avoid it... Just my 2 cents, [1] http://en.wikipedia.org/wiki/Test-and-set -Josh On Jul 30, 2014, at 1:53 PM, Jay Pipes jaypi...@gmail.com wrote: On 07/30/2014 12:21 PM, Kevin Benton wrote: Maybe I misunderstood your approach then. I though you were suggesting where a node performs an UPDATE record WHERE record = last_state_node_saw query and then checks the number of affected rows. That's optimistic locking by every definition I've heard of it. It matches the following statement from the wiki article you linked to as well: The latter situation (optimistic locking) is only appropriate when there is less chance of someone needing to access the record while it is locked; otherwise it cannot be certain that the update will succeed because the attempt to update the record will fail if another user updates the record first. Did I misinterpret how your approach works? The record is never locked in my approach, why is why I don't like to think of it as optimistic locking. It's more like optimistic read and update with retry if certain conditions continue to be met... :) To be very precise, the record is never locked explicitly -- either through the use of SELECT FOR UPDATE or some explicit file or distributed lock. InnoDB won't even hold a lock on anything, as it will simply add a new version to the row using its MGCC (sometimes called MVCC) methods. The technique I am showing in the patch relies on the behaviour of the SQL UPDATE statement with a WHERE clause that contains certain columns and values from the original view of the record. The behaviour of the UPDATE statement will be a NOOP when some other thread has updated the record in between the time that the first thread read the record, and the time the first thread attempted to update the record. The caller of UPDATE can detect this NOOP by checking the number of affected rows, and retry the UPDATE if certain conditions remain kosher... So, there's actually no locks taken in the entire process, which is why I object to the term optimistic locking :) I think where the confusion has been is that the initial SELECT and the following UPDATE statements are *not* done in the context of a single SQL transaction... Best, -jay On Wed, Jul 30, 2014 at 11:07 AM, Jay Pipes jaypi...@gmail.com mailto:jaypi...@gmail.com wrote: On 07/30/2014 10:53 AM, Kevin Benton wrote: Using the UPDATE WHERE statement you described is referred to as optimistic locking. [1] https://docs.jboss.org/__jbossas/docs/Server___Configuration_Guide/4/html/__The_CMP_Engine-Optimistic___Locking.html https://docs.jboss.org/jbossas/docs/Server_Configuration_Guide/4/html/The_CMP_Engine-Optimistic_Locking.html SQL != JBoss. It's not optimistic locking in the database world. In the database world, optimistic locking is an entirely separate animal: http://en.wikipedia.org/wiki/__Lock_(database) http://en.wikipedia.org/wiki/Lock_(database) And what I am describing is not optimistic lock concurrency in databases. -jay _ OpenStack-dev mailing list OpenStack-dev@lists.openstack.__org mailto:OpenStack-dev@lists.openstack.org http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Kevin Benton
Re: [openstack-dev] [neutron] Cross-server locking for neutron server
It's not about distributed locking. It's about allowing multiple threads to make some sort of progress in the face of a contentious piece of code. Obstruction and lock-free algorithms are preferred, IMO, over lock-based solutions that sit there and block while something else is doing something. And yeah, I'm using the term lock-free here, when in fact, there is the possibility of a lock being held for a very short amount of time in the low-level storage engine code... That said, I completely agree with you on using existing technologies and not reinventing wheels where appropriate. If I needed a distributed lock, I wouldn't reinvent one in Python ;) Best -jay On 07/30/2014 03:16 PM, Joshua Harlow wrote: I'll just start by saying I'm not the expert in what should-be the solution for neutron here (this is their developers ultimate decision) but I just wanted to add my thoughts Jays solution looks/sounds like a spin lock with a test-and-set[1] (imho still a lock, no matter the makeup u put on it), Seems similar to: https://review.openstack.org/#/c/97059/ in concept that I also saw recently. I though start to feel we should be figuring out how to use a proven correct locking mechanism (kazoo - zookeeper, tooz - memcache, redis or zookeeper...) and avoid the premature optimization that we seem to be falling into when creating our own types of spin locks, optimistic locks and so on... I'd much rather prefer correctness that *might* be a little slower than a solution that is hard to debug, hard to reason about and requires retry magic numbers/hacks (for example that prior keystone review has a magic 10 iteration limit, after all who really knows what that magic number should be...), especially in cases where it is really necessary (I can't qualify to say whether this neutron situation is appropriate for this). Maybe this is the appropriate time to focus on correct (maybe slower, maybe requires zookeeper or redis...) solutions instead of reinvent another solution that we will regret in the future. I'd rather not put my operators through hell (they will be the ones left at the middle of the night trying to figure out why the lock didn't lock) when I can avoid it... Just my 2 cents, [1] http://en.wikipedia.org/wiki/Test-and-set -Josh On Jul 30, 2014, at 1:53 PM, Jay Pipes jaypi...@gmail.com wrote: On 07/30/2014 12:21 PM, Kevin Benton wrote: Maybe I misunderstood your approach then. I though you were suggesting where a node performs an UPDATE record WHERE record = last_state_node_saw query and then checks the number of affected rows. That's optimistic locking by every definition I've heard of it. It matches the following statement from the wiki article you linked to as well: The latter situation (optimistic locking) is only appropriate when there is less chance of someone needing to access the record while it is locked; otherwise it cannot be certain that the update will succeed because the attempt to update the record will fail if another user updates the record first. Did I misinterpret how your approach works? The record is never locked in my approach, why is why I don't like to think of it as optimistic locking. It's more like optimistic read and update with retry if certain conditions continue to be met... :) To be very precise, the record is never locked explicitly -- either through the use of SELECT FOR UPDATE or some explicit file or distributed lock. InnoDB won't even hold a lock on anything, as it will simply add a new version to the row using its MGCC (sometimes called MVCC) methods. The technique I am showing in the patch relies on the behaviour of the SQL UPDATE statement with a WHERE clause that contains certain columns and values from the original view of the record. The behaviour of the UPDATE statement will be a NOOP when some other thread has updated the record in between the time that the first thread read the record, and the time the first thread attempted to update the record. The caller of UPDATE can detect this NOOP by checking the number of affected rows, and retry the UPDATE if certain conditions remain kosher... So, there's actually no locks taken in the entire process, which is why I object to the term optimistic locking :) I think where the confusion has been is that the initial SELECT and the following UPDATE statements are *not* done in the context of a single SQL transaction... Best, -jay On Wed, Jul 30, 2014 at 11:07 AM, Jay Pipes jaypi...@gmail.com mailto:jaypi...@gmail.com wrote: On 07/30/2014 10:53 AM, Kevin Benton wrote: Using the UPDATE WHERE statement you described is referred to as optimistic locking. [1] https://docs.jboss.org/__jbossas/docs/Server___Configuration_Guide/4/html/__The_CMP_Engine-Optimistic___Locking.html https://docs.jboss.org/jbossas/docs/Server_Configuration_Guide/4/html/The_CMP_Engine-Optimistic_Locking.html SQL != JBoss. It's not optimistic locking in the database world. In the database world, optimistic
Re: [openstack-dev] [neutron] Cross-server locking for neutron server
On Jul 30, 2014, at 3:27 PM, Jay Pipes jaypi...@gmail.com wrote: It's not about distributed locking. It's about allowing multiple threads to make some sort of progress in the face of a contentious piece of code. Sure, multiple threads, multiple process, multiple remote/distributed/cooperating process (all pretty much the same imho, at least at a high-level); all trying to get there work done (like good little worker bees). Obstruction and lock-free algorithms are preferred, IMO, over lock-based solutions that sit there and block while something else is doing something. Understandable, preferring the ones that can move forward under contention is great; just experience and knowledge and people that have implemented them know that lock-free solutions are usually multiple of times harder to get right and reason about than just sitting there and blocking. Both can work, I'm just more of a fan of making it work right (keyword `right`) first (using whichever solution is fine with the neutron folks), then coming back and making it work better as needed. And yeah, I'm using the term lock-free here, when in fact, there is the possibility of a lock being held for a very short amount of time in the low-level storage engine code... That said, I completely agree with you on using existing technologies and not reinventing wheels where appropriate. If I needed a distributed lock, I wouldn't reinvent one in Python ;) +2 Best -jay I'll go back to being quiet now, :-P -Josh On 07/30/2014 03:16 PM, Joshua Harlow wrote: I'll just start by saying I'm not the expert in what should-be the solution for neutron here (this is their developers ultimate decision) but I just wanted to add my thoughts Jays solution looks/sounds like a spin lock with a test-and-set[1] (imho still a lock, no matter the makeup u put on it), Seems similar to: https://review.openstack.org/#/c/97059/ in concept that I also saw recently. I though start to feel we should be figuring out how to use a proven correct locking mechanism (kazoo - zookeeper, tooz - memcache, redis or zookeeper...) and avoid the premature optimization that we seem to be falling into when creating our own types of spin locks, optimistic locks and so on... I'd much rather prefer correctness that *might* be a little slower than a solution that is hard to debug, hard to reason about and requires retry magic numbers/hacks (for example that prior keystone review has a magic 10 iteration limit, after all who really knows what that magic number should be...), especially in cases where it is really necessary (I can't qualify to say whether this neutron situation is appropriate for this). Maybe this is the appropriate time to focus on correct (maybe slower, maybe requires zookeeper or redis...) solutions instead of reinvent another solution that we will regret in the future. I'd rather not put my operators through hell (they will be the ones left at the middle of the night trying to figure out why the lock didn't lock) when I can avoid it... Just my 2 cents, [1] http://en.wikipedia.org/wiki/Test-and-set -Josh On Jul 30, 2014, at 1:53 PM, Jay Pipes jaypi...@gmail.com wrote: On 07/30/2014 12:21 PM, Kevin Benton wrote: Maybe I misunderstood your approach then. I though you were suggesting where a node performs an UPDATE record WHERE record = last_state_node_saw query and then checks the number of affected rows. That's optimistic locking by every definition I've heard of it. It matches the following statement from the wiki article you linked to as well: The latter situation (optimistic locking) is only appropriate when there is less chance of someone needing to access the record while it is locked; otherwise it cannot be certain that the update will succeed because the attempt to update the record will fail if another user updates the record first. Did I misinterpret how your approach works? The record is never locked in my approach, why is why I don't like to think of it as optimistic locking. It's more like optimistic read and update with retry if certain conditions continue to be met... :) To be very precise, the record is never locked explicitly -- either through the use of SELECT FOR UPDATE or some explicit file or distributed lock. InnoDB won't even hold a lock on anything, as it will simply add a new version to the row using its MGCC (sometimes called MVCC) methods. The technique I am showing in the patch relies on the behaviour of the SQL UPDATE statement with a WHERE clause that contains certain columns and values from the original view of the record. The behaviour of the UPDATE statement will be a NOOP when some other thread has updated the record in between the time that the first thread read the record, and the time the first thread attempted to update the record. The caller of UPDATE can detect this NOOP by checking the number of affected rows,