Re: [openstack-dev] [Neutron] DB: transaction isolation and related questions

Clint Byrum Wed, 19 Nov 2014 13:16:35 -0800

Excerpts from Mike Bayer's message of 2014-11-19 10:05:35 -0800:
> 
> > On Nov 18, 2014, at 1:38 PM, Eugene Nikanorov <[email protected]> 
> > wrote:
> > 
> > Hi neutron folks,
> > 
> > There is an ongoing effort to refactor some neutron DB logic to be 
> > compatible with galera/mysql which doesn't support locking 
> > (with_lockmode('update')).
> > 
> > Some code paths that used locking in the past were rewritten to retry the 
> > operation if they detect that an object was modified concurrently.
> > The problem here is that all DB operations (CRUD) are performed in the 
> > scope of some transaction that makes complex operations to be executed in 
> > atomic manner.
> > For mysql the default transaction isolation level is 'REPEATABLE READ' 
> > which means that once the code issue a query within a transaction, this 
> > query will return the same result while in this transaction (e.g. the 
> > snapshot is taken by the DB during the first query and then reused for the 
> > same query).
> > In other words, the retry logic like the following will not work:
> > 
> > def allocate_obj():
> >     with session.begin(subtrans=True):
> >          for i in xrange(n_retries):
> >               obj = session.query(Model).filter_by(filters)
> >               count = session.query(Model).filter_by(id=obj.id 
> > <http://obj.id/>).update({'allocated': True})
> >               if count:
> >                    return obj
> > 
> > since usually methods like allocate_obj() is called from within another 
> > transaction, we can't simply put transaction under 'for' loop to fix the 
> > issue.
> 
> has this been confirmed?  the point of systems like repeatable read is not 
> just that you read the “old” data, it’s also to ensure that updates to that 
> data either proceed or fail explicitly; locking is also used to prevent 
> concurrent access that can’t be reconciled.  A lower isolation removes these 
> advantages.  
>


Yes this is confirmed and fails reliably on Galera based systems.

> I ran a simple test in two MySQL sessions as follows:
> 
> session 1:
> 
> mysql> create table some_table(data integer) engine=innodb;
> Query OK, 0 rows affected (0.01 sec)
> 
> mysql> insert into some_table(data) values (1);
> Query OK, 1 row affected (0.00 sec)
> 
> mysql> begin;
> Query OK, 0 rows affected (0.00 sec)
> 
> mysql> select data from some_table;
> +------+
> | data |
> +------+
> |    1 |
> +------+
> 1 row in set (0.00 sec)
> 
> 
> session 2:
> 
> mysql> begin;
> Query OK, 0 rows affected (0.00 sec)
> 
> mysql> update some_table set data=2 where data=1;
> Query OK, 1 row affected (0.00 sec)
> Rows matched: 1  Changed: 1  Warnings: 0
> 
> then back in session 1, I ran:
> 
> mysql> update some_table set data=3 where data=1;
> 
> this query blocked;  that’s because session 2 has placed a write lock on the 
> table.  this is the effect of repeatable read isolation.

With Galera this session might happen on another node. There is no
distributed lock, so this would not block...

> 
> while it blocked, I went to session 2 and committed the in-progress 
> transaction:
> 
> mysql> commit;
> Query OK, 0 rows affected (0.00 sec)
> 
> then session 1 unblocked, and it reported, correctly, that zero rows were 
> affected:
> 
> Query OK, 0 rows affected (7.29 sec)
> Rows matched: 0  Changed: 0  Warnings: 0
> 
> the update had not taken place, as was stated by “rows matched":
> 
> mysql> select * from some_table;
> +------+
> | data |
> +------+
> |    1 |
> +------+
> 1 row in set (0.00 sec)
> 
> the code in question would do a retry at this point; it is checking the 
> number of rows matched, and that number is accurate.
> 
> if our code did *not* block at the point of our UPDATE, then it would have 
> proceeded, and the other transaction would have overwritten what we just did, 
> when it committed.   I don’t know that read committed is necessarily any 
> better here.
> 
> now perhaps, with Galera, none of this works correctly.  That would be a 
> different issue in which case sure, we should use whatever isolation is 
> recommended for Galera.  But I’d want to potentially peg it to the fact that 
> Galera is in use, or not.
> 
> would love also to hear from Jay Pipes on this since he literally wrote the 
> book on MySQL ! :)

What you missed is that with Galera the commit that happened last would
be rolled back. This is a reality in many scenarios on SQL databases and
should be handled _regardless_ of Galera. It is a valid way to handle
deadlocks on single node DBs as well (pgsql will do this sometimes).

One simply cannot rely on multi-statement transactions to always succeed.

_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Neutron] DB: transaction isolation and related questions

Reply via email to