Hi, Also it would be nice to confirm that really SELECT FOR UPDATES are causing the deadlocks. Since these are row lock waits in a single node case, with a slow log from a single node, pt-query-digest can help to determine this.
pt-query-digest /path/to/slow.log --order-by InnoDB_rec_lock_wait:sum > digest-rec_lock_waits.txt It will show what statements waited for locks most, these will most likely be the ones causing the deadlock issues in case of multi-node writing. On Tue, May 20, 2014 at 2:27 PM, Peter Boros <peter.bo...@percona.com> wrote: > Hi, > > I would like to shed some additional light on this for those who were > not there. So, SELECT ... FOR UPDATE does lock on a single node, as it > is pointed out earlier in this thread, a simple solution is to write > only one node at a time. Haproxy can be set up with both backends, see > this blog post for example. > http://www.mysqlperformanceblog.com/2012/06/20/percona-xtradb-cluster-reference-architecture-with-haproxy/ > > In a nutshell, and with a bit of an oversimplification, galera > replicates in write sets. A write set is practically a row based > binary log event + some metadata which is good for 2 things: you can > take a look at 2 write sets and tell if they are conflicting or not, > and you can take a look at a writeset and a database, and tell if the > write set is applicable to the database. At the time of commit, the > transaction is transferred to all the other cluster nodes in parallel. > On the remote node, the new transaction is compared to each other > transaction waiting in the queue to be applied, and it's checked if > it's applicable to the database. If the transaction if not > conflicting, and it's applicable, it's queued, and the node signals > back that the commit can proceed. There is a nice drawing about this > here: > > http://www.percona.com/doc/percona-xtradb-cluster/5.6/features/multimaster-replication.html > > So, because of this, the locks of SELECT FOR UPDATE won't replicate. > Between nodes, galera uses optimistic locking. This means that we > assume that during the certification process (described above), there > will be no conflicts. If there are conflicts, the transaction is > rolled back on the originating node, and this is when you receive the > error message in question. A failed transaction is something which can > happen any time with any database engine with any "interesting > feature", and when a transaction failed, the application should now > what to do with it. In case of galera, a conflict in a single node > case was a wait on row locks, in case of galera replication, this will > be a rollback. A rollback is a much more expensive operation (data has > to be copied back from undo), so if there are lots of failures like > this, performance will suffer. > So, this is not a deadlock in the classical sense. Yet, InnoDB can > roll back a transaction any time because of a deadlock (any database > engine can do that, including PostgreSQL), and the application should > be able to handle this. > > As it was noted earlier, writing to a single node only at a time is a > good solution for avoiding this. With multiple nodes written, storage > engine level writes will still happen on every node, because every > node has the whole data set. Writing on multiple nodes can be > beneficial because parsing SQL is much more expensive than just > applying a row based binary log event, so you can see some performance > improvement if all nodes are written. > > I would discourage using any type of multi-master replication without > understanding how conflict resolution works in case of the chosen > solution. In case of galera, if row locks were replicated over the > network, it would act the same way as a single server, but it would be > really slow. If SELECT FOR UPDATE is only used to achieve consistent > reads (read your own writes), that can be achieved with > wsrep_causal_reads. I am happy to help to avoid SELECT FOR UPDATE if > somebody can tell me the use cases. > > On Tue, May 20, 2014 at 10:53 AM, Julien Danjou <jul...@danjou.info> wrote: >> On Mon, May 19 2014, Jay Pipes wrote: >> >>> I think at that point I mentioned that there were a number of places that >>> were using the SELECT ... FOR UPDATE construct in Nova (in SQLAlchemy, it's >>> the with_lockmode('update') modification of the query object). Peter >>> promptly said that was a problem. MySQL Galera does not support SELECT ... >>> FOR UPDATE, since it has no concept of cross-node locking of records and >>> results are non-deterministic. >> >> So you send a command that's not supported and the whole software >> deadlocks? Is there a bug number about that or something? I cannot >> understand how this can be possible and considered as something normal >> (that's the feeling I have reading your mail, I may be wrong). >> >>> We have a number of options: >>> >>> 1) Stop using MySQL Galera for databases of projects that contain >>> with_lockmode('update') >>> >>> 2) Put a big old warning in the docs somewhere about the problem of >>> potential deadlocks or odd behaviour with Galera in these projects >>> >>> 3) For Nova and Neutron, remove the use of with_lockmode('update') and >>> instead use a coarse-grained file lock or a distributed lock manager for >>> those areas where we need deterministic reads or quiescence. >>> >>> 4) For the Nova db quota driver, refactor the driver to either use a >>> non-locking method for reservation and quota queries or move the driver out >>> into its own projects (or use something like Climate and make sure that >>> Climate uses a non-blocking algorithm for those queries...) >>> >>> Thoughts? >> >> 5) Stop leveling down our development, and rely and leverage a powerful >> RDBMS that provides interesting feature, such as PostgreSQL. >> >> Sorry, had to say it, but it's pissing me off to see the low quality of >> the work that is done around SQL in OpenStack. >> >> -- >> Julien Danjou >> /* Free Software hacker >> http://julien.danjou.info */ > > > > -- > Peter Boros, Consultant, Percona > Telephone: +1 888 401 3401 ext 546 > Emergency: +1 888 401 3401 ext 911 > Skype: percona.pboros -- Peter Boros, Consultant, Percona Telephone: +1 888 401 3401 ext 546 Emergency: +1 888 401 3401 ext 911 Skype: percona.pboros _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev