I think at that point I mentioned that there were a number of places that
were using the SELECT ... FOR UPDATE construct in Nova (in SQLAlchemy, it's
the with_lockmode('update') modification of the query object). Peter
promptly said that was a problem. MySQL Galera does not support SELECT ...
FOR UPDATE, since it has no concept of cross-node locking of records and
results are non-deterministic.

So you send a command that's not supported and the whole software
deadlocks? Is there a bug number about that or something? I cannot
understand how this can be possible and considered as something normal
(that's the feeling I have reading your mail, I may be wrong).

Yes, you entirely misread the email.

The whole system does not deadlock -- in fact, it's not even a deadlock that is causing the problem, as you might have known if you read the email. The error is called a deadlock but it's actually a timeout failure to certify the working set, which is different from a deadlock.

We have a number of options:

1) Stop using MySQL Galera for databases of projects that contain

2) Put a big old warning in the docs somewhere about the problem of
potential deadlocks or odd behaviour with Galera in these projects

3) For Nova and Neutron, remove the use of with_lockmode('update') and
instead use a coarse-grained file lock or a distributed lock manager for
those areas where we need deterministic reads or quiescence.

4) For the Nova db quota driver, refactor the driver to either use a
non-locking method for reservation and quota queries or move the driver out
into its own projects (or use something like Climate and make sure that
Climate uses a non-blocking algorithm for those queries...)


5) Stop leveling down our development, and rely and leverage a powerful
RDBMS that provides interesting feature, such as PostgreSQL.

For the record, there's nothing about this that affects PostgreSQL deployments. There's also little in the PostgreSQL community that will help anyone with write load balancing nor anything in the PostgreSQL community that supports the kinds of things that MySQL Galera supports -- synchronous working-set replication.

So, instead of being a snarky person that thinks anything that doesn't use PostgreSQL is worthless, how about just letting those of us who work with multiple DBs talk about solving a problem.

Sorry, had to say it, but it's pissing me off to see the low quality of
the work that is done around SQL in OpenStack.

Hmm, this coming from Ceilometer...


