Please lets not disregard zookeeper (zab based) or etc.d or other raft[1] based 
projects.

I'd also not rather bet on concoord (which seems to be maintained as a 
university project by 2 people @ cornell).

Even if zookeeper is java the constructs provided by kazoo[3] should be 
directly usable (tooz[2] provides nice helpers for these and we should consider 
making tooz better if it doesn't fit imho).

In my opinion we should focus on tooz maturation and having provably correct 
drivers in it for lock management (integrate zookeeper or etc.d or other impls 
at this level) and let deployers decide what they are comfortable using (I'd be 
ok with zookeeper, as it's already well used here at yahoo).

[1] http://raftconsensus.github.io/
[2] https://github.com/stackforge/tooz
[3] https://github.com/python-zk/kazoo/tree/master/kazoo/recipe

From: Salvatore Orlando <sorla...@nicira.com<mailto:sorla...@nicira.com>>
Reply-To: "OpenStack Development Mailing List (not for usage questions)" 
<openstack-dev@lists.openstack.org<mailto:openstack-dev@lists.openstack.org>>
Date: Monday, May 19, 2014 at 3:26 PM
To: "OpenStack Development Mailing List (not for usage questions)" 
<openstack-dev@lists.openstack.org<mailto:openstack-dev@lists.openstack.org>>
Subject: Re: [openstack-dev] [nova][neutron][mysql] IMPORTANT: MySQL Galera 
does *not* support SELECT ... FOR UPDATE

Some comments inline.

Salvatore


On 19 May 2014 20:32, sridhar basam 
<sridhar.ba...@gmail.com<mailto:sridhar.ba...@gmail.com>> wrote:



On Mon, May 19, 2014 at 1:30 PM, Jay Pipes 
<jaypi...@gmail.com<mailto:jaypi...@gmail.com>> wrote:
Stackers,

On Friday in Atlanta, I had the pleasure of moderating the database session at 
the Ops Meetup track. We had lots of good discussions and heard important 
feedback from operators on DB topics.

For the record, I would not bring this point up so publicly unless I believed 
it was a serious problem affecting a large segment of users. When doing an 
informal survey of the users/operators in the room at the start of the session, 
out of approximately 200 people in the room, only a single person was using 
PostgreSQL, about a dozen were using standard MySQL master/slave replication, 
and the rest were using MySQL Galera clustering. So, this is a real issue for a 
large segment of the operators -- or at least the ones at the session. :)


​We are one of those operators that use Galera for replicating our mysql 
databases. We used to  see issues with deadlocks when having multiple mysql 
writers in our mysql cluster. As a workaround we have our haproxy configuration 
in an active-standby configuration for our mysql VIP.

I seem to recall we had a lot of the deadlocks happen through Neutron. When we 
go through our Icehouse testing, we will redo our multimaster mysql setup and 
provide feedback on the issues we see.

The SELECT... FOR UPDATE issue is going to be a non trivial one for neutron as 
well. Some components, like IPAM, heavily rely on it.
However, Neutron is a lot more susceptible to deadlock problems than nova 
because it does not implement at the moment a retry mechanism.
This is something which should be added during the Juno release cycle 
regardless of all the other enhancement currently being planned, such as task 
oriented operations.

thanks,
         Sridhar


Peter Boros, from Percona, was able to provide some insight on MySQL Galera 
topics, and one issue came up that is likely the cause of a lot of heartache 
for operators who use MySQL Galera (or Percona XtraDB Cluster).

We were discussing whether people had seen deadlock issues [1] when using MySQL 
Galera in their deployment, and were brainstorming on why deadlocks might be 
seen. I had suggested that perhaps Nova's use of autoincrementing primary keys 
may have been the cause. Peter pretty quickly dispatched that notion, saying 
that Galera automatically handles autoincrementing keys using managed 
innodb_autoincrement_increment and innodb_autoincrement_offset config options.

I think at that point I mentioned that there were a number of places that were 
using the SELECT ... FOR UPDATE construct in Nova (in SQLAlchemy, it's the 
with_lockmode('update') modification of the query object). Peter promptly said 
that was a problem. MySQL Galera does not support SELECT ... FOR UPDATE, since 
it has no concept of cross-node locking of records and results are 
non-deterministic.

So... what to do?

For starters, some information on the use of with_lockmode() in Nova and 
Neutron...

Within Nova, there are actually only a few places where with_lockmode('update') 
is used. Unfortunately, the use of with_lockmode('update') is in the quota 
code, which tends to wrap largish blocks of code within the Nova compute 
execution code.

Within Neutron, however, the use of with_lockmode('update') is all over the 
place. There are 44 separate uses of it in 11 different files.


I will report on a separate thread on this, so that we can have an assessment 
of where locking statements are used and why.

We have a number of options:

I thin option 0 should be to rework/redesign the code, where possible, to avoid 
DB-level locking at all.


1) Stop using MySQL Galera for databases of projects that contain 
with_lockmode('update')

This looks hideous, but I am afraid this is what all people wishing to deploy 
Icehouse should consider doing.


2) Put a big old warning in the docs somewhere about the problem of potential 
deadlocks or odd behaviour with Galera in these projects

3) For Nova and Neutron, remove the use of with_lockmode('update') and instead 
use a coarse-grained file lock or a distributed lock manager for those areas 
where we need deterministic reads or quiescence.

We had an attempt at implementing a sort of distributed lock for neutron: 
https://review.openstack.org/#/c/34695/
Beyond the implementation reservations on this patch, one thing that should be 
noticed, probably needless to say, is that distributed coordination is 
something that should never be taken in a light-hearted way.
Once all the non-locking solution have been ruled out, distributed coordination 
among processes could be considered. In that case I think it might be better to 
use some OTS software rather than working out some home grown solution (I 
surely do not see space for a new project here)
On a side note, I'm rather ignorant on python frameworks for distributed 
coordination... concoord? Is zookeper something that should be ruled out 
because of language restrictions?



4) For the Nova db quota driver, refactor the driver to either use a 
non-locking method for reservation and quota queries or move the driver out 
into its own projects (or use something like Climate and make sure that Climate 
uses a non-blocking algorithm for those queries...)

Thoughts?

-jay

[1] http://lists.openstack.org/pipermail/openstack/2014-May/007202.html

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org<mailto:OpenStack-dev@lists.openstack.org>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org<mailto:OpenStack-dev@lists.openstack.org>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to