Eugene, I just spent about an hour playing around with an example
scenario in both MySQL and PostgreSQL using READ COMMITTED and
REPEATABLE READ isolation levels. The scenario I used had a single row
being updated, with a loop and a check on rows affected.
*You are 100% correct that setting the transaction isolation level to
READ COMMITTED works in the retry loop*.
I stand corrected, and humbled :) Please accept my apologies.
One thing I did note, though, is that changing the isolation level of an
*already-started transaction* does not change the current transaction's
isolation level -- the new isolation level only takes effect once the
previously started transaction is committed or rolled back. So, on line
107 in your proposed patch here:
https://review.openstack.org/#/c/129288/5/neutron/plugins/ml2/drivers/helpers.py
From what I could find out in my research, the setting of the isolation
level needs to be done *outside* of the session.begin() call, otherwise
the isolation level will not take effect until that transaction is
committed or rolled back. Of course, if SQLAlchemy is doing some
auto-commit or something in the session, then you may not see this
affect, but I certainly was able to see this in my testing in mysql
client sessions... so I'm a little perplexed as to how your code works
on already-started transactions. The documentation on the MySQL site
backs up what I observed:
http://dev.mysql.com/doc/refman/5.0/en/set-transaction.html
"...the statement sets the default transaction level for all subsequent
transactions performed within the current session."
All the best, and thanks for the informative lesson of the week!
-jay
On 11/21/2014 03:24 AM, Eugene Nikanorov wrote:
Comments inline:
On Thu, Nov 20, 2014 at 4:34 AM, Jay Pipes <jaypi...@gmail.com
<mailto:jaypi...@gmail.com>> wrote:
So while the SELECTs may return different data on successive calls
when you use the READ COMMITTED isolation level, the UPDATE
statements will continue to return 0 rows affected **if they attempt
to change rows that have been changed since the start of the
transaction**
The reason that changing the isolation level to READ COMMITTED
appears to work for the code in question:
https://github.com/openstack/__neutron/blob/master/neutron/__plugins/ml2/drivers/helpers.__py#L98
<https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers/helpers.py#L98>
is likely because the SELECT ... LIMIT 1 query is returning a
different row on successive attempts (though since there is no ORDER
BY on the query, the returned row of the query is entirely
unpredictable (line 112)). Since data from that returned row is used
in the UPDATE statement (line 118 and 124), *different* rows are
actually being changed by successive UPDATE statements.
Not really, we're updating the same row we've selected. It's ensured by
'raw_segment' which actually contains 'gre_id' (or similar) attribute.
So in each iteration we're working with the same row, but in different
iterations READ COMMITTED allows us to see different data and hence work
with a different row.
What this means is that for this *very particular* case, setting the
transaction isolation level to READ COMMITTTED will work presumably
most of the time on MySQL, but it's not an appropriate solution for
the generalized problem domain of the SELECT FOR UPDATE. If you need
to issue a SELECT and an UPDATE in a retry loop, and you are
attempting to update the same row or rows (for instance, in the
quota reservation or resource allocation scenarios), this solution
will not work, even with READ COMMITTED. This is why I say it's not
really appropriate, and a better general solution is to use separate
transactions for each loop in the retry mechanic.
By saying 'this solution will not work' what issues do you mean what
exactly?
Btw, I agree on using separate transaction for each loop, the problem is
that transaction is usually not 'local' to the method where the retry
loop resides.
The issue is about doing the retry within a single transaction.
That's not what I recommend doing. I recommend instead doing short
separate transactions instead of long-lived, multi-statement
transactions and relying on the behaviour of the DB's isolation
level (default or otherwise) to "solve" the problem of reading
changes to a record that you intend to update.
" instead of long-lived, multi-statement transactions" - that's really
what would require quite large code redesign.
So far finding a way to bring retry logic upper to the stack of nesting
transactions seems more appropriate.
Thanks,
Eugene.
Cheers,
-jay
Also, thanks Clint for clarification about example scenario
described by
Mike Bayer.
Initially the issue was discovered with concurrent tests on
multi master
environment with galera as a DB backend.
Thanks,
Eugene
On Thu, Nov 20, 2014 at 12:20 AM, Mike Bayer <mba...@redhat.com
<mailto:mba...@redhat.com>
<mailto:mba...@redhat.com <mailto:mba...@redhat.com>>> wrote:
On Nov 19, 2014, at 3:47 PM, Ryan Moats
<rmo...@us.ibm.com <mailto:rmo...@us.ibm.com>
<mailto:rmo...@us.ibm.com <mailto:rmo...@us.ibm.com>>>
wrote:
>
BTW, I view your examples from oslo as helping make my
argument for
me (and I don't think that was your intent :) )
I disagree with that as IMHO the differences in producing
MM in the
app layer against arbitrary backends (Postgresql vs. DB2
vs. MariaDB
vs. ???) will incur a lot more “bifurcation” than a system
that
targets only a handful of existing MM solutions. The example I
referred to in oslo.db is dealing with distinct, non MM
backends.
That level of DB-specific code and more is a given if we are
building a MM system against multiple backends generically.
It’s not possible to say which approach would be better or
worse at
the level of “how much database specific application logic
do we
need”, though in my experience, no matter what one is
trying to do,
the answer is always, “tons”; we’re dealing not just with
databases
but also Python drivers that have a vast amount of
differences in
behaviors, at every level. On top of all of that,
hand-rolled MM
adds just that much more application code to be developed and
maintained, which also claims it will do a better job than
mature
(ish?) database systems designed to do the same job against a
specific backend.
> > My reason for asking this question here is that if
the community
> > wants to consider #2, then these problems are the
place to start
> > crafting that solution - if we solve the conflicts
inherent
with the
> > two conncurrent thread scenarios, then I think we
will find that
> > we've solved the multi-master problem essentially
"for free”.
>
> Maybe I’m missing something, if we learn how to write
out a row such
> that a concurrent transaction against the same row
doesn’t throw us
> off, where is the part where that data is replicated
to databases
> running concurrently on other IP numbers in a way
that is atomic
> come out of that effort “for free” ? A home-rolled
“multi master”
> scenario would have to start with a system that has
multiple
> create_engine() calls, since we need to communicate
directly to
> multiple database servers. From there it gets really
crazy.
Where’sall that ?
Boiled down, what you are talking about here w.r.t.
concurrent
transactions is really conflict resolution, which is
the hardest
part of implementing multi-master (as a side note,
using locking in
this case is the equivalent of option #1).
All I wished to point out is that there are other ways
to solve the
conflict resolution that could then be leveraged into a
multi-master
scenario.
As for the parts that I glossed over, once conflict
resolution is
separated out, replication turns into a much simpler
problem with
well understood patterns and so I view that part as coming
"for free."
Ryan
_________________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.__org
<mailto:OpenStack-dev@lists.openstack.org>
<mailto:OpenStack-dev@lists.__openstack.org
<mailto:OpenStack-dev@lists.openstack.org>>
http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev
<http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>
_________________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.__org
<mailto:OpenStack-dev@lists.openstack.org>
<mailto:OpenStack-dev@lists.__openstack.org
<mailto:OpenStack-dev@lists.openstack.org>>
http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev
<http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>
_________________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.__org
<mailto:OpenStack-dev@lists.openstack.org>
http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev
<http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>
_________________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.__org
<mailto:OpenStack-dev@lists.openstack.org>
http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev
<http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>
_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev