Re: [openstack-dev] [all][oslo.db] Repeatable Read considered harmful

2015-03-31 Thread Salvatore Orlando
On 31 March 2015 at 17:22, Mike Bayer  wrote:

>
>
> Eugene Nikanorov  wrote:
>
> > Hi Matthew,
> >
> > I'll add just 2c:
> >
> > We've tried to move from repeatable-read to read committed in Neutron
> project.
> > This change actually has caused multiple deadlocks during regular
> tempest test run.
> > That is a known problem (the issue with eventlet and currect mysql
> client library),
> > but anyway, at least one major openstack project is not ready to move to
> read-committed.
> >
> > Also, particular transaction isolation level's performance is highly
> affected by DB usage pattern.
> > Is there any research of how read-committed affects performance of
> openstack projects?
>
> So I would add that I think the altering of transaction isolation level
> should be done on a per-method basis; that is, methods that definitely need
> the effects of a certain isolation level should run it locally, so that the
> change can be made safely without having to deal with moving the entire
> application space over to a new mode of operation. I’ve added methods to
> SQLAlchemy specifically to make this achievable at the ORM level in
> response
> to [1], documented at [2].
>

I totally agree that the transaction isolation level should be set on a
per-method basis, because ultimately it's just impossible to define a rule
which will fit all cases.
Considering repeatable read a "harmful" mode is, in my opinion, an
exaggerated generalization. But I guess that "considered harmful" has
nowadays become just a trendy and catchy mailing list subject.
Anyway, a  wilful relaxation of transaction isolation is something that
should be handled carefully - and this is why explicitly setting the
desired isolation level for a transaction represent the optimal solution.
As Eugene said, there are examples where globally lowering it triggers more
issues, and not just because of the infamous eventlet issue.

Said that, I agree that the dichotomy between mysql and postgres' default
isolation methods has always been annoying for me as for every transaction
we write or review we have to be careful about thinking whether it will be
safe both in read committed and repeatable read modes. From what I gather,
the changes in sqlalchemy mentioned by Mike will also enable us to set an
application-specific isolation default, and therefore the applications will
not be dependent anymore on the backend default.

Salvatore



>
> The addition of specific isolation levels to enginefacade [3] will be
> straightforward and very clean. The API as it stands now, assuming
> decorator
> use looks like:
>
> @enginefacade.writer
> def some_api_method(context):
> context.session.do_something()
>
> To specify specific isolation level would be like this:
>
> @enginefacade.writer.with_read_committed
> def some_api_method(context):
> context.session.do_something()
>
>
>
> [1] https://review.openstack.org/#/c/148339/
> [2]
> http://docs.sqlalchemy.org/en/rel_0_9/orm/session_transaction.html#setting-isolation-for-individual-transactions
> .
> [3] https://review.openstack.org/#/c/138215/
>
>
>
>
> >
> > Thanks,
> > Eugene.
> >
> > On Fri, Feb 6, 2015 at 7:59 PM, Matthew Booth  wrote:
> > I was surprised recently to discover that MySQL uses repeatable read for
> > transactions by default. Postgres uses read committed by default, and
> > SQLite uses serializable. We don't set the isolation level explicitly
> > anywhere, so our applications are running under different isolation
> > levels depending on backend. This doesn't sound like a good idea to me.
> > It's one thing to support multiple sql syntaxes, but different isolation
> > levels have different semantics. Supporting that is much harder, and
> > currently we're not even trying.
> >
> > I'm aware that the same isolation level on different databases will
> > still have subtly different semantics, but at least they should agree on
> > the big things. I think we should pick one, and it should be read
> committed.
> >
> > Also note that 'repeatable read' on both MySQL and Postgres is actually
> > snapshot isolation, which isn't quite the same thing. For example, it
> > doesn't get phantom reads.
> >
> > The most important reason I think we need read committed is recovery
> > from concurrent changes within the scope of a single transaction. To
> > date, in Nova at least, this hasn't been an issue as transactions have
> > had an extremely small scope. However, we're trying to expand that scope
> > with the new enginefacade in oslo.db:
> > https://review.openstack.org/#/c/138215/ . With this expanded scope,
> > transaction failure in a library function can't simply be replayed
> > because the transaction scope is larger than the function.
> >
> > So, 3 concrete examples of how repeatable read will make Nova worse:
> >
> > * https://review.openstack.org/#/c/140622/
> >
> > This was committed to Nova recently. Note how it involves a retry in the
> > case of concurrent change. This works fine, because the retry is creat

Re: [openstack-dev] [all][oslo.db] Repeatable Read considered harmful

2015-03-31 Thread Mike Bayer


Eugene Nikanorov  wrote:

> Hi Matthew,
> 
> I'll add just 2c:
> 
> We've tried to move from repeatable-read to read committed in Neutron project.
> This change actually has caused multiple deadlocks during regular tempest 
> test run.
> That is a known problem (the issue with eventlet and currect mysql client 
> library),
> but anyway, at least one major openstack project is not ready to move to 
> read-committed.
> 
> Also, particular transaction isolation level's performance is highly affected 
> by DB usage pattern.
> Is there any research of how read-committed affects performance of openstack 
> projects?

So I would add that I think the altering of transaction isolation level
should be done on a per-method basis; that is, methods that definitely need
the effects of a certain isolation level should run it locally, so that the
change can be made safely without having to deal with moving the entire
application space over to a new mode of operation. I’ve added methods to
SQLAlchemy specifically to make this achievable at the ORM level in response
to [1], documented at [2].

The addition of specific isolation levels to enginefacade [3] will be
straightforward and very clean. The API as it stands now, assuming decorator
use looks like:

@enginefacade.writer
def some_api_method(context):
context.session.do_something()

To specify specific isolation level would be like this:

@enginefacade.writer.with_read_committed
def some_api_method(context):
context.session.do_something()



[1] https://review.openstack.org/#/c/148339/
[2] 
http://docs.sqlalchemy.org/en/rel_0_9/orm/session_transaction.html#setting-isolation-for-individual-transactions.
[3] https://review.openstack.org/#/c/138215/




> 
> Thanks,
> Eugene.
> 
> On Fri, Feb 6, 2015 at 7:59 PM, Matthew Booth  wrote:
> I was surprised recently to discover that MySQL uses repeatable read for
> transactions by default. Postgres uses read committed by default, and
> SQLite uses serializable. We don't set the isolation level explicitly
> anywhere, so our applications are running under different isolation
> levels depending on backend. This doesn't sound like a good idea to me.
> It's one thing to support multiple sql syntaxes, but different isolation
> levels have different semantics. Supporting that is much harder, and
> currently we're not even trying.
> 
> I'm aware that the same isolation level on different databases will
> still have subtly different semantics, but at least they should agree on
> the big things. I think we should pick one, and it should be read committed.
> 
> Also note that 'repeatable read' on both MySQL and Postgres is actually
> snapshot isolation, which isn't quite the same thing. For example, it
> doesn't get phantom reads.
> 
> The most important reason I think we need read committed is recovery
> from concurrent changes within the scope of a single transaction. To
> date, in Nova at least, this hasn't been an issue as transactions have
> had an extremely small scope. However, we're trying to expand that scope
> with the new enginefacade in oslo.db:
> https://review.openstack.org/#/c/138215/ . With this expanded scope,
> transaction failure in a library function can't simply be replayed
> because the transaction scope is larger than the function.
> 
> So, 3 concrete examples of how repeatable read will make Nova worse:
> 
> * https://review.openstack.org/#/c/140622/
> 
> This was committed to Nova recently. Note how it involves a retry in the
> case of concurrent change. This works fine, because the retry is creates
> a new transaction. However, if the transaction was larger than the scope
> of this function this would not work, because each iteration would
> continue to read the old data. The solution to this is to create a new
> transaction. However, because the transaction is outside of the scope of
> this function, the only thing we can do locally is fail. The caller then
> has to re-execute the whole transaction, or fail itself.
> 
> This is a local concurrency problem which can be very easily handled
> locally, but not if we're using repeatable read.
> 
> *
> https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/api.py#L4749
> 
> Nova has multiple functions of this type which attempt to update a
> key/value metadata table. I'd expect to find multiple concurrency issues
> with this if I stopped to give it enough thought, but concentrating just
> on what's there, notice how the retry loop starts a new transaction. If
> we want to get to a place where we don't do that, with repeatable read
> we're left failing the whole transaction.
> 
> * https://review.openstack.org/#/c/136409/
> 
> This one isn't upstream, yet. It's broken, and I can't currently think
> of a solution if we're using repeatable read.
> 
> The issue is atomic creation of a shared resource. We want to handle a
> creation race safely. This patch:
> 
> * Attempts to reads the default (it will normally exist)
> * Creates a new one if it doesn't e

Re: [openstack-dev] [all][oslo.db] Repeatable Read considered harmful

2015-03-31 Thread Eugene Nikanorov
Hi Matthew,

I'll add just 2c:

We've tried to move from repeatable-read to read committed in Neutron
project.
This change actually has caused multiple deadlocks during regular tempest
test run.
That is a known problem (the issue with eventlet and currect mysql client
library),
but anyway, at least one major openstack project is not ready to move to
read-committed.

Also, particular transaction isolation level's performance is highly
affected by DB usage pattern.
Is there any research of how read-committed affects performance of
openstack projects?

Thanks,
Eugene.

On Fri, Feb 6, 2015 at 7:59 PM, Matthew Booth  wrote:

> I was surprised recently to discover that MySQL uses repeatable read for
> transactions by default. Postgres uses read committed by default, and
> SQLite uses serializable. We don't set the isolation level explicitly
> anywhere, so our applications are running under different isolation
> levels depending on backend. This doesn't sound like a good idea to me.
> It's one thing to support multiple sql syntaxes, but different isolation
> levels have different semantics. Supporting that is much harder, and
> currently we're not even trying.
>
> I'm aware that the same isolation level on different databases will
> still have subtly different semantics, but at least they should agree on
> the big things. I think we should pick one, and it should be read
> committed.
>
> Also note that 'repeatable read' on both MySQL and Postgres is actually
> snapshot isolation, which isn't quite the same thing. For example, it
> doesn't get phantom reads.
>
> The most important reason I think we need read committed is recovery
> from concurrent changes within the scope of a single transaction. To
> date, in Nova at least, this hasn't been an issue as transactions have
> had an extremely small scope. However, we're trying to expand that scope
> with the new enginefacade in oslo.db:
> https://review.openstack.org/#/c/138215/ . With this expanded scope,
> transaction failure in a library function can't simply be replayed
> because the transaction scope is larger than the function.
>
> So, 3 concrete examples of how repeatable read will make Nova worse:
>
> * https://review.openstack.org/#/c/140622/
>
> This was committed to Nova recently. Note how it involves a retry in the
> case of concurrent change. This works fine, because the retry is creates
> a new transaction. However, if the transaction was larger than the scope
> of this function this would not work, because each iteration would
> continue to read the old data. The solution to this is to create a new
> transaction. However, because the transaction is outside of the scope of
> this function, the only thing we can do locally is fail. The caller then
> has to re-execute the whole transaction, or fail itself.
>
> This is a local concurrency problem which can be very easily handled
> locally, but not if we're using repeatable read.
>
> *
>
> https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/api.py#L4749
>
> Nova has multiple functions of this type which attempt to update a
> key/value metadata table. I'd expect to find multiple concurrency issues
> with this if I stopped to give it enough thought, but concentrating just
> on what's there, notice how the retry loop starts a new transaction. If
> we want to get to a place where we don't do that, with repeatable read
> we're left failing the whole transaction.
>
> * https://review.openstack.org/#/c/136409/
>
> This one isn't upstream, yet. It's broken, and I can't currently think
> of a solution if we're using repeatable read.
>
> The issue is atomic creation of a shared resource. We want to handle a
> creation race safely. This patch:
>
> * Attempts to reads the default (it will normally exist)
> * Creates a new one if it doesn't exist
> * Goes back to the start if creation failed due to a duplicate
>
> Seems fine, but it will fail because the re-read will continue to not
> return the new value under repeatable read (no phantom reads). The only
> way to see the new row is a new transaction. Is this will no longer be
> in the scope of this function, the only solution will be to fail. Read
> committed could continue without failing.
>
> Incidentally, this currently works by using multiple transactions, which
> we are trying to avoid. It has also been suggested that in this specific
> instance the default security group could be created with the project.
> However, that would both be more complicated, because it would require
> putting a hook into another piece of code, and less robust, because it
> wouldn't recover if somebody deleted the default security group.
>
>
> To summarise, with repeatable read we're forced to abort the current
> transaction to deal with certain relatively common classes of
> concurrency issue, whereas with read committed we can safely recover. If
> we want to reduce the number of transactions we're using, which we do,
> the impact of this is going to dramatically increase.

[openstack-dev] [all][oslo.db] Repeatable Read considered harmful

2015-02-06 Thread Matthew Booth
I was surprised recently to discover that MySQL uses repeatable read for
transactions by default. Postgres uses read committed by default, and
SQLite uses serializable. We don't set the isolation level explicitly
anywhere, so our applications are running under different isolation
levels depending on backend. This doesn't sound like a good idea to me.
It's one thing to support multiple sql syntaxes, but different isolation
levels have different semantics. Supporting that is much harder, and
currently we're not even trying.

I'm aware that the same isolation level on different databases will
still have subtly different semantics, but at least they should agree on
the big things. I think we should pick one, and it should be read committed.

Also note that 'repeatable read' on both MySQL and Postgres is actually
snapshot isolation, which isn't quite the same thing. For example, it
doesn't get phantom reads.

The most important reason I think we need read committed is recovery
from concurrent changes within the scope of a single transaction. To
date, in Nova at least, this hasn't been an issue as transactions have
had an extremely small scope. However, we're trying to expand that scope
with the new enginefacade in oslo.db:
https://review.openstack.org/#/c/138215/ . With this expanded scope,
transaction failure in a library function can't simply be replayed
because the transaction scope is larger than the function.

So, 3 concrete examples of how repeatable read will make Nova worse:

* https://review.openstack.org/#/c/140622/

This was committed to Nova recently. Note how it involves a retry in the
case of concurrent change. This works fine, because the retry is creates
a new transaction. However, if the transaction was larger than the scope
of this function this would not work, because each iteration would
continue to read the old data. The solution to this is to create a new
transaction. However, because the transaction is outside of the scope of
this function, the only thing we can do locally is fail. The caller then
has to re-execute the whole transaction, or fail itself.

This is a local concurrency problem which can be very easily handled
locally, but not if we're using repeatable read.

*
https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/api.py#L4749

Nova has multiple functions of this type which attempt to update a
key/value metadata table. I'd expect to find multiple concurrency issues
with this if I stopped to give it enough thought, but concentrating just
on what's there, notice how the retry loop starts a new transaction. If
we want to get to a place where we don't do that, with repeatable read
we're left failing the whole transaction.

* https://review.openstack.org/#/c/136409/

This one isn't upstream, yet. It's broken, and I can't currently think
of a solution if we're using repeatable read.

The issue is atomic creation of a shared resource. We want to handle a
creation race safely. This patch:

* Attempts to reads the default (it will normally exist)
* Creates a new one if it doesn't exist
* Goes back to the start if creation failed due to a duplicate

Seems fine, but it will fail because the re-read will continue to not
return the new value under repeatable read (no phantom reads). The only
way to see the new row is a new transaction. Is this will no longer be
in the scope of this function, the only solution will be to fail. Read
committed could continue without failing.

Incidentally, this currently works by using multiple transactions, which
we are trying to avoid. It has also been suggested that in this specific
instance the default security group could be created with the project.
However, that would both be more complicated, because it would require
putting a hook into another piece of code, and less robust, because it
wouldn't recover if somebody deleted the default security group.


To summarise, with repeatable read we're forced to abort the current
transaction to deal with certain relatively common classes of
concurrency issue, whereas with read committed we can safely recover. If
we want to reduce the number of transactions we're using, which we do,
the impact of this is going to dramatically increase. We should
standardise on read committed.

Matt
-- 
Matthew Booth
Red Hat Engineering, Virtualisation Team

Phone: +442070094448 (UK)
GPG ID:  D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev