Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-14 Thread Matthew Booth
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 14/11/14 14:40, Dan Smith wrote:
>> I don't think it's a problem. It puts a practical limit on the
>> scope of an 'api call' which can be covered by a single database
>> transaction though, because it would be difficult to arrange for
>> 2 RPC calls to both use the same DB transaction on the remote
>> end. I think we agree on this.
> 
> Sure, but since RPC calls can be queued for *minutes* this is
> probably not a realistic problem, right?

More violent agreement ;) I think scoping this to just conductor is
appropriate and useful. Compare and swap at the object level would be
a useful mechanism for safety across multiple rpc calls.

Matt

- -- 
Matthew Booth
Red Hat Engineering, Virtualisation Team

Phone: +442070094448 (UK)
GPG ID:  D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlRmJiQACgkQNEHqGdM8NJASFgCdFsz8Bm9LcIqTBf+uZSe0/hmE
LG0An2D9L3vyI8QSAUxNDj3fpfSFt0tx
=32fN
-END PGP SIGNATURE-

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-14 Thread Dan Smith
> I don't think it's a problem. It puts a practical limit on the scope
> of an 'api call' which can be covered by a single database transaction
> though, because it would be difficult to arrange for 2 RPC calls to
> both use the same DB transaction on the remote end. I think we agree
> on this.

Sure, but since RPC calls can be queued for *minutes* this is probably
not a realistic problem, right?

--Dan



signature.asc
Description: OpenPGP digital signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-14 Thread Matthew Booth
On 13/11/14 18:26, Mike Bayer wrote:
> 
>> On Nov 13, 2014, at 3:52 AM, Nikola Đipanov 
>> wrote:
>> 
>> On 11/13/2014 02:45 AM, Dan Smith wrote:
 I’m not sure if I’m seeing the second SELECT here either but
 I’m less familiar with what I’m looking at.
 compute_node_update() does the one SELECT as we said, then it
 doesn’t look like self._from_db_object() would emit any further
 SQL specific to that row.
>>> 
>>> I don't think you're missing anything. I don't see anything in
>>> that object code, or the other db/sqlalchemy/api.py code that
>>> looks like a second select. Perhaps he was referring to two
>>> *queries*, being the initial select and the following update?
>>> 
>> 
>> FWIW - I think an example Matt was giving me yesterday was block
>> devices where we have:
>> 
>> @require_context def block_device_mapping_update(context, bdm_id,
>> values, legacy=True): _scrub_empty_str_values(values,
>> ['volume_size']) values = _from_legacy_values(values, legacy,
>> allow_updates=True) query
>> =_block_device_mapping_get_query(context).filter_by(id=bdm_id) 
>> query.update(values) return query.first()
>> 
>> which gets called from object save()
> 
> OK well there, that is still a single UPDATE statement and then a
> SELECT.   It’s using an aggregate UPDATE so there is no load up front
> required.   Unless _from_legacy_values() does something, that’s still
> just UPDATE + SELECT, just not in the usual order.  I’d suggest this
> method be swapped around to load the object first, then use the
> traditional flush process to flush it, as regular flush is a lot more
> reliable, so I’d agree this method is awkward and should be fixed,
> but I’m not sure there’s a second SELECT there.

Indeed, looks like there's just a single select here. Aggregate does,
however, fetch twice. This was the first one I found, and I may have
'seen' this pattern in more places than it actually exists.

Matt
-- 
Matthew Booth
Red Hat Engineering, Virtualisation Team

Phone: +442070094448 (UK)
GPG ID:  D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-14 Thread Matthew Booth
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 13/11/14 19:19, Dan Smith wrote:
>>> Unfortunately this model doesn't apply to Nova objects, which
>>> are persisted remotely. Unless I've missed something, SQLA
>>> doesn't run on Nova Compute at all. Instead, when Nova Compute
>>> calls object.save() this results in an RPC call to Nova
>>> Conductor, which persists the object in the DB using SQLA.
>>> Compute wouldn't be able to use common DB transactions without
>>> some hairy lifecycle management in Conductor, so Compute apis
>>> need to be explicitly aware of this.
>> 
>> So just a note to Dan, this is an example of where I keep
>> hearing this about Nova objects.I’ve discussed this with Dan
>> and if I understand him correctly, I think the idea is that a
>> Nova Compute call can be organized such that the objects layer
>> interacts with the database layer in a more coarse-grained
>> fashion, if that is desired, so if you really need several things
>> to happen in one DB transaction, you should organize the relevant
>> objects code to work that way.
> 
> Instance.save() is a single thing. It implies that
> Instance.metadata, Instance.system_metadata, Instance.info_cache,
> Instance.security_groups, Instance.numa_topology,
> Instance.pci_requests, etc should all be written to the database
> atomically (or fail). We don't do it atomically and in a 
> transaction right now, but only because db/api is broken into lots
> of small pieces (all of which existed before objects).
> 
> If there was a good way to do this:
> 
> with transaction: save_instance_data() save_instance_metadata() 
> save_instance_system_metadata() ...etc
> 
> Then we'd do that at the backend, achieve atomicity, and the client
> (the compute node) wouldn't notice, or care, beyond the fact that
> it had assumed that was happening all along. It sounds like Mike's
> facade will provide us a nice way to clean up the db/api calls that
> the objects are using to persist data in such a way that we can do
> the above safely like we should have been doing all along.

Agree.

> 
> Does that make sense?
> 
>> Still for me to get my head around is how often we are in fact 
>> organizing the bridge between objects / database such that we
>> are using the database most effectively, and not breaking up a
>> single logical operation into many individual transactions.   I
>> know that Nova objects doesn’t mandate that this occur but I
>> still want to learn if perhaps it tends to “encourage” that
>> pattern to emerge - it’s hard for me to make that guess right now
>> because I haven’t surveyed nova objects very much at all as I’ve
>> been really trying to stick with getting database patterns sane
>> to start with.
> 
> I don't agree that it encourages anything relating to how you
> interact with the database, one way or the other. Almost all of our
> objects are organized in the exact way that previously we had 
> dicts-of-dicts-of-dicts and an RPC call to flush things to the
> database. We've changed very little of that access pattern.
> 
> I think we should push back to Matt to provide a description of why
> he thinks that this is a problem.

I don't think it's a problem. It puts a practical limit on the scope
of an 'api call' which can be covered by a single database transaction
though, because it would be difficult to arrange for 2 RPC calls to
both use the same DB transaction on the remote end. I think we agree
on this.

Matt
- -- 
Matthew Booth
Red Hat Engineering, Virtualisation Team

Phone: +442070094448 (UK)
GPG ID:  D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlRlyHYACgkQNEHqGdM8NJCQsQCgmEgZGyCjIJ3zQ+mjo14AQTeo
58UAn3nhO2OfOm/xZ3yIMhlVp0EhJm+0
=xjYZ
-END PGP SIGNATURE-

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-13 Thread Dan Smith
>> Unfortunately this model doesn't apply to Nova objects, which are 
>> persisted remotely. Unless I've missed something, SQLA doesn't run
>> on Nova Compute at all. Instead, when Nova Compute calls
>> object.save() this results in an RPC call to Nova Conductor, which
>> persists the object in the DB using SQLA. Compute wouldn't be able
>> to use common DB transactions without some hairy lifecycle
>> management in Conductor, so Compute apis need to be explicitly
>> aware of this.
> 
> So just a note to Dan, this is an example of where I keep hearing
> this about Nova objects.I’ve discussed this with Dan and if I
> understand him correctly, I think the idea is that a Nova Compute
> call can be organized such that the objects layer interacts with the
> database layer in a more coarse-grained fashion, if that is desired,
> so if you really need several things to happen in one DB transaction,
> you should organize the relevant objects code to work that way.

Instance.save() is a single thing. It implies that Instance.metadata,
Instance.system_metadata, Instance.info_cache, Instance.security_groups,
Instance.numa_topology, Instance.pci_requests, etc should all be written
to the database atomically (or fail). We don't do it atomically and in a
transaction right now, but only because db/api is broken into lots of
small pieces (all of which existed before objects).

If there was a good way to do this:

  with transaction:
save_instance_data()
save_instance_metadata()
save_instance_system_metadata()
...etc

Then we'd do that at the backend, achieve atomicity, and the client (the
compute node) wouldn't notice, or care, beyond the fact that it had
assumed that was happening all along. It sounds like Mike's facade will
provide us a nice way to clean up the db/api calls that the objects are
using to persist data in such a way that we can do the above safely like
we should have been doing all along.

Does that make sense?

> Still for me to get my head around is how often we are in fact
> organizing the bridge between objects / database such that we are
> using the database most effectively, and not breaking up a single
> logical operation into many individual transactions.   I know that
> Nova objects doesn’t mandate that this occur but I still want to
> learn if perhaps it tends to “encourage” that pattern to emerge -
> it’s hard for me to make that guess right now because I haven’t
> surveyed nova objects very much at all as I’ve been really trying to
> stick with getting database patterns sane to start with.

I don't agree that it encourages anything relating to how you interact
with the database, one way or the other. Almost all of our objects are
organized in the exact way that previously we had
dicts-of-dicts-of-dicts and an RPC call to flush things to the database.
We've changed very little of that access pattern.

I think we should push back to Matt to provide a description of why he
thinks that this is a problem.

--Dan



signature.asc
Description: OpenPGP digital signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-13 Thread Mike Bayer

> On Nov 13, 2014, at 5:47 AM, Matthew Booth  wrote:
> 
> Unfortunately this model doesn't apply to Nova objects, which are
> persisted remotely. Unless I've missed something, SQLA doesn't run on
> Nova Compute at all. Instead, when Nova Compute calls object.save() this
> results in an RPC call to Nova Conductor, which persists the object in
> the DB using SQLA. Compute wouldn't be able to use common DB
> transactions without some hairy lifecycle management in Conductor, so
> Compute apis need to be explicitly aware of this.


So just a note to Dan, this is an example of where I keep hearing this about 
Nova objects.I’ve discussed this with Dan and if I understand him 
correctly, I think the idea is that a Nova Compute call can be organized such 
that the objects layer interacts with the database layer in a more 
coarse-grained fashion, if that is desired, so if you really need several 
things to happen in one DB transaction, you should organize the relevant 
objects code to work that way.

Still for me to get my head around is how often we are in fact organizing the 
bridge between objects / database such that we are using the database most 
effectively, and not breaking up a single logical operation into many 
individual transactions.   I know that Nova objects doesn’t mandate that this 
occur but I still want to learn if perhaps it tends to “encourage” that pattern 
to emerge - it’s hard for me to make that guess right now because I haven’t 
surveyed nova objects very much at all as I’ve been really trying to stick with 
getting database patterns sane to start with.  



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-13 Thread Mike Bayer

> On Nov 13, 2014, at 3:52 AM, Nikola Đipanov  wrote:
> 
> On 11/13/2014 02:45 AM, Dan Smith wrote:
>>> I’m not sure if I’m seeing the second SELECT here either but I’m less
>>> familiar with what I’m looking at. compute_node_update() does the
>>> one SELECT as we said, then it doesn’t look like
>>> self._from_db_object() would emit any further SQL specific to that
>>> row.
>> 
>> I don't think you're missing anything. I don't see anything in that
>> object code, or the other db/sqlalchemy/api.py code that looks like a
>> second select. Perhaps he was referring to two *queries*, being the
>> initial select and the following update?
>> 
> 
> FWIW - I think an example Matt was giving me yesterday was block devices
> where we have:
> 
> @require_context
> def block_device_mapping_update(context, bdm_id, values, legacy=True):
>_scrub_empty_str_values(values, ['volume_size'])
>values = _from_legacy_values(values, legacy, allow_updates=True)
>query =_block_device_mapping_get_query(context).filter_by(id=bdm_id)
>query.update(values)
>return query.first()
> 
> which gets called from object save()

OK well there, that is still a single UPDATE statement and then a SELECT.   
It’s using an aggregate UPDATE so there is no load up front required.   Unless 
_from_legacy_values() does something, that’s still just UPDATE + SELECT, just 
not in the usual order.  I’d suggest this method be swapped around to load the 
object first, then use the traditional flush process to flush it, as regular 
flush is a lot more reliable, so I’d agree this method is awkward and should be 
fixed, but I’m not sure there’s a second SELECT there.



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-13 Thread Dan Smith
> Can we guarantee that the lifetime of a context object in conductor is
> a single rpc call, and that the object cannot be referenced from any
> other thread?

Yes, without a doubt.

--Dan



signature.asc
Description: OpenPGP digital signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-13 Thread Matthew Booth
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 13/11/14 14:26, Dan Smith wrote:
>> On 12/11/14 19:39, Mike Bayer wrote:
>>> lets keep in mind my everyone-likes-it-so-far proposal for
>>> reader() and writer(): https://review.openstack.org/#/c/125181/
>>> (this is where it’s going to go as nobody has -1’ed it, so in
>>> absence of any “no way!” votes I have to assume this is what
>>> we’re going with).
>> 
>> Dan,
>> 
>> Note that this model, as I understand it, would conflict with
>> storing context in NovaObject.
> 
> Why do you think that? As you pointed out, the above model is
> purely SQLA code, which is run by an object, long after the context
> has been resolved, the call has been remoted, etc.

Can we guarantee that the lifetime of a context object in conductor is
a single rpc call, and that the object cannot be referenced from any
other thread? Seems safer just to pass it around.

Matt
- -- 
Matthew Booth
Red Hat Engineering, Virtualisation Team

Phone: +442070094448 (UK)
GPG ID:  D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlRkwTMACgkQNEHqGdM8NJBHMwCdF6RpkpFSXitHfGfOmL0Iw/wr
f/8AnRxozN/LusnermjbZffmvuyoFub7
=S6KI
-END PGP SIGNATURE-

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-13 Thread Dan Smith
> On 12/11/14 19:39, Mike Bayer wrote:
>> lets keep in mind my everyone-likes-it-so-far proposal for reader()
>> and writer(): https://review.openstack.org/#/c/125181/   (this is
>> where it’s going to go as nobody has -1’ed it, so in absence of any
>> “no way!” votes I have to assume this is what we’re going with).
> 
> Dan,
> 
> Note that this model, as I understand it, would conflict with storing
> context in NovaObject.

Why do you think that? As you pointed out, the above model is purely
SQLA code, which is run by an object, long after the context has been
resolved, the call has been remoted, etc.

--Dan



signature.asc
Description: OpenPGP digital signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-13 Thread Matthew Booth
On 12/11/14 19:39, Mike Bayer wrote:
> lets keep in mind my everyone-likes-it-so-far proposal for reader()
> and writer(): https://review.openstack.org/#/c/125181/   (this is
> where it’s going to go as nobody has -1’ed it, so in absence of any
> “no way!” votes I have to assume this is what we’re going with).

Dan,

Note that this model, as I understand it, would conflict with storing
context in NovaObject.

Matt
-- 
Matthew Booth
Red Hat Engineering, Virtualisation Team

Phone: +442070094448 (UK)
GPG ID:  D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-13 Thread Matthew Booth
On 12/11/14 19:39, Mike Bayer wrote:
> 
>> On Nov 12, 2014, at 12:45 PM, Dan Smith  wrote:
>> 
>>> I personally favour having consistent behaviour across the board.
>>> How about updating them all to auto-refresh by default for
>>> consistency, but adding an additional option to save() to disable
>>> it for particular calls?
>> 
>> I think these should be two patches: one to make them all
>> auto-refresh, and another to make it conditional. That serves the
>> purpose of (a) bisecting a regression to one or the other, and (b)
>> we can bikeshed on the interface and appropriateness of the
>> don't-refresh flag :)
>> 
>>> I also suggest a tactical fix to any object which fetches itself
>>> twice on update (e.g. Aggregate).
>> 
>> I don't see that being anything other than an obvious win, unless
>> there is some obscure reason for it. But yeah, seems like a good
>> thing to do.
> 
> lets keep in mind my everyone-likes-it-so-far proposal for reader()
> and writer(): https://review.openstack.org/#/c/125181/   (this is
> where it’s going to go as nobody has -1’ed it, so in absence of any
> “no way!” votes I have to assume this is what we’re going with).

FWIW, it got my +1, too. Looks great.

> in this system, the span of session use is implicit within the
> context and/or decorator, and when writer() is specified, a commit()
> can be implicit as well.  IMHO there should be no “.save()” at all,
> at least as far as database writing is concerned. SQLAlchemy
> doesn’t need boilerplate like that - just let the ORM work normally:
> 
> @sql.writer def some_other_api_method(context): someobject =
> context.session.query(SomeObject)….one() 
> someobject.change_some_state()
> 
> # done!
> 
> if you want an explicit refresh, then just do so:
> 
> @sql.writer def some_other_api_method(context): someobject =
> context.session.query(SomeObject)….one() 
> someobject.change_some_state()
> 
> context.session.flush() context.session.refresh(someobject) # do
> something with someobject

Unfortunately this model doesn't apply to Nova objects, which are
persisted remotely. Unless I've missed something, SQLA doesn't run on
Nova Compute at all. Instead, when Nova Compute calls object.save() this
results in an RPC call to Nova Conductor, which persists the object in
the DB using SQLA. Compute wouldn't be able to use common DB
transactions without some hairy lifecycle management in Conductor, so
Compute apis need to be explicitly aware of this.

However, it absolutely makes sense for a single Conductor api call to
use a single transaction.

> however, seeing as this is all one API method the only reason you’d
> want to refresh() is that you think something has happened between
> that flush() and the refresh() that would actually show up, I can’t
> imagine what that would be looking for, unless maybe some large
> amount of operations took up a lot of time between the flush() and
> the refresh().

Given the above constraints, the problem I'm actually trying to solve is
when another process modifies an object underneath us between multiple,
remote transactions. This is one of the motivations for compare-and-swap
over row locking on read. Another is that the length of some API calls
makes holding a row lock for that long undesirable.

Matt
-- 
Matthew Booth
Red Hat Engineering, Virtualisation Team

Phone: +442070094448 (UK)
GPG ID:  D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-13 Thread Matthew Booth
On 12/11/14 23:23, Mike Bayer wrote:
> 
>> On Nov 12, 2014, at 10:56 AM, Matthew Booth  wrote:
>>
>> For brevity, I have conflated what happens in object.save() with what
>> happens in db.api. Where the code lives isn't relevant here: I'm only
>> looking at what happens.
>>
>> Specifically, the following objects refresh themselves on save:
>>
>> Aggregate
>> BlockDeviceMapping
>> ComputeNode
> 
>> Excluding irrelevant complexity, the general model for objects which
>> refresh on update is:
>>
>> object = 
>> object.update()
>> object.save()
>> return 
>>
>> Some objects skip out the second select and return the freshly saved
>> object. That is, a save involves an update + either 1 or 2 selects.
> 
> If I may inquire as to the irrelevant complexity, I’m trying to pinpoint 
> where you see this happening.

The irrelevant complexity is mostly munging values before they are
inserted into the db. While this needs to be there, I don't think it's
important to the post.

Matt

-- 
Matthew Booth
Red Hat Engineering, Virtualisation Team

Phone: +442070094448 (UK)
GPG ID:  D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-13 Thread Matthew Booth
On 13/11/14 08:52, Nikola Đipanov wrote:
> On 11/13/2014 02:45 AM, Dan Smith wrote:
>>> I’m not sure if I’m seeing the second SELECT here either but I’m less
>>> familiar with what I’m looking at. compute_node_update() does the
>>> one SELECT as we said, then it doesn’t look like
>>> self._from_db_object() would emit any further SQL specific to that
>>> row.
>>
>> I don't think you're missing anything. I don't see anything in that
>> object code, or the other db/sqlalchemy/api.py code that looks like a
>> second select. Perhaps he was referring to two *queries*, being the
>> initial select and the following update?
>>
> 
> FWIW - I think an example Matt was giving me yesterday was block devices
> where we have:
> 
> @require_context
> def block_device_mapping_update(context, bdm_id, values, legacy=True):
> _scrub_empty_str_values(values, ['volume_size'])
> values = _from_legacy_values(values, legacy, allow_updates=True)
> query =_block_device_mapping_get_query(context).filter_by(id=bdm_id)
> query.update(values)
> return query.first()
> 
> which gets called from object save()

Yes, this is one example, another is Aggregate. I already had a big list
in the post and didn't want a second one.

Matt
-- 
Matthew Booth
Red Hat Engineering, Virtualisation Team

Phone: +442070094448 (UK)
GPG ID:  D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-13 Thread Nikola Đipanov
On 11/13/2014 02:45 AM, Dan Smith wrote:
>> I’m not sure if I’m seeing the second SELECT here either but I’m less
>> familiar with what I’m looking at. compute_node_update() does the
>> one SELECT as we said, then it doesn’t look like
>> self._from_db_object() would emit any further SQL specific to that
>> row.
> 
> I don't think you're missing anything. I don't see anything in that
> object code, or the other db/sqlalchemy/api.py code that looks like a
> second select. Perhaps he was referring to two *queries*, being the
> initial select and the following update?
> 

FWIW - I think an example Matt was giving me yesterday was block devices
where we have:

@require_context
def block_device_mapping_update(context, bdm_id, values, legacy=True):
_scrub_empty_str_values(values, ['volume_size'])
values = _from_legacy_values(values, legacy, allow_updates=True)
query =_block_device_mapping_get_query(context).filter_by(id=bdm_id)
query.update(values)
return query.first()

which gets called from object save()

N.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-12 Thread Dan Smith
> I’m not sure if I’m seeing the second SELECT here either but I’m less
> familiar with what I’m looking at. compute_node_update() does the
> one SELECT as we said, then it doesn’t look like
> self._from_db_object() would emit any further SQL specific to that
> row.

I don't think you're missing anything. I don't see anything in that
object code, or the other db/sqlalchemy/api.py code that looks like a
second select. Perhaps he was referring to two *queries*, being the
initial select and the following update?

--Dan



signature.asc
Description: OpenPGP digital signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-12 Thread Mike Bayer

> On Nov 12, 2014, at 6:23 PM, Mike Bayer  wrote:
> 
> 
> If I may inquire as to the irrelevant complexity, I’m trying to pinpoint 
> where you see this happening.
> 
> When we talk about updating a ComputeNode, because I’m only slightly familiar 
> with Nova’s codebase, I assume we are looking at “def compute_node_update()” 
> on line 633 of nova/db/sqlalchemy/api.py ?
> 
> if that’s the full extent of it, I’m not seeing the second select:
> 
>def compute_node_update(context, compute_id, values):
>"""Updates the ComputeNode record with the most recent data."""
> 
>session = get_session()
>with session.begin():
>compute_ref = _compute_node_get(context, compute_id, 
> session=session)
>values['updated_at'] = timeutils.utcnow()
>datetime_keys = ('created_at', 'deleted_at', 'updated_at')
>convert_objects_related_datetimes(values, *datetime_keys)
>compute_ref.update(values)
> 
>return compute_ref
> 
> so “with session.begin()”, when that context ends, will emit the flush of the 
> compute_ref, and then commit the transaction.  The Session by default has a 
> behavior “expire_on_commit”, which means that when this compute_ref is 
> returned to the outside world, the first thing that accesses anything on it 
> *will* emit a SELECT for the row again.  However, as far as I can tell the 
> expire_on_commit flag is turned off.   get_session() returns from oslo.db’s 
> EngineFacade (the subject of my previously mentioned blueprint), and that 
> passes through “expire_on_commit” of False by default.  It is definitely 
> False when oslo.db does the sessionmaker and I see no code that is setting it 
> to True anywhere.The save() method is not used here either, but even if 
> it is, NovaBase.save() calls into ModelBase.save() which just calls a 
> flush(), shouldn’t be emitting a SELECT either.
> 
> Let me know if a. I’m looking in the right place, b. if this second SELECT is 
> actually observed; if it’s occurring I’d like to understand better what we’re 
> looking at.

ah, we’re talking about the objects level:

class ComputeNode(base.NovaPersistentObject, base.NovaObject):
# …

@base.remotable
def save(self, context, prune_stats=False):
# NOTE(belliott) ignore prune_stats param, no longer relevant

updates = self.obj_get_changes()
updates.pop('id', None)
self._convert_stats_to_db_format(updates)
self._convert_host_ip_to_db_format(updates)
self._convert_supported_instances_to_db_format(updates)

db_compute = db.compute_node_update(context, self.id, updates)
self._from_db_object(context, self, db_compute)

I’m not sure if I’m seeing the second SELECT here either but I’m less familiar 
with what I’m looking at. compute_node_update() does the one SELECT as we 
said, then it doesn’t look like self._from_db_object() would emit any further 
SQL specific to that row.I see that it calls upon 
db_compute.get('supported_instances’) but that’s something different.






___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-12 Thread Mike Bayer

> On Nov 12, 2014, at 10:56 AM, Matthew Booth  wrote:
> 
> For brevity, I have conflated what happens in object.save() with what
> happens in db.api. Where the code lives isn't relevant here: I'm only
> looking at what happens.
> 
> Specifically, the following objects refresh themselves on save:
> 
> Aggregate
> BlockDeviceMapping
> ComputeNode

> Excluding irrelevant complexity, the general model for objects which
> refresh on update is:
> 
> object = 
> object.update()
> object.save()
> return 
> 
> Some objects skip out the second select and return the freshly saved
> object. That is, a save involves an update + either 1 or 2 selects.

If I may inquire as to the irrelevant complexity, I’m trying to pinpoint where 
you see this happening.

When we talk about updating a ComputeNode, because I’m only slightly familiar 
with Nova’s codebase, I assume we are looking at “def compute_node_update()” on 
line 633 of nova/db/sqlalchemy/api.py ?

if that’s the full extent of it, I’m not seeing the second select:

def compute_node_update(context, compute_id, values):
"""Updates the ComputeNode record with the most recent data."""

session = get_session()
with session.begin():
compute_ref = _compute_node_get(context, compute_id, 
session=session)
values['updated_at'] = timeutils.utcnow()
datetime_keys = ('created_at', 'deleted_at', 'updated_at')
convert_objects_related_datetimes(values, *datetime_keys)
compute_ref.update(values)

return compute_ref

so “with session.begin()”, when that context ends, will emit the flush of the 
compute_ref, and then commit the transaction.  The Session by default has a 
behavior “expire_on_commit”, which means that when this compute_ref is returned 
to the outside world, the first thing that accesses anything on it *will* emit 
a SELECT for the row again.  However, as far as I can tell the expire_on_commit 
flag is turned off.   get_session() returns from oslo.db’s EngineFacade (the 
subject of my previously mentioned blueprint), and that passes through 
“expire_on_commit” of False by default.  It is definitely False when oslo.db 
does the sessionmaker and I see no code that is setting it to True anywhere.
The save() method is not used here either, but even if it is, NovaBase.save() 
calls into ModelBase.save() which just calls a flush(), shouldn’t be emitting a 
SELECT either.

Let me know if a. I’m looking in the right place, b. if this second SELECT is 
actually observed; if it’s occurring I’d like to understand better what we’re 
looking at.




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-12 Thread Mike Bayer

> On Nov 12, 2014, at 12:45 PM, Dan Smith  wrote:
> 
>> I personally favour having consistent behaviour across the board. How
>> about updating them all to auto-refresh by default for consistency,
>> but adding an additional option to save() to disable it for particular
>> calls?
> 
> I think these should be two patches: one to make them all auto-refresh,
> and another to make it conditional. That serves the purpose of (a)
> bisecting a regression to one or the other, and (b) we can bikeshed on
> the interface and appropriateness of the don't-refresh flag :)
> 
>> I also suggest a tactical fix to any object which fetches itself twice
>> on update (e.g. Aggregate).
> 
> I don't see that being anything other than an obvious win, unless there
> is some obscure reason for it. But yeah, seems like a good thing to do.

lets keep in mind my everyone-likes-it-so-far proposal for reader() and 
writer(): https://review.openstack.org/#/c/125181/   (this is where it’s going 
to go as nobody has -1’ed it, so in absence of any “no way!” votes I have to 
assume this is what we’re going with).

in this system, the span of session use is implicit within the context and/or 
decorator, and when writer() is specified, a commit() can be implicit as well.  
IMHO there should be no “.save()” at all, at least as far as database writing 
is concerned. SQLAlchemy doesn’t need boilerplate like that - just let the 
ORM work normally:

@sql.writer
def some_other_api_method(context):
someobject = context.session.query(SomeObject)….one()
someobject.change_some_state()

# done!

if you want an explicit refresh, then just do so:

@sql.writer
def some_other_api_method(context):
someobject = context.session.query(SomeObject)….one()
someobject.change_some_state()

context.session.flush()
context.session.refresh(someobject)
# do something with someobject

however, seeing as this is all one API method the only reason you’d want to 
refresh() is that you think something has happened between that flush() and the 
refresh() that would actually show up, I can’t imagine what that would be 
looking for, unless maybe some large amount of operations took up a lot of time 
between the flush() and the refresh().



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-12 Thread Dan Smith
> I personally favour having consistent behaviour across the board. How
> about updating them all to auto-refresh by default for consistency,
> but adding an additional option to save() to disable it for particular
> calls?

I think these should be two patches: one to make them all auto-refresh,
and another to make it conditional. That serves the purpose of (a)
bisecting a regression to one or the other, and (b) we can bikeshed on
the interface and appropriateness of the don't-refresh flag :)

> I also suggest a tactical fix to any object which fetches itself twice
> on update (e.g. Aggregate).

I don't see that being anything other than an obvious win, unless there
is some obscure reason for it. But yeah, seems like a good thing to do.

--Dan



signature.asc
Description: OpenPGP digital signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-12 Thread Matthew Booth
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 12/11/14 16:22, Dan Smith wrote:
>> An initial inconsistency I have noticed is that some objects
>> refresh themselves from the database when calling save(), but
>> others don't.
> 
> I agree that it would be ideal for all objects to behave the same
> in this regard. I expect that in practice, it's not necessary for
> all objects to do this, and since it is an extra step/query in some
> places, it's not without cost to just always refresh (there is no
> reason to refresh a Flavor object AFAIK, for instance).
> 
>> The lack of consistency in behaviour is obviously a problem, and
>> I can't think of any good reason for a second select for objects
>> which do that. However, I don't think it is good design for
>> save() to refresh the object at all, and the reason is
>> concurrency. The cached contents of a Nova object are *always*
>> potentially stale. A refresh does nothing to change that, because
>> the contents are again potentially stale as soon as it returns.
>> Handling this requires concurrency primitives which we don't
>> currently have (see the larger context I mentioned above).
>> Refreshing an object's contents might reduce the probability of a
>> race, but it doesn't fix it. Callers who want a refresh after
>> save can always call object.refresh(), but for others it's just
>> wasted hits on the db.
> 
> Doing a refresh() after a save() is more than just another DB hit,
> it's another RPC round-trip, which is far more expensive.
> 
> There is a lot of code in the compute manager (and elsewhere I'm
> sure) that expects to get back the refreshed object after a save
> (our instance update functions have always exhibited this behavior,
> so there is code built to expect it). Any time something could
> change async on the object that might affect what we're about to do
> will benefit from this implicit refreshing. A good example is when
> we're doing something long-running on an instance and it is deleted
> underneath us. If we didn't get the deleted=True change after a
> save(), we might go continue doing a lot of extra work before we
> notice it the next time.
> 
> It's not that doing so prevents the object's contents from being
> stale, it's that it reduces the amount of time before we notice a
> change, and avoids us needing to explicitly check. Any code we have
> that can't tolerate the object being stale is broken anyway.
> 
>> Refresh on save() is also arbitrary. Why should the object be
>> updated then rather than at any other time? The timing of an
>> update in thread X is unrelated to the timing of an update in
>> thread Y, but it's a problem whenever it happens.
> 
> Because it's not about cache consistency, it's about the expense 
> required to do any of these things. To save(), we *have* to make
> the round-trip, so why not get a refresh at the same time? In cases
> where we explicitly want to refresh, we call refresh(), but
> otherwise we use natural synchronization points like save() to do
> that.

Ok, it makes more sense in this context.

>> Can anybody see a problem if we didn't fetch the row at all, and
>>  simply updated it? Absent locking or compare-and-swap this is 
>> effectively what we're already doing, and it reduces the db cost
>> of save to a single update statement. The difference would be
>> that the object would remain stale without an explicit refresh().
>> Value munging would remain unaffected.
> 
> At least for instance, I don't want to do away with the implicit 
> refreshing. I would be up for any of these options:
> 
> 1. Documenting which objects do and don't auto-refresh 2. Making
> the case for non-Instance objects to not auto-refresh 3. Making
> them all auto-refresh for consistency, with the appropriate 
> re-tooling of the db/api code to minimize performance impact.

I personally favour having consistent behaviour across the board. How
about updating them all to auto-refresh by default for consistency,
but adding an additional option to save() to disable it for particular
calls? Ideally we'd have the opposite default and fix all callers, but
I suspect that's more likely to bite us in the short term unless we're
confident we can identify all the critical callers.

I also suggest a tactical fix to any object which fetches itself twice
on update (e.g. Aggregate).

> 
>> Additionally, Instance, InstanceGroup, and Flavor perform
>> multiple updates on save(). I would apply the same rule to the
>> sub-updates, and also move them into a single transaction such
>> that the updates are atomic.
> 
> Yep, no complaints about fixing these non-atomic updates, of course
> :)

Thanks,

Matt
- -- 
Matthew Booth
Red Hat Engineering, Virtualisation Team

Phone: +442070094448 (UK)
GPG ID:  D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlRjkEcACgkQNEHqGdM8NJC5hACfYimQH2MqcMTnvHc7loqi1QAZ
R2EAoIOSVe83htncBWBIDBxBwFdANajG
=veuA
-END PGP SIGNATURE

Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-12 Thread Dan Smith
> An initial inconsistency I have noticed is that some objects refresh 
> themselves from the database when calling save(), but others don't.

I agree that it would be ideal for all objects to behave the same in
this regard. I expect that in practice, it's not necessary for all
objects to do this, and since it is an extra step/query in some places,
it's not without cost to just always refresh (there is no reason to
refresh a Flavor object AFAIK, for instance).

> The lack of consistency in behaviour is obviously a problem, and I 
> can't think of any good reason for a second select for objects which 
> do that. However, I don't think it is good design for save() to 
> refresh the object at all, and the reason is concurrency. The cached 
> contents of a Nova object are *always* potentially stale. A refresh 
> does nothing to change that, because the contents are again 
> potentially stale as soon as it returns. Handling this requires 
> concurrency primitives which we don't currently have (see the larger 
> context I mentioned above). Refreshing an object's contents might 
> reduce the probability of a race, but it doesn't fix it. Callers who 
> want a refresh after save can always call object.refresh(), but for 
> others it's just wasted hits on the db.

Doing a refresh() after a save() is more than just another DB hit, it's
another RPC round-trip, which is far more expensive.

There is a lot of code in the compute manager (and elsewhere I'm sure)
that expects to get back the refreshed object after a save (our instance
update functions have always exhibited this behavior, so there is code
built to expect it). Any time something could change async on the object
that might affect what we're about to do will benefit from this implicit
refreshing. A good example is when we're doing something long-running on
an instance and it is deleted underneath us. If we didn't get the
deleted=True change after a save(), we might go continue doing a lot of
extra work before we notice it the next time.

It's not that doing so prevents the object's contents from being stale,
it's that it reduces the amount of time before we notice a change, and
avoids us needing to explicitly check. Any code we have that can't
tolerate the object being stale is broken anyway.

> Refresh on save() is also arbitrary. Why should the object be updated
> then rather than at any other time? The timing of an update in thread
> X is unrelated to the timing of an update in thread Y, but it's a
> problem whenever it happens.

Because it's not about cache consistency, it's about the expense
required to do any of these things. To save(), we *have* to make the
round-trip, so why not get a refresh at the same time? In cases where we
explicitly want to refresh, we call refresh(), but otherwise we use
natural synchronization points like save() to do that.

> Can anybody see a problem if we didn't fetch the row at all, and 
> simply updated it? Absent locking or compare-and-swap this is 
> effectively what we're already doing, and it reduces the db cost of 
> save to a single update statement. The difference would be that the 
> object would remain stale without an explicit refresh(). Value 
> munging would remain unaffected.

At least for instance, I don't want to do away with the implicit
refreshing. I would be up for any of these options:

1. Documenting which objects do and don't auto-refresh
2. Making the case for non-Instance objects to not auto-refresh
3. Making them all auto-refresh for consistency, with the appropriate
   re-tooling of the db/api code to minimize performance impact.

> Additionally, Instance, InstanceGroup, and Flavor perform multiple 
> updates on save(). I would apply the same rule to the sub-updates, 
> and also move them into a single transaction such that the updates 
> are atomic.

Yep, no complaints about fixing these non-atomic updates, of course :)

--Dan



signature.asc
Description: OpenPGP digital signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev