Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-14 Thread Matthew Booth
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 13/11/14 19:19, Dan Smith wrote:
 Unfortunately this model doesn't apply to Nova objects, which
 are persisted remotely. Unless I've missed something, SQLA
 doesn't run on Nova Compute at all. Instead, when Nova Compute
 calls object.save() this results in an RPC call to Nova
 Conductor, which persists the object in the DB using SQLA.
 Compute wouldn't be able to use common DB transactions without
 some hairy lifecycle management in Conductor, so Compute apis
 need to be explicitly aware of this.
 
 So just a note to Dan, this is an example of where I keep
 hearing this about Nova objects.I’ve discussed this with Dan
 and if I understand him correctly, I think the idea is that a
 Nova Compute call can be organized such that the objects layer
 interacts with the database layer in a more coarse-grained
 fashion, if that is desired, so if you really need several things
 to happen in one DB transaction, you should organize the relevant
 objects code to work that way.
 
 Instance.save() is a single thing. It implies that
 Instance.metadata, Instance.system_metadata, Instance.info_cache,
 Instance.security_groups, Instance.numa_topology,
 Instance.pci_requests, etc should all be written to the database
 atomically (or fail). We don't do it atomically and in a 
 transaction right now, but only because db/api is broken into lots
 of small pieces (all of which existed before objects).
 
 If there was a good way to do this:
 
 with transaction: save_instance_data() save_instance_metadata() 
 save_instance_system_metadata() ...etc
 
 Then we'd do that at the backend, achieve atomicity, and the client
 (the compute node) wouldn't notice, or care, beyond the fact that
 it had assumed that was happening all along. It sounds like Mike's
 facade will provide us a nice way to clean up the db/api calls that
 the objects are using to persist data in such a way that we can do
 the above safely like we should have been doing all along.

Agree.

 
 Does that make sense?
 
 Still for me to get my head around is how often we are in fact 
 organizing the bridge between objects / database such that we
 are using the database most effectively, and not breaking up a
 single logical operation into many individual transactions.   I
 know that Nova objects doesn’t mandate that this occur but I
 still want to learn if perhaps it tends to “encourage” that
 pattern to emerge - it’s hard for me to make that guess right now
 because I haven’t surveyed nova objects very much at all as I’ve
 been really trying to stick with getting database patterns sane
 to start with.
 
 I don't agree that it encourages anything relating to how you
 interact with the database, one way or the other. Almost all of our
 objects are organized in the exact way that previously we had 
 dicts-of-dicts-of-dicts and an RPC call to flush things to the
 database. We've changed very little of that access pattern.
 
 I think we should push back to Matt to provide a description of why
 he thinks that this is a problem.

I don't think it's a problem. It puts a practical limit on the scope
of an 'api call' which can be covered by a single database transaction
though, because it would be difficult to arrange for 2 RPC calls to
both use the same DB transaction on the remote end. I think we agree
on this.

Matt
- -- 
Matthew Booth
Red Hat Engineering, Virtualisation Team

Phone: +442070094448 (UK)
GPG ID:  D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlRlyHYACgkQNEHqGdM8NJCQsQCgmEgZGyCjIJ3zQ+mjo14AQTeo
58UAn3nhO2OfOm/xZ3yIMhlVp0EhJm+0
=xjYZ
-END PGP SIGNATURE-

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-14 Thread Matthew Booth
On 13/11/14 18:26, Mike Bayer wrote:
 
 On Nov 13, 2014, at 3:52 AM, Nikola Đipanov ndipa...@redhat.com
 wrote:
 
 On 11/13/2014 02:45 AM, Dan Smith wrote:
 I’m not sure if I’m seeing the second SELECT here either but
 I’m less familiar with what I’m looking at.
 compute_node_update() does the one SELECT as we said, then it
 doesn’t look like self._from_db_object() would emit any further
 SQL specific to that row.
 
 I don't think you're missing anything. I don't see anything in
 that object code, or the other db/sqlalchemy/api.py code that
 looks like a second select. Perhaps he was referring to two
 *queries*, being the initial select and the following update?
 
 
 FWIW - I think an example Matt was giving me yesterday was block
 devices where we have:
 
 @require_context def block_device_mapping_update(context, bdm_id,
 values, legacy=True): _scrub_empty_str_values(values,
 ['volume_size']) values = _from_legacy_values(values, legacy,
 allow_updates=True) query
 =_block_device_mapping_get_query(context).filter_by(id=bdm_id) 
 query.update(values) return query.first()
 
 which gets called from object save()
 
 OK well there, that is still a single UPDATE statement and then a
 SELECT.   It’s using an aggregate UPDATE so there is no load up front
 required.   Unless _from_legacy_values() does something, that’s still
 just UPDATE + SELECT, just not in the usual order.  I’d suggest this
 method be swapped around to load the object first, then use the
 traditional flush process to flush it, as regular flush is a lot more
 reliable, so I’d agree this method is awkward and should be fixed,
 but I’m not sure there’s a second SELECT there.

Indeed, looks like there's just a single select here. Aggregate does,
however, fetch twice. This was the first one I found, and I may have
'seen' this pattern in more places than it actually exists.

Matt
-- 
Matthew Booth
Red Hat Engineering, Virtualisation Team

Phone: +442070094448 (UK)
GPG ID:  D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-14 Thread Dan Smith
 I don't think it's a problem. It puts a practical limit on the scope
 of an 'api call' which can be covered by a single database transaction
 though, because it would be difficult to arrange for 2 RPC calls to
 both use the same DB transaction on the remote end. I think we agree
 on this.

Sure, but since RPC calls can be queued for *minutes* this is probably
not a realistic problem, right?

--Dan



signature.asc
Description: OpenPGP digital signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-14 Thread Matthew Booth
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 14/11/14 14:40, Dan Smith wrote:
 I don't think it's a problem. It puts a practical limit on the
 scope of an 'api call' which can be covered by a single database
 transaction though, because it would be difficult to arrange for
 2 RPC calls to both use the same DB transaction on the remote
 end. I think we agree on this.
 
 Sure, but since RPC calls can be queued for *minutes* this is
 probably not a realistic problem, right?

More violent agreement ;) I think scoping this to just conductor is
appropriate and useful. Compare and swap at the object level would be
a useful mechanism for safety across multiple rpc calls.

Matt

- -- 
Matthew Booth
Red Hat Engineering, Virtualisation Team

Phone: +442070094448 (UK)
GPG ID:  D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlRmJiQACgkQNEHqGdM8NJASFgCdFsz8Bm9LcIqTBf+uZSe0/hmE
LG0An2D9L3vyI8QSAUxNDj3fpfSFt0tx
=32fN
-END PGP SIGNATURE-

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-13 Thread Nikola Đipanov
On 11/13/2014 02:45 AM, Dan Smith wrote:
 I’m not sure if I’m seeing the second SELECT here either but I’m less
 familiar with what I’m looking at. compute_node_update() does the
 one SELECT as we said, then it doesn’t look like
 self._from_db_object() would emit any further SQL specific to that
 row.
 
 I don't think you're missing anything. I don't see anything in that
 object code, or the other db/sqlalchemy/api.py code that looks like a
 second select. Perhaps he was referring to two *queries*, being the
 initial select and the following update?
 

FWIW - I think an example Matt was giving me yesterday was block devices
where we have:

@require_context
def block_device_mapping_update(context, bdm_id, values, legacy=True):
_scrub_empty_str_values(values, ['volume_size'])
values = _from_legacy_values(values, legacy, allow_updates=True)
query =_block_device_mapping_get_query(context).filter_by(id=bdm_id)
query.update(values)
return query.first()

which gets called from object save()

N.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-13 Thread Matthew Booth
On 13/11/14 08:52, Nikola Đipanov wrote:
 On 11/13/2014 02:45 AM, Dan Smith wrote:
 I’m not sure if I’m seeing the second SELECT here either but I’m less
 familiar with what I’m looking at. compute_node_update() does the
 one SELECT as we said, then it doesn’t look like
 self._from_db_object() would emit any further SQL specific to that
 row.

 I don't think you're missing anything. I don't see anything in that
 object code, or the other db/sqlalchemy/api.py code that looks like a
 second select. Perhaps he was referring to two *queries*, being the
 initial select and the following update?

 
 FWIW - I think an example Matt was giving me yesterday was block devices
 where we have:
 
 @require_context
 def block_device_mapping_update(context, bdm_id, values, legacy=True):
 _scrub_empty_str_values(values, ['volume_size'])
 values = _from_legacy_values(values, legacy, allow_updates=True)
 query =_block_device_mapping_get_query(context).filter_by(id=bdm_id)
 query.update(values)
 return query.first()
 
 which gets called from object save()

Yes, this is one example, another is Aggregate. I already had a big list
in the post and didn't want a second one.

Matt
-- 
Matthew Booth
Red Hat Engineering, Virtualisation Team

Phone: +442070094448 (UK)
GPG ID:  D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-13 Thread Matthew Booth
On 12/11/14 23:23, Mike Bayer wrote:
 
 On Nov 12, 2014, at 10:56 AM, Matthew Booth mbo...@redhat.com wrote:

 For brevity, I have conflated what happens in object.save() with what
 happens in db.api. Where the code lives isn't relevant here: I'm only
 looking at what happens.

 Specifically, the following objects refresh themselves on save:

 Aggregate
 BlockDeviceMapping
 ComputeNode
 
 Excluding irrelevant complexity, the general model for objects which
 refresh on update is:

 object = select row from object table
 object.update()
 object.save()
 return select row from object table again

 Some objects skip out the second select and return the freshly saved
 object. That is, a save involves an update + either 1 or 2 selects.
 
 If I may inquire as to the irrelevant complexity, I’m trying to pinpoint 
 where you see this happening.

The irrelevant complexity is mostly munging values before they are
inserted into the db. While this needs to be there, I don't think it's
important to the post.

Matt

-- 
Matthew Booth
Red Hat Engineering, Virtualisation Team

Phone: +442070094448 (UK)
GPG ID:  D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-13 Thread Matthew Booth
On 12/11/14 19:39, Mike Bayer wrote:
 
 On Nov 12, 2014, at 12:45 PM, Dan Smith d...@danplanet.com wrote:
 
 I personally favour having consistent behaviour across the board.
 How about updating them all to auto-refresh by default for
 consistency, but adding an additional option to save() to disable
 it for particular calls?
 
 I think these should be two patches: one to make them all
 auto-refresh, and another to make it conditional. That serves the
 purpose of (a) bisecting a regression to one or the other, and (b)
 we can bikeshed on the interface and appropriateness of the
 don't-refresh flag :)
 
 I also suggest a tactical fix to any object which fetches itself
 twice on update (e.g. Aggregate).
 
 I don't see that being anything other than an obvious win, unless
 there is some obscure reason for it. But yeah, seems like a good
 thing to do.
 
 lets keep in mind my everyone-likes-it-so-far proposal for reader()
 and writer(): https://review.openstack.org/#/c/125181/   (this is
 where it’s going to go as nobody has -1’ed it, so in absence of any
 “no way!” votes I have to assume this is what we’re going with).

FWIW, it got my +1, too. Looks great.

 in this system, the span of session use is implicit within the
 context and/or decorator, and when writer() is specified, a commit()
 can be implicit as well.  IMHO there should be no “.save()” at all,
 at least as far as database writing is concerned. SQLAlchemy
 doesn’t need boilerplate like that - just let the ORM work normally:
 
 @sql.writer def some_other_api_method(context): someobject =
 context.session.query(SomeObject)….one() 
 someobject.change_some_state(stuff)
 
 # done!
 
 if you want an explicit refresh, then just do so:
 
 @sql.writer def some_other_api_method(context): someobject =
 context.session.query(SomeObject)….one() 
 someobject.change_some_state(stuff)
 
 context.session.flush() context.session.refresh(someobject) # do
 something with someobject

Unfortunately this model doesn't apply to Nova objects, which are
persisted remotely. Unless I've missed something, SQLA doesn't run on
Nova Compute at all. Instead, when Nova Compute calls object.save() this
results in an RPC call to Nova Conductor, which persists the object in
the DB using SQLA. Compute wouldn't be able to use common DB
transactions without some hairy lifecycle management in Conductor, so
Compute apis need to be explicitly aware of this.

However, it absolutely makes sense for a single Conductor api call to
use a single transaction.

 however, seeing as this is all one API method the only reason you’d
 want to refresh() is that you think something has happened between
 that flush() and the refresh() that would actually show up, I can’t
 imagine what that would be looking for, unless maybe some large
 amount of operations took up a lot of time between the flush() and
 the refresh().

Given the above constraints, the problem I'm actually trying to solve is
when another process modifies an object underneath us between multiple,
remote transactions. This is one of the motivations for compare-and-swap
over row locking on read. Another is that the length of some API calls
makes holding a row lock for that long undesirable.

Matt
-- 
Matthew Booth
Red Hat Engineering, Virtualisation Team

Phone: +442070094448 (UK)
GPG ID:  D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-13 Thread Matthew Booth
On 12/11/14 19:39, Mike Bayer wrote:
 lets keep in mind my everyone-likes-it-so-far proposal for reader()
 and writer(): https://review.openstack.org/#/c/125181/   (this is
 where it’s going to go as nobody has -1’ed it, so in absence of any
 “no way!” votes I have to assume this is what we’re going with).

Dan,

Note that this model, as I understand it, would conflict with storing
context in NovaObject.

Matt
-- 
Matthew Booth
Red Hat Engineering, Virtualisation Team

Phone: +442070094448 (UK)
GPG ID:  D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-13 Thread Dan Smith
 On 12/11/14 19:39, Mike Bayer wrote:
 lets keep in mind my everyone-likes-it-so-far proposal for reader()
 and writer(): https://review.openstack.org/#/c/125181/   (this is
 where it’s going to go as nobody has -1’ed it, so in absence of any
 “no way!” votes I have to assume this is what we’re going with).
 
 Dan,
 
 Note that this model, as I understand it, would conflict with storing
 context in NovaObject.

Why do you think that? As you pointed out, the above model is purely
SQLA code, which is run by an object, long after the context has been
resolved, the call has been remoted, etc.

--Dan



signature.asc
Description: OpenPGP digital signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-13 Thread Matthew Booth
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 13/11/14 14:26, Dan Smith wrote:
 On 12/11/14 19:39, Mike Bayer wrote:
 lets keep in mind my everyone-likes-it-so-far proposal for
 reader() and writer(): https://review.openstack.org/#/c/125181/
 (this is where it’s going to go as nobody has -1’ed it, so in
 absence of any “no way!” votes I have to assume this is what
 we’re going with).
 
 Dan,
 
 Note that this model, as I understand it, would conflict with
 storing context in NovaObject.
 
 Why do you think that? As you pointed out, the above model is
 purely SQLA code, which is run by an object, long after the context
 has been resolved, the call has been remoted, etc.

Can we guarantee that the lifetime of a context object in conductor is
a single rpc call, and that the object cannot be referenced from any
other thread? Seems safer just to pass it around.

Matt
- -- 
Matthew Booth
Red Hat Engineering, Virtualisation Team

Phone: +442070094448 (UK)
GPG ID:  D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlRkwTMACgkQNEHqGdM8NJBHMwCdF6RpkpFSXitHfGfOmL0Iw/wr
f/8AnRxozN/LusnermjbZffmvuyoFub7
=S6KI
-END PGP SIGNATURE-

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-13 Thread Dan Smith
 Can we guarantee that the lifetime of a context object in conductor is
 a single rpc call, and that the object cannot be referenced from any
 other thread?

Yes, without a doubt.

--Dan



signature.asc
Description: OpenPGP digital signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-13 Thread Mike Bayer

 On Nov 13, 2014, at 3:52 AM, Nikola Đipanov ndipa...@redhat.com wrote:
 
 On 11/13/2014 02:45 AM, Dan Smith wrote:
 I’m not sure if I’m seeing the second SELECT here either but I’m less
 familiar with what I’m looking at. compute_node_update() does the
 one SELECT as we said, then it doesn’t look like
 self._from_db_object() would emit any further SQL specific to that
 row.
 
 I don't think you're missing anything. I don't see anything in that
 object code, or the other db/sqlalchemy/api.py code that looks like a
 second select. Perhaps he was referring to two *queries*, being the
 initial select and the following update?
 
 
 FWIW - I think an example Matt was giving me yesterday was block devices
 where we have:
 
 @require_context
 def block_device_mapping_update(context, bdm_id, values, legacy=True):
_scrub_empty_str_values(values, ['volume_size'])
values = _from_legacy_values(values, legacy, allow_updates=True)
query =_block_device_mapping_get_query(context).filter_by(id=bdm_id)
query.update(values)
return query.first()
 
 which gets called from object save()

OK well there, that is still a single UPDATE statement and then a SELECT.   
It’s using an aggregate UPDATE so there is no load up front required.   Unless 
_from_legacy_values() does something, that’s still just UPDATE + SELECT, just 
not in the usual order.  I’d suggest this method be swapped around to load the 
object first, then use the traditional flush process to flush it, as regular 
flush is a lot more reliable, so I’d agree this method is awkward and should be 
fixed, but I’m not sure there’s a second SELECT there.



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-13 Thread Mike Bayer

 On Nov 13, 2014, at 5:47 AM, Matthew Booth mbo...@redhat.com wrote:
 
 Unfortunately this model doesn't apply to Nova objects, which are
 persisted remotely. Unless I've missed something, SQLA doesn't run on
 Nova Compute at all. Instead, when Nova Compute calls object.save() this
 results in an RPC call to Nova Conductor, which persists the object in
 the DB using SQLA. Compute wouldn't be able to use common DB
 transactions without some hairy lifecycle management in Conductor, so
 Compute apis need to be explicitly aware of this.


So just a note to Dan, this is an example of where I keep hearing this about 
Nova objects.I’ve discussed this with Dan and if I understand him 
correctly, I think the idea is that a Nova Compute call can be organized such 
that the objects layer interacts with the database layer in a more 
coarse-grained fashion, if that is desired, so if you really need several 
things to happen in one DB transaction, you should organize the relevant 
objects code to work that way.

Still for me to get my head around is how often we are in fact organizing the 
bridge between objects / database such that we are using the database most 
effectively, and not breaking up a single logical operation into many 
individual transactions.   I know that Nova objects doesn’t mandate that this 
occur but I still want to learn if perhaps it tends to “encourage” that pattern 
to emerge - it’s hard for me to make that guess right now because I haven’t 
surveyed nova objects very much at all as I’ve been really trying to stick with 
getting database patterns sane to start with.  



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-13 Thread Dan Smith
 Unfortunately this model doesn't apply to Nova objects, which are 
 persisted remotely. Unless I've missed something, SQLA doesn't run
 on Nova Compute at all. Instead, when Nova Compute calls
 object.save() this results in an RPC call to Nova Conductor, which
 persists the object in the DB using SQLA. Compute wouldn't be able
 to use common DB transactions without some hairy lifecycle
 management in Conductor, so Compute apis need to be explicitly
 aware of this.
 
 So just a note to Dan, this is an example of where I keep hearing
 this about Nova objects.I’ve discussed this with Dan and if I
 understand him correctly, I think the idea is that a Nova Compute
 call can be organized such that the objects layer interacts with the
 database layer in a more coarse-grained fashion, if that is desired,
 so if you really need several things to happen in one DB transaction,
 you should organize the relevant objects code to work that way.

Instance.save() is a single thing. It implies that Instance.metadata,
Instance.system_metadata, Instance.info_cache, Instance.security_groups,
Instance.numa_topology, Instance.pci_requests, etc should all be written
to the database atomically (or fail). We don't do it atomically and in a
transaction right now, but only because db/api is broken into lots of
small pieces (all of which existed before objects).

If there was a good way to do this:

  with transaction:
save_instance_data()
save_instance_metadata()
save_instance_system_metadata()
...etc

Then we'd do that at the backend, achieve atomicity, and the client (the
compute node) wouldn't notice, or care, beyond the fact that it had
assumed that was happening all along. It sounds like Mike's facade will
provide us a nice way to clean up the db/api calls that the objects are
using to persist data in such a way that we can do the above safely like
we should have been doing all along.

Does that make sense?

 Still for me to get my head around is how often we are in fact
 organizing the bridge between objects / database such that we are
 using the database most effectively, and not breaking up a single
 logical operation into many individual transactions.   I know that
 Nova objects doesn’t mandate that this occur but I still want to
 learn if perhaps it tends to “encourage” that pattern to emerge -
 it’s hard for me to make that guess right now because I haven’t
 surveyed nova objects very much at all as I’ve been really trying to
 stick with getting database patterns sane to start with.

I don't agree that it encourages anything relating to how you interact
with the database, one way or the other. Almost all of our objects are
organized in the exact way that previously we had
dicts-of-dicts-of-dicts and an RPC call to flush things to the database.
We've changed very little of that access pattern.

I think we should push back to Matt to provide a description of why he
thinks that this is a problem.

--Dan



signature.asc
Description: OpenPGP digital signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-12 Thread Dan Smith
 An initial inconsistency I have noticed is that some objects refresh 
 themselves from the database when calling save(), but others don't.

I agree that it would be ideal for all objects to behave the same in
this regard. I expect that in practice, it's not necessary for all
objects to do this, and since it is an extra step/query in some places,
it's not without cost to just always refresh (there is no reason to
refresh a Flavor object AFAIK, for instance).

 The lack of consistency in behaviour is obviously a problem, and I 
 can't think of any good reason for a second select for objects which 
 do that. However, I don't think it is good design for save() to 
 refresh the object at all, and the reason is concurrency. The cached 
 contents of a Nova object are *always* potentially stale. A refresh 
 does nothing to change that, because the contents are again 
 potentially stale as soon as it returns. Handling this requires 
 concurrency primitives which we don't currently have (see the larger 
 context I mentioned above). Refreshing an object's contents might 
 reduce the probability of a race, but it doesn't fix it. Callers who 
 want a refresh after save can always call object.refresh(), but for 
 others it's just wasted hits on the db.

Doing a refresh() after a save() is more than just another DB hit, it's
another RPC round-trip, which is far more expensive.

There is a lot of code in the compute manager (and elsewhere I'm sure)
that expects to get back the refreshed object after a save (our instance
update functions have always exhibited this behavior, so there is code
built to expect it). Any time something could change async on the object
that might affect what we're about to do will benefit from this implicit
refreshing. A good example is when we're doing something long-running on
an instance and it is deleted underneath us. If we didn't get the
deleted=True change after a save(), we might go continue doing a lot of
extra work before we notice it the next time.

It's not that doing so prevents the object's contents from being stale,
it's that it reduces the amount of time before we notice a change, and
avoids us needing to explicitly check. Any code we have that can't
tolerate the object being stale is broken anyway.

 Refresh on save() is also arbitrary. Why should the object be updated
 then rather than at any other time? The timing of an update in thread
 X is unrelated to the timing of an update in thread Y, but it's a
 problem whenever it happens.

Because it's not about cache consistency, it's about the expense
required to do any of these things. To save(), we *have* to make the
round-trip, so why not get a refresh at the same time? In cases where we
explicitly want to refresh, we call refresh(), but otherwise we use
natural synchronization points like save() to do that.

 Can anybody see a problem if we didn't fetch the row at all, and 
 simply updated it? Absent locking or compare-and-swap this is 
 effectively what we're already doing, and it reduces the db cost of 
 save to a single update statement. The difference would be that the 
 object would remain stale without an explicit refresh(). Value 
 munging would remain unaffected.

At least for instance, I don't want to do away with the implicit
refreshing. I would be up for any of these options:

1. Documenting which objects do and don't auto-refresh
2. Making the case for non-Instance objects to not auto-refresh
3. Making them all auto-refresh for consistency, with the appropriate
   re-tooling of the db/api code to minimize performance impact.

 Additionally, Instance, InstanceGroup, and Flavor perform multiple 
 updates on save(). I would apply the same rule to the sub-updates, 
 and also move them into a single transaction such that the updates 
 are atomic.

Yep, no complaints about fixing these non-atomic updates, of course :)

--Dan



signature.asc
Description: OpenPGP digital signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-12 Thread Matthew Booth
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 12/11/14 16:22, Dan Smith wrote:
 An initial inconsistency I have noticed is that some objects
 refresh themselves from the database when calling save(), but
 others don't.
 
 I agree that it would be ideal for all objects to behave the same
 in this regard. I expect that in practice, it's not necessary for
 all objects to do this, and since it is an extra step/query in some
 places, it's not without cost to just always refresh (there is no
 reason to refresh a Flavor object AFAIK, for instance).
 
 The lack of consistency in behaviour is obviously a problem, and
 I can't think of any good reason for a second select for objects
 which do that. However, I don't think it is good design for
 save() to refresh the object at all, and the reason is
 concurrency. The cached contents of a Nova object are *always*
 potentially stale. A refresh does nothing to change that, because
 the contents are again potentially stale as soon as it returns.
 Handling this requires concurrency primitives which we don't
 currently have (see the larger context I mentioned above).
 Refreshing an object's contents might reduce the probability of a
 race, but it doesn't fix it. Callers who want a refresh after
 save can always call object.refresh(), but for others it's just
 wasted hits on the db.
 
 Doing a refresh() after a save() is more than just another DB hit,
 it's another RPC round-trip, which is far more expensive.
 
 There is a lot of code in the compute manager (and elsewhere I'm
 sure) that expects to get back the refreshed object after a save
 (our instance update functions have always exhibited this behavior,
 so there is code built to expect it). Any time something could
 change async on the object that might affect what we're about to do
 will benefit from this implicit refreshing. A good example is when
 we're doing something long-running on an instance and it is deleted
 underneath us. If we didn't get the deleted=True change after a
 save(), we might go continue doing a lot of extra work before we
 notice it the next time.
 
 It's not that doing so prevents the object's contents from being
 stale, it's that it reduces the amount of time before we notice a
 change, and avoids us needing to explicitly check. Any code we have
 that can't tolerate the object being stale is broken anyway.
 
 Refresh on save() is also arbitrary. Why should the object be
 updated then rather than at any other time? The timing of an
 update in thread X is unrelated to the timing of an update in
 thread Y, but it's a problem whenever it happens.
 
 Because it's not about cache consistency, it's about the expense 
 required to do any of these things. To save(), we *have* to make
 the round-trip, so why not get a refresh at the same time? In cases
 where we explicitly want to refresh, we call refresh(), but
 otherwise we use natural synchronization points like save() to do
 that.

Ok, it makes more sense in this context.

 Can anybody see a problem if we didn't fetch the row at all, and
  simply updated it? Absent locking or compare-and-swap this is 
 effectively what we're already doing, and it reduces the db cost
 of save to a single update statement. The difference would be
 that the object would remain stale without an explicit refresh().
 Value munging would remain unaffected.
 
 At least for instance, I don't want to do away with the implicit 
 refreshing. I would be up for any of these options:
 
 1. Documenting which objects do and don't auto-refresh 2. Making
 the case for non-Instance objects to not auto-refresh 3. Making
 them all auto-refresh for consistency, with the appropriate 
 re-tooling of the db/api code to minimize performance impact.

I personally favour having consistent behaviour across the board. How
about updating them all to auto-refresh by default for consistency,
but adding an additional option to save() to disable it for particular
calls? Ideally we'd have the opposite default and fix all callers, but
I suspect that's more likely to bite us in the short term unless we're
confident we can identify all the critical callers.

I also suggest a tactical fix to any object which fetches itself twice
on update (e.g. Aggregate).

 
 Additionally, Instance, InstanceGroup, and Flavor perform
 multiple updates on save(). I would apply the same rule to the
 sub-updates, and also move them into a single transaction such
 that the updates are atomic.
 
 Yep, no complaints about fixing these non-atomic updates, of course
 :)

Thanks,

Matt
- -- 
Matthew Booth
Red Hat Engineering, Virtualisation Team

Phone: +442070094448 (UK)
GPG ID:  D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlRjkEcACgkQNEHqGdM8NJC5hACfYimQH2MqcMTnvHc7loqi1QAZ
R2EAoIOSVe83htncBWBIDBxBwFdANajG
=veuA
-END PGP SIGNATURE-

___
OpenStack-dev mailing list

Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-12 Thread Dan Smith
 I personally favour having consistent behaviour across the board. How
 about updating them all to auto-refresh by default for consistency,
 but adding an additional option to save() to disable it for particular
 calls?

I think these should be two patches: one to make them all auto-refresh,
and another to make it conditional. That serves the purpose of (a)
bisecting a regression to one or the other, and (b) we can bikeshed on
the interface and appropriateness of the don't-refresh flag :)

 I also suggest a tactical fix to any object which fetches itself twice
 on update (e.g. Aggregate).

I don't see that being anything other than an obvious win, unless there
is some obscure reason for it. But yeah, seems like a good thing to do.

--Dan



signature.asc
Description: OpenPGP digital signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-12 Thread Mike Bayer

 On Nov 12, 2014, at 12:45 PM, Dan Smith d...@danplanet.com wrote:
 
 I personally favour having consistent behaviour across the board. How
 about updating them all to auto-refresh by default for consistency,
 but adding an additional option to save() to disable it for particular
 calls?
 
 I think these should be two patches: one to make them all auto-refresh,
 and another to make it conditional. That serves the purpose of (a)
 bisecting a regression to one or the other, and (b) we can bikeshed on
 the interface and appropriateness of the don't-refresh flag :)
 
 I also suggest a tactical fix to any object which fetches itself twice
 on update (e.g. Aggregate).
 
 I don't see that being anything other than an obvious win, unless there
 is some obscure reason for it. But yeah, seems like a good thing to do.

lets keep in mind my everyone-likes-it-so-far proposal for reader() and 
writer(): https://review.openstack.org/#/c/125181/   (this is where it’s going 
to go as nobody has -1’ed it, so in absence of any “no way!” votes I have to 
assume this is what we’re going with).

in this system, the span of session use is implicit within the context and/or 
decorator, and when writer() is specified, a commit() can be implicit as well.  
IMHO there should be no “.save()” at all, at least as far as database writing 
is concerned. SQLAlchemy doesn’t need boilerplate like that - just let the 
ORM work normally:

@sql.writer
def some_other_api_method(context):
someobject = context.session.query(SomeObject)….one()
someobject.change_some_state(stuff)

# done!

if you want an explicit refresh, then just do so:

@sql.writer
def some_other_api_method(context):
someobject = context.session.query(SomeObject)….one()
someobject.change_some_state(stuff)

context.session.flush()
context.session.refresh(someobject)
# do something with someobject

however, seeing as this is all one API method the only reason you’d want to 
refresh() is that you think something has happened between that flush() and the 
refresh() that would actually show up, I can’t imagine what that would be 
looking for, unless maybe some large amount of operations took up a lot of time 
between the flush() and the refresh().



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-12 Thread Mike Bayer

 On Nov 12, 2014, at 10:56 AM, Matthew Booth mbo...@redhat.com wrote:
 
 For brevity, I have conflated what happens in object.save() with what
 happens in db.api. Where the code lives isn't relevant here: I'm only
 looking at what happens.
 
 Specifically, the following objects refresh themselves on save:
 
 Aggregate
 BlockDeviceMapping
 ComputeNode

 Excluding irrelevant complexity, the general model for objects which
 refresh on update is:
 
 object = select row from object table
 object.update()
 object.save()
 return select row from object table again
 
 Some objects skip out the second select and return the freshly saved
 object. That is, a save involves an update + either 1 or 2 selects.

If I may inquire as to the irrelevant complexity, I’m trying to pinpoint where 
you see this happening.

When we talk about updating a ComputeNode, because I’m only slightly familiar 
with Nova’s codebase, I assume we are looking at “def compute_node_update()” on 
line 633 of nova/db/sqlalchemy/api.py ?

if that’s the full extent of it, I’m not seeing the second select:

def compute_node_update(context, compute_id, values):
Updates the ComputeNode record with the most recent data.

session = get_session()
with session.begin():
compute_ref = _compute_node_get(context, compute_id, 
session=session)
values['updated_at'] = timeutils.utcnow()
datetime_keys = ('created_at', 'deleted_at', 'updated_at')
convert_objects_related_datetimes(values, *datetime_keys)
compute_ref.update(values)

return compute_ref

so “with session.begin()”, when that context ends, will emit the flush of the 
compute_ref, and then commit the transaction.  The Session by default has a 
behavior “expire_on_commit”, which means that when this compute_ref is returned 
to the outside world, the first thing that accesses anything on it *will* emit 
a SELECT for the row again.  However, as far as I can tell the expire_on_commit 
flag is turned off.   get_session() returns from oslo.db’s EngineFacade (the 
subject of my previously mentioned blueprint), and that passes through 
“expire_on_commit” of False by default.  It is definitely False when oslo.db 
does the sessionmaker and I see no code that is setting it to True anywhere.
The save() method is not used here either, but even if it is, NovaBase.save() 
calls into ModelBase.save() which just calls a flush(), shouldn’t be emitting a 
SELECT either.

Let me know if a. I’m looking in the right place, b. if this second SELECT is 
actually observed; if it’s occurring I’d like to understand better what we’re 
looking at.




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-12 Thread Mike Bayer

 On Nov 12, 2014, at 6:23 PM, Mike Bayer mba...@redhat.com wrote:
 
 
 If I may inquire as to the irrelevant complexity, I’m trying to pinpoint 
 where you see this happening.
 
 When we talk about updating a ComputeNode, because I’m only slightly familiar 
 with Nova’s codebase, I assume we are looking at “def compute_node_update()” 
 on line 633 of nova/db/sqlalchemy/api.py ?
 
 if that’s the full extent of it, I’m not seeing the second select:
 
def compute_node_update(context, compute_id, values):
Updates the ComputeNode record with the most recent data.
 
session = get_session()
with session.begin():
compute_ref = _compute_node_get(context, compute_id, 
 session=session)
values['updated_at'] = timeutils.utcnow()
datetime_keys = ('created_at', 'deleted_at', 'updated_at')
convert_objects_related_datetimes(values, *datetime_keys)
compute_ref.update(values)
 
return compute_ref
 
 so “with session.begin()”, when that context ends, will emit the flush of the 
 compute_ref, and then commit the transaction.  The Session by default has a 
 behavior “expire_on_commit”, which means that when this compute_ref is 
 returned to the outside world, the first thing that accesses anything on it 
 *will* emit a SELECT for the row again.  However, as far as I can tell the 
 expire_on_commit flag is turned off.   get_session() returns from oslo.db’s 
 EngineFacade (the subject of my previously mentioned blueprint), and that 
 passes through “expire_on_commit” of False by default.  It is definitely 
 False when oslo.db does the sessionmaker and I see no code that is setting it 
 to True anywhere.The save() method is not used here either, but even if 
 it is, NovaBase.save() calls into ModelBase.save() which just calls a 
 flush(), shouldn’t be emitting a SELECT either.
 
 Let me know if a. I’m looking in the right place, b. if this second SELECT is 
 actually observed; if it’s occurring I’d like to understand better what we’re 
 looking at.

ah, we’re talking about the objects level:

class ComputeNode(base.NovaPersistentObject, base.NovaObject):
# …

@base.remotable
def save(self, context, prune_stats=False):
# NOTE(belliott) ignore prune_stats param, no longer relevant

updates = self.obj_get_changes()
updates.pop('id', None)
self._convert_stats_to_db_format(updates)
self._convert_host_ip_to_db_format(updates)
self._convert_supported_instances_to_db_format(updates)

db_compute = db.compute_node_update(context, self.id, updates)
self._from_db_object(context, self, db_compute)

I’m not sure if I’m seeing the second SELECT here either but I’m less familiar 
with what I’m looking at. compute_node_update() does the one SELECT as we 
said, then it doesn’t look like self._from_db_object() would emit any further 
SQL specific to that row.I see that it calls upon 
db_compute.get('supported_instances’) but that’s something different.






___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

2014-11-12 Thread Dan Smith
 I’m not sure if I’m seeing the second SELECT here either but I’m less
 familiar with what I’m looking at. compute_node_update() does the
 one SELECT as we said, then it doesn’t look like
 self._from_db_object() would emit any further SQL specific to that
 row.

I don't think you're missing anything. I don't see anything in that
object code, or the other db/sqlalchemy/api.py code that looks like a
second select. Perhaps he was referring to two *queries*, being the
initial select and the following update?

--Dan



signature.asc
Description: OpenPGP digital signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev