On Mon, Oct 20, 2014 at 01:38:46PM -0400, Jay Pipes wrote:
> Hi Dan, Dan, Nikola, all Nova devs,
> 
> OK, so in reviewing Dan B's patch series that refactors the virt driver's
> get_available_resource() method [1], I am stuck between two concerns. I like
> (love even) much of the refactoring work involved in Dan's patches. They
> replace a whole bunch of our nested dicts that are used in the resource
> tracker with real objects -- and this is something I've been harping on for
> months that really hinders developer's understanding of Nova's internals.

Yep, as you say one of the problems with understanding WTF.com is going
on in the code is that the interface between resource_tracker.py and the
virt/driver.py was a completely undocumented dict.

Some of the data in the dict got directly copied into the database in
whatever format the virt driver sent it in. Other data fields in the
dict got over-written by the resource tracker. Other fields got converted
into a slightly different format, with extra info added to them.

> However, all of the object classes that Dan B has introduced have been
> unversioned objects -- i.e. they have not derived from
> nova.objects.base.NovaObject. This means that these objects cannot be sent
> over the wire via an RPC API call. In practical terms, this issue has not
> yet reared its head, because the resource tracker still sends a dictified
> JSON representation of the object's fields directly over the wire, in the
> same format as Icehouse, therefore there have been no breakages in RPC API
> compatibility.

If all the data from the virt driver was going straight into the database
or out over the wire, unchanged, then I'd agree that using the versioned
objects would clearly make sense.

When I started the cleanup though, I got the impression that most the data
from the virt driver got changed/munged in some way before hitting the database
or RPC layer.  There is also the long standing discussions about the extensible
resource tracker, that would represent data in the database in a completely
generic abstracted way as a list of key/value pairs. So I was imagining that
long term what's put in the database by the resource tracker would be in a
completely different structure than the data coming out fo the virt drivers.

Based on that understanding I felt it would be better to define a clear set
of classes solely for the data that's coming out of the virt driver, and
de-couple this from the objects used for storing stuff in the database.

Of course I've not attempted to tackle the full problemspace of cleaning up
the entire resource tracker codebase. I just focused in the interface to the
virt drivers. So as you note, in order to maintain compatibility, I was
careful to ensure that the classes I defined were able to serialize into
the same JSON format as is currently used in the horrible undocumented dicts.

I was not really expecting that the to_dict/from_dict/to_json/from_json
methods in the virt/hardware.py classes be something we use long termm though.
I was just thinking of them as a temporary stepping stone, and that the rest
of the people working on the (extensible) resource tracker would eventually
convert the RT code to directly read the attributes in the hardware.py classes
and use them to populate whatever data format the RT wants to use long term.

In particular what I'd like to see is that the virt driver be decoupled from
long term changes in the resource tracker code data formats. eg if someone
comes along in the Lnnnn cycle and decides the resource tracker/scheduler
would be much more effective if the data was persisted in a new format X,
then we ought to avoid having to changing the virt/driver.py & virt/hardware.py
APIs/classes. The RT code would just use the existing classes and convert into
whatever fancy new format is better.

> The problems with having all these objects not modelled by deriving from
> nova.objects.base.NovaObject are two-fold:
> 
>  * The object's fields/schema cannot be changed -- or rather, cannot be
> changed without introducing upgrade problems.
>  * The objects introduce a different way of serializing the object contents
> than is used in nova/objects -- it's not that much different, but it's
> different, and only has not caused a problem because the serialization
> routines are not yet being used to transfer data over the wire
> 
> So, what to do? Clearly, I think the nova/virt/hardware.py objects are badly
> needed. However, one of (the top?) priorities of the Nova project is
> upgradeability, and by not deriving from nova.objects.base.NovaObject, these
> nova.virt.hardware objects are putting that mission in jeopardy, IMO.
> 
> My proposal is that before we go and approve any BPs or patches that add to
> nova/virt/hardware.py, we first put together a patch series that moves the
> object models in nova/virt/hardware.py to being full-fledged objects in
> nova/objects/*

I really it really depends on how we see the resource tracker data model
evolving over the next few releases. As noted above, I'd really like to
see us be able to evolve the resource tracker & its data model, without
it causing ripple effects into the virt driver implementations. If that
is something we are able to agree on, then it seems to me that long term
we must de-couple the classes use by virt/driver.py for get_available_resource
from the Nova Objects used by the RT for persisting the data in the DB.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to