By coincidence I've just written up a spec [1] that proposes an
admittedly very generic mechanism to solve this problem. I was coming
at it from the perspective of keeping the relative order of PCI device
addresses constant across evacuations. In that spec, I propose letting
the virt driver store blobs of data in the database (with some
caveats, obviously). This can then be used, for example, by the
libvirt driver to persist instance XML throughout an instance's
lifetime.

I agree with Dan Berranage that it's an overkill solution - we can
simply use libvirt itself as the storage mechanism and not have this
duplicate blob floating in the database confusing us about which
source of truth do we really want to use - except for the evacuation
edge case. When evacuating, the source host is unavailable, thus we're
unable to retrieve the instance XML from it. Also, in my conversation
with Claudiu (and hopefully he can chime in here and confirm I'm not
putting words in his mouth), the Hyper-V folks are potentially
interested in something like the driver private storage mechanism
proposed in the spec [1] to use with their new VM config export/import
feature [2].

[1] https://review.openstack.org/#/c/377806/
[2] https://review.openstack.org/#/c/340908/

On Tue, Sep 27, 2016 at 12:31 PM, Daniel P. Berrange
<berra...@redhat.com> wrote:
> On Tue, Sep 27, 2016 at 05:17:29PM +0100, Matthew Booth wrote:
>> Currently the libvirt driver (mostly) considers the nova db canonical. That
>> is, we can throw away libvirt's domain XML at any time and recreate it from
>> Nova. Anywhere that doesn't assume this is a bug, because whatever
>> direction we choose we don't need 2 different sources of truth. The
>> thinking behind this is that we should always know what we told libvirt,
>> and if we lose that information then that's a bug.
>>
>> This is true to a degree, and it's the reason I proposed the persistent
>> instance storage metadata spec: we lose track of how we configured an
>> instance's storage. I realised recently that this isn't the whole story,
>> though. Libvirt also automatically creates a bunch of state for us which we
>> didn't specify explicitly. We lose this every time we drop it and recreate.
>> For example, consider device addressing and ordering:
>>
>> $ nova boot ...
>>
>> We tell libvirt to give us a root disk, config disk, and a memballoon
>> device (amongst other things).
>>
>> Libvirt assigns pci addresses to all of these things.
>>
>> $ nova volume-attach ...
>>
>> We tell libvirt to create a new disk attached to the given volume.
>>
>> Libvirt assigns it a pci address.
>>
>> $ nova reboot
>>
>> We throw away libvirt's domain xml and create a new one from scratch.
>>
>> Libvirt assigns new addresses for all of these devices.
>>
>> Before reboot, the device order was: root disk, config disk, memballoon,
>> volume. After reboot the device order is: root disk, volume, config disk,
>> memballoon. Not only have all our devices changed address, which makes
>> Windows sad and paranoid about its licensing, and causes it to offline
>> volumes under certain circumstances, but our disks have been reordered.
>
> It is worth pointing out that we do have the device metadata role
> tagging support now, which lets guest OS identify devices automatically
> at startup. In theory you could say guests should rely on using that
> on *every* boot, not merely the first boot after provisioning.
>
> I think there is reasonable case to be made, however, that we should
> maintain a stable device configuration for an instance after its
> initial boot attempt. Arbitrarily changing hardware config on every
> reboot is being gratuitously nasty to guest admins. The example about
> causing Windows to require license reaactivation is on its own, enough
> of a reason to ensure stable hardware once initial provisioning is
> done.
>
>
>> This isn't all we've thrown away, though. Libvirt also gave us a default
>> machine type. When we create a new domain we'll get a new default machine
>> type. If libvirt has been upgraded, eg during host maintenance, this isn't
>> necessarily what it was before. Again, this can make our guests sad. Same
>> goes for CPU model, default devices, and probably many more things I
>> haven't thought of.
>
> Yes indeed.
>
>> Also... we lost the storage configuration of the guest: the information I
>> propose to persist in persistent instance storage metadata.
>>
>> We could store all of this information in Nova, but with the possible
>> exception of storage metadata it really isn't at the level of 'management':
>> it's the minutia of the hypervisor. In order to persist all of these things
>> in Nova we'd have to implement them explicitly, and when libvirt/kvm grows
>> more stuff we'll have to do that too. We'll need to mirror the
>> functionality of libvirt in Nova, feature for feature. This is a red flag
>> for me, and I think it means we should switch to libvirt being canonical.
>>
>> I think we should be able to create a domain, but once created we should
>> never redefine a domain. We can do adding and removing devices dynamically
>> using libvirt's apis, secure in the knowledge that libvirt will persist
>> this for us. When we upgrade the host, libvirt can ensure we don't break
>> guests which are on it. Evacuate should be pretty much the only reason to
>> start again.
>
> And in fact we do persist the guest XML with libvirt already. We sadly
> never use that info though - we just blindly overwrite it every time
> with newly generated XML.
>
> Fixing this should not be technically difficult for the most part.
>
>> I raised this in the live migration sub-team meeting, and the immediate
>> response was understandably conservative. I think this solves more problems
>> than it creates, though, and it would result in Nova's libvirt driver
>> getting a bit smaller and a bit simpler. That's a big win in my book.
>
> I don't think it'll get significantly smaller/simpler, but it will
> definitely be more intelligent and user friendly to do this IMHO.
> As mentioned above, I think the windows license reactivation issue
> alone is enough of a reason todo this.
>
> Regards,
> Daniel
> --
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
> |: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



-- 
--
Artom Lifshitz

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to