Re: [openstack-dev] [nova] [placement] Upgrade concerns with nested Resource Providers

Jay Pipes Thu, 31 May 2018 08:01:39 -0700

On 05/31/2018 05:10 AM, Sylvain Bauza wrote:

After considering the whole approach, discussing with a couple of folksover IRC, here is what I feel the best approach for a seamless upgrade : - VGPU inventory will be kept on root RP (for the first type) inQueens so that a compute service upgrade won't impact the DB - during Queens, operators can run a DB online migration script (likethe ones we currently have inhttps://github.com/openstack/nova/blob/c2f42b0/nova/cmd/manage.py#L375)that will create a new resource provider for the first type and move theinventory and allocations to it. - it's the responsibility of the virt driver code to check whether achild RP with its name being the first type name already exists to knowwhether to update the inventory against the root RP or the child RP.
Does it work for folks ?

No, sorry, that doesn't work for me. It seems overly complex andfragile, especially considering that VGPUs are not moveable anyway (nosupport for live migrating them). Same goes for CPU pinning, NUMAtopologies, PCI passthrough devices, SR-IOV PF/VFs and all the other"must have" features that have been added to the virt driver over thelast 5 years.

My feeling is that we should not attempt to "migrate" any allocations orinventories between root or child providers within a compute node, period.

The virt drivers should simply error out of update_provider_tree() ifthere are ANY existing VMs on the host AND the virt driver wishes tobegin tracking resources with nested providers.


The upgrade operation should look like this:

1) Upgrade placement
2) Upgrade nova-scheduler
3) start loop on compute nodes. for each compute node:
 3a) disable nova-compute service on node (to take it out of scheduling)
 3b) evacuate all existing VMs off of node
 3c) upgrade compute node (on restart, the compute node will see no
     VMs running on the node and will construct the provider tree inside
     update_provider_tree() with an appropriate set of child providers
     and inventories on those child providers)
 3d) enable nova-compute service on node

Which is virtually identical to the "normal" upgrade process wheneverthere are significant changes to the compute node -- such as upgradinglibvirt or the kernel. Nested resource tracking is another suchsignificant change and should be dealt with in a similar way, IMHO.


Best,
-jay

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] [placement] Upgrade concerns with nested Resource Providers

Reply via email to