I’m glad to see this topic getting some focus once again.  :-)

>From several of the administrators I talk with, when they think of putting a 
>host into maintenance mode, the common requests I hear are:

1. Don’t schedule more VMs to the host
2. Provide an optional way to automatically migrate all (usually active) VMs 
off the host so that users’ workloads remain “unaffected” by the maintenance 

#1 can easily be achieved, as has been mentioned several times, by simply 
disabling the compute service.  However, #2 involves a little more work, 
although certainly possible using all the operations provided by nova today 
(e.g., live migration, etc.).  I believe these types of discussions have come 
up several times over the past several OpenStack releases—certainly since 
Grizzly (i.e., when I started watching this space).

It seems that the general direction is to have the type of workflow needed for 
#2 outside of nova (which is certainly a valid stance).  To that end, it would 
be fairly straightforward to build some code that logically sits on top of 
nova, that when entering maintenance:

1. Prevents VMs from being scheduled to the host;
2. Maintains state about the maintenance operation (e.g., not in maintenance, 
migrations in progress, in maintenance, or error);
3. Provides mechanisms to, upon entering maintenance, dictates which VMs 
(active, all, none) to migrate and provides some throttling capabilities to 
prevent hundreds of parallel migrations on densely packed hosts (all done via a 

If anyone has additional questions, comments, or would like to discuss some 
options, please let me know.  If interested, upon request, I could even share a 
video of how such cases might work.  :-)  My colleagues and I have given these 
use cases a lot of thought and consideration and I’d love to talk more about 
them (perhaps a small session in Paris would be possible).

- Joe

On Oct 17, 2014, at 4:18 AM, John Garbutt <j...@johngarbutt.com> wrote:

> On 17 October 2014 02:28, Matt Riedemann <mrie...@linux.vnet.ibm.com> wrote:
>> On 10/16/2014 7:26 PM, Christopher Aedo wrote:
>>> On Tue, Sep 9, 2014 at 2:19 PM, Mike Scherbakov
>>> <mscherba...@mirantis.com> wrote:
>>>>> On Tue, Sep 9, 2014 at 6:02 PM, Clint Byrum <cl...@fewbar.com> wrote:
>>>> The idea is not simply deny or hang requests from clients, but provide
>>>> them
>>>> "we are in maintenance mode, retry in X seconds"
>>>>> You probably would want 'nova host-servers-migrate <host>'
>>>> yeah for migrations - but as far as I understand, it doesn't help with
>>>> disabling this host in scheduler - there is can be a chance that some
>>>> workloads will be scheduled to the host.
>>> Regarding putting a compute host in maintenance mode using "nova
>>> host-update --maintenance enable", it looks like the blueprint and
>>> associated commits were abandoned a year and a half ago:
>>> https://blueprints.launchpad.net/nova/+spec/host-maintenance
>>> It seems that "nova service-disable <host> nova-compute" effectively
>>> prevents the scheduler from trying to send new work there.  Is this
>>> the best approach to use right now if you want to pull a compute host
>>> out of an environment before migrating VMs off?
>>> I agree with Tim and Mike that having something respond "down for
>>> maintenance" rather than ignore or hang would be really valuable.  But
>>> it also looks like that hasn't gotten much traction in the past -
>>> anyone feel like they'd be in support of reviving the notion of
>>> "maintenance mode"?
>>> -Christopher
>>> _______________________________________________
>>> OpenStack-dev mailing list
>>> OpenStack-dev@lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> host-maintenance-mode is definitely a thing in nova compute via the os-hosts
>> API extension and the --maintenance parameter, the compute manager code is
>> here [1].  The thing is the only in-tree virt driver that implements it is
>> xenapi, and I believe when you put the host in maintenance mode it's
>> supposed to automatically evacuate the instances to some other host, but you
>> can't target the other host or tell the driver, from the API, which
>> instances you want to evacuate, e.g. all, none, running only, etc.
>> [1]
>> http://git.openstack.org/cgit/openstack/nova/tree/nova/compute/manager.py?id=2014.2#n3990
> We should certainly make that more generic. It doesn't update the VM
> state, so its really only admin focused in its current form.
> The XenAPI logic only works when using XenServer pools with shared NFS
> storage, if my memory serves me correctly. Honestly, its a bit of code
> I have planned on removing, along with the rest of the pool support.
> In terms of requiring DB downtime in Nova, the current efforts are
> focusing on avoiding downtime all together, via expand/contract style
> migrations, with a little help from objects to avoid data migrations.
> That doesn't mean maintenance mode if not useful for other things,
> like an emergency patching of the hypervisor.
> John
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

OpenStack-dev mailing list

Reply via email to