Thanks for starting this discussion. There is a lot to cover/answer. On Tue, Apr 11, 2017 at 6:35 PM, Matt Riedemann <[email protected]> wrote: > > This is not discoverable at the moment, for the end user or cinder, so I'm > trying to figure out what the failure mode looks like. > > This all starts on the cinder side to extend the size of the attached > volume. Cinder is going to have to see if Nova is new enough to handle this > (via the available API versions) before accepting the request and resizing > the volume. Then Cinder sends the event to Nova. This is where it gets > interesting. > > On the Nova side, if all of the computes aren't new enough, we could just > fail the request outright with a 409. What does Cinder do then? Rollback the > volume resize?
This means an extend volume operation would need to check for Nova support first. This also means adding a new API call to fetch and discover such capabilities per instance (from associated compute node). If we want to catch errors in volume size extension in Nova, we will need to find an other way, external events are async. > But let's say the computes are new enough, but the instance is on a compute > that does not support the operation. Then what? Do we register an instance > fault and put the instance into ERROR state? Then the admin would need to > intervene. > > Are there other ideas? Until we have capabilities (info) exposed out of the > API we're stuck with questions like this. > Like TommyLike mentioned in a review, AWS introduced Live Volume Modifications available on some instance types. On instance types with limited support, you need to stop/start the instance or detach/attach the volume. On instances started before a certain date, you need to stop/start the instance or detach/attach the volume at least once. In all cases, the end user needs to extend the partition/filesystem in the instance. They have the luxury to fully control the environment and synchronize the compute service with the volume service. Even (speculatively) having bidirectional orchestration/synchronization/communications or what. I have that same luxury since I only support one volume backend and virt driver combination. But I now start to grasp the extend of what adding such feature requires, especially when it implies cross-services support... We have a matrix of compute drivers and volume backends to support with some combinations which might never support online volume extension. There is the desire for OpenStack to be interoperable between clouds so there is a strong incentive to make it work for all combinations. I will still take the liberty to ask: Would it be in the realm of possibilities for a deployer to have to explicitly enable this feature? A deployer would be able to enable such feature once all services/components it choose to deployed fully support online volume extension. I know it won't address cases where a mixed of volume backends and virt drivers are deployed. So we would still need capabilities discoverability. This includes volume type capabilities discoverability which I'm not sure exists today. Lets not start about how Horizon will discover such capabilities per instance/volume. That's an other can of worms. =) -- Mathieu __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: [email protected]?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
