mlsorensen opened a new issue, #6746:
URL: https://github.com/apache/cloudstack/issues/6746
<!--
Verify first that your issue/request is not already reported on GitHub.
Also test if the latest release and main branch are affected too.
Always add information AFTER of these HTML comments, but no need to delete
the comments.
-->
##### ISSUE TYPE
* Enhancement Request
##### COMPONENT NAME
~~~
Storage
~~~
##### CLOUDSTACK VERSION
~~~
Any
~~~
##### SUMMARY
Unmapping volumes from a hypervisor host upon VM stop or migration is done
on a best-effort basis. The VM is already stopped, or already migrated, we try
to unmap, but if something goes wrong there is really no recourse or retry and
a warning is logged. This leaves a potential of leaking maps to hosts over
time.
In code review I've also found edge cases where a VM is moved to "Stopped"
state without necessarily cleaning up network or volume resources, these can
also lead to leaked maps over time. Examples are force removing a hypervisor
host with running VMs on it, and possibly any other code that just calls
`vm.setState(State.Stopped)`.
My request is that we be more thorough during VM start in ensuring that our
target host and *only* our target host has access to the volume. Or at least
call the storage plugin involved to let it decide how to do this. It should be
as simple as calling the storage service to "revoke all" just before we grant
access, or allowing for an exclusive grant in the storage API.
For example, with the PowerFlex/ScaleIO storage client there is an
`unmapVolumeFromAllSdcs` that could be called just prior to granting access to
volumes during VM start.
We may need to add a `revokeAllAccess()` method to the
`PrimaryDataStoreDriver`, or add a flag to the existing `revokeAccess` to
indicate that the storage driver should revoke all.
Or alternatively (I think I like this better), the `grantAccess()` call
might gain a flag `boolean exclusive` so the driver can ensure that only one
mapping exists - the one requested. This would be cleaner.
Crucially - we need to avoid exclusive access during the live migration
workflows. It seems safe to ensure exclusive access during VM start, however.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]