There seems to be a few discussions going on here wrt to detaches. One
is what to do on the Nova side with calling os-brick's
disconnect_volume, and also when to or not to call Cinder's
terminate_connection and detach.
My original post was simply to discuss a mechanism to try and figure out
the first problem. When should nova call brick to remove
the local volume, prior to calling Cinder to do something.
Nova needs to know if it's safe to call disconnect_volume or not. Cinder
already tracks each attachment, and it can return the connection_info
for each attachment with a call to initialize_connection. If 2 of
those connection_info dicts are the same, it's a shared volume/target.
Don't call disconnect_volume if there are any more of those left.
On the Cinder side of things, if terminate_connection, detach is called,
the volume manager can find the list of attachments for a volume, and
compare that to the attachments on a host. The problem is, Cinder
doesn't track the host along with the instance_uuid in the attachments
table. I plan on allowing that as an API change after microversions
lands, so we know how many times a volume is attached/used on a
particular host. The driver can decide what to do with it at
terminate_connection, detach time. This helps account for
the differences in each of the Cinder backends, which we will never get
all aligned to the same model. Each array/backend handles attachments
different and only the driver knows if it's safe to remove the target or
not, depending on how many attachments/usages it has
on the host itself. This is the same thing as a reference counter,
which we don't need, because we have the count in the attachments table,
once we allow setting the host and the instance_uuid at the same time.
Walt
On Tue, Feb 09, 2016 at 11:49:33AM -0800, Walter A. Boring IV wrote:
Hey folks,
One of the challenges we have faced with the ability to attach a single
volume to multiple instances, is how to correctly detach that volume. The
issue is a bit complex, but I'll try and explain the problem, and then
describe one approach to solving one part of the detach puzzle.
Problem:
When a volume is attached to multiple instances on the same host. There
are 2 scenarios here.
1) Some Cinder drivers export a new target for every attachment on a
compute host. This means that you will get a new unique volume path on a
host, which is then handed off to the VM instance.
2) Other Cinder drivers export a single target for all instances on a
compute host. This means that every instance on a single host, will reuse
the same host volume path.
This problem isn't actually new. It is a problem we already have in Nova
even with single attachments per volume. eg, with NFS and SMBFS there
is a single mount setup on the host, which can serve up multiple volumes.
We have to avoid unmounting that until no VM is using any volume provided
by that mount point. Except we pretend the problem doesn't exist and just
try to unmount every single time a VM stops, and rely on the kernel
failing umout() with EBUSY. Except this has a race condition if one VM
is stopping right as another VM is starting
There is a patch up to try to solve this for SMBFS:
https://review.openstack.org/#/c/187619/
but I don't really much like it, because it only solves it for one
driver.
I think we need a general solution that solves the problem for all
cases, including multi-attach.
AFAICT, the only real answer here is to have nova record more info
about volume attachments, so it can reliably decide when it is safe
to release a connection on the host.
Proposed solution:
Nova needs to determine if the volume that's being detached is a shared or
non shared volume. Here is one way to determine that.
Every Cinder volume has a list of it's attachments. In those attachments
it contains the instance_uuid that the volume is attached to. I presume
Nova can find which of the volume attachments are on the same host. Then
Nova can call Cinder's initialize_connection for each of those attachments
to get the target's connection_info dictionary. This connection_info
dictionary describes how to connect to the target on the cinder backend. If
the target is shared, then each of the connection_info dicts for each
attachment on that host will be identical. Then Nova would know that it's a
shared target, and then only call os-brick's disconnect_volume, if it's the
last attachment on that host. I think at most 2 calls to cinder's
initialize_connection would suffice to determine if the volume is a shared
target. This would only need to be done if the volume is multi-attach
capable and if there are more than 1 attachments on the same host, where the
detach is happening.
As above, we need to solve this more generally than just multi-attach,
even single-attach is flawed today.
Regards,
Daniel
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev