Re: [openstack-dev] [nova][cinder] How will nova advertise that volume multi-attach is supported?

2016-01-15 Thread Ildikó Váncsa
Hi All,

I wonder whether we could provide an interface on the API where these kind of 
capabilities can be retrieved? I know we have a support matrix in the 
documentation that's good to have. I asked the question, because here we have a 
base functionality, which is attaching a volume that Cinder exports. The 
multiattach feature is an extension to this, which is provided by Cinder and we 
wire it into Nova to provide this functionality for the instances. It's not a 
question that the API behavior will change by this, but it's more the matter of 
the components that allows the multiple attachments. In this sense the API 
microversion does not provide too much information standalone, it can only say 
that if you have all the good drivers set up in your environment, then you can 
use it. But do we have a way to check this?

Also in order to be able to introduce multiattach in the N cycle there are two 
patches that we have to land for Mitaka [1] [2]. [1] prepares the detach 
mechanism to send all the information to Cinder in order to identify the right 
attachment. This means to pass the attachment_id to Cinder. In case of an 
upgrade when we can have old and new components in the system it is important 
that if a new component attaches a volume for the second time then the detach 
called on the old one can be executed properly. [2] is need for Cinder as some 
of the drivers need the host information for tracking the attachments of the 
same volume on the same host properly.

> -Original Message-
> From: Matt Riedemann [mailto:mrie...@linux.vnet.ibm.com]
> Sent: January 14, 2016 17:11
> To: openstack-dev@lists.openstack.org
> Subject: Re: [openstack-dev] [nova][cinder] How will nova advertise that 
> volume multi-attach is supported?
> 
> 
> 
> On 1/14/2016 9:42 AM, Dan Smith wrote:
> >> It is however not ideal when a deployment is set up such that
> >> multiattach will always fail because a hypervisor is in use which
> >> doesn't support it.  An immediate solution would be to add a policy
> >> so a deployer could disallow it that way which would provide
> >> immediate feedback to a user that they can't do it.  A longer term
> >> solution would be to add capabilities to flavors and have flavors act
> >> as a proxy between the user and various hypervisor capabilities
> >> available in the deployment.  Or we can focus on providing better
> >> async feedback through instance-actions, and other discussed async api 
> >> changes.
> >
> > Presumably a deployer doesn't enable volumes to be set as multi-attach
> > on the cinder side if their nova doesn't support it at all, right? I
> > would expect that is the gating policy element for something global.
> 
> There is no policy in cinder to disallow creating multiattach-able volumes 
> [1]. It's just a property on the volume and somewhere in
> cinder the volume drivers support the capability or not.
> 
>  From a very quick look at the cinder code, the scheduler has a capabilities 
> filter for multiattach so if you try to create a multiattach
> volume and don't have any hosts (volume backends) that support that, you'd 
> fail to create the volume with NoValidHost.
> 
> But lvm supports it, so if you have an lvm backend you can create the 
> multiattach volume, that doesn't mean you can use it in nova. So
> it seems like you'd also need the same kind of capabilities filter in the 
> nova scheduler for this and that capability from the compute
> host would come from the virt driver, of which only libvirt is going to 
> support it at first.

Do I understand correctly that you mean that we would specify at VM creation 
time that it should go to a compute host where the hypervisor supports 
multiattach?

Thanks,
/Ildikó

[1] https://review.openstack.org/#/c/193134/ 
[2] https://review.openstack.org/#/c/256273/ 

> 
> >
> > Now, if multiple novas share a common cinder, then I guess it gets a
> > little more confusing...
> >
> > --Dan
> >
> > __
> >  OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe:
> > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> 
> [1]
> https://github.com/openstack/cinder/blob/master/cinder/api/v2/volumes.py#L407
> 
> --
> 
> Thanks,
> 
> Matt Riedemann
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][cinder] How will nova advertise that volume multi-attach is supported?

2016-01-14 Thread Matt Riedemann



On 1/14/2016 9:42 AM, Dan Smith wrote:

It is however not ideal when a deployment is set up such that
multiattach will always fail because a hypervisor is in use which
doesn't support it.  An immediate solution would be to add a policy so a
deployer could disallow it that way which would provide immediate
feedback to a user that they can't do it.  A longer term solution would
be to add capabilities to flavors and have flavors act as a proxy
between the user and various hypervisor capabilities available in the
deployment.  Or we can focus on providing better async feedback through
instance-actions, and other discussed async api changes.


Presumably a deployer doesn't enable volumes to be set as multi-attach
on the cinder side if their nova doesn't support it at all, right? I
would expect that is the gating policy element for something global.


There is no policy in cinder to disallow creating multiattach-able 
volumes [1]. It's just a property on the volume and somewhere in cinder 
the volume drivers support the capability or not.


From a very quick look at the cinder code, the scheduler has a 
capabilities filter for multiattach so if you try to create a 
multiattach volume and don't have any hosts (volume backends) that 
support that, you'd fail to create the volume with NoValidHost.


But lvm supports it, so if you have an lvm backend you can create the 
multiattach volume, that doesn't mean you can use it in nova. So it 
seems like you'd also need the same kind of capabilities filter in the 
nova scheduler for this and that capability from the compute host would 
come from the virt driver, of which only libvirt is going to support it 
at first.




Now, if multiple novas share a common cinder, then I guess it gets a
little more confusing...

--Dan

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



[1] 
https://github.com/openstack/cinder/blob/master/cinder/api/v2/volumes.py#L407


--

Thanks,

Matt Riedemann


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][cinder] How will nova advertise that volume multi-attach is supported?

2016-01-14 Thread Matt Riedemann



On 1/14/2016 9:42 AM, Dan Smith wrote:

It is however not ideal when a deployment is set up such that
multiattach will always fail because a hypervisor is in use which
doesn't support it.  An immediate solution would be to add a policy so a
deployer could disallow it that way which would provide immediate
feedback to a user that they can't do it.  A longer term solution would
be to add capabilities to flavors and have flavors act as a proxy
between the user and various hypervisor capabilities available in the
deployment.  Or we can focus on providing better async feedback through
instance-actions, and other discussed async api changes.


Presumably a deployer doesn't enable volumes to be set as multi-attach
on the cinder side if their nova doesn't support it at all, right? I
would expect that is the gating policy element for something global.


Is there a policy in cinder for that though? /me looks



Now, if multiple novas share a common cinder, then I guess it gets a
little more confusing...

--Dan

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



--

Thanks,

Matt Riedemann


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][cinder] How will nova advertise that volume multi-attach is supported?

2016-01-14 Thread Andrew Laski


On Wed, Jan 13, 2016, at 07:25 PM, Dan Smith wrote:
> > While I don't think it's strictly required by the api change guidelines [3]
> > I think the API interactions and behavior here feel different enough to 
> > warrant
> > having a microversion. Ideally there should have been some versioning in the
> > cinder api around the multiattach support but that ship has already sailed.
> > Treating the volume attach case in nova as the traditional single attach 
> > case
> > by default and having to specify a new microversion to enable using multiple
> > attach will at least make it more explicit to users which I think is a good
> > thing.
> 
> Right, I think the client explicitly saying "I know that there is this
> new thing called multi-attach" or "I should know but I didn't read the
> docs and irresponsibly claim to support this version anyway" is an
> important thing to have. While it doesn't (AFAIK) fall under the
> guidelines for signalling a change as you say, it is a big change
> regardless. There could certainly be clients that have the same
> attachment assumptions as nova currently has.
> 
> The problem is that we can't honor the pre-microversion semantics to
> older clients. Meaning, a client that claims to know nothing about
> multi-attach is going to make the assumptions it was making anyway, and
> we can't un-ring the bell for that client.
> 
> Still, I think it's useful to signal this change if for no other reason
> than it will hopefully catch the attention of careful client authors as
> they bump their maximum supported version declaration.
> 
> > I'm probably overlooking something major but shouldn't nova know if the virt
> > driver supports multiattach? If there are no computes with a compatible 
> > setup
> > why not just return an error and not even attempt the cast? I'm guessing 
> > all the
> > necessary info isn't in the DB which means there isn't a way to check this 
> > up
> > front.
> 
> We don't have that information, and as you hint above, we can have
> multiple virt drivers with varying levels of support in a single
> deployment. However, the inevitable result of "No Valid Host" is a
> little more correct in the case of the virt driver support situation.
> You asked us to do a thing, which was reasonable and supported by nova
> but ... during scheduling we failed to find any computes willing to
> honor the request. That could have been different ten minutes ago, and
> could certainly be different an hour from now. That fits NoValidHost
> properly I think.
> 
> If you've been told by cinder that your volume supports multi-attach,
> and nova is new enough to claim it supports it, returning 400 seems
> unfair and confusing to the user -- the operation should be valid.
> 
> So in summary:
> 
> - I think a microversion is not specifically required, but useful
> - I think a config or dynamic flag to change the API behavior is wrong
> - NoValidHost when no available hypervisors support it seems appropriate

I think NoValidHost is appropriate for now as well.

It is however not ideal when a deployment is set up such that
multiattach will always fail because a hypervisor is in use which
doesn't support it.  An immediate solution would be to add a policy so a
deployer could disallow it that way which would provide immediate
feedback to a user that they can't do it.  A longer term solution would
be to add capabilities to flavors and have flavors act as a proxy
between the user and various hypervisor capabilities available in the
deployment.  Or we can focus on providing better async feedback through
instance-actions, and other discussed async api changes.



> 
> --Dan
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe:
> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][cinder] How will nova advertise that volume multi-attach is supported?

2016-01-13 Thread Matthew Treinish
On Wed, Jan 13, 2016 at 04:52:37PM -0600, Matt Riedemann wrote:
> tl;dr - do we need a REST API microversion for multi-attach support in nova?

Yes, I think so

> 
> The details:
> 
> The volume multi-attach series in nova, starting here [1], has run into an
> upgrade problem.
> 
> Basically, there is code in liberty which doesn't pass an instance uuid and
> volume id to the block_device_mappings table query to uniquely identify BDMs
> for operations like detach, swap volume and live migration. This is a
> problem when we are trying to introduce support for volume multiattach
> because today nova considers volume/instance as a 1:1 relationship, but this
> introduces a 1:M relationship (oh, and the nova bdm table doesn't have any
> unique constraints enforcing data integrity - oops!).
> 
> So in Mitaka we are for sure going to land the code that makes sure that the
> operations which have a volume and instance get a BDM based on those keys.
> The only one that doesn't is assisted-snapshots, which is a REST API
> limitation, and we're going to block that case from multiattach support
> (when it lands). Dan Smith is starting that cleanup work here [2].
> 
> Ildiko has been asking how multiattach could be merged in nova in mitaka and
> actually supported in the REST API, DB API, compute and virt driver layers.
> Because of this query issue, and rolling upgrades, we could have a mitaka
> API talking to liberty computes. If we allow sending a multiattach volume to
> a liberty compute that already has attachments, that will fail.
> 
> We talked about adding a check in the API layer to see if there are any
> liberty computes running and if not, then we could allow the 2nd attach on a
> multi-attach volume, but that is racy and not fail-safe, which could mean a
> user could get a multi-attach request to pass in one case but fail in
> another (it's a problem related to how the service versions are cached
> in-memory per API worker and isn't something that can be resolved before
> mitaka release).
> 
> Ildiko was asking about adding a configuration option to the compute API
> which basically toggles the multi-attach feature. The thinking is that would
> default to False in mitaka and an operator would flip it to True once all
> the computes are upgraded to mitaka. We don't normally feature-toggle like
> this though, so it's not an attractive option really.
> 
> A lot of this also falls down in the aspect of how a user actually knows
> when the compute endpoint they are talking to supports multi-attach. If
> you're trying to attach a volume to a 2nd instance after it's already
> attached today, you'll get a 400 response.
> 
> If we did the config option idea and it was False, you'd get a 400 response
> for that also, even if you were running mitaka-level compute API (but still
> had liberty computes).
> 
> Normally we advertise capability in the API via microversions. So I'm
> thinking what we need is basically a microversion in the volume attach REST
> API which says:
> 
> 1. if you have a multiattach=True volume
> 2. that is already attached to instance 1
> 3. and you're trying to attach it to instance 2
> 4. that fails *unless* you're requesting a new enough microversion (opt-in
> to the feature).
> 
> That seems clear enough from the client perspective I think.

While I don't think it's strictly required by the api change guidelines [3]
I think the API interactions and behavior here feel different enough to warrant
having a microversion. Ideally there should have been some versioning in the
cinder api around the multiattach support but that ship has already sailed.
Treating the volume attach case in nova as the traditional single attach case
by default and having to specify a new microversion to enable using multiple
attach will at least make it more explicit to users which I think is a good
thing.

> 
> There is another wrinkle, however, which is not unique to this case.
> 
> There are only certain cinder backends (lvm) and nova virt drivers (libvirt)
> that will support multiattach. In the case of nova, you don't know if the
> multiattach will actually be accepted until we've cast off to the compute
> and find out if the virt driver supports it.

I'm probably overlooking something major but shouldn't nova know if the virt
driver supports multiattach? If there are no computes with a compatible setup
why not just return an error and not even attempt the cast? I'm guessing all the
necessary info isn't in the DB which means there isn't a way to check this up
front.

> 
> So even if we have a microversion in the REST API, and all of your computes
> are at >=mitaka, if you're hitting a non-libvirt compute node, it will fail
> (you get an instance fault in the case of volume attach, and a NoValidHost
> in the case of boot from volume).
> 
> I don't think we have a solution to that issue right now. I think John has
> been trying to sort that out somehow with the feature classification effort,
> but I'm unclear on the 

Re: [openstack-dev] [nova][cinder] How will nova advertise that volume multi-attach is supported?

2016-01-13 Thread Dan Smith
> While I don't think it's strictly required by the api change guidelines [3]
> I think the API interactions and behavior here feel different enough to 
> warrant
> having a microversion. Ideally there should have been some versioning in the
> cinder api around the multiattach support but that ship has already sailed.
> Treating the volume attach case in nova as the traditional single attach case
> by default and having to specify a new microversion to enable using multiple
> attach will at least make it more explicit to users which I think is a good
> thing.

Right, I think the client explicitly saying "I know that there is this
new thing called multi-attach" or "I should know but I didn't read the
docs and irresponsibly claim to support this version anyway" is an
important thing to have. While it doesn't (AFAIK) fall under the
guidelines for signalling a change as you say, it is a big change
regardless. There could certainly be clients that have the same
attachment assumptions as nova currently has.

The problem is that we can't honor the pre-microversion semantics to
older clients. Meaning, a client that claims to know nothing about
multi-attach is going to make the assumptions it was making anyway, and
we can't un-ring the bell for that client.

Still, I think it's useful to signal this change if for no other reason
than it will hopefully catch the attention of careful client authors as
they bump their maximum supported version declaration.

> I'm probably overlooking something major but shouldn't nova know if the virt
> driver supports multiattach? If there are no computes with a compatible setup
> why not just return an error and not even attempt the cast? I'm guessing all 
> the
> necessary info isn't in the DB which means there isn't a way to check this up
> front.

We don't have that information, and as you hint above, we can have
multiple virt drivers with varying levels of support in a single
deployment. However, the inevitable result of "No Valid Host" is a
little more correct in the case of the virt driver support situation.
You asked us to do a thing, which was reasonable and supported by nova
but ... during scheduling we failed to find any computes willing to
honor the request. That could have been different ten minutes ago, and
could certainly be different an hour from now. That fits NoValidHost
properly I think.

If you've been told by cinder that your volume supports multi-attach,
and nova is new enough to claim it supports it, returning 400 seems
unfair and confusing to the user -- the operation should be valid.

So in summary:

- I think a microversion is not specifically required, but useful
- I think a config or dynamic flag to change the API behavior is wrong
- NoValidHost when no available hypervisors support it seems appropriate

--Dan

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova][cinder] How will nova advertise that volume multi-attach is supported?

2016-01-13 Thread Matt Riedemann

tl;dr - do we need a REST API microversion for multi-attach support in nova?

The details:

The volume multi-attach series in nova, starting here [1], has run into 
an upgrade problem.


Basically, there is code in liberty which doesn't pass an instance uuid 
and volume id to the block_device_mappings table query to uniquely 
identify BDMs for operations like detach, swap volume and live 
migration. This is a problem when we are trying to introduce support for 
volume multiattach because today nova considers volume/instance as a 1:1 
relationship, but this introduces a 1:M relationship (oh, and the nova 
bdm table doesn't have any unique constraints enforcing data integrity - 
oops!).


So in Mitaka we are for sure going to land the code that makes sure that 
the operations which have a volume and instance get a BDM based on those 
keys. The only one that doesn't is assisted-snapshots, which is a REST 
API limitation, and we're going to block that case from multiattach 
support (when it lands). Dan Smith is starting that cleanup work here [2].


Ildiko has been asking how multiattach could be merged in nova in mitaka 
and actually supported in the REST API, DB API, compute and virt driver 
layers. Because of this query issue, and rolling upgrades, we could have 
a mitaka API talking to liberty computes. If we allow sending a 
multiattach volume to a liberty compute that already has attachments, 
that will fail.


We talked about adding a check in the API layer to see if there are any 
liberty computes running and if not, then we could allow the 2nd attach 
on a multi-attach volume, but that is racy and not fail-safe, which 
could mean a user could get a multi-attach request to pass in one case 
but fail in another (it's a problem related to how the service versions 
are cached in-memory per API worker and isn't something that can be 
resolved before mitaka release).


Ildiko was asking about adding a configuration option to the compute API 
which basically toggles the multi-attach feature. The thinking is that 
would default to False in mitaka and an operator would flip it to True 
once all the computes are upgraded to mitaka. We don't normally 
feature-toggle like this though, so it's not an attractive option really.


A lot of this also falls down in the aspect of how a user actually knows 
when the compute endpoint they are talking to supports multi-attach. If 
you're trying to attach a volume to a 2nd instance after it's already 
attached today, you'll get a 400 response.


If we did the config option idea and it was False, you'd get a 400 
response for that also, even if you were running mitaka-level compute 
API (but still had liberty computes).


Normally we advertise capability in the API via microversions. So I'm 
thinking what we need is basically a microversion in the volume attach 
REST API which says:


1. if you have a multiattach=True volume
2. that is already attached to instance 1
3. and you're trying to attach it to instance 2
4. that fails *unless* you're requesting a new enough microversion 
(opt-in to the feature).


That seems clear enough from the client perspective I think.

There is another wrinkle, however, which is not unique to this case.

There are only certain cinder backends (lvm) and nova virt drivers 
(libvirt) that will support multiattach. In the case of nova, you don't 
know if the multiattach will actually be accepted until we've cast off 
to the compute and find out if the virt driver supports it.


So even if we have a microversion in the REST API, and all of your 
computes are at >=mitaka, if you're hitting a non-libvirt compute node, 
it will fail (you get an instance fault in the case of volume attach, 
and a NoValidHost in the case of boot from volume).


I don't think we have a solution to that issue right now. I think John 
has been trying to sort that out somehow with the feature classification 
effort, but I'm unclear on the details.


But I think the first question is whether or not we require a 
microversion for multiattach support in the REST API purely as a signal 
to client code that a given cloud supports it, at least on certain nodes 
(running lvm + libvirt + mitaka).


[1] https://review.openstack.org/#/c/193133/
[2] https://review.openstack.org/#/c/267169/

--

Thanks,

Matt Riedemann


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev