Re: [Openstack] [openstack-dev] [nova] Disk attachment consistency

2012-08-15 Thread Vishvananda Ishaya

On Aug 14, 2012, at 10:00 PM, Chuck Thier cth...@gmail.com wrote:

 snip
 I could get behind this, and was was brought up by others in our group
 as a more feasible short term solution.  I have a couple of concerns
 with this.  It may cause just as much confusion if the api can't
 reliably determine which device a volume is attached to.  I'm also
 curious as to how well this will work with Xen, and hope some of the
 citrix folks will chime in.  From an api standpoint, I think it would
 be fine to make it optional, as any client that is using old api
 contract will still work as intended.

This will continue to work as well as it currently does with Xen. We can 
reliably determine which devices are known about and pass back the next one my 
patch under review below does that
 
 (review at https://review.openstack.org/#/c/10908/)
 
 snip
 First question: should we return this magic path somewhere via the api? It 
 would be pretty easy to have horizon generate it but it might be nice to 
 have it show up. If we do return it, do we mangle the device to always show 
 the consistent one, or do we return it as another parameter? guest_device 
 perhaps?
 
 Second question: what should happen if someone specifies /dev/xvda against a 
 kvm cloud or /dev/vda against a xen cloud?
 I see two options:
 a) automatically convert it to the right value and return it
 
 I thought that it already did this, but I would have to go back and
 double check.  But it seemed like for xen at least, if you specify
 /dev/vda, Nova would change it to /dev/xvda.

That may be true, I believe for libvirt we just accept /dev/xvdc since it is 
just interpreted as a label.
 
 
 snip
 
 Xen Server 6.0 has a limit of 16 virtual devices per guest instance.
 Experimentally it also expects those to be /dev/xvda - /dev/xvdp.  You
 can't for example attach a device to /dev/xvdq, even if there are no
 other devices attached to the instance.  If you attempt to do this,
 the volume will go in to the attaching state, fail to attach, and then
 fall back to the available state (This can be a bit confusing to new
 users who try to do so).  Does anyone know if there are similar
 limitations for KVM?

There are no limitations like this AFAIK, however in KVM it is possible
to exhaust virtio minor device numbers by continually detaching and
attaching a device if the guest kernel  3.2

 
 Also if you attempt to attach a volume to a deivce that already
 exists, it will silently fail and go back to available as well.  In
 this new scheme should it fail like that, or should it attempt to
 attach it to the next available device, or error out?  Perhaps a
 better question here, is for this initial consistency, is the goal to
 try to be consistent just when there is no device sent, or to also be
 consistent when the device is sent as well.

my review above addresses this by raising an error if you try to attach
to an existing device. I think this is preferrable: i.e. only do the
auto-assign if it is specifically requested.

 
 There was another idea, also brought up in our group.  Would it be
 possible to add a call that would return a list of available devices
 to be attached to?  In the case of Xen, it would return a list of
 devices /dev/xvda-p that were not used.  In the case of KVM, it would
 just return the next available device name.  At least in this case,
 user interfaces and command line tools could use this to validate the
 input the user provides (or auto generate the device to be used if the
 user doesn't select a device).

This is definitely a possibility, although it seems like a separate feature.

Vish


___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] [openstack-dev] [nova] Disk attachment consistency

2012-08-14 Thread Scott Moser
On Mon, 13 Aug 2012, Vishvananda Ishaya wrote:

 Hey Everyone,

 Resulting Issues
 

 a) The device name only makes sense for linux. FreeBSD will select
 different device names, and windows doesn't even use device names. In
 addition xen uses /dev/xvda and kvm uses /dev/vda

 b) The device sent in kvm will not match where it actually shows up. We
 can consistently guess where it will show up if the guest kernel is =
 3.2, otherwise we are likely to be wrong, and it may change on a reboot
 anyway

 Long term solutions
 --

 We probably shouldn't expose a device path, it should be a device number. 
 This is probably the right change long term, but short term we need to make 
 the device name make sense somehow. I want to delay the long term until after 
 the summit, and come up with something that works short-term with our 
 existing parameters and usage.

 The first proposal I have is to make the device parameter optional. The 
 system will automatically generate a valid device name that will be accurate 
 for xen and kvm with guest kernel 3.2, but will likely be wrong for old kvm 
 guests in some situations. I think this is definitely an improvement and only 
 a very minor change to an extension api (making a parameter optional, and 
 returning the generated value of the parameter).

 (review at https://review.openstack.org/#/c/10908/)

 The second proposal I have is to use a feature of kvm attach and set the
 device serial number. We can set it to the same value as the device
 parameter. This means that a device attached to /dev/vdb may not always
 be at /dev/vdb (with old kvm guests), but it will at least show up at
 /dev/disk/by-id/virtio-vdb consistently.

This is the right way to do this.
Expose 'serial-number' (or some other name for it) in the API, attach the
device with that serial number and get out of the way.

If the user doesn't provide you one, then create a unique one (at least
for that guest) and return it.  For many use cases, a user attaches a
disk, ssh's in, finds the new disk, and uses it.  Don't burden them with
coming up with a naming/uuid scheme for this parameter if they dont want
to.

Does xen have anything like this?  Can you set the serial number of the
xen block device?

 (review coming soon)

 First question: should we return this magic path somewhere via the api?
 It would be pretty easy to have horizon generate it but it might be nice
 to have it show up. If we do return it, do we mangle the device to
 always show the consistent one, or do we return it as another parameter?
 guest_device perhaps?

From the api perspective, I think it makes most sense to call it what it
is.  Don't make any promises or allusions to what the guest OS will do
with it.

 Second question: what should happen if someone specifies /dev/xvda
 against a kvm cloud or /dev/vda against a xen cloud?
 I see two options:
 a) automatically convert it to the right value and return it
 b) fail with an error message

In EC2, this fails with an error message.  I think this is more correct.
The one issue here is that you really cannot, and should not attempt to
guess or know what the guest has named devices.  Thats why we're we have
this problem in the first place.

So, I dont have strong feelings either way on this.  Its broken to pass
'device=' and assume that means something.

 Third question: what do we do if someone specifies a device value to a
 kvm cloud that we know will not work. For example the vm has /dev/vda
 and /dev/vdb and they request an attach at /dev/vdf. In this case we
 know that it will likely show up at /dev/vdc. I see a few options here
 and none of them are amazing:

 a) let the attach go through as is.
   advantages: it will allow scripts to work without having to manually find 
 the next device.
   disadvantages: the device name will never be correct in the guest
 b) automatically modify the request to attach at /dev/vdc and return it
   advantages: the device name will be correct some of the time (kvm guests 
 with newer kernels)
   disadvantages: sometimes the name is wrong anyway. The user may not expect 
 the device number to change
 c) fail and say, the next disk must be attached at /dev/vdc:
   advantages: explicit
   disadvantages: painful, incompatible, and the place we say to attach may be 
 incorrect anyway (kvm guests with old kernels)

I vote 'a'.
Just be stupid. Play it simple, don't believe that you can understand what
device naming convention the guest kernel and udev have decided upon.

Heres an example.  Do you know what happens if I attach 26 devices?
/dev/vd[a-z], then what?  I'm pretty sure it goes to /dev/vd[a-z][a-z],
but its not worth you trying to know that.  That convention may not be
followed for xen block devices.  At one point (maybe only with scsi
attached disks, letters were never re-used, so an attach, detach, attach
would end up going /dev/vdb, dev/vdc, /dev/vdd).

There is no binary api that linux and udev promise on this, so 

Re: [Openstack] [openstack-dev] [nova] Disk attachment consistency

2012-08-14 Thread Chuck Thier
Hey Vish,

First, thanks for bringing this up for discussion.  Coincidentally a
similar discussion had come up with our teams, but I had pushed it
aside at the time due to time constraints.  It is a tricky problem to
solve generally for all hypervisors.  See my comments inline:

On Mon, Aug 13, 2012 at 10:35 PM, Vishvananda Ishaya
vishvana...@gmail.com wrote:


 Long term solutions
 --

 We probably shouldn't expose a device path, it should be a device number. 
 This is probably the right change long term, but short term we need to make 
 the device name make sense somehow. I want to delay the long term until after 
 the summit, and come up with something that works short-term with our 
 existing parameters and usage.

I totally agree with delaying the long term discussion, and look
forward to discussing these types of issues more at the summit.


 The first proposal I have is to make the device parameter optional. The 
 system will automatically generate a valid device name that will be accurate 
 for xen and kvm with guest kernel 3.2, but will likely be wrong for old kvm 
 guests in some situations. I think this is definitely an improvement and only 
 a very minor change to an extension api (making a parameter optional, and 
 returning the generated value of the parameter).

I could get behind this, and was was brought up by others in our group
as a more feasible short term solution.  I have a couple of concerns
with this.  It may cause just as much confusion if the api can't
reliably determine which device a volume is attached to.  I'm also
curious as to how well this will work with Xen, and hope some of the
citrix folks will chime in.  From an api standpoint, I think it would
be fine to make it optional, as any client that is using old api
contract will still work as intended.


 (review at https://review.openstack.org/#/c/10908/)

 The second proposal I have is to use a feature of kvm attach and set the 
 device serial number. We can set it to the same value as the device 
 parameter. This means that a device attached to /dev/vdb may not always be at 
 /dev/vdb (with old kvm guests), but it will at least show up at 
 /dev/disk/by-id/virtio-vdb consistently.

 (review coming soon)

 First question: should we return this magic path somewhere via the api? It 
 would be pretty easy to have horizon generate it but it might be nice to have 
 it show up. If we do return it, do we mangle the device to always show the 
 consistent one, or do we return it as another parameter? guest_device perhaps?

 Second question: what should happen if someone specifies /dev/xvda against a 
 kvm cloud or /dev/vda against a xen cloud?
 I see two options:
 a) automatically convert it to the right value and return it

I thought that it already did this, but I would have to go back and
double check.  But it seemed like for xen at least, if you specify
/dev/vda, Nova would change it to /dev/xvda.

 b) fail with an error message


I don't have a strong opinion either way, as long as it is documented
correctly.  I would suggest thought that if it has been converting it
in the past, that we continue to do so.

 Third question: what do we do if someone specifies a device value to a kvm 
 cloud that we know will not work. For example the vm has /dev/vda and 
 /dev/vdb and they request an attach at /dev/vdf. In this case we know that it 
 will likely show up at /dev/vdc. I see a few options here and none of them 
 are amazing:

 a) let the attach go through as is.
   advantages: it will allow scripts to work without having to manually find 
 the next device.
   disadvantages: the device name will never be correct in the guest
 b) automatically modify the request to attach at /dev/vdc and return it
   advantages: the device name will be correct some of the time (kvm guests 
 with newer kernels)
   disadvantages: sometimes the name is wrong anyway. The user may not expect 
 the device number to change
 c) fail and say, the next disk must be attached at /dev/vdc:
   advantages: explicit
   disadvantages: painful, incompatible, and the place we say to attach may be 
 incorrect anyway (kvm guests with old kernels)

I would choose b, as it tries to get things in the correct state.  c
is a bad idea as it would change the overall api behavior, and current
clients wouldn't expect it.

There are also a couple of other interesting tidbits, that may be
related, or at least be worthwhile to know while discussing this.

Xen Server 6.0 has a limit of 16 virtual devices per guest instance.
Experimentally it also expects those to be /dev/xvda - /dev/xvdp.  You
can't for example attach a device to /dev/xvdq, even if there are no
other devices attached to the instance.  If you attempt to do this,
the volume will go in to the attaching state, fail to attach, and then
fall back to the available state (This can be a bit confusing to new
users who try to do so).  Does anyone know if there are similar
limitations for KVM?

Also if you attempt 

Re: [Openstack] [openstack-dev] [nova] Disk attachment consistency

2012-08-13 Thread Nathanael Burton
On Aug 13, 2012 11:37 PM, Vishvananda Ishaya vishvana...@gmail.com
wrote:
 The second proposal I have is to use a feature of kvm attach and set the
device serial number. We can set it to the same value as the device
parameter. This means that a device attached to /dev/vdb may not always be
at /dev/vdb (with old kvm guests), but it will at least show up at
/dev/disk/by-id/virtio-vdb consistently.

What about setting the serial number to the volume_id? At least that way
you could be sure it was the volume you meant, especially in the case where
vdb in the guest ends up not being what you requested. What about other
hypervisors?

 (review coming soon)

 First question: should we return this magic path somewhere via the api?
It would be pretty easy to have horizon generate it but it might be nice to
have it show up. If we do return it, do we mangle the device to always show
the consistent one, or do we return it as another parameter? guest_device
perhaps?

 Second question: what should happen if someone specifies /dev/xvda
against a kvm cloud or /dev/vda against a xen cloud?
 I see two options:
 a) automatically convert it to the right value and return it
 b) fail with an error message

 Third question: what do we do if someone specifies a device value to a
kvm cloud that we know will not work. For example the vm has /dev/vda and
/dev/vdb and they request an attach at /dev/vdf. In this case we know that
it will likely show up at /dev/vdc. I see a few options here and none of
them are amazing:

 a) let the attach go through as is.
   advantages: it will allow scripts to work without having to manually
find the next device.
   disadvantages: the device name will never be correct in the guest
 b) automatically modify the request to attach at /dev/vdc and return it
   advantages: the device name will be correct some of the time (kvm
guests with newer kernels)
   disadvantages: sometimes the name is wrong anyway. The user may not
expect the device number to change
 c) fail and say, the next disk must be attached at /dev/vdc:
   advantages: explicit
   disadvantages: painful, incompatible, and the place we say to attach
may be incorrect anyway (kvm guests with old kernels)

 The second proposal earlier will at least give us a consistent name to
find the volume in all these cases, although b) means we have to check the
return value to find out what that consistent location is like we do when
we don't pass in a device.

 I hope everything is clear, but if more explanation is needed please let
me know. If anyone has alternative/better proposals please tell me. The
last question I think is the most important.

 Vish


 ___
 OpenStack-dev mailing list
 openstack-...@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp