Re: [Openstack] [openstack-dev] [nova] Disk attachment consistency
On Aug 14, 2012, at 10:00 PM, Chuck Thier cth...@gmail.com wrote: snip I could get behind this, and was was brought up by others in our group as a more feasible short term solution. I have a couple of concerns with this. It may cause just as much confusion if the api can't reliably determine which device a volume is attached to. I'm also curious as to how well this will work with Xen, and hope some of the citrix folks will chime in. From an api standpoint, I think it would be fine to make it optional, as any client that is using old api contract will still work as intended. This will continue to work as well as it currently does with Xen. We can reliably determine which devices are known about and pass back the next one my patch under review below does that (review at https://review.openstack.org/#/c/10908/) snip First question: should we return this magic path somewhere via the api? It would be pretty easy to have horizon generate it but it might be nice to have it show up. If we do return it, do we mangle the device to always show the consistent one, or do we return it as another parameter? guest_device perhaps? Second question: what should happen if someone specifies /dev/xvda against a kvm cloud or /dev/vda against a xen cloud? I see two options: a) automatically convert it to the right value and return it I thought that it already did this, but I would have to go back and double check. But it seemed like for xen at least, if you specify /dev/vda, Nova would change it to /dev/xvda. That may be true, I believe for libvirt we just accept /dev/xvdc since it is just interpreted as a label. snip Xen Server 6.0 has a limit of 16 virtual devices per guest instance. Experimentally it also expects those to be /dev/xvda - /dev/xvdp. You can't for example attach a device to /dev/xvdq, even if there are no other devices attached to the instance. If you attempt to do this, the volume will go in to the attaching state, fail to attach, and then fall back to the available state (This can be a bit confusing to new users who try to do so). Does anyone know if there are similar limitations for KVM? There are no limitations like this AFAIK, however in KVM it is possible to exhaust virtio minor device numbers by continually detaching and attaching a device if the guest kernel 3.2 Also if you attempt to attach a volume to a deivce that already exists, it will silently fail and go back to available as well. In this new scheme should it fail like that, or should it attempt to attach it to the next available device, or error out? Perhaps a better question here, is for this initial consistency, is the goal to try to be consistent just when there is no device sent, or to also be consistent when the device is sent as well. my review above addresses this by raising an error if you try to attach to an existing device. I think this is preferrable: i.e. only do the auto-assign if it is specifically requested. There was another idea, also brought up in our group. Would it be possible to add a call that would return a list of available devices to be attached to? In the case of Xen, it would return a list of devices /dev/xvda-p that were not used. In the case of KVM, it would just return the next available device name. At least in this case, user interfaces and command line tools could use this to validate the input the user provides (or auto generate the device to be used if the user doesn't select a device). This is definitely a possibility, although it seems like a separate feature. Vish ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [openstack-dev] [nova] Disk attachment consistency
On Mon, 13 Aug 2012, Vishvananda Ishaya wrote: Hey Everyone, Resulting Issues a) The device name only makes sense for linux. FreeBSD will select different device names, and windows doesn't even use device names. In addition xen uses /dev/xvda and kvm uses /dev/vda b) The device sent in kvm will not match where it actually shows up. We can consistently guess where it will show up if the guest kernel is = 3.2, otherwise we are likely to be wrong, and it may change on a reboot anyway Long term solutions -- We probably shouldn't expose a device path, it should be a device number. This is probably the right change long term, but short term we need to make the device name make sense somehow. I want to delay the long term until after the summit, and come up with something that works short-term with our existing parameters and usage. The first proposal I have is to make the device parameter optional. The system will automatically generate a valid device name that will be accurate for xen and kvm with guest kernel 3.2, but will likely be wrong for old kvm guests in some situations. I think this is definitely an improvement and only a very minor change to an extension api (making a parameter optional, and returning the generated value of the parameter). (review at https://review.openstack.org/#/c/10908/) The second proposal I have is to use a feature of kvm attach and set the device serial number. We can set it to the same value as the device parameter. This means that a device attached to /dev/vdb may not always be at /dev/vdb (with old kvm guests), but it will at least show up at /dev/disk/by-id/virtio-vdb consistently. This is the right way to do this. Expose 'serial-number' (or some other name for it) in the API, attach the device with that serial number and get out of the way. If the user doesn't provide you one, then create a unique one (at least for that guest) and return it. For many use cases, a user attaches a disk, ssh's in, finds the new disk, and uses it. Don't burden them with coming up with a naming/uuid scheme for this parameter if they dont want to. Does xen have anything like this? Can you set the serial number of the xen block device? (review coming soon) First question: should we return this magic path somewhere via the api? It would be pretty easy to have horizon generate it but it might be nice to have it show up. If we do return it, do we mangle the device to always show the consistent one, or do we return it as another parameter? guest_device perhaps? From the api perspective, I think it makes most sense to call it what it is. Don't make any promises or allusions to what the guest OS will do with it. Second question: what should happen if someone specifies /dev/xvda against a kvm cloud or /dev/vda against a xen cloud? I see two options: a) automatically convert it to the right value and return it b) fail with an error message In EC2, this fails with an error message. I think this is more correct. The one issue here is that you really cannot, and should not attempt to guess or know what the guest has named devices. Thats why we're we have this problem in the first place. So, I dont have strong feelings either way on this. Its broken to pass 'device=' and assume that means something. Third question: what do we do if someone specifies a device value to a kvm cloud that we know will not work. For example the vm has /dev/vda and /dev/vdb and they request an attach at /dev/vdf. In this case we know that it will likely show up at /dev/vdc. I see a few options here and none of them are amazing: a) let the attach go through as is. advantages: it will allow scripts to work without having to manually find the next device. disadvantages: the device name will never be correct in the guest b) automatically modify the request to attach at /dev/vdc and return it advantages: the device name will be correct some of the time (kvm guests with newer kernels) disadvantages: sometimes the name is wrong anyway. The user may not expect the device number to change c) fail and say, the next disk must be attached at /dev/vdc: advantages: explicit disadvantages: painful, incompatible, and the place we say to attach may be incorrect anyway (kvm guests with old kernels) I vote 'a'. Just be stupid. Play it simple, don't believe that you can understand what device naming convention the guest kernel and udev have decided upon. Heres an example. Do you know what happens if I attach 26 devices? /dev/vd[a-z], then what? I'm pretty sure it goes to /dev/vd[a-z][a-z], but its not worth you trying to know that. That convention may not be followed for xen block devices. At one point (maybe only with scsi attached disks, letters were never re-used, so an attach, detach, attach would end up going /dev/vdb, dev/vdc, /dev/vdd). There is no binary api that linux and udev promise on this, so
Re: [Openstack] [openstack-dev] [nova] Disk attachment consistency
Hey Vish, First, thanks for bringing this up for discussion. Coincidentally a similar discussion had come up with our teams, but I had pushed it aside at the time due to time constraints. It is a tricky problem to solve generally for all hypervisors. See my comments inline: On Mon, Aug 13, 2012 at 10:35 PM, Vishvananda Ishaya vishvana...@gmail.com wrote: Long term solutions -- We probably shouldn't expose a device path, it should be a device number. This is probably the right change long term, but short term we need to make the device name make sense somehow. I want to delay the long term until after the summit, and come up with something that works short-term with our existing parameters and usage. I totally agree with delaying the long term discussion, and look forward to discussing these types of issues more at the summit. The first proposal I have is to make the device parameter optional. The system will automatically generate a valid device name that will be accurate for xen and kvm with guest kernel 3.2, but will likely be wrong for old kvm guests in some situations. I think this is definitely an improvement and only a very minor change to an extension api (making a parameter optional, and returning the generated value of the parameter). I could get behind this, and was was brought up by others in our group as a more feasible short term solution. I have a couple of concerns with this. It may cause just as much confusion if the api can't reliably determine which device a volume is attached to. I'm also curious as to how well this will work with Xen, and hope some of the citrix folks will chime in. From an api standpoint, I think it would be fine to make it optional, as any client that is using old api contract will still work as intended. (review at https://review.openstack.org/#/c/10908/) The second proposal I have is to use a feature of kvm attach and set the device serial number. We can set it to the same value as the device parameter. This means that a device attached to /dev/vdb may not always be at /dev/vdb (with old kvm guests), but it will at least show up at /dev/disk/by-id/virtio-vdb consistently. (review coming soon) First question: should we return this magic path somewhere via the api? It would be pretty easy to have horizon generate it but it might be nice to have it show up. If we do return it, do we mangle the device to always show the consistent one, or do we return it as another parameter? guest_device perhaps? Second question: what should happen if someone specifies /dev/xvda against a kvm cloud or /dev/vda against a xen cloud? I see two options: a) automatically convert it to the right value and return it I thought that it already did this, but I would have to go back and double check. But it seemed like for xen at least, if you specify /dev/vda, Nova would change it to /dev/xvda. b) fail with an error message I don't have a strong opinion either way, as long as it is documented correctly. I would suggest thought that if it has been converting it in the past, that we continue to do so. Third question: what do we do if someone specifies a device value to a kvm cloud that we know will not work. For example the vm has /dev/vda and /dev/vdb and they request an attach at /dev/vdf. In this case we know that it will likely show up at /dev/vdc. I see a few options here and none of them are amazing: a) let the attach go through as is. advantages: it will allow scripts to work without having to manually find the next device. disadvantages: the device name will never be correct in the guest b) automatically modify the request to attach at /dev/vdc and return it advantages: the device name will be correct some of the time (kvm guests with newer kernels) disadvantages: sometimes the name is wrong anyway. The user may not expect the device number to change c) fail and say, the next disk must be attached at /dev/vdc: advantages: explicit disadvantages: painful, incompatible, and the place we say to attach may be incorrect anyway (kvm guests with old kernels) I would choose b, as it tries to get things in the correct state. c is a bad idea as it would change the overall api behavior, and current clients wouldn't expect it. There are also a couple of other interesting tidbits, that may be related, or at least be worthwhile to know while discussing this. Xen Server 6.0 has a limit of 16 virtual devices per guest instance. Experimentally it also expects those to be /dev/xvda - /dev/xvdp. You can't for example attach a device to /dev/xvdq, even if there are no other devices attached to the instance. If you attempt to do this, the volume will go in to the attaching state, fail to attach, and then fall back to the available state (This can be a bit confusing to new users who try to do so). Does anyone know if there are similar limitations for KVM? Also if you attempt
Re: [Openstack] [openstack-dev] [nova] Disk attachment consistency
On Aug 13, 2012 11:37 PM, Vishvananda Ishaya vishvana...@gmail.com wrote: The second proposal I have is to use a feature of kvm attach and set the device serial number. We can set it to the same value as the device parameter. This means that a device attached to /dev/vdb may not always be at /dev/vdb (with old kvm guests), but it will at least show up at /dev/disk/by-id/virtio-vdb consistently. What about setting the serial number to the volume_id? At least that way you could be sure it was the volume you meant, especially in the case where vdb in the guest ends up not being what you requested. What about other hypervisors? (review coming soon) First question: should we return this magic path somewhere via the api? It would be pretty easy to have horizon generate it but it might be nice to have it show up. If we do return it, do we mangle the device to always show the consistent one, or do we return it as another parameter? guest_device perhaps? Second question: what should happen if someone specifies /dev/xvda against a kvm cloud or /dev/vda against a xen cloud? I see two options: a) automatically convert it to the right value and return it b) fail with an error message Third question: what do we do if someone specifies a device value to a kvm cloud that we know will not work. For example the vm has /dev/vda and /dev/vdb and they request an attach at /dev/vdf. In this case we know that it will likely show up at /dev/vdc. I see a few options here and none of them are amazing: a) let the attach go through as is. advantages: it will allow scripts to work without having to manually find the next device. disadvantages: the device name will never be correct in the guest b) automatically modify the request to attach at /dev/vdc and return it advantages: the device name will be correct some of the time (kvm guests with newer kernels) disadvantages: sometimes the name is wrong anyway. The user may not expect the device number to change c) fail and say, the next disk must be attached at /dev/vdc: advantages: explicit disadvantages: painful, incompatible, and the place we say to attach may be incorrect anyway (kvm guests with old kernels) The second proposal earlier will at least give us a consistent name to find the volume in all these cases, although b) means we have to check the return value to find out what that consistent location is like we do when we don't pass in a device. I hope everything is clear, but if more explanation is needed please let me know. If anyone has alternative/better proposals please tell me. The last question I think is the most important. Vish ___ OpenStack-dev mailing list openstack-...@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp