Re: [Openstack] [nova] Disk attachment consistency

2012-08-19 Thread Richard W.M. Jones
On Mon, Aug 13, 2012 at 08:35:28PM -0700, Vishvananda Ishaya wrote:
> a) The device name only makes sense for linux. FreeBSD will select different 
> device names, and windows doesn't even use device names. In addition xen uses 
> /dev/xvda and kvm uses /dev/vda
> 
> b) The device sent in kvm will not match where it actually shows up. We can 
> consistently guess where it will show up if the guest kernel is >= 3.2, 
> otherwise we are likely to be wrong, and it may change on a reboot anyway

Another one -- possibly not a good one, but I'm including it for
completeness -- is that you preformat the disk with a partition and a
filesystem, then use the UUID or LABEL of the filesystem to mount
it[*], ie:

  mount UUID=abc-123-456 /data
  mount LABEL=os-data-disk /data

Naturally libguestfs can make these performatted disks (see the
'virt-format' tool, or do it through the API).
http://libguestfs.org/virt-format.1.html

Rich.

[*] I'm assuming here that Windows can find and mount filesystems
using the NTFS ID, but I don't know if that's true ...

-- 
Richard Jones
Red Hat

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] [nova] Disk attachment consistency

2012-08-15 Thread John Garbutt
> From: Daniel P. Berrange [mailto:berra...@redhat.com]
> On Wed, Aug 15, 2012 at 03:49:45PM +0100, John Garbutt wrote:
> > You can see what XenAPI exposes here:
> >  http://docs.vmd.citrix.com/XenServer/6.0.0/1.0/en_gb/api/?c=VBD
> >
> > I think the only thing you can influence when plugging in the disk is the
> “userdevice”
> > which is the disk position: 0,1,2…  When you have attached the disk
> > you can find out the “device” name, such as /dev/xvda
> >
> > I don't know about Xen with libvirt. But from the previous discussion
> > it seems using the disk position would also work with KVM?
> 
> No, this doesn't really work in general. Virtio disks get assigned SCSI device
> numbers on a first-come first served basis. In the configuration you only have
> control over the PCI device slot/function. You might assume that your disks 
> are
> probed in PCI device order, and thus get SCSI device numbers in that same 
> order.
> This is not really safe though. Further more if the guest has any other kinds 
> of
> devices, eg perhaps they logged into an iSCSI target, then all bets are off 
> for
> what SCSI device you get assigned.
>
> All the host can safely say is
> 
>   - Virtio-blk disks get PCI address domain:bus:slot:function
>   - Virtio-SCSI disks get SCSI address A.B.C.D
>   - Disks have an unique serial string ZZZ
> 
> As a guest OS admin you can use this info get reliable disk names in
> /dev/disk/by-{path,id}.
Doh, I guess my plan doesn't work then. After checking, apparently the same 
problem is also present with how Xen deals with exposing the "position" to the 
guest VM.

> Relying on /dev/sdXXX is doomed to failure in the long term, even on bare
> metal, and should be avoided wherever possible.
I agree, long term, this is not the way forward. I was just thinking in terms 
of backwards compatibility.

> If your disk has a filesystem on it, you can also get a unique UUID and /or
> filesystem label, which means you can refer to the device from /dev/disk/by-
> {uuid,label} too.
That sounds the interesting for those attaching volumes that have a file system 
on it. Would it be reasonable to make this the best practice way for users to 
discover where the volume has been attached?

Maybe we should simply leave nova to report where the disk has been attached? 
XenAPI driver can simply attach in the next available slot, and report back 
what it can about the device location. Sounds like the libvirt driver could do 
the same?

We could document the current device (or whatever we call it) as a driver 
specific "hint". Although this doesn't seem very satisfying.

Cheers,
John
___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] [nova] Disk attachment consistency

2012-08-15 Thread Daniel P. Berrange
On Wed, Aug 15, 2012 at 03:49:45PM +0100, John Garbutt wrote:
> You can see what XenAPI exposes here:
>  http://docs.vmd.citrix.com/XenServer/6.0.0/1.0/en_gb/api/?c=VBD
> 
> I think the only thing you can influence when plugging in the disk is the 
> “userdevice”
> which is the disk position: 0,1,2…  When you have attached the disk you can 
> find out
> the “device” name, such as /dev/xvda
> 
> I don't know about Xen with libvirt. But from the previous discussion it 
> seems using
> the disk position would also work with KVM?

No, this doesn't really work in general. Virtio disks get assigned SCSI device
numbers on a first-come first served basis. In the configuration you only have
control over the PCI device slot/function. You might assume that your disks
are probed in PCI device order, and thus get SCSI device numbers in that same
order. This is not really safe though. Further more if the guest has any
other kinds of devices, eg perhaps they logged into an iSCSI target, then all
bets are off for what SCSI device you get assigned.

All the host can safely say is

  - Virtio-blk disks get PCI address domain:bus:slot:function
  - Virtio-SCSI disks get SCSI address A.B.C.D
  - Disks have an unique serial string ZZZ

As a guest OS admin you can use this info get reliable disk names
in /dev/disk/by-{path,id}.

If your disk has a filesystem on it, you can also get a unique UUID
and /or filesystem label, which means you can refer to the device
from /dev/disk/by-{uuid,label} too.

Relying on /dev/sdXXX is doomed to failure in the long term, even on
bare metal, and should be avoided wherever possible.

Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] [nova] Disk attachment consistency

2012-08-15 Thread John Garbutt
You can see what XenAPI exposes here:
 http://docs.vmd.citrix.com/XenServer/6.0.0/1.0/en_gb/api/?c=VBD

I think the only thing you can influence when plugging in the disk is the 
“userdevice” which is the disk position: 0,1,2…
When you have attached the disk you can find out the “device” name, such as 
/dev/xvda

I don't know about Xen with libvirt. But from the previous discussion it seems 
using the disk position would also work with KVM?

It seems disk position is also suitably OS agnostic, but I may have missed 
something there.

For backwards compatibility, we could make a "best effort" of translating the 
specified device name to a position. But as mentioned already, it seems fraught 
with danger in the general case.

I like the idea of an extra field to help report to the user what the likely 
device name is, if available. It would allow us to spot when the above 
"guessing" did not gone the way we had hoped.

Related to this, there is a limitation in Xen (a limitation in the 
blkbk/blkfrnt protocol I am told) that means you can't cancel the operation of 
removing a disk from a VM. So if the disk is in use, it may return with an 
exception saying the disk is in-use, but as soon as the disk can be released, 
it will be removed anyway. Currently, nova isn't very good at expressing to the 
user that this is what is happening:
https://bugs.launchpad.net/nova/+bug/1030108

Cheers,
John
 
> From: openstack-bounces+john.garbutt=citrix@lists.launchpad.net 
> [mailto:openstack-bounces+john.garbutt=citrix@lists.launchpad.net] On 
> Behalf Of Wangpan
> Sent: 15 August 2012 5:11
> To: Vishvananda Ishaya
> Cc: openstack
> Subject: Re: [Openstack] [nova] Disk attachment consistency
> > This is definitely another solution, although it seems less usable than the 
> > device serial number which can be an arbitrary string. If this works for 
> > xen though, that would be a plus.
> > Vish
> I don't have a Xen hypervisor in hand, so anybody else can try it on Xen ? 
> thanks
___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] [nova] Disk attachment consistency

2012-08-14 Thread Wangpan

> This is definitely another solution, although it seems less usable than the 
> device serial number which can be an arbitrary string. If this works for xen 
> though, that would be a plus.


> Vish

I don't have a Xen hypervisor in hand, so anybody else can try it on Xen ? 
thanks___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] [nova] Disk attachment consistency

2012-08-14 Thread Vishvananda Ishaya

On Aug 14, 2012, at 7:55 PM, "Wangpan" wrote:

> How about using the pci address as the UUID of target devices in one VM?
> the pci address is generated by libvirt and we can see it in VM by cmd "ls 
> -la /sys/block/",
> and it has no dependency with the kernel version, I can see it in 2.6.32*
> when an user attached a disk to VM, we find a free target dev such as vdd(the 
> user doesn't need to assign one) to attach and return the pci address to user,
> the disk is consistent when user see it on horizon and in VM(by cmd "ls -la 
> /sys/block/").
>  
> libvirt: function='0x0'/>
> ls -la /sys/block/:vda -> 
> ../devices/pci-:00/:00:04.0/virtio1/block/vda

This is definitely another solution, although it seems less usable than the 
device serial number which can be an arbitrary string. If this works for xen 
though, that would be a plus.

Vish

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] [nova] Disk attachment consistency

2012-08-14 Thread Wangpan
How about using the pci address as the UUID of target devices in one VM?
the pci address is generated by libvirt and we can see it in VM by cmd "ls -la 
/sys/block/",
and it has no dependency with the kernel version, I can see it in 2.6.32*
when an user attached a disk to VM, we find a free target dev such as vdd(the 
user doesn't need to assign one) to attach and return the pci address to user,
the disk is consistent when user see it on horizon and in VM(by cmd "ls -la 
/sys/block/").

libvirt:
ls -la /sys/block/:vda -> ../devices/pci-:00/:00:04.0/virtio1/block/vda

>Hey Everyone, 
>
>Overview 
> 
>
>One of the things that we are striving for in nova is interface consistency, 
>that is, we'd like someone to be able to use an openstack cloud without 
>knowing or caring which hypervisor is running underneath. There is a nasty bit 
>of inconsistency in the way that disks are hot attached to vms that shows 
>through to the user. I've been debating ways to minimize this and I have some 
>issues I need feedback on. 
>
>Background 
>-- 
>
>There are three issues contributing to the bad user experience of attaching 
>volumes. 
>
>1) The api we present for attaching a volume to an instance has a parameter 
>called device. This is presented as where to attach the disk in the guest. 
>
>2) Xen picks minor device numbers on the host hypervisor side and the guest 
>driver follows instructions 
>
>3) KVM picks minor device numbers on the guest driver side and doesn't expose 
>them to the host hypervisor side 
>
>Resulting Issues 
> 
>
>a) The device name only makes sense for linux. FreeBSD will select different 
>device names, and windows doesn't even use device names. In addition xen uses 
>/dev/xvda and kvm uses /dev/vda 
>
>b) The device sent in kvm will not match where it actually shows up. We can 
>consistently guess where it will show up if the guest kernel is >= 3.2, 
>otherwise we are likely to be wrong, and it may change on a reboot anyway 
>
>
>Long term solutions 
>-- 
>
>We probably shouldn't expose a device path, it should be a device number. This 
>is probably the right change long term, but short term we need to make the 
>device name make sense somehow. I want to delay the long term until after the 
>summit, and come up with something that works short-term with our existing 
>parameters and usage. 
>
>The first proposal I have is to make the device parameter optional. The system 
>will automatically generate a valid device name that will be accurate for xen 
>and kvm with guest kernel 3.2, but will likely be wrong for old kvm guests in 
>some situations. I think this is definitely an improvement and only a very 
>minor change to an extension api (making a parameter optional, and returning 
>the generated value of the parameter). 
>
>(review at https://review.openstack.org/#/c/10908/) 
>
>The second proposal I have is to use a feature of kvm attach and set the 
>device serial number. We can set it to the same value as the device parameter. 
>This means that a device attached to /dev/vdb may not always be at /dev/vdb 
>(with old kvm guests), but it will at least show up at 
>/dev/disk/by-id/virtio-vdb consistently. 
>
>(review coming soon) 
>
>First question: should we return this magic path somewhere via the api? It 
>would be pretty easy to have horizon generate it but it might be nice to have 
>it show up. If we do return it, do we mangle the device to always show the 
>consistent one, or do we return it as another parameter? guest_device perhaps? 
>
>Second question: what should happen if someone specifies /dev/xvda against a 
>kvm cloud or /dev/vda against a xen cloud? 
>I see two options: 
>a) automatically convert it to the right value and return it 
>b) fail with an error message 
>
>Third question: what do we do if someone specifies a device value to a kvm 
>cloud that we know will not work. For example the vm has /dev/vda and /dev/vdb 
>and they request an attach at /dev/vdf. In this case we know that it will 
>likely show up at /dev/vdc. I see a few options here and none of them are 
>amazing: 
>
>a) let the attach go through as is. 
>  advantages: it will allow scripts to work without having to manually find 
> the next device. 
>  disadvantages: the device name will never be correct in the guest 
>b) automatically modify the request to attach at /dev/vdc and return it 
>  advantages: the device name will be correct some of the time (kvm guests 
> with newer kernels) 
>  disadvantages: sometimes the name is wrong anyway. The user may not expect 
> the device number to change 
>c) fail and say, the next disk must be attached at /dev/vdc: 
>  advantages: explicit 
>  disadvantages: painful, incompatible, and the place we say to attach may be 
> incorrect anyway (kvm guests with old kernels) 
>
>The second proposal earlier will at least give us a consistent name to find 
>the volume in all these cases, although b) means we have to check the 

[Openstack] [nova] Disk attachment consistency

2012-08-13 Thread Vishvananda Ishaya
Hey Everyone,

Overview


One of the things that we are striving for in nova is interface consistency, 
that is, we'd like someone to be able to use an openstack cloud without knowing 
or caring which hypervisor is running underneath. There is a nasty bit of 
inconsistency in the way that disks are hot attached to vms that shows through 
to the user. I've been debating ways to minimize this and I have some issues I 
need feedback on.

Background
--

There are three issues contributing to the bad user experience of attaching 
volumes.

1) The api we present for attaching a volume to an instance has a parameter 
called device. This is presented as where to attach the disk in the guest.

2) Xen picks minor device numbers on the host hypervisor side and the guest 
driver follows instructions

3) KVM picks minor device numbers on the guest driver side and doesn't expose 
them to the host hypervisor side

Resulting Issues


a) The device name only makes sense for linux. FreeBSD will select different 
device names, and windows doesn't even use device names. In addition xen uses 
/dev/xvda and kvm uses /dev/vda

b) The device sent in kvm will not match where it actually shows up. We can 
consistently guess where it will show up if the guest kernel is >= 3.2, 
otherwise we are likely to be wrong, and it may change on a reboot anyway


Long term solutions
--

We probably shouldn't expose a device path, it should be a device number. This 
is probably the right change long term, but short term we need to make the 
device name make sense somehow. I want to delay the long term until after the 
summit, and come up with something that works short-term with our existing 
parameters and usage.

The first proposal I have is to make the device parameter optional. The system 
will automatically generate a valid device name that will be accurate for xen 
and kvm with guest kernel 3.2, but will likely be wrong for old kvm guests in 
some situations. I think this is definitely an improvement and only a very 
minor change to an extension api (making a parameter optional, and returning 
the generated value of the parameter).

(review at https://review.openstack.org/#/c/10908/)

The second proposal I have is to use a feature of kvm attach and set the device 
serial number. We can set it to the same value as the device parameter. This 
means that a device attached to /dev/vdb may not always be at /dev/vdb (with 
old kvm guests), but it will at least show up at /dev/disk/by-id/virtio-vdb 
consistently.

(review coming soon)

First question: should we return this magic path somewhere via the api? It 
would be pretty easy to have horizon generate it but it might be nice to have 
it show up. If we do return it, do we mangle the device to always show the 
consistent one, or do we return it as another parameter? guest_device perhaps?

Second question: what should happen if someone specifies /dev/xvda against a 
kvm cloud or /dev/vda against a xen cloud?
I see two options:
a) automatically convert it to the right value and return it
b) fail with an error message

Third question: what do we do if someone specifies a device value to a kvm 
cloud that we know will not work. For example the vm has /dev/vda and /dev/vdb 
and they request an attach at /dev/vdf. In this case we know that it will 
likely show up at /dev/vdc. I see a few options here and none of them are 
amazing:

a) let the attach go through as is.
  advantages: it will allow scripts to work without having to manually find the 
next device.
  disadvantages: the device name will never be correct in the guest
b) automatically modify the request to attach at /dev/vdc and return it
  advantages: the device name will be correct some of the time (kvm guests with 
newer kernels)
  disadvantages: sometimes the name is wrong anyway. The user may not expect 
the device number to change
c) fail and say, the next disk must be attached at /dev/vdc:
  advantages: explicit
  disadvantages: painful, incompatible, and the place we say to attach may be 
incorrect anyway (kvm guests with old kernels)

The second proposal earlier will at least give us a consistent name to find the 
volume in all these cases, although b) means we have to check the return value 
to find out what that consistent location is like we do when we don't pass in a 
device.

I hope everything is clear, but if more explanation is needed please let me 
know. If anyone has alternative/better proposals please tell me. The last 
question I think is the most important.

Vish


___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp