Re: [openstack-dev] [nova][libvirt] Block migrations and Cinder volumes

2015-04-02 Thread Matthew Gilliard
>> Thanks for the clarification, is there a  bug tracking this in libvirt
>> already?

> Actually I don't think there is one, so feel free to file one

I took the liberty of doing so:
https://bugzilla.redhat.com/show_bug.cgi?id=1208588

On Wed, Mar 18, 2015 at 6:11 PM, Daniel P. Berrange  wrote:
> On Wed, Mar 18, 2015 at 10:59:19AM -0700, Joe Gordon wrote:
>> On Wed, Mar 18, 2015 at 3:09 AM, Daniel P. Berrange 
>> wrote:
>>
>> > On Wed, Mar 18, 2015 at 08:33:26AM +0100, Thomas Herve wrote:
>> > > > Interesting bug.  I think I agree with you that there isn't a good
>> > solution
>> > > > currently for instances that have a mix of shared and not-shared
>> > storage.
>> > > >
>> > > > I'm curious what Daniel meant by saying that marking the disk
>> > shareable is
>> > > > not
>> > > > as reliable as we would want.
>> > >
>> > > I think this is the bug I reported here:
>> > https://bugs.launchpad.net/nova/+bug/1376615
>> > >
>> > > My initial approach was indeed to mark the disks are shareable: the
>> > patch (https://review.openstack.org/#/c/125616/) has comments around the
>> > issues, mainly around I/Ocache and SELinux isolation being disabled.
>> >
>> > Yep, those are both show stopper issues. The only solution is to fix the
>> > libvirt API for this first.
>> >
>>
>> Thanks for the clarification, is there a  bug tracking this in libvirt
>> already?
>
> Actually I don't think there is one, so feel free to file one
>
>
> Regards,
> Daniel
> --
> |: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org  -o- http://virt-manager.org :|
> |: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
> |: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][libvirt] Block migrations and Cinder volumes

2015-03-18 Thread Daniel P. Berrange
On Wed, Mar 18, 2015 at 10:59:19AM -0700, Joe Gordon wrote:
> On Wed, Mar 18, 2015 at 3:09 AM, Daniel P. Berrange 
> wrote:
> 
> > On Wed, Mar 18, 2015 at 08:33:26AM +0100, Thomas Herve wrote:
> > > > Interesting bug.  I think I agree with you that there isn't a good
> > solution
> > > > currently for instances that have a mix of shared and not-shared
> > storage.
> > > >
> > > > I'm curious what Daniel meant by saying that marking the disk
> > shareable is
> > > > not
> > > > as reliable as we would want.
> > >
> > > I think this is the bug I reported here:
> > https://bugs.launchpad.net/nova/+bug/1376615
> > >
> > > My initial approach was indeed to mark the disks are shareable: the
> > patch (https://review.openstack.org/#/c/125616/) has comments around the
> > issues, mainly around I/Ocache and SELinux isolation being disabled.
> >
> > Yep, those are both show stopper issues. The only solution is to fix the
> > libvirt API for this first.
> >
> 
> Thanks for the clarification, is there a  bug tracking this in libvirt
> already?

Actually I don't think there is one, so feel free to file one


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][libvirt] Block migrations and Cinder volumes

2015-03-18 Thread Joe Gordon
On Wed, Mar 18, 2015 at 3:09 AM, Daniel P. Berrange 
wrote:

> On Wed, Mar 18, 2015 at 08:33:26AM +0100, Thomas Herve wrote:
> > > Interesting bug.  I think I agree with you that there isn't a good
> solution
> > > currently for instances that have a mix of shared and not-shared
> storage.
> > >
> > > I'm curious what Daniel meant by saying that marking the disk
> shareable is
> > > not
> > > as reliable as we would want.
> >
> > I think this is the bug I reported here:
> https://bugs.launchpad.net/nova/+bug/1376615
> >
> > My initial approach was indeed to mark the disks are shareable: the
> patch (https://review.openstack.org/#/c/125616/) has comments around the
> issues, mainly around I/Ocache and SELinux isolation being disabled.
>
> Yep, those are both show stopper issues. The only solution is to fix the
> libvirt API for this first.
>

Thanks for the clarification, is there a  bug tracking this in libvirt
already?


>
> Regards,
> Daniel
> --
> |: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/
> :|
> |: http://libvirt.org  -o- http://virt-manager.org
> :|
> |: http://autobuild.org   -o- http://search.cpan.org/~danberr/
> :|
> |: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc
> :|
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][libvirt] Block migrations and Cinder volumes

2015-03-18 Thread Daniel P. Berrange
On Wed, Mar 18, 2015 at 08:33:26AM +0100, Thomas Herve wrote:
> > Interesting bug.  I think I agree with you that there isn't a good solution
> > currently for instances that have a mix of shared and not-shared storage.
> > 
> > I'm curious what Daniel meant by saying that marking the disk shareable is
> > not
> > as reliable as we would want.
> 
> I think this is the bug I reported here: 
> https://bugs.launchpad.net/nova/+bug/1376615
> 
> My initial approach was indeed to mark the disks are shareable: the patch 
> (https://review.openstack.org/#/c/125616/) has comments around the issues, 
> mainly around I/Ocache and SELinux isolation being disabled.

Yep, those are both show stopper issues. The only solution is to fix the
libvirt API for this first.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][libvirt] Block migrations and Cinder volumes

2015-03-18 Thread Daniel P. Berrange
On Tue, Mar 17, 2015 at 01:33:26PM -0700, Joe Gordon wrote:
> On Thu, Jun 19, 2014 at 1:38 AM, Daniel P. Berrange 
> wrote:
> 
> > On Wed, Jun 18, 2014 at 11:09:33PM -0700, Rafi Khardalian wrote:
> > > I am concerned about how block migration functions when Cinder volumes
> > are
> > > attached to an instance being migrated.  We noticed some unexpected
> > > behavior recently, whereby attached generic NFS-based volumes would
> > become
> > > entirely unsparse over the course of a migration.  After spending some
> > time
> > > reviewing the code paths in Nova, I'm more concerned that this was
> > actually
> > > a minor symptom of a much more significant issue.
> > >
> > > For those unfamiliar, NFS-based volumes are simply RAW files residing on
> > an
> > > NFS mount.  From Libvirt's perspective, these volumes look no different
> > > than root or ephemeral disks.  We are currently not filtering out volumes
> > > whatsoever when making the request into Libvirt to perform the migration.
> > >  Libvirt simply receives an additional flag (VIR_MIGRATE_NON_SHARED_INC)
> > > when a block migration is requested, which applied to the entire
> > migration
> > > process, not differentiated on a per-disk basis.  Numerous guards within
> > > Nova to prevent a block based migration from being allowed if the
> > instance
> > > disks exist on the destination; yet volumes remain attached and within
> > the
> > > defined XML during a block migration.
> > >
> > > Unless Libvirt has a lot more logic around this than I am lead to
> > believe,
> > > this seems like a recipe for corruption.  It seems as though this would
> > > also impact any type of volume attached to an instance (iSCSI, RBD,
> > etc.),
> > > NFS just happens to be what we were testing.  If I am wrong and someone
> > can
> > > correct my understanding, I would really appreciate it.  Otherwise, I'm
> > > surprised we haven't had more reports of issues when block migrations are
> > > used in conjunction with any attached volumes.
> >
> > Libvirt/QEMU has no special logic. When told to block-migrate, it will do
> > so for *all* disks attached to the VM in read-write-exclusive mode. It will
> > only skip those marked read-only or read-write-shared mode. Even that
> > distinction is somewhat dubious and so not reliably what you would want.
> >
> > It seems like we should just disallow block migrate when any cinder volumes
> > are attached to the VM, since there is never any valid use case for doing
> > block migrate from a cinder volume to itself.
> 
> Digging up this old thread because I am working on getting multi node live
> migration testing working (https://review.openstack.org/#/c/165182/), and
> just ran into this issue (bug 1398999).
> 
> And I am not sure I agree with this statement. I think there is a valid
> case for doing block migrate with a cinder volume attached to an instance:

To be clear, I'm not saying the use cases for block migrating cinder are
invalid. Just that with the way libvirt exposes block migration today
it isn't safe for us to allow it, because we don't have fine grained
control to make it reliably safe from openstack. We need to improve the
libvirt API in this area and then we can support this feature properly.

> * Cloud isn't using a shared filesystem for ephemeral storage
> * Instance is booted from an image, and a volume is attached afterwards. An
> admin wants to take the box the instance is running on offline for
> maintanince with a minimal impact to the instances running on it.
> 
> What is the recommended solution for that use case? If the admin
> disconnects and reconnects the volume themselves is there a risk of
> impacting whats running on the instance? etc.

Yes, and that sucks, but that's the only safe option today, otherwise
libvirt is going to try copying the data in the cinder volumes itself,
which means it is copying from the volume on one host, back into the
very same volume on the other host. IOW it is rewriting all the data
even though the volume is shared betwteen the hosts. This has dangerous
data corruption failure scenarios as well as being massively wasteful
of CPU and network bandwidth.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][libvirt] Block migrations and Cinder volumes

2015-03-18 Thread Thomas Herve
> Interesting bug.  I think I agree with you that there isn't a good solution
> currently for instances that have a mix of shared and not-shared storage.
> 
> I'm curious what Daniel meant by saying that marking the disk shareable is
> not
> as reliable as we would want.

I think this is the bug I reported here: 
https://bugs.launchpad.net/nova/+bug/1376615

My initial approach was indeed to mark the disks are shareable: the patch 
(https://review.openstack.org/#/c/125616/) has comments around the issues, 
mainly around I/Ocache and SELinux isolation being disabled.

-- 
Thomas

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][libvirt] Block migrations and Cinder volumes

2015-03-17 Thread Chris Friesen

On 03/17/2015 02:33 PM, Joe Gordon wrote:


Digging up this old thread because I am working on getting multi node live
migration testing working (https://review.openstack.org/#/c/165182/), and just
ran into this issue (bug 1398999).

And I am not sure I agree with this statement. I think there is a valid case for
doing block migrate with a cinder volume attached to an instance:


* Cloud isn't using a shared filesystem for ephemeral storage
* Instance is booted from an image, and a volume is attached afterwards. An
admin wants to take the box the instance is running on offline for maintanince
with a minimal impact to the instances running on it.

What is the recommended solution for that use case? If the admin disconnects and
reconnects the volume themselves is there a risk of impacting whats running on
the instance? etc.


Interesting bug.  I think I agree with you that there isn't a good solution 
currently for instances that have a mix of shared and not-shared storage.


I'm curious what Daniel meant by saying that marking the disk shareable is not 
as reliable as we would want.


I think there is definitely a risk if the admin disconnects the volume--whether 
or not that causes problems depends on whether the application can handle that 
cleanly.


I suspect the "proper" cloud-aware strategy would be to just kill it and have 
another instance take over.  But that's not very helpful for 
not-fully-cloud-aware applications.


Also, since you've been playing in this area...do you know if we currently 
properly support all variations on live/cold migration, resize, evacuate, etc. 
for the boot-from-volume case?


Chris

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][libvirt] Block migrations and Cinder volumes

2015-03-17 Thread Joe Gordon
On Thu, Jun 19, 2014 at 1:38 AM, Daniel P. Berrange 
wrote:

> On Wed, Jun 18, 2014 at 11:09:33PM -0700, Rafi Khardalian wrote:
> > I am concerned about how block migration functions when Cinder volumes
> are
> > attached to an instance being migrated.  We noticed some unexpected
> > behavior recently, whereby attached generic NFS-based volumes would
> become
> > entirely unsparse over the course of a migration.  After spending some
> time
> > reviewing the code paths in Nova, I'm more concerned that this was
> actually
> > a minor symptom of a much more significant issue.
> >
> > For those unfamiliar, NFS-based volumes are simply RAW files residing on
> an
> > NFS mount.  From Libvirt's perspective, these volumes look no different
> > than root or ephemeral disks.  We are currently not filtering out volumes
> > whatsoever when making the request into Libvirt to perform the migration.
> >  Libvirt simply receives an additional flag (VIR_MIGRATE_NON_SHARED_INC)
> > when a block migration is requested, which applied to the entire
> migration
> > process, not differentiated on a per-disk basis.  Numerous guards within
> > Nova to prevent a block based migration from being allowed if the
> instance
> > disks exist on the destination; yet volumes remain attached and within
> the
> > defined XML during a block migration.
> >
> > Unless Libvirt has a lot more logic around this than I am lead to
> believe,
> > this seems like a recipe for corruption.  It seems as though this would
> > also impact any type of volume attached to an instance (iSCSI, RBD,
> etc.),
> > NFS just happens to be what we were testing.  If I am wrong and someone
> can
> > correct my understanding, I would really appreciate it.  Otherwise, I'm
> > surprised we haven't had more reports of issues when block migrations are
> > used in conjunction with any attached volumes.
>
> Libvirt/QEMU has no special logic. When told to block-migrate, it will do
> so for *all* disks attached to the VM in read-write-exclusive mode. It will
> only skip those marked read-only or read-write-shared mode. Even that
> distinction is somewhat dubious and so not reliably what you would want.
>
> It seems like we should just disallow block migrate when any cinder volumes
> are attached to the VM, since there is never any valid use case for doing
> block migrate from a cinder volume to itself.



Digging up this old thread because I am working on getting multi node live
migration testing working (https://review.openstack.org/#/c/165182/), and
just ran into this issue (bug 1398999).

And I am not sure I agree with this statement. I think there is a valid
case for doing block migrate with a cinder volume attached to an instance:


* Cloud isn't using a shared filesystem for ephemeral storage
* Instance is booted from an image, and a volume is attached afterwards. An
admin wants to take the box the instance is running on offline for
maintanince with a minimal impact to the instances running on it.

What is the recommended solution for that use case? If the admin
disconnects and reconnects the volume themselves is there a risk of
impacting whats running on the instance? etc.


>
> Regards,
> Daniel
> --
> |: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/
> :|
> |: http://libvirt.org  -o- http://virt-manager.org
> :|
> |: http://autobuild.org   -o- http://search.cpan.org/~danberr/
> :|
> |: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc
> :|
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][libvirt] Block migrations and Cinder volumes

2015-02-16 Thread Daniel P. Berrange
On Mon, Feb 16, 2015 at 01:39:21PM +1300, Robert Collins wrote:
> On 19 June 2014 at 20:38, Daniel P. Berrange  wrote:
> > On Wed, Jun 18, 2014 at 11:09:33PM -0700, Rafi Khardalian wrote:
> >> I am concerned about how block migration functions when Cinder volumes are
> >> attached to an instance being migrated.  We noticed some unexpected
> >> behavior recently, whereby attached generic NFS-based volumes would become
> >> entirely unsparse over the course of a migration.  After spending some time
> >> reviewing the code paths in Nova, I'm more concerned that this was actually
> >> a minor symptom of a much more significant issue.
> >>
> >> For those unfamiliar, NFS-based volumes are simply RAW files residing on an
> >> NFS mount.  From Libvirt's perspective, these volumes look no different
> >> than root or ephemeral disks.  We are currently not filtering out volumes
> >> whatsoever when making the request into Libvirt to perform the migration.
> >>  Libvirt simply receives an additional flag (VIR_MIGRATE_NON_SHARED_INC)
> >> when a block migration is requested, which applied to the entire migration
> >> process, not differentiated on a per-disk basis.  Numerous guards within
> >> Nova to prevent a block based migration from being allowed if the instance
> >> disks exist on the destination; yet volumes remain attached and within the
> >> defined XML during a block migration.
> >>
> >> Unless Libvirt has a lot more logic around this than I am lead to believe,
> >> this seems like a recipe for corruption.  It seems as though this would
> >> also impact any type of volume attached to an instance (iSCSI, RBD, etc.),
> >> NFS just happens to be what we were testing.  If I am wrong and someone can
> >> correct my understanding, I would really appreciate it.  Otherwise, I'm
> >> surprised we haven't had more reports of issues when block migrations are
> >> used in conjunction with any attached volumes.
> >
> > Libvirt/QEMU has no special logic. When told to block-migrate, it will do
> > so for *all* disks attached to the VM in read-write-exclusive mode. It will
> > only skip those marked read-only or read-write-shared mode. Even that
> > distinction is somewhat dubious and so not reliably what you would want.
> >
> > It seems like we should just disallow block migrate when any cinder volumes
> > are attached to the VM, since there is never any valid use case for doing
> > block migrate from a cinder volume to itself.
> >
> > Regards,
> > Daniel
> 
> Just ran across this from bug
> https://bugs.launchpad.net/nova/+bug/1398999. Is there some way to
> signal to libvirt that some block devices shouldn't be migrated by it
> but instead are known to be networked etc? Or put another way, how can
> we have our cake and eat it too. Its not uncommon for a VM to be
> cinder booted but have local storage for swap... and AIUI the fix we
> put in for this bug stops those VM's being migrated. Do you think it
> is tractable (but needs libvirt work), or is it something endemic to
> the problem (e.g. dirty page synchronisation with the VM itself) that
> will be in the way?

It is merely a missing feature in libvirt that no one has had the
time to address yet.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][libvirt] Block migrations and Cinder volumes

2015-02-15 Thread Tony Breeds
On Mon, Feb 16, 2015 at 01:39:21PM +1300, Robert Collins wrote:

> Just ran across this from bug
> https://bugs.launchpad.net/nova/+bug/1398999. Is there some way to
> signal to libvirt that some block devices shouldn't be migrated by it
> but instead are known to be networked etc? Or put another way, how can
> we have our cake and eat it too. Its not uncommon for a VM to be
> cinder booted but have local storage for swap... and AIUI the fix we
> put in for this bug stops those VM's being migrated. Do you think it
> is tractable (but needs libvirt work), or is it something endemic to
> the problem (e.g. dirty page synchronisation with the VM itself) that
> will be in the way?


I have a half drafted email for the libvirt devel list proposing this
exact thing.  Allow an element in the XML that tell libvirt how/if it
can/should migrate a device.  As noted previously I'm happy to do qemu/libvirt
work to help openstack.

Doign this would solve a few other issues in openstack without breaking
existing users.

So now I look like I'm jumping on your bandwagon Phooey
 
Yours Tony.


pgpryzA1nk4rz.pgp
Description: PGP signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][libvirt] Block migrations and Cinder volumes

2015-02-15 Thread Robert Collins
On 19 June 2014 at 20:38, Daniel P. Berrange  wrote:
> On Wed, Jun 18, 2014 at 11:09:33PM -0700, Rafi Khardalian wrote:
>> I am concerned about how block migration functions when Cinder volumes are
>> attached to an instance being migrated.  We noticed some unexpected
>> behavior recently, whereby attached generic NFS-based volumes would become
>> entirely unsparse over the course of a migration.  After spending some time
>> reviewing the code paths in Nova, I'm more concerned that this was actually
>> a minor symptom of a much more significant issue.
>>
>> For those unfamiliar, NFS-based volumes are simply RAW files residing on an
>> NFS mount.  From Libvirt's perspective, these volumes look no different
>> than root or ephemeral disks.  We are currently not filtering out volumes
>> whatsoever when making the request into Libvirt to perform the migration.
>>  Libvirt simply receives an additional flag (VIR_MIGRATE_NON_SHARED_INC)
>> when a block migration is requested, which applied to the entire migration
>> process, not differentiated on a per-disk basis.  Numerous guards within
>> Nova to prevent a block based migration from being allowed if the instance
>> disks exist on the destination; yet volumes remain attached and within the
>> defined XML during a block migration.
>>
>> Unless Libvirt has a lot more logic around this than I am lead to believe,
>> this seems like a recipe for corruption.  It seems as though this would
>> also impact any type of volume attached to an instance (iSCSI, RBD, etc.),
>> NFS just happens to be what we were testing.  If I am wrong and someone can
>> correct my understanding, I would really appreciate it.  Otherwise, I'm
>> surprised we haven't had more reports of issues when block migrations are
>> used in conjunction with any attached volumes.
>
> Libvirt/QEMU has no special logic. When told to block-migrate, it will do
> so for *all* disks attached to the VM in read-write-exclusive mode. It will
> only skip those marked read-only or read-write-shared mode. Even that
> distinction is somewhat dubious and so not reliably what you would want.
>
> It seems like we should just disallow block migrate when any cinder volumes
> are attached to the VM, since there is never any valid use case for doing
> block migrate from a cinder volume to itself.
>
> Regards,
> Daniel

Just ran across this from bug
https://bugs.launchpad.net/nova/+bug/1398999. Is there some way to
signal to libvirt that some block devices shouldn't be migrated by it
but instead are known to be networked etc? Or put another way, how can
we have our cake and eat it too. Its not uncommon for a VM to be
cinder booted but have local storage for swap... and AIUI the fix we
put in for this bug stops those VM's being migrated. Do you think it
is tractable (but needs libvirt work), or is it something endemic to
the problem (e.g. dirty page synchronisation with the VM itself) that
will be in the way?

-Rob

-- 
Robert Collins 
Distinguished Technologist
HP Converged Cloud

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][libvirt] Block migrations and Cinder volumes

2014-06-19 Thread Duncan Thomas
I think here there are two different processes making us of libvirt
block migration:

1) Instance migration, which should not do anything with cinder volumes
2) Cinder live migration between backends, which is what I think Ronen
Kat is referring to

On 19 June 2014 11:06, Ronen Kat  wrote:
> The use-case for block migration in Libvirt/QEMU is to allow migration
> between two different back-ends.
> This is basically a host based volume migration, ESXi has a similar
> functionality (storage vMotion), but probably not enabled with OpenStack.
> Btw, if the Cinder volume driver can migrate the volume by itself, the
> Libvirt/QEMU is not called upon, but if it can't (different vendor boxes
> don't talk to each other), then Cinder asks Nova to help move the data...
>
> If you are missing this host based process you are basically have a "data
> lock-in" on a specific back-end - the use case could be storage evacuation,
> or just moving the data to a different box.
>
> Ronen,
>
>
>
> From:"Daniel P. Berrange" 
> To:"OpenStack Development Mailing List (not for usage questions)"
> ,
> Date:        19/06/2014 11:42 AM
> Subject:Re: [openstack-dev] [nova][libvirt] Block migrations and
> Cinder volumes
> 
>
>
>
> On Wed, Jun 18, 2014 at 11:09:33PM -0700, Rafi Khardalian wrote:
>> I am concerned about how block migration functions when Cinder volumes are
>> attached to an instance being migrated.  We noticed some unexpected
>> behavior recently, whereby attached generic NFS-based volumes would become
>> entirely unsparse over the course of a migration.  After spending some
>> time
>> reviewing the code paths in Nova, I'm more concerned that this was
>> actually
>> a minor symptom of a much more significant issue.
>>
>> For those unfamiliar, NFS-based volumes are simply RAW files residing on
>> an
>> NFS mount.  From Libvirt's perspective, these volumes look no different
>> than root or ephemeral disks.  We are currently not filtering out volumes
>> whatsoever when making the request into Libvirt to perform the migration.
>>  Libvirt simply receives an additional flag (VIR_MIGRATE_NON_SHARED_INC)
>> when a block migration is requested, which applied to the entire migration
>> process, not differentiated on a per-disk basis.  Numerous guards within
>> Nova to prevent a block based migration from being allowed if the instance
>> disks exist on the destination; yet volumes remain attached and within the
>> defined XML during a block migration.
>>
>> Unless Libvirt has a lot more logic around this than I am lead to believe,
>> this seems like a recipe for corruption.  It seems as though this would
>> also impact any type of volume attached to an instance (iSCSI, RBD, etc.),
>> NFS just happens to be what we were testing.  If I am wrong and someone
>> can
>> correct my understanding, I would really appreciate it.  Otherwise, I'm
>> surprised we haven't had more reports of issues when block migrations are
>> used in conjunction with any attached volumes.
>
> Libvirt/QEMU has no special logic. When told to block-migrate, it will do
> so for *all* disks attached to the VM in read-write-exclusive mode. It will
> only skip those marked read-only or read-write-shared mode. Even that
> distinction is somewhat dubious and so not reliably what you would want.
>
> It seems like we should just disallow block migrate when any cinder volumes
> are attached to the VM, since there is never any valid use case for doing
> block migrate from a cinder volume to itself.
>
> Regards,
> Daniel
> --
> |: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/
> :|
> |: http://libvirt.org  -o- http://virt-manager.org
> :|
> |: http://autobuild.org   -o- http://search.cpan.org/~danberr/
> :|
> |: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc
> :|
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 
Duncan Thomas

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][libvirt] Block migrations and Cinder volumes

2014-06-19 Thread Ronen Kat
The use-case for block migration in Libvirt/QEMU is to allow migration 
between two different back-ends.
This is basically a host based volume migration, ESXi has a similar 
functionality (storage vMotion), but probably not enabled with OpenStack.
Btw, if the Cinder volume driver can migrate the volume by itself, the 
Libvirt/QEMU is not called upon, but if it can't (different vendor boxes 
don't talk to each other), then Cinder asks Nova to help move the data...

If you are missing this host based process you are basically have a "data 
lock-in" on a specific back-end - the use case could be storage 
evacuation, or just moving the data to a different box.

Ronen,



From:   "Daniel P. Berrange" 
To: "OpenStack Development Mailing List (not for usage questions)" 
, 
Date:   19/06/2014 11:42 AM
Subject:    Re: [openstack-dev] [nova][libvirt] Block migrations and 
Cinder volumes



On Wed, Jun 18, 2014 at 11:09:33PM -0700, Rafi Khardalian wrote:
> I am concerned about how block migration functions when Cinder volumes 
are
> attached to an instance being migrated.  We noticed some unexpected
> behavior recently, whereby attached generic NFS-based volumes would 
become
> entirely unsparse over the course of a migration.  After spending some 
time
> reviewing the code paths in Nova, I'm more concerned that this was 
actually
> a minor symptom of a much more significant issue.
> 
> For those unfamiliar, NFS-based volumes are simply RAW files residing on 
an
> NFS mount.  From Libvirt's perspective, these volumes look no different
> than root or ephemeral disks.  We are currently not filtering out 
volumes
> whatsoever when making the request into Libvirt to perform the 
migration.
>  Libvirt simply receives an additional flag (VIR_MIGRATE_NON_SHARED_INC)
> when a block migration is requested, which applied to the entire 
migration
> process, not differentiated on a per-disk basis.  Numerous guards within
> Nova to prevent a block based migration from being allowed if the 
instance
> disks exist on the destination; yet volumes remain attached and within 
the
> defined XML during a block migration.
> 
> Unless Libvirt has a lot more logic around this than I am lead to 
believe,
> this seems like a recipe for corruption.  It seems as though this would
> also impact any type of volume attached to an instance (iSCSI, RBD, 
etc.),
> NFS just happens to be what we were testing.  If I am wrong and someone 
can
> correct my understanding, I would really appreciate it.  Otherwise, I'm
> surprised we haven't had more reports of issues when block migrations 
are
> used in conjunction with any attached volumes.

Libvirt/QEMU has no special logic. When told to block-migrate, it will do
so for *all* disks attached to the VM in read-write-exclusive mode. It 
will
only skip those marked read-only or read-write-shared mode. Even that
distinction is somewhat dubious and so not reliably what you would want.

It seems like we should just disallow block migrate when any cinder 
volumes
are attached to the VM, since there is never any valid use case for doing
block migrate from a cinder volume to itself.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ 
:|
|: http://libvirt.org  -o- http://virt-manager.org 
:|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ 
:|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc 
:|

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][libvirt] Block migrations and Cinder volumes

2014-06-19 Thread Daniel P. Berrange
On Wed, Jun 18, 2014 at 11:09:33PM -0700, Rafi Khardalian wrote:
> I am concerned about how block migration functions when Cinder volumes are
> attached to an instance being migrated.  We noticed some unexpected
> behavior recently, whereby attached generic NFS-based volumes would become
> entirely unsparse over the course of a migration.  After spending some time
> reviewing the code paths in Nova, I'm more concerned that this was actually
> a minor symptom of a much more significant issue.
> 
> For those unfamiliar, NFS-based volumes are simply RAW files residing on an
> NFS mount.  From Libvirt's perspective, these volumes look no different
> than root or ephemeral disks.  We are currently not filtering out volumes
> whatsoever when making the request into Libvirt to perform the migration.
>  Libvirt simply receives an additional flag (VIR_MIGRATE_NON_SHARED_INC)
> when a block migration is requested, which applied to the entire migration
> process, not differentiated on a per-disk basis.  Numerous guards within
> Nova to prevent a block based migration from being allowed if the instance
> disks exist on the destination; yet volumes remain attached and within the
> defined XML during a block migration.
> 
> Unless Libvirt has a lot more logic around this than I am lead to believe,
> this seems like a recipe for corruption.  It seems as though this would
> also impact any type of volume attached to an instance (iSCSI, RBD, etc.),
> NFS just happens to be what we were testing.  If I am wrong and someone can
> correct my understanding, I would really appreciate it.  Otherwise, I'm
> surprised we haven't had more reports of issues when block migrations are
> used in conjunction with any attached volumes.

Libvirt/QEMU has no special logic. When told to block-migrate, it will do
so for *all* disks attached to the VM in read-write-exclusive mode. It will
only skip those marked read-only or read-write-shared mode. Even that
distinction is somewhat dubious and so not reliably what you would want.

It seems like we should just disallow block migrate when any cinder volumes
are attached to the VM, since there is never any valid use case for doing
block migrate from a cinder volume to itself.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova][libvirt] Block migrations and Cinder volumes

2014-06-18 Thread Rafi Khardalian
I am concerned about how block migration functions when Cinder volumes are
attached to an instance being migrated.  We noticed some unexpected
behavior recently, whereby attached generic NFS-based volumes would become
entirely unsparse over the course of a migration.  After spending some time
reviewing the code paths in Nova, I'm more concerned that this was actually
a minor symptom of a much more significant issue.

For those unfamiliar, NFS-based volumes are simply RAW files residing on an
NFS mount.  From Libvirt's perspective, these volumes look no different
than root or ephemeral disks.  We are currently not filtering out volumes
whatsoever when making the request into Libvirt to perform the migration.
 Libvirt simply receives an additional flag (VIR_MIGRATE_NON_SHARED_INC)
when a block migration is requested, which applied to the entire migration
process, not differentiated on a per-disk basis.  Numerous guards within
Nova to prevent a block based migration from being allowed if the instance
disks exist on the destination; yet volumes remain attached and within the
defined XML during a block migration.

Unless Libvirt has a lot more logic around this than I am lead to believe,
this seems like a recipe for corruption.  It seems as though this would
also impact any type of volume attached to an instance (iSCSI, RBD, etc.),
NFS just happens to be what we were testing.  If I am wrong and someone can
correct my understanding, I would really appreciate it.  Otherwise, I'm
surprised we haven't had more reports of issues when block migrations are
used in conjunction with any attached volumes.

I have ideas on how we can address the issue if we can reach some consensus
that the issue is valid, but we'll discuss those when if/when we get to
that point.

Regards,
Rafi
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev