Yes, that's the blueprint where we're tracking it. I've added work
items to the blueprint based on the spec we've got so far.

On Mon, Nov 25, 2013 at 6:37 AM, Mike Scherbakov
<[email protected]> wrote:
> Dmitry,
> are we tracking this effort in the blueprint
> https://blueprints.launchpad.net/fuel/+spec/ceph-live-migration ?
>
> Can you add the work items which needs to be done there in Work Items
> section, so we can track everything there?
>
> Thanks,
>
>
> On Wed, Nov 20, 2013 at 1:55 AM, Andrey Korolyov <[email protected]>
> wrote:
>>
>> On 11/20/2013 12:32 AM, Dmitry Borodaenko wrote:
>> > Yes, we were able to live-migrate an instance today. After
>> > re-migration back to the original node the instance began reporting
>> > weird I/O errors on some commands, Ryan is re-testing to check if the
>> > same problem occurs again or was a Cirros specific fluke.
>> >
>> > Here's our task list based on research so far:
>> > 1) patch Nova to add CoW from images to instance boot drives as per
>> > OSCI-773.
>> > 2) patch Nova to disable shared filesystem check for live migration of
>> > non-volume-backed instances (we have a hack in place, I'm working on a
>> > proper patch)
>> > 3) patch Nova remove 'rbd ls' from the rbd driver as per Ceph #6693
>> > found by Andrey K.
>> > 4) patch Ceph manifests to create new 'compute' Ceph user, keyring,
>> > and pool for Nova (we tested with the images user so far), and to use
>> > the 'compute' user instead of 'volumes' when defining the libvirt
>> > secret.
>> > 5) figure out tls and tcp auth configuration for libvirt: we had to
>> > disable it to make live migrations work, have to investigate how to
>> > make them work in a more secure configuration, patch Ceph manifests
>> > accordingly
>>
>> I suppose patch should come to the libvirt.pp, not Ceph manifests. I
>> have double thoughts on topic - in one hand there is no actual reasons
>> to make migration inside intranet to be wrapped by secure layer, ssh/tls
>> and of course I am doing the same in Flops. In second hand, we should
>> mind to not release insecure sh*t even if the rest is not ready even a
>> bit for the same level of security. Moving complexity of implementation
>> aside, most proper way to do this thing w/o performance impact is, of
>> course, TCP+TLS transport. One should mind that there is no privilege
>> seperation in the libvirt and since we`ve opened plain TCP socket for
>> migrations, any user with any kind of access to the local subnet can
>> gain complete control over libvirt instances and VMs under it. Generally
>> we _may_ release simple TCP transport but put future patch-upgrade and
>> proper TLS implementation in the MUST section of one of upcoming
>> releases. TLS transport will require single-point CA like just a bunch
>> of files rsynced over controllers or something more mature like [1] but
>> anyway amount of work is not comparable to the resulting enhancements in
>> means of current OpenStack security. We can be blamed for most stupid
>> and straightforward implementation in the communities but it is a
>> shortest and smartest way to put the things on board right now.
>>
>> [1.] http://www.ejbca.org/
>> > 6) patch Ceph manifests to modify nova.conf (enable RBD backend,
>> > configure Ceph pool and user credentials, etc.)
>> > 7) patch OpenStack manifests to open libvirt qemu/kvm live migration
>> > ports between compute nodes, report Nova bug about live migration
>> > being silently cancelled without reporting the libvirt failure to
>> > connect.
>> >
>> > Can anyone help with item (5) above?
>> >
>> >
>> > On Tue, Nov 19, 2013 at 2:53 AM, Mike Scherbakov
>> > <[email protected]> wrote:
>> >> I'd like to keep all the issues on the subject in the single email
>> >> thread,
>> >> so here is what I copy-pasted from A.Korolev:
>> >>> http://tracker.ceph.com/issues/6693
>> >>
>> >> Also, I don't see any reason for keeping this conversation private, so
>> >> I'm
>> >> adding fuel-dev.
>> >>
>> >> Dmitry - any successes so far in your research?
>> >>
>> >>
>> >> On Tue, Nov 19, 2013 at 1:53 AM, Dmitry Borodaenko
>> >> <[email protected]> wrote:
>> >>>
>> >>> The reason it's not a limitation for a volume backed instance is this
>> >>> misguided conditional:
>> >>>
>> >>>
>> >>> https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L3922
>> >>>
>> >>> It assumes that only a volume-backed instance without ephemeral disks
>> >>> can be live-migrated without shared storage. I also found many other
>> >>> places in live migration code in Nova making the same assumption. What
>> >>> I did not find so far is any real reason for shared storage to be
>> >>> required for anything other than backing the instance's boot drive,
>> >>> which is no longer a concern with the Ephemeral RBD patch. I'll try to
>> >>> disable this and other similar checks and see if that makes live
>> >>> migration work for an instance backed by RBD.
>> >>>
>> >>> If that's the case and there are no other blockers in nova, libvirt or
>> >>> qemu, fixing this in Nova will indeed be relatively straightforward.
>> >>>
>> >>> -Dmitry
>> >>>
>> >>> On Mon, Nov 18, 2013 at 9:37 AM, Mike Scherbakov
>> >>> <[email protected]> wrote:
>> >>>> If instance boots from volume, Nova should not have such a
>> >>>> limitation.
>> >>>> So if
>> >>>> it has, it might be easier to fix Nova instead.
>> >>>>
>> >>>>
>> >>>> On Mon, Nov 18, 2013 at 8:56 PM, Dmitry Borodaenko
>> >>>> <[email protected]> wrote:
>> >>>>>
>> >>>>> I used patched packages built by OSCI team per Jira OSCI-773, there
>> >>>>> are
>> >>>>> two more patches on the branch mentioned in the thread on
>> >>>>> ceph-users, I
>> >>>>> still need to review and test those.
>> >>>>>
>> >>>>> We have seen the same error reported on this thread about shared
>> >>>>> storage,
>> >>>>> Nova requires /var/lib/nova to be shared between all compute nodes
>> >>>>> for
>> >>>>> live
>> >>>>> migrations, I am still waiting for Haomai to confirm whether he was
>> >>>>> able to
>> >>>>> overcome this limitation. If not, we will have to add glusterfs or
>> >>>>> cephfs,
>> >>>>> which is too much work for 4.0 timeframe.
>> >>>>>
>> >>>>> On Nov 18, 2013 1:32 AM, "Mike Scherbakov"
>> >>>>> <[email protected]>
>> >>>>> wrote:
>> >>>>>>
>> >>>>>> Dmitry - sorry for late response.
>> >>>>>> It is good news - I remember time when we were experimenting with
>> >>>>>> DRBD,
>> >>>>>> and now we will have Ceph, which should be a way better for the
>> >>>>>> purposes we
>> >>>>>> need it for.
>> >>>>>>
>> >>>>>>> works with the patched Nova packages
>> >>>>>> What patches did you apply? OSCI team already aware?
>> >>>>>>
>> >>>>>> As we merged havana into master, what are your estimates on
>> >>>>>> enabling
>> >>>>>> all
>> >>>>>> of this? We had meeting w/Roman, David, and we really want to have
>> >>>>>> live
>> >>>>>> migration enabled in 4.0 (see #6 here:
>> >>>>>>
>> >>>>>>
>> >>>>>> https://mirantis.jira.com/wiki/display/PRD/4.0+-+Mirantis+OpenStack+release+home+page)
>> >>>>>>
>> >>>>>> Thanks,
>> >>>>>>
>> >>>>>>
>> >>>>>> On Wed, Nov 13, 2013 at 12:39 AM, Dmitry Borodaenko
>> >>>>>> <[email protected]> wrote:
>> >>>>>>>
>> >>>>>>> Ephemeral storage in Ceph works with the patched Nova packages, we
>> >>>>>>> can
>> >>>>>>> start updating our Ceph manifests as soon as we have havana branch
>> >>>>>>> merged into fuel master!
>> >>>>>>>
>> >>>>>>> ---------- Forwarded message ----------
>> >>>>>>> From: Dmitry Borodaenko <[email protected]>
>> >>>>>>> Date: Tue, Nov 12, 2013 at 12:38 PM
>> >>>>>>> Subject: Re: Ephemeral RBD with Havana and Dumpling
>> >>>>>>> To: [email protected]
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> And to answer my own question, I was missing a meaningful error
>> >>>>>>> message: what the ObjectNotFound exception I got from librados
>> >>>>>>> didn't
>> >>>>>>> tell me was that I didn't have the images keyring file in
>> >>>>>>> /etc/ceph/
>> >>>>>>> on my compute node. After 'ceph auth get-or-create client.images >
>> >>>>>>> /etc/ceph/ceph.client.images.keyring' and reverting images caps
>> >>>>>>> back
>> >>>>>>> to original state, it all works!
>> >>>>>>>
>> >>>>>>> On Tue, Nov 12, 2013 at 12:19 PM, Dmitry Borodaenko
>> >>>>>>> <[email protected]> wrote:
>> >>>>>>>> I can get ephemeral storage for Nova to work with RBD backend,
>> >>>>>>>> but
>> >>>>>>>> I
>> >>>>>>>> don't understand why it only works with the admin cephx user?
>> >>>>>>>> With
>> >>>>>>>> a
>> >>>>>>>> different user starting a VM fails, even if I set its caps to
>> >>>>>>>> 'allow
>> >>>>>>>> *'.
>> >>>>>>>>
>> >>>>>>>> Here's what I have in nova.conf:
>> >>>>>>>> libvirt_images_type=rbd
>> >>>>>>>> libvirt_images_rbd_pool=images
>> >>>>>>>> rbd_secret_uuid=fd9a11cc-6995-10d7-feb4-d338d73a4399
>> >>>>>>>> rbd_user=images
>> >>>>>>>>
>> >>>>>>>> The secret UUID is defined following the same steps as for Cinder
>> >>>>>>>> and
>> >>>>>>>> Glance:
>> >>>>>>>> http://ceph.com/docs/master/rbd/libvirt/
>> >>>>>>>>
>> >>>>>>>> BTW rbd_user option doesn't seem to be documented anywhere, is
>> >>>>>>>> that
>> >>>>>>>> a
>> >>>>>>>> documentation bug?
>> >>>>>>>>
>> >>>>>>>> And here's what 'ceph auth list' tells me about my cephx users:
>> >>>>>>>>
>> >>>>>>>> client.admin
>> >>>>>>>>         key: AQCoSX1SmIo0AxAAnz3NffHCMZxyvpz65vgRDg==
>> >>>>>>>>         caps: [mds] allow
>> >>>>>>>>         caps: [mon] allow *
>> >>>>>>>>         caps: [osd] allow *
>> >>>>>>>> client.images
>> >>>>>>>>         key: AQC1hYJS0LQhDhAAn51jxI2XhMaLDSmssKjK+g==
>> >>>>>>>>         caps: [mds] allow
>> >>>>>>>>         caps: [mon] allow *
>> >>>>>>>>         caps: [osd] allow *
>> >>>>>>>> client.volumes
>> >>>>>>>>         key: AQALSn1ScKruMhAAeSETeatPLxTOVdMIt10uRg==
>> >>>>>>>>         caps: [mon] allow r
>> >>>>>>>>         caps: [osd] allow class-read object_prefix rbd_children,
>> >>>>>>>> allow
>> >>>>>>>> rwx pool=volumes, allow rx pool=images
>> >>>>>>>>
>> >>>>>>>> Setting rbd_user to images or volumes doesn't work.
>> >>>>>>>>
>> >>>>>>>> What am I missing?
>> >>>>>>>>
>> >>>>>>>> Thanks,
>> >>>>>>>>
>> >>>>>>>> --
>> >>>>>>>> Dmitry Borodaenko
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> --
>> >>>>>>> Dmitry Borodaenko
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> --
>> >>>>>>> Dmitry Borodaenko
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> --
>> >>>>>> Mike Scherbakov
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Mike Scherbakov
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Dmitry Borodaenko
>> >>
>> >>
>> >>
>> >>
>> >> --
>> >> Mike Scherbakov
>> >
>> >
>> >
>>
>
>
>
> --
> Mike Scherbakov
> #mihgen



-- 
Dmitry Borodaenko

-- 
Mailing list: https://launchpad.net/~fuel-dev
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~fuel-dev
More help   : https://help.launchpad.net/ListHelp

Reply via email to