On Wed, Jul 22, 2015 at 3:52 PM, Oleg Gelbukh <ogelb...@mirantis.com> wrote:

> Greetings,
>
> While working on upgrade of OpenStack with Fuel installer, I meet a
> requirement to re-add OSD devices with the existing data set to a Ceph
> cluster using Puppet module. Node is reinstalled during the upgrade, thus
> disks used for OSDs are not mounted at Puppet runtime.
>
> Current version of Ceph module in fuel-library only supports addition of
> new OSD devices. Mounted devices are skipped. Not mounted devices with Ceph
> UUID in GPT label are passed to 'ceph-deploy osd prepare' command that
> formats the device, recreates file system and all existing data is lost.
>
> I proposed a patch to allow support for OSD devices with existing data set:
> https://review.openstack.org/#/c/203639/2
>
> However, this fix is very straightforward and doesn't account for
> different corner cases, as was pointed out by Mykola Golub in review. As
> this problem seems rather significant to me, I'd like to bring this
> discussion to the broader audience.
>
> So, here's the comment with my replies inline:

Oleg,

Sorry for the delay. I saw your message buth missed that apart my
comments it contained your replies. See my comments below.

>
> I am not sure just reactivating disks that have a filesystem is a safe
> approach:
>
> 1) If you are deploying a mix of new and restored disks you may end up
> with confiicting OSDs joining the cluster with the same ID. 2) It makes
> sense to restore OSDs only if a monitor (cluster) is restored, otherwise
> activation of old OSDs will fail. 3) It might happen that the partition
> contains a valid filesystem by accident (e.g. the user reused disk/hosts
> from another cluster) -- it will not join the cluster because wrong fsid
> and credentials but the deployment will unexpectedly fail.
>
> 1) As far as I can tell, OSD device IDs are assgined by Ceph cluster based
> on already existing devices. So, if some ID is stored on the device, either
> device with the given ID already exists in the cluster and no other new
> device will the same ID, or cluster doesn't know about a device with the
> given ID, and that means we already lost the data placement before.

I though here about the case when you are restoring the cluster from
scratch, readding OSD devices to the osd map. So agree, in your case,
if OSDs are not removed from the cluster map, it should work.

> 2) This can be fixed by adding a check that ensures that fsid parameter in
> ceph.conf on the node and cluster-fsid on the device are equal. Otherwise
> the device is treated like a new device, i.e. passed to 'ceph-deploy osd
> prepare'.

Yes, I think after succesfully mounting this device we should check
both that cluster ID matches and the osd ID matches what is in the
cluster map.

> 3) This situation would be covered by previous check, in my
> understanding.

Yes, if you add the check like above and failing the check will cause
redeploy it should work.

>
> Is it posible to pass information that the cluster is restored using
> partition preservation? Becasue I think a much safer approach is:
>
> 1) Pass some flag from the user that we are restoring the cluster 2)
> Restore controller (monitor) and abort deployment if it fails. 3) When
> deploying osd host, if 'restore' flag is present, skip prepare step and try
> only activate for all disks if possible (we might want to ignore activate
> error, and continue with other disks so we restore osds as many as possible)
>
> The case I want to support by this change is not restoration of the whole
> cluster, but rather support for reinstallation of OSD node's operating
> system. For this case, the approach you propose seems actually more correct
> than my implementation. For node being reinstalled we do not expect new
> devices, but only ones with the existing data set, so we don't need to
> specifically check for it, but rather just skip prepare for all
> devices.

If this is for the case of restoring a one OSD node I think you can go
forwadrd with your approach. If it were supposed for the case when a
whole cluster need to be recovered that I would prefore mine.

I just though about restoring the whole cluster case, because recently
some people were asking me about possibility to restore after the
whole cluster lost.

>
> We still need to check that the value of fsid on the disk is consistent
> with the cluster's fsid.
>
> Which issues should we anticipate with this kind of approach?

Apart issues already mentioned that you agreed to address I think
nothing, I am looking forward at reviewing your updated patch :-)

>
> Another question that is still unclear to me is if someone really needs
> support for a hybrid use case when the new and existing unmounted OSD
> devices are mixed in one OSD node?

I don't think we need to support, but if does not forbidden for
users...  we don't know in what state the cluster a user is trying to
restore, I could imaging it having old and new osd disks.

>
> --
> Best regards,
> Oleg Gelbukh
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to