I came up with a new theory for how to delete a large pool sanely and
without impacting the cluster heavily.  I haven't tested this yet, but it
just occurred to me as I was planning to remove a large pool of my own,
again.

First you need to stop all IO to the pool to be deleted.  Next you stop an
OSD; if the OSD is filestore you delete the PG folders or use the
ceph-objectstore-tool to do it if it's bluestore.  Start the OSD and move
onto the next one (or do a full host at a time, just some sane method to go
through all of your OSDs).  Before long (probably fairly immediately) the
cluster is freaking out about inconsistent PGs and lost data... PERFECT,
we're deleting a pool, we want lost data.  As long as no traffic is going
to the pool, you shouldn't see any blocked requests in the cluster due to
this.  When you're done manually deleting the PGs for the pool from the
OSDs, then you mark all of the PGs lost to the cluster and delete the now
empty pool that happens instantly.

I intend to test this out in our staging environment and I'll update here.
I expect to have to do some things at the end to get the pool to delete
properly, possibly forcibly recreate the PGs or something.  All in all,
though, I think this should work nicely... if not tediously.  Does anyone
see any gotcha's that I haven't thought about here?  I know my biggest
question is why Ceph doesn't do something similar under the hood when
deleting a pool.  It took almost a month the last time I deleted a large
pool.

On Fri, May 25, 2018 at 7:04 AM Paul Emmerich <[email protected]>
wrote:

> Also, upgrade to luminous and migrate your OSDs to bluestore before using
> erasure coding.
> Luminous + Bluestore performs so much better for erasure coding than any
> of the old configurations.
>
> Also, I've found that deleting a large number of objects is far less
> stressfull on a Bluestore OSD than on a Filestore OSD.
>
> Paul
>
>
> 2018-05-22 19:28 GMT+02:00 David Turner <[email protected]>:
>
>> From my experience, that would cause you some troubles as it would throw
>> the entire pool into the deletion queue to be processed as it cleans up the
>> disks and everything.  I would suggest using a pool listing from `rados -p
>> .rgw.buckets ls` and iterate on that using some scripts around the `rados
>> -p .rgw.buckest rm <obj-name>` command that you could stop, restart at a
>> faster pace, slow down, etc.  Once the objects in the pool are gone, you
>> can delete the empty pool without any problems.  I like this option because
>> it makes it simple to stop it if you're impacting your VM traffic.
>>
>>
>> On Tue, May 22, 2018 at 11:05 AM Simon Ironside <[email protected]>
>> wrote:
>>
>>> Hi Everyone,
>>>
>>> I have an older cluster (Hammer 0.94.7) with a broken radosgw service
>>> that I'd just like to blow away before upgrading to Jewel after which
>>> I'll start again with EC pools.
>>>
>>> I don't need the data but I'm worried that deleting the .rgw.buckets
>>> pool will cause performance degradation for the production RBD pool used
>>> by VMs. .rgw.buckets is a replicated pool (size=3) with ~14TB data in
>>> 5.3M objects. A little over half the data in the whole cluster.
>>>
>>> Is deleting this pool simply using ceph osd pool delete likely to cause
>>> me a performance problem? If so, is there a way I can do it better?
>>>
>>> Thanks,
>>> Simon.
>>> _______________________________________________
>>> ceph-users mailing list
>>> [email protected]
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> [email protected]
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>
>
> --
> Paul Emmerich
>
> Looking for help with your Ceph cluster? Contact us at https://croit.io
>
> croit GmbH
> Freseniusstr. 31h
> <https://maps.google.com/?q=Freseniusstr.+31h+81247+M%C3%BCnchen&entry=gmail&source=g>
> 81247 München
> <https://maps.google.com/?q=Freseniusstr.+31h+81247+M%C3%BCnchen&entry=gmail&source=g>
> www.croit.io
> Tel: +49 89 1896585 90 <+49%2089%20189658590>
>
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to