Hrmmm... one gotcha might be that the deleted PGs might try to backfill from the rest of the cluster when you bring an OSD back online. Setting nobackfill/norecover would prevent the other PGs on the OSD from other pools from catching back up... There has to be a way around that. Maybe marking the PGs lost first and then delete them from the disk?
On Tue, Jun 19, 2018 at 5:47 PM David Turner <[email protected]> wrote: > I came up with a new theory for how to delete a large pool sanely and > without impacting the cluster heavily. I haven't tested this yet, but it > just occurred to me as I was planning to remove a large pool of my own, > again. > > First you need to stop all IO to the pool to be deleted. Next you stop an > OSD; if the OSD is filestore you delete the PG folders or use the > ceph-objectstore-tool to do it if it's bluestore. Start the OSD and move > onto the next one (or do a full host at a time, just some sane method to go > through all of your OSDs). Before long (probably fairly immediately) the > cluster is freaking out about inconsistent PGs and lost data... PERFECT, > we're deleting a pool, we want lost data. As long as no traffic is going > to the pool, you shouldn't see any blocked requests in the cluster due to > this. When you're done manually deleting the PGs for the pool from the > OSDs, then you mark all of the PGs lost to the cluster and delete the now > empty pool that happens instantly. > > I intend to test this out in our staging environment and I'll update > here. I expect to have to do some things at the end to get the pool to > delete properly, possibly forcibly recreate the PGs or something. All in > all, though, I think this should work nicely... if not tediously. Does > anyone see any gotcha's that I haven't thought about here? I know my > biggest question is why Ceph doesn't do something similar under the hood > when deleting a pool. It took almost a month the last time I deleted a > large pool. > > On Fri, May 25, 2018 at 7:04 AM Paul Emmerich <[email protected]> > wrote: > >> Also, upgrade to luminous and migrate your OSDs to bluestore before using >> erasure coding. >> Luminous + Bluestore performs so much better for erasure coding than any >> of the old configurations. >> >> Also, I've found that deleting a large number of objects is far less >> stressfull on a Bluestore OSD than on a Filestore OSD. >> >> Paul >> >> >> 2018-05-22 19:28 GMT+02:00 David Turner <[email protected]>: >> >>> From my experience, that would cause you some troubles as it would throw >>> the entire pool into the deletion queue to be processed as it cleans up the >>> disks and everything. I would suggest using a pool listing from `rados -p >>> .rgw.buckets ls` and iterate on that using some scripts around the `rados >>> -p .rgw.buckest rm <obj-name>` command that you could stop, restart at a >>> faster pace, slow down, etc. Once the objects in the pool are gone, you >>> can delete the empty pool without any problems. I like this option because >>> it makes it simple to stop it if you're impacting your VM traffic. >>> >>> >>> On Tue, May 22, 2018 at 11:05 AM Simon Ironside <[email protected]> >>> wrote: >>> >>>> Hi Everyone, >>>> >>>> I have an older cluster (Hammer 0.94.7) with a broken radosgw service >>>> that I'd just like to blow away before upgrading to Jewel after which >>>> I'll start again with EC pools. >>>> >>>> I don't need the data but I'm worried that deleting the .rgw.buckets >>>> pool will cause performance degradation for the production RBD pool >>>> used >>>> by VMs. .rgw.buckets is a replicated pool (size=3) with ~14TB data in >>>> 5.3M objects. A little over half the data in the whole cluster. >>>> >>>> Is deleting this pool simply using ceph osd pool delete likely to cause >>>> me a performance problem? If so, is there a way I can do it better? >>>> >>>> Thanks, >>>> Simon. >>>> _______________________________________________ >>>> ceph-users mailing list >>>> [email protected] >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>> >>> >>> _______________________________________________ >>> ceph-users mailing list >>> [email protected] >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>> >> >> >> -- >> Paul Emmerich >> >> Looking for help with your Ceph cluster? Contact us at https://croit.io >> >> croit GmbH >> Freseniusstr. 31h >> <https://maps.google.com/?q=Freseniusstr.+31h+81247+M%C3%BCnchen&entry=gmail&source=g> >> 81247 München >> <https://maps.google.com/?q=Freseniusstr.+31h+81247+M%C3%BCnchen&entry=gmail&source=g> >> www.croit.io >> Tel: +49 89 1896585 90 <+49%2089%20189658590> >> >
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
