We recently deleted a bucket that was no longer needed that had 400TB of
data in it to help as our cluster is getting quite full.  That should free
up about 30% of our cluster used space, but in the last week we haven't
seen nearly a fraction of that free up yet.  I left the cluster with this
running over the weekend to try to help `radosgw-admin --rgw-realm=local gc
process`, but it didn't seem to put a dent into it.  Our regular ingestion
is faster than how fast the garbage collection is cleaning stuff up, but
our regular ingestion is less than 2% growth at it's maximum.

As of yesterday our gc list was over 350GB when dumped into a file (I had
to stop it as the disk I was redirecting the output to was almost full).
In the future I will use the --bypass-gc option to avoid the cleanup, but
is there a way to speed up the gc once you're in this position?  There were
about 8M objects that were deleted from this bucket.  I've come across a
few references to the rgw-gc settings in the config, but nothing that
explained the times well enough for me to feel comfortable doing anything
with them.

On Tue, Jul 25, 2017 at 4:01 PM Bryan Stillwell <bstillw...@godaddy.com>
wrote:

> Excellent, thank you!  It does exist in 0.94.10!  :)
>
>
>
> Bryan
>
>
>
> *From: *Pavan Rallabhandi <prallabha...@walmartlabs.com>
> *Date: *Tuesday, July 25, 2017 at 11:21 AM
>
>
> *To: *Bryan Stillwell <bstillw...@godaddy.com>, "ceph-users@lists.ceph.com"
> <ceph-users@lists.ceph.com>
> *Subject: *Re: [ceph-users] Speeding up garbage collection in RGW
>
>
>
> I’ve just realized that the option is present in Hammer (0.94.10) as well,
> you should try that.
>
>
>
> *From: *Bryan Stillwell <bstillw...@godaddy.com>
> *Date: *Tuesday, 25 July 2017 at 9:45 PM
> *To: *Pavan Rallabhandi <prallabha...@walmartlabs.com>, "
> ceph-users@lists.ceph.com" <ceph-users@lists.ceph.com>
> *Subject: *EXT: Re: [ceph-users] Speeding up garbage collection in RGW
>
>
>
> Unfortunately, we're on hammer still (0.94.10).  That option looks like it
> would work better, so maybe it's time to move the upgrade up in the
> schedule.
>
>
>
> I've been playing with the various gc options and I haven't seen any
> speedups like we would need to remove them in a reasonable amount of time.
>
>
>
> Thanks,
>
> Bryan
>
>
>
> *From: *Pavan Rallabhandi <prallabha...@walmartlabs.com>
> *Date: *Tuesday, July 25, 2017 at 3:00 AM
> *To: *Bryan Stillwell <bstillw...@godaddy.com>, "ceph-users@lists.ceph.com"
> <ceph-users@lists.ceph.com>
> *Subject: *Re: [ceph-users] Speeding up garbage collection in RGW
>
>
>
> If your Ceph version is >=Jewel, you can try the `--bypass-gc` option in
> radosgw-admin, which would remove the tails objects as well without marking
> them to be GCed.
>
>
>
> Thanks,
>
>
>
> On 25/07/17, 1:34 AM, "ceph-users on behalf of Bryan Stillwell" <
> ceph-users-boun...@lists.ceph.com on behalf of bstillw...@godaddy.com>
> wrote:
>
>
>
>     I'm in the process of cleaning up a test that an internal customer did
> on our production cluster that produced over a billion objects spread
> across 6000 buckets.  So far I've been removing the buckets like this:
>
>
>
>     printf %s\\n bucket{1..6000} | xargs -I{} -n 1 -P 32 radosgw-admin
> bucket rm --bucket={} --purge-objects
>
>
>
>     However, the disk usage doesn't seem to be getting reduced at the same
> rate the objects are being removed.  From what I can tell a large number of
> the objects are waiting for garbage collection.
>
>
>
>     When I first read the docs it sounded like the garbage collector would
> only remove 32 objects every hour, but after looking through the logs I'm
> seeing about 55,000 objects removed every hour.  That's about 1.3 million a
> day, so at this rate it'll take a couple years to clean up the rest!  For
> comparison, the purge-objects command above is removing (but not GC'ing)
> about 30 million objects a day, so a much more manageable 33 days to finish.
>
>
>
>     I've done some digging and it appears like I should be changing these
> configuration options:
>
>
>
>     rgw gc max objs (default: 32)
>
>     rgw gc obj min wait (default: 7200)
>
>     rgw gc processor max time (default: 3600)
>
>     rgw gc processor period (default: 3600)
>
>
>
>     A few questions I have though are:
>
>
>
>     Should 'rgw gc processor max time' and 'rgw gc processor period'
> always be set to the same value?
>
>
>
>     Which would be better, increasing 'rgw gc max objs' to something like
> 1024, or reducing the 'rgw gc processor' times to something like 60 seconds?
>
>
>
>     Any other guidance on the best way to adjust these values?
>
>
>
>     Thanks,
>
>     Bryan
>
>
>
>
>
>     _______________________________________________
>
>     ceph-users mailing list
>
>     ceph-users@lists.ceph.com
>
>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to