That helps a little bit, but overall the process would take years at this rate:
# for i in {1..3600}; do ceph df -f json-pretty |grep -A7 '".rgw.buckets"'
|grep objects; sleep 60; done
"objects": 1660775838
"objects": 1660775733
"objects": 1660775548
"objects": 1660774825
"objects": 1660774790
"objects": 1660774735
This is on a hammer cluster. Would upgrading to Jewel or Luminous speed up
this process at all?
Bryan
From: Yehuda Sadeh-Weinraub <[email protected]>
Date: Wednesday, October 25, 2017 at 11:32 AM
To: Bryan Stillwell <[email protected]>
Cc: David Turner <[email protected]>, Ben Hines <[email protected]>,
"[email protected]" <[email protected]>
Subject: Re: [ceph-users] Speeding up garbage collection in RGW
Some of the options there won't do much for you as they'll only affect
newer object removals. I think the default number of gc objects is
just inadequate for your needs. You can try manually running
'radosgw-admin gc process' concurrently (for the start 2 or 3
processes), see if it makes any dent there. I think one of the problem
is that the gc omaps grew so much that operations on them are too
slow.
Yehuda
On Wed, Oct 25, 2017 at 9:05 AM, Bryan Stillwell
<[email protected]<mailto:[email protected]>> wrote:
We tried various options like the one's Ben mentioned to speed up the garbage
collection process and were unsuccessful. Luckily, we had the ability to
create a new cluster and move all the data that wasn't part of the POC which
created our problem.
One of the things we ran into was the .rgw.gc pool became too large to handle
drive failures without taking down the cluster. We eventually had to move that
pool to SSDs just to get the cluster healthy. It was not obvious it was
getting large though, because this is what it looked like in the 'ceph df'
output:
NAME ID USED %USED MAX AVAIL OBJECTS
.rgw.gc 17 0 0 235G 2647
However, if you look at the SSDs we used (repurposed journal SSDs to get out of
the disaster) in 'ceph osd df' you can see quite a bit of data is being used:
410 0.20000 1.00000 181G 23090M 158G 12.44 0.18
411 0.20000 1.00000 181G 29105M 152G 15.68 0.22
412 0.20000 1.00000 181G 110G 72223M 61.08 0.86
413 0.20000 1.00000 181G 42964M 139G 23.15 0.33
414 0.20000 1.00000 181G 33530M 148G 18.07 0.26
415 0.20000 1.00000 181G 38420M 143G 20.70 0.29
416 0.20000 1.00000 181G 92215M 93355M 49.69 0.70
417 0.20000 1.00000 181G 64730M 118G 34.88 0.49
418 0.20000 1.00000 181G 61353M 121G 33.06 0.47
419 0.20000 1.00000 181G 77168M 105G 41.58 0.59
That's ~560G of omap data for the .rgw.gc pool that isn't being reported in
'ceph df'.
Right now the cluster is still around while we wait to verify the new cluster
isn't missing anything. So if there is anything the RGW developers would like
to try on it to speed up the gc process, we should be able to do that.
Bryan
From: ceph-users
<[email protected]<mailto:[email protected]>>
on behalf of David Turner <[email protected]<mailto:[email protected]>>
Date: Tuesday, October 24, 2017 at 4:07 PM
To: Ben Hines <[email protected]<mailto:[email protected]>>
Cc: "[email protected]<mailto:[email protected]>"
<[email protected]<mailto:[email protected]>>
Subject: Re: [ceph-users] Speeding up garbage collection in RGW
Thank you so much for chiming in, Ben.
Can you explain what each setting value means? I believe I understand min wait,
that's just how long to wait before allowing the object to be cleaned up. gc
max objs is how many will be cleaned up during each period? gc processor
period is how often it will kick off gc to clean things up? And gc processor
max time is the longest the process can run after the period starts? Is that
about right for that? I read somewhere saying that prime numbers are optimal
for gc max objs. Do you know why that is? I notice you're using one there.
What is lc max objs? I couldn't find a reference for that setting.
Additionally, do you know if the radosgw-admin gc list is ever cleaned up, or
is it an ever growing list? I got up to 3.6 Billion objects in the list before
I killed the gc list command.
On Tue, Oct 24, 2017 at 4:47 PM Ben Hines
<[email protected]<mailto:[email protected]>> wrote:
I agree the settings are rather confusing. We also have many millions of
objects and had this trouble, so i set these rather aggressive gc settings on
our cluster which result in gc almost always running. We also use lifecycles to
expire objects.
rgw lifecycle work time = 00:01-23:59
rgw gc max objs = 2647
rgw lc max objs = 2647
rgw gc obj min wait = 300
rgw gc processor period = 600
rgw gc processor max time = 600
-Ben
On Tue, Oct 24, 2017 at 9:25 AM, David Turner
<[email protected]<mailto:[email protected]>> wrote:
As I'm looking into this more and more, I'm realizing how big of a problem
garbage collection has been in our clusters. The biggest cluster has over 1
billion objects in its gc list (the command is still running, it just recently
passed by the 1B mark). Does anyone have any guidance on what to do to
optimize the gc settings to hopefully/eventually catch up on this as well as
stay caught up once we are? I'm not expecting an overnight fix, but something
that could feasibly be caught up within 6 months would be wonderful.
On Mon, Oct 23, 2017 at 11:18 AM David Turner
<[email protected]<mailto:[email protected]>> wrote:
We recently deleted a bucket that was no longer needed that had 400TB of data
in it to help as our cluster is getting quite full. That should free up about
30% of our cluster used space, but in the last week we haven't seen nearly a
fraction of that free up yet. I left the cluster with this running over the
weekend to try to help `radosgw-admin --rgw-realm=local gc process`, but it
didn't seem to put a dent into it. Our regular ingestion is faster than how
fast the garbage collection is cleaning stuff up, but our regular ingestion is
less than 2% growth at it's maximum.
As of yesterday our gc list was over 350GB when dumped into a file (I had to
stop it as the disk I was redirecting the output to was almost full). In the
future I will use the --bypass-gc option to avoid the cleanup, but is there a
way to speed up the gc once you're in this position? There were about 8M
objects that were deleted from this bucket. I've come across a few references
to the rgw-gc settings in the config, but nothing that explained the times well
enough for me to feel comfortable doing anything with them.
On Tue, Jul 25, 2017 at 4:01 PM Bryan Stillwell
<[email protected]<mailto:[email protected]>> wrote:
Excellent, thank you! It does exist in 0.94.10! :)
Bryan
From: Pavan Rallabhandi
<[email protected]<mailto:[email protected]>>
Date: Tuesday, July 25, 2017 at 11:21 AM
To: Bryan Stillwell <[email protected]<mailto:[email protected]>>,
"[email protected]<mailto:[email protected]>"
<[email protected]<mailto:[email protected]>>
Subject: Re: [ceph-users] Speeding up garbage collection in RGW
I’ve just realized that the option is present in Hammer (0.94.10) as well, you
should try that.
From: Bryan Stillwell <[email protected]<mailto:[email protected]>>
Date: Tuesday, 25 July 2017 at 9:45 PM
To: Pavan Rallabhandi
<[email protected]<mailto:[email protected]>>,
"[email protected]<mailto:[email protected]>"
<[email protected]<mailto:[email protected]>>
Subject: EXT: Re: [ceph-users] Speeding up garbage collection in RGW
Unfortunately, we're on hammer still (0.94.10). That option looks like it
would work better, so maybe it's time to move the upgrade up in the schedule.
I've been playing with the various gc options and I haven't seen any speedups
like we would need to remove them in a reasonable amount of time.
Thanks,
Bryan
From: Pavan Rallabhandi
<[email protected]<mailto:[email protected]>>
Date: Tuesday, July 25, 2017 at 3:00 AM
To: Bryan Stillwell <[email protected]<mailto:[email protected]>>,
"[email protected]<mailto:[email protected]>"
<[email protected]<mailto:[email protected]>>
Subject: Re: [ceph-users] Speeding up garbage collection in RGW
If your Ceph version is >=Jewel, you can try the `--bypass-gc` option in
radosgw-admin, which would remove the tails objects as well without marking
them to be GCed.
Thanks,
On 25/07/17, 1:34 AM, "ceph-users on behalf of Bryan Stillwell"
<[email protected]<mailto:[email protected]> on
behalf of [email protected]<mailto:[email protected]>> wrote:
I'm in the process of cleaning up a test that an internal customer did on
our production cluster that produced over a billion objects spread across 6000
buckets. So far I've been removing the buckets like this:
printf %s\\n bucket{1..6000} | xargs -I{} -n 1 -P 32 radosgw-admin bucket
rm --bucket={} --purge-objects
However, the disk usage doesn't seem to be getting reduced at the same
rate the objects are being removed. From what I can tell a large number of the
objects are waiting for garbage collection.
When I first read the docs it sounded like the garbage collector would
only remove 32 objects every hour, but after looking through the logs I'm
seeing about 55,000 objects removed every hour. That's about 1.3 million a
day, so at this rate it'll take a couple years to clean up the rest! For
comparison, the purge-objects command above is removing (but not GC'ing) about
30 million objects a day, so a much more manageable 33 days to finish.
I've done some digging and it appears like I should be changing these
configuration options:
rgw gc max objs (default: 32)
rgw gc obj min wait (default: 7200)
rgw gc processor max time (default: 3600)
rgw gc processor period (default: 3600)
A few questions I have though are:
Should 'rgw gc processor max time' and 'rgw gc processor period' always be
set to the same value?
Which would be better, increasing 'rgw gc max objs' to something like
1024, or reducing the 'rgw gc processor' times to something like 60 seconds?
Any other guidance on the best way to adjust these values?
Thanks,
Bryan
_______________________________________________
ceph-users mailing list
[email protected]<mailto:[email protected]>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
[email protected]<mailto:[email protected]>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
[email protected]<mailto:[email protected]>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
[email protected]<mailto:[email protected]>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com