Hi all

I'm seeing some behaviour I wish to check on a Luminous (12.2.10) cluster
that I'm running for rbd and rgw (mostly SATA filestore with NVME journal
with a few SATA only bluestore).  There's a set of dedicated SSD OSDs
running bluestore for the .rgw buckets.index pool and also holding the
.rgw.gc pool

There's a long running upload of small files, which I think is causing a
large amount of leveldb compaction (on filestore nodes) and rocksdb
compaction on bluestore nodes.  The .rgw.buckets bluestore nodes were
exhibiting noticeably higher load than filestore nodes, although this seems
to have been solved following configuring the following options for
bluestore SATA osds:

bluestore cache size hdd = 10737418240
osd memory target = 10737418240

However the bluestore nodes are still showing significantly higher wait CPU
and higher disk IO than filestore nodes, is there anything else that I
should be looking at tuning for bluestore, or is this is expected due to
the loss of file cache with filestore?

Whilst the upload has been running a "radosgw-admin orphans find" was also
being executed, although this was ended manually before completion, as a
significant buildup in garbage collection has occurred.  Looking into this,
it looks like most of the outstanding garbage collection relates to a
single bucket, which was shown to contain a large amount of
multipart/shadow files.  These are now being listed in the radosgw-admin gc
list

# radosgw-admin gc list | grep -c '"oid":'
224557347
# radosgw-admin gc list | grep  '"oid":' | grep -v -c
"default.1084171934.99"
3674322
# radosgw-admin gc list | head -1000 | grep  '"oid":'| grep 1084171934
                "oid":
"default.1084171934.99__multipart_ServerImageBackup/95C48F007C44E36C-00-00.mrimg.tmp.2~MZ7fyct8yAWCUX82e9F-j9q-UJcnheP.1",
                "oid":
"default.1084171934.99__shadow_ServerImageBackup/95C48F007C44E36C-00-00.mrimg.tmp.2~MZ7fyct8yAWCUX82e9F-j9q-UJcnheP.1_1",
                "oid":
"default.1084171934.99__shadow_ServerImageBackup/95C48F007C44E36C-00-00.mrimg.tmp.2~MZ7fyct8yAWCUX82e9F-j9q-UJcnheP.1_2",
                "oid":
"default.1084171934.99__shadow_ServerImageBackup/95C48F007C44E36C-00-00.mrimg.tmp.2~MZ7fyct8yAWCUX82e9F-j9q-UJcnheP.1_3",
                "oid":
"default.1084171934.99__shadow_ServerImageBackup/95C48F007C44E36C-00-00.mrimg.tmp.2~MZ7fyct8yAWCUX82e9F-j9q-UJcnheP.1_4",
                "oid":
"default.1084171934.99__multipart_ServerImageBackup/95C48F007C44E36C-00-00.mrimg.tmp.2~MZ7fyct8yAWCUX82e9F-j9q-UJcnheP.2",

Despite running multiple "radosgw-admin gc process" commands alongside our
radosgw processes, which has helped clean up garbage collection in the
past, our gc list is currently continuing to grow.  I believe I can loop
through this manually and use the rados rm command to remove the objects
from the .rgw.buckets pool after having a look through some historic posts
on this list, and then remove the garbage collection objects - is this a
reasonable solution?  Are there any recommendations for dealing with a
garbage collection list of this size?

If there's any additional information I should provide for context here,
please let me know.

Thanks for any help
Chris
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to