[ceph-users] Re: Massive OMAP remediation

Dan van der Ster Wed, 26 Apr 2023 09:11:58 -0700

Hi Ben,

Are you compacting the relevant osds periodically? ceph tell osd.x
compact (for the three osds holding the bilog) would help reshape the
rocksdb levels to least perform better for a little while until the
next round of bilog trims.


Otherwise, I have experience deleting ~50M object indices in one step
in the past, probably back in the luminous days IIRC. It will likely
lockup the relevant osds for a while while the omap is removed. If you
dare take that step, it might help to set nodown; that might prevent
other osds from flapping and creating more work.

Cheers, Dan

______________________________
Clyso GmbH | https://www.clyso.com


On Tue, Apr 25, 2023 at 2:45 PM Ben.Zieglmeier
<ben.zieglme...@target.com> wrote:
>
> Hi All,
>
> We have a RGW cluster running Luminous (12.2.11) that has one object with an 
> extremely large OMAP database in the index pool. Listomapkeys on the object 
> returned 390 Million keys to start. Through bilog trim commands, we’ve 
> whittled that down to about 360 Million. This is a bucket index for a 
> regrettably unsharded bucket. There are only about 37K objects actually in 
> the bucket, but through years of neglect, the bilog grown completely out of 
> control. We’ve hit some major problems trying to deal with this particular 
> OMAP object. We just crashed 4 OSDs when a bilog trim caused enough churn to 
> knock one of the OSDs housing this PG out of the cluster temporarily. The OSD 
> disks are 6.4TB NVMe, but are split into 4 partitions, each housing their own 
> OSD daemon (collocated journal).
>
> We want to be rid of this large OMAP object, but are running out of options 
> to deal with it. Reshard outright does not seem like a viable option, as we 
> believe the deletion would deadlock OSDs can could cause impact. Continuing 
> to run `bilog trim` 1000 records at a time has been what we’ve done, but this 
> also seems to be creating impacts to performance/stability. We are seeking 
> options to remove this problematic object without creating additional 
> problems. It is quite likely this bucket is abandoned, so we could remove the 
> data, but I fear even the deletion of such a large OMAP could bring OSDs down 
> and cause potential for metadata loss (the other bucket indexes on that same 
> PG).
>
> Any insight available would be highly appreciated.
>
> Thanks.
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Massive OMAP remediation

Reply via email to