> Hi all,
> Yesterday I marked out all the osds on one node in our new cluster to
> reconfigure them with WAL/DB on their NVMe devices, but it is taking
> ages to rebalance.

> > ceph tell 'osd.*' injectargs '--osd-max-backfills 16'
> > ceph tell 'osd.*' injectargs '--osd-recovery-max-active 4'
> Since the cluster is currently hardly loaded, backfilling can take up
> all the unused bandwidth as far as I'm concerned...
> Is it a good idea to give the above commands or other commands to speed
> up the backfilling? (e.g. like increasing "osd max backfills")
> OSD max backfills is going to have a very large effect on recovery time,
so that
would be the obvious knob to twist first. Check what it defaults to now,
raise to 4,8,12,16
in steps and see that it doesn't slow rebalancing down too much.
Spindrives without any ssd/nvme journal/wal/db should perhaps have 1 or 2
at most,
ssds can take more than that and nvme even more before diminishing gains

