Hi Frank,

this values we used to reduce the recovery impact before luminous.

#reduce recovery impact
osd max backfills
osd recovery max active
osd recovery max single start
osd recovery op priority
osd recovery threads
osd backfill scan max
osd backfill scan min

I do not know how many osds and pgs you have in your cluster. But the backfill performance depends on osds, pgs and objects/pg.

Regards, Joachim

___________________________________

Clyso GmbH

Am 03.12.2020 um 12:35 schrieb 胡 玮文:
Hi,

There is a “OSD recovery priority” dialog box in web dashboard. Configurations 
it will change includes:

osd_max_backfill
osd_recovery_max_active
osd_recovery_max_single_start
osd_recovery_sleep

Tune these config may helps. “High” priority corresponding to 4, 4, 4, 0, 
respectively. Some of these also have a _ssd/_hdd variant.

在 2020年12月3日,17:11,Frank Schilder <[email protected]> 写道:

Hi all,

I have the opposite problem as discussed in "slow down keys/s in recovery". I 
need to increase the number of objects in flight during rebalance. It is already all 
remapped PGs in state backfilling, but it looks like no more than 8 objects/sec are 
transferred per PG at a time. The pools sits on high-performance SSDs and could easily 
handle a transfer of 100 or more objects/sec simultaneously. Is there any way to increase 
the number of transfers/sec or simultaneous transfers? Increasing the options 
osd_max_backfills and osd_recovery_max_active has no effect.

Background: The pool in question (con-fs2-meta2) is the default data pool of a 
ceph fs, which stores exclusively the kind of meta data that goes into this 
pool. Storage consumption is reported as 0, but the number of objects is huge:

    NAME                     ID     USED        %USED     MAX AVAIL     OBJECTS
    con-fs2-meta1            12     216 MiB      0.02       933 GiB      
13311115
    con-fs2-meta2            13         0 B         0       933 GiB     
118389897
    con-fs2-data             14     698 TiB     72.15       270 TiB     
286826739

Unfortunately, there were no recommendations on dimensioning PG numbers for 
this pool, so I used the same for con-fs2-meta1, and con-fs2-meta2. In 
hindsight, this was potentially a bad idea, the meta2 pool should have a much 
higher PG count or a much more aggressive recovery policy.

I now need to rebalance PGs on meta2 and it is going way too slow compared with 
the performance of the SSDs it is located on. In a way, I would like to keep 
the PG count where it is, but increase the recovery rate for this pool by a 
factor of 10. Please let me know what options I have.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to