Hello All,
I´m writing to you because i´m trying to find the way to rebuild a osd disk
in a way to don´t impact the performance of the cluster.
That´s because my applications are very latency sensitive.

1_ I found the way to reuse a OSD ID and don´t rebalance the cluster every
time that I lost a disk.
So, my cluster is running with the noout check forever.
The point here is do the disk change as fast I can.

2_ after reuse de OSD ID, I´m living the OSD up and running, but with CERO
weight.
For example:

root@DC4-ceph03-dn03:/var/lib/ceph/osd/ceph-352# ceph osd tree | grep 352
*352   1.81999                 osd.352               up        0
 1.00000*

At this point everything is good.

3_ Starting the reweight, using "osd reweigh" i´m not touching the
crushmap, and I´m doing the reweight very gradually.
Example:
*ceph osd reweight 352 0.001*

But, anyway doing the reweight in this way i´m heating the latency
sometimes.
Depending of the amount of PGs that the cluster is recovering the impact is
worst.

Tunings that I already have done:

ceph tell osd.* injectargs "--osd_max_backfills 1"
ceph tell osd.* injectargs "--osd_recovery_max_active 1"
ceph tell osd.* injectargs '--osd-max-recovery-threads 1'
ceph tell osd.* injectargs '--osd-recovery-op-priority 1'
ceph tell osd.* injectargs '--osd-client-op-priority 63'

The question is, there are more parameters to change in order to do more
gradually the OSD rebuild?

I really appreciate your help, thanks in advance.

Agustin Trolli
Storage Team
Mercadolibre.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to