On 05/29/2015 04:47 PM, Samuel Just wrote:
Many people have reported that they need to lower the osd recovery config 
options to minimize the impact of recovery on client io.  We are talking about 
changing the defaults as follows:

osd_max_backfills to 1 (from 10)
osd_recovery_max_active to 3 (from 15)
osd_recovery_op_priority to 1 (from 10)
osd_recovery_max_single_start to 1 (from 5)

We'd like a bit of feedback first though.  Is anyone happy with the current 
configs?  Is anyone using something between these values and the current 
defaults?  What kind of workload?  I'd guess that lowering osd_max_backfills to 
1 is probably a good idea, but I wonder whether lowering 
osd_recovery_max_active and osd_recovery_max_single_start will cause small 
objects to recover unacceptably slowly.

Thoughts?

We ran recovery tests last year around when firefly was released. The basic gist of it was that as you increase client IO, the ratio of backfill to client IO changes for a given combination of priority settings. IE you can tune around 10 15 10 5, or 1 3 1 1, but in each case the ratio of client to recovery IO appears to scale with the amount of client IO, even past the super saturation point. I believe users will have a hard time finding optimal settings as clusters at the saturation point will behave differently than those in heavy super-saturation.


http://nhm.ceph.com/Ceph_3XRep_Backfill_Recovery_Results.pdf
http://nhm.ceph.com/Ceph_62EC_Backfill_Recovery_Results.pdf
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to