The main seeing you can control is osd_max_backfills. It's default is 1. I watch iostat on my osds are I slowly increment that seeing to leave enough overhead on the disks for client activity while the cluster moves all of its data around.
On Wed, Jul 19, 2017, 11:45 AM Richard Hesketh <[email protected]> wrote: > In my case my cluster is under very little active load and so I have never > had to be concerned about recovery operations impacting on client traffic. > In fact, I generally tune up from the defaults (increase osx max backfills) > to improve recovery speed when I'm doing major changes, because there's > plenty of spare capacity in the cluster; and either way I'm in the > fortunate position where I can place a higher value on having a HEALTH_OK > cluster ASAP than on the client I/O being consistent. > > Rich > > On 19/07/17 16:27, Laszlo Budai wrote: > > Hi Rich, > > > > Thank you for your answer. This is good news to hear :) > > Regarding the reconfiguration you've done: if I understand correctly, > you have changed it all at once (like download the crush map, edit it - add > all the new OSDs, and upload the new map to the cluster). How did you > controlled the impact of the recovery/refilling operation on your clients' > data traffic? What setting have you used to avoid slow requests? > > > > Kind regards, > > Laszlo > > > > > > On 19.07.2017 17:40, Richard Hesketh wrote: > >> On 19/07/17 15:14, Laszlo Budai wrote: > >>> Hi David, > >>> > >>> Thank you for that reference about CRUSH. It's a nice one. > >>> There I could read about expanding the cluster, but in one of my cases > we want to do more: we want to move from host failure domain to chassis > failure domain. Our concern is: how will ceph behave for those PGs where > all the three replicas currently are in the same chassis? Because in this > case according to the new CRUSH map two replicas are in the wrong place. > >>> > >>> Kind regards, > >>> Laszlo > >> > >> Changing crush rules resulting in PGs being remapped works exactly the > same way as changes in crush weights causing remapped data. The PGs will be > remapped in accordance with the new crushmap/rules and then recovery > operations will copy them over to the new OSDs as usual. Even if a PG is > entirely remapped, the OSDs that were originally hosting it will operate as > an acting set and continue to serve I/O and replicate data until copies on > the new OSDs are ready to take over - ceph won't throw an upset because the > acting set doesn't comply with the crush rules. I have done, for instance, > a crush rule change which resulted in an entire pool being entirely > remapped - switching the cephfs metadata pool from an HDD root to an SSD > root rule, so every single PG was moved to a completely different set of > OSDs - and it all continued to work fine while recovery took place. > >> > >> Rich > > _______________________________________________ > ceph-users mailing list > [email protected] > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
