In my case my cluster is under very little active load and so I have never had to be concerned about recovery operations impacting on client traffic. In fact, I generally tune up from the defaults (increase osx max backfills) to improve recovery speed when I'm doing major changes, because there's plenty of spare capacity in the cluster; and either way I'm in the fortunate position where I can place a higher value on having a HEALTH_OK cluster ASAP than on the client I/O being consistent.
Rich On 19/07/17 16:27, Laszlo Budai wrote: > Hi Rich, > > Thank you for your answer. This is good news to hear :) > Regarding the reconfiguration you've done: if I understand correctly, you > have changed it all at once (like download the crush map, edit it - add all > the new OSDs, and upload the new map to the cluster). How did you controlled > the impact of the recovery/refilling operation on your clients' data traffic? > What setting have you used to avoid slow requests? > > Kind regards, > Laszlo > > > On 19.07.2017 17:40, Richard Hesketh wrote: >> On 19/07/17 15:14, Laszlo Budai wrote: >>> Hi David, >>> >>> Thank you for that reference about CRUSH. It's a nice one. >>> There I could read about expanding the cluster, but in one of my cases we >>> want to do more: we want to move from host failure domain to chassis >>> failure domain. Our concern is: how will ceph behave for those PGs where >>> all the three replicas currently are in the same chassis? Because in this >>> case according to the new CRUSH map two replicas are in the wrong place. >>> >>> Kind regards, >>> Laszlo >> >> Changing crush rules resulting in PGs being remapped works exactly the same >> way as changes in crush weights causing remapped data. The PGs will be >> remapped in accordance with the new crushmap/rules and then recovery >> operations will copy them over to the new OSDs as usual. Even if a PG is >> entirely remapped, the OSDs that were originally hosting it will operate as >> an acting set and continue to serve I/O and replicate data until copies on >> the new OSDs are ready to take over - ceph won't throw an upset because the >> acting set doesn't comply with the crush rules. I have done, for instance, a >> crush rule change which resulted in an entire pool being entirely remapped - >> switching the cephfs metadata pool from an HDD root to an SSD root rule, so >> every single PG was moved to a completely different set of OSDs - and it all >> continued to work fine while recovery took place. >> >> Rich
signature.asc
Description: OpenPGP digital signature
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
