In my case my cluster is under very little active load and so I have never had 
to be concerned about recovery operations impacting on client traffic. In fact, 
I generally tune up from the defaults (increase osx max backfills) to improve 
recovery speed when I'm doing major changes, because there's plenty of spare 
capacity in the cluster; and either way I'm in the fortunate position where I 
can place a higher value on having a HEALTH_OK cluster ASAP than on the client 
I/O being consistent.

Rich

On 19/07/17 16:27, Laszlo Budai wrote:
> Hi Rich,
> 
> Thank you for your answer. This is good news to hear :)
> Regarding the reconfiguration you've done: if I understand correctly, you 
> have changed it all at once (like download the crush map, edit it - add all 
> the new OSDs, and upload the new map to the cluster). How did you controlled 
> the impact of the recovery/refilling operation on your clients' data traffic? 
> What setting have you used to avoid slow requests?
> 
> Kind regards,
> Laszlo
> 
> 
> On 19.07.2017 17:40, Richard Hesketh wrote:
>> On 19/07/17 15:14, Laszlo Budai wrote:
>>> Hi David,
>>>
>>> Thank you for that reference about CRUSH. It's a nice one.
>>> There I could read about expanding the cluster, but in one of my cases we 
>>> want to do more: we want to move from host failure domain to chassis 
>>> failure domain. Our concern is: how will ceph behave for those PGs where 
>>> all the three replicas currently are in the same chassis? Because in this 
>>> case according to the new CRUSH map two replicas are in the wrong place.
>>>
>>> Kind regards,
>>> Laszlo
>>
>> Changing crush rules resulting in PGs being remapped works exactly the same 
>> way as changes in crush weights causing remapped data. The PGs will be 
>> remapped in accordance with the new crushmap/rules and then recovery 
>> operations will copy them over to the new OSDs as usual. Even if a PG is 
>> entirely remapped, the OSDs that were originally hosting it will operate as 
>> an acting set and continue to serve I/O and replicate data until copies on 
>> the new OSDs are ready to take over - ceph won't throw an upset because the 
>> acting set doesn't comply with the crush rules. I have done, for instance, a 
>> crush rule change which resulted in an entire pool being entirely remapped - 
>> switching the cephfs metadata pool from an HDD root to an SSD root rule, so 
>> every single PG was moved to a completely different set of OSDs - and it all 
>> continued to work fine while recovery took place.
>>
>> Rich

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to