[ceph-users] Re: Performance improvement suggestion

Frank Schilder Mon, 04 Mar 2024 05:38:42 -0800

>>> Fast write enabled would mean that the primary OSD sends #size copies to the
>>> entire active set (including itself) in parallel and sends an ACK to the
>>> client as soon as min_size ACKs have been received from the peers (including
>>> itself). In this way, one can tolerate (size-min_size) slow(er) OSDs (slow
>>> for whatever reason) without suffering performance penalties immediately
>>> (only after too many requests started piling up, which will show as a slow
>>> requests warning).
>>>
>> What happens if there occurs an error on the slowest osd after the min_size 
>> ACK has already been send to the client?
>>
>This should not be different than what exists today..unless of-course if
>the error happens on the local/primary osd


Can this be addressed with reasonable effort? I don't expect this to be a 
quick-fix and it should be tested. However, beating the tail-latency statistics 
with the extra redundancy should be worth it. I observe fluctuations of 
latencies, OSDs become randomly slow for whatever reason for short time 
intervals and then return to normal.

A reason for this could be DB compaction. I think during compaction latency 
tends to spike.

A fast-write option would effectively remove the impact of this.

Best regards and thanks for considering this!
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Performance improvement suggestion

Reply via email to