[ceph-users] Re: Recomand number of k and m erasure code

Phong Tran Thanh Mon, 15 Jan 2024 21:15:25 -0800

Thanks Anthony for your knowledge.

I am very happy


Vào Th 7, 13 thg 1, 2024 vào lúc 23:36 Anthony D'Atri <
anthony.da...@gmail.com> đã viết:

> There are nuances, but in general the higher the sum of m+k, the lower the
> performance, because *every* operation has to hit that many drives, which
> is especially impactful with HDDs.  So there’s a tradeoff between storage
> efficiency and performance.  And as you’ve seen, larger parity groups
> especially mean slower recovery/backfill.
>
> There’s also a modest benefit to choosing values of m and k that have
> small prime factors, but I wouldn’t worry too much about that.
>
>
> You can find EC efficiency tables on the net:
>
>
>
> https://docs.netapp.com/us-en/storagegrid-116/ilm/what-erasure-coding-schemes-are.html
>
>
> I should really add a table to the docs, making a note to do that.
>
> There’s a nice calculator at the OSNEXUS site:
>
> Ceph Designer <https://www.osnexus.com/ceph-designer>
> osnexus.com <https://www.osnexus.com/ceph-designer>
> [image: favicon.ico] <https://www.osnexus.com/ceph-designer>
> <https://www.osnexus.com/ceph-designer>
>
>
> The overhead factor is (k+m) / k
>
> So for a 4,2 profile, that’s 6 / 4 == 1.5
>
> For 6,2, 8 / 6 = 1.33
>
> For 10,2, 12 / 10 = 1.2
>
> and so forth.  As k increases, the incremental efficiency gain sees
> diminishing returns, but performance continues to decrease.
>
> Think of m as the number of copies you can lose without losing data, and
> m-1 as the number you can lose / have down and still have data *available*.
>
> I also suggest that the number of failure domains — in your cases this
> means OSD nodes — be *at least* k+m+1, so in your case you want k+m to be
> at most 9.
>
> With RBD and many CephFS implementations, we mostly have relatively large
> RADOS objects that are striped over many OSDs.
>
> When using RGW especially, one should attend to average and median S3
> object size.  There’s an analysis of the potential for space amplification
> in the docs so I won’t repeat it here in detail. This sheet
> https://docs.google.com/spreadsheets/d/1rpGfScgG-GLoIGMJWDixEkqs-On9w8nAUToPQjN8bDI/edit#gid=358760253
>  visually
> demonstrates this.
>
> Basically, for an RGW bucket pool — or for a CephFS data pool storing
> unusually small objects — if you have a lot of S3 objects in the multiples
> of KB size, you waste a significant fraction of underlying storage.  This
> is exacerbated by EC, and the larger the sum of k+m, the more waste.
>
> When people ask me about replication vs EC and EC profile, the first
> question I ask is what they’re storing.  When EC isn’t a non-starter, I
> tend to recommend 4,2 as a profile until / unless someone has specific
> needs and can understand the tradeoffs. This lets you store ~~ 2x the data
> of 3x replication while not going overboard on the performance hit.
>
> If you care about your data, do not set m=1.
>
> If you need to survive the loss of many drives, say if your cluster is
> across multiple buildings or sites, choose a larger value of k.  There are
> people running profiles like 4,6 because they have unusual and specific
> needs.
>
>
>
>
> On Jan 13, 2024, at 10:32 AM, Phong Tran Thanh <tranphong...@gmail.com>
> wrote:
>
> Hi ceph user!
>
> I need to determine which erasure code values (k and m) to choose for a
> Ceph cluster with 10 nodes.
>
> I am using the reef version with rbd. Furthermore, when using a larger k,
> for example, ec6+2 and ec4+2, which erasure coding performance is better,
> and what are the criteria for choosing the appropriate erasure coding?
> Please help me
>
> Email: tranphong...@gmail.com
> Skype: tranphong079
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
>

-- 
Trân trọng,
----------------------------------------------------------------------------

*Tran Thanh Phong*

Email: tranphong...@gmail.com
Skype: tranphong079

favicon.ico
Description: Binary data

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Recomand number of k and m erasure code

Reply via email to