Re: [ceph-users] How does CEPH calculates PGs per OSD for erasure coded (EC) pools?

Christian Wuerdig Mon, 29 Apr 2019 11:25:26 -0700

On Sun, 28 Apr 2019 at 21:45, Igor Podlesny <ceph-u...@poige.ru> wrote:

> On Sun, 28 Apr 2019 at 16:14, Paul Emmerich <paul.emmer...@croit.io>
> wrote:
> > Use k+m for PG calculation, that value also shows up as "erasure size"
> > in ceph osd pool ls detail
>
> So does it mean that for PG calculation those 2 pools are equivalent:
>
> 1) EC(4, 2)
> 2) replicated, size 6
>

Correct

>
> ? Sounds weird to be honest. Replicated with size 6 means each logical
> data is stored 6 times, what needed single PG now requires 6 PGs.
> And with EC(4, 2) there's still only 1.5 overhead in terms of raw
> occupied space -- how come PG calculation distribution needs adjusting
> to 6 instead of 1.5 then?
>

A single logical data unit (an object in ceph terms) will be allocated to a
single PG. For a replicated pool of size n this PG will simply be stored on
n OSDs. For an EC(k+m) pool this PG will get stored on k+m OSDs with the
difference that this single PG will contain different parts of the data on
the different OSDs.
http://docs.ceph.com/docs/master/architecture/#erasure-coding provides a
good overview on how this is actually achieved.

> Also, why does CEPH documentation say "It is equivalent to a
> replicated pool of size __two__" when describing EC(2, 1) example?
>

This relates to fault tolerance. A replicated pool of size 2 can loose one
OSD without data loss and so can a EC(2+1) pool

>
> --
> End of message. Next message?
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] How does CEPH calculates PGs per OSD for erasure coded (EC) pools?

Reply via email to