Re: [ceph-users] Erasure coded pools and ceph failure domain setup

Hector Martin Mon, 04 Mar 2019 00:37:51 -0800


On 02/03/2019 01:02, Ravi Patel wrote:

Hello,
My question is how crush distributes chunks throughout the cluster witherasure coded pools. Currently, we have 4 OSD nodes with 36 drives(OSDdaemons) per node. If we use ceph_failire_domaon=host, then we arenecessarily limited to k=3,m=1, or k=2,m=2. We would like to explorek>3, m>2 modes of coding but are unsure how the crush rule set willdistribute the chunks if we set the crush_failure_domain to OSD
Ideally, we would like CRUSH to distribute the chunks hierarchically soto spread them evenly across the nodes. For example, all chunks are on asingle node.
Are chunks evenly spread by default? If not, how might we go aboutconfiguring them?

You can write your own CRUSH rules to distribute chunks hierarchically.For example, you can have a k=6, m=2 code together with a rule thatguarantees that each node gets two chunks. This means that if you lose anode you do not lose data (though depending on your min_size settingyour pool might be unavailable at that point until you replace the nodeor add a new one and the chunks can be recovered). You would accomplishthis with a rule that looks like this:


rule ec8 {
        id <some free id>
        type erasure
        min_size 7
        max_size 8
        step set_chooseleaf_tries 5
        step set_choose_tries 100
        step take default
        step choose indep 4 type host
        step chooseleaf indep 2 type osd
        step emit
}

This means the rule will first pick 4 hosts, then pick 2 OSDs per host,resulting in a total of 8 OSDs. This is appropriate for k=6 m=2 codes aswell as k=5 m=2 codes (that will just leave one random OSD unused),hence min_size 7 max_size 8.

If you just set crush_failure_domain to OSD, then the rule will pickrandom OSDs without regard for the hosts; you will be able to useeffectively any EC widths you want, but there will be no guarantees ofdata durability if you lose a whole host.


--
Hector Martin (hec...@marcansoft.com)
Public Key: https://mrcn.st/pub
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Erasure coded pools and ceph failure domain setup

Reply via email to