Hi all, I'd like to setup a small cluster (5 nodes) using erasure coding. I would like to use k=5 and m=3. Normally you would need a minimum of 8 nodes (preferably 9 or more) for this.
Then i found this blog: https://ceph.com/planet/erasure-code-on-small-clusters/ This sounded ideal to me so i started building a test setup using the 5+3 profile Changed the erasure ruleset to: rule erasure_ruleset { ruleset X type erasure min_size 8 max_size 8 step take default step choose indep 4 type host step choose indep 2 type osd step emit } Created a pool and now every PG has 8 shards in 4 hosts with 2 shards each, perfect. But then i tested a node failure, no problem again, all PG's stay active (most undersized+degraded, but still active). Then after 10 minutes the OSD's on the failed node were all marked as out, as expected. I waited for the data to be recovered to the other (fifth) node but that doesn't happen, there is no recovery whatsoever. Only when i completely remove the down+out OSD's from the cluster the data is recovered. My guess is that the "step choose indep 4 type host" chooses 4 hosts beforehand to store data on. Would it be possible to do something like this: Create a 5+3 EC profile, every hosts has a maximum of 2 shards (so 4 hosts are needed), in case of node failure -> recover data from failed node to fifth node. Thank you in advance, Caspar
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com