On 09/09/2014 14:21, Lei Dong wrote: > Thanks loic! > > Actually I've found that increase choose_local_fallback_tries can > help(chooseleaf_tries helps not so significantly), but I'm afraid when osd > failure happen and need to find new acting set, it may be fail to find enough > racks again. So I'm trying to find a more guaranteed way in case of osd > failure. > > My profile is nothing special other than k=8 m=3.
So your goal is to make it so loosing 3 OSD simultaneously does not mean loosing data. By forcing each rack to hold at most 2 OSDs for a given object, you make it so loosing a full rack does not mean loosing data. Are these racks in the same room in the datacenter ? In the event of a catastrophic failure that permanently destroy one rack, how realistic is it that the other racks are unharmed ? If the rack is destroyed by fire and is in a row with the six other racks, there is a very high chance that the other racks will also be damaged. Note that I am not a system architect nor a system administrator : I may be completely wrong ;-) If it turns out that the probability of a single rack to fail entirely and independently of the others is negligible, it may not be necessary to make a complex ruleset and instead use the default ruleset. My 2cts > > Thanks again! > > Leidong > > > > > >> On 2014年9月9日, at 下午7:53, "Loic Dachary" <[email protected]> wrote: >> >> Hi, >> >> It is indeed possible that mapping fails if there are just enough racks to >> match the constraint. And the probability of a bad mapping increases when >> the number of PG increases because there is a need for more mapping. You can >> tell crush to try harder with >> >> step set_chooseleaf_tries 10 >> >> Be careful though : increasing this number will change mapping. It will not >> just fix the bad mappings you're seeing, it will also change the mappings >> that succeeded with a lower value. Once you've set this parameter, it cannot >> be modified. >> >> Would you mind sharing the erasure code profile you plan to work with ? >> >> Cheers >> >>> On 09/09/2014 12:39, Lei Dong wrote: >>> Hi ceph users: >>> >>> I want to create a customized crush rule for my EC pool (with replica_size >>> = 11) to distribute replicas into 6 different Racks. >>> >>> I use the following rule at first: >>> >>> Step take default // root >>> Step choose firstn 6 type rack// 6 racks, I have and only have 6 racks >>> Step chooseleaf indep 2 type osd // 2 osds per rack >>> Step emit >>> >>> I looks fine and works fine when PG num is small. >>> But when pg num increase, there are always some PGs which can not take all >>> the 6 racks. >>> It looks like “Step choose firstn 6 type rack” sometimes returns only 5 >>> racks. >>> After some investigation, I think it may caused by collision of choices. >>> >>> Then I come up with another solution to solve collision like this: >>> >>> Step take rack0 >>> Step chooseleaf indep 2 type osd >>> Step emit >>> Step take rack1 >>> …. >>> (manually take every rack) >>> >>> This won’t cause rack collision, because I specify rack by name at first. >>> But the problem is that osd in rack0 will always be primary osd because I >>> choose from rack0 first. >>> >>> So the question is what is the recommended way to meet such a need >>> (distribute 11 replicas into 6 racks evenly in case of rack failure)? >>> >>> >>> Thanks! >>> LeiDong >>> >>> >>> >>> >>> _______________________________________________ >>> ceph-users mailing list >>> [email protected] >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> -- >> Loïc Dachary, Artisan Logiciel Libre >> -- Loïc Dachary, Artisan Logiciel Libre
signature.asc
Description: OpenPGP digital signature
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
