On 04/20/2015 11:02 AM, Robert LeBlanc wrote:
> We have a similar issue, but we wanted three copies across two racks. Turns
> out, that we increased size to 4 and left min_size at 2. We didn't want to
> risk having less than two copies and if we only had thee copies, losing a
> rack would block I/O. Once we expand to a third rack, we will adjust our rule
> and go to size 3. Searching the mailing list and docs proved difficult, so
> I'll include my rule so that you can use it as a basis. You should be able to
> just change rack to host and host to osd. If you want to keep only three
> copies, the "extra" OSD chosen just won't be used as Gregory mentions.
> Technically this rule should have "max_size 4", but I won't set a pool over 4
> copies so I didn't change it here.
>
> If anyone has a better way of writing this rule (or one that would work for
> both a two rack and 3+ rack configuration as mentioned above), I'd be open to
> it. This is the first rule that I've really wrote on my own.
>
> rule replicated_ruleset {
> ruleset 0
> type replicated
> min_size 1
> max_size 10
> step take default
> step choose firstn 2 type rack
> step chooseleaf firstn 2 type host
> step emit
> }
Thank you Robert. Your example was very helpful. I didn't realize you could
nest the choose and chooseleaf steps together. I thought chooseleaf effectively
handled that for you already. This makes a bit more sense now.
My rule looks like this now:
rule host_rule {
ruleset 2
type replicated
min_size 2
max_size 3
step take default
step choose firstn 2 type host
step chooseleaf firstn 2 type osd
step emit
}
And the cluster is reporting the pool as clean, finally. If I understand
correctly, we will now potentially have as many as 4 replicas of an object in
the pool, 2 on each host.
> On Mon, Apr 20, 2015 at 11:50 AM, Gregory Farnum <[email protected]
> <mailto:[email protected]>> wrote:
> It's actually pretty hacky: you configure your CRUSH rule to return
> two OSDs from each host, but set your size to 3. You'll want to test
> this carefully with your installed version to make sure that works,
> though — older CRUSH implementations would crash if you did that. :(
>
> In slightly more detail, you'll need to change it so that instead of
> using "chooseleaf" you "choose" 2 hosts, and then choose or chooseleaf
> 2 OSDs from each of those hosts. If you search the list archives for
> CRUSH threads you'll find some other discussions about doing precisely
> this, and I think the CRUSH documentation should cover the more
> general bits of how the language works.
> -Greg
Thank you Greg, I had trouble searching for discussions related to this. The
Google was not being friendly, or I wasn't issuing a good query. My
understanding of choose vs. chooseleaf and using multiple choose~ steps in a
rule will send me back to the docs for the remainder of my day.
Thanks,
Colin
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com