On 04/20/2015 01:46 PM, Robert LeBlanc wrote:
>
>
> On Mon, Apr 20, 2015 at 2:34 PM, Colin Corr <[email protected]
> <mailto:[email protected]>> wrote:
>
>
>
> On 04/20/2015 11:02 AM, Robert LeBlanc wrote:
> > We have a similar issue, but we wanted three copies across two racks.
> Turns out, that we increased size to 4 and left min_size at 2. We didn't want
> to risk having less than two copies and if we only had thee copies, losing a
> rack would block I/O. Once we expand to a third rack, we will adjust our rule
> and go to size 3. Searching the mailing list and docs proved difficult, so
> I'll include my rule so that you can use it as a basis. You should be able to
> just change rack to host and host to osd. If you want to keep only three
> copies, the "extra" OSD chosen just won't be used as Gregory mentions.
> Technically this rule should have "max_size 4", but I won't set a pool over 4
> copies so I didn't change it here.
> >
> > If anyone has a better way of writing this rule (or one that would work
> for both a two rack and 3+ rack configuration as mentioned above), I'd be
> open to it. This is the first rule that I've really wrote on my own.
> >
> > rule replicated_ruleset {
> > ruleset 0
> > type replicated
> > min_size 1
> > max_size 10
> > step take default
> > step choose firstn 2 type rack
> > step chooseleaf firstn 2 type host
> > step emit
> > }
>
> Thank you Robert. Your example was very helpful. I didn't realize you
> could nest the choose and chooseleaf steps together. I thought chooseleaf
> effectively handled that for you already. This makes a bit more sense now.
>
>
> I'm still a little fuzzy on it myself as well, but by not having an emit step
> between the choose and chooseleaf makes chooseleaf operate on the items
> chosen by choose instead of picking new things from all available entities. I
> couldn't get crushtool --test --simulate to work properly to confirm
> (http://tracker.ceph.com/issues/11224), but it is working properly in our
> cluster. Just FYI, the min_size and max_size does not change your pools, it
> only specifies what sizes the rule works for. Technically if the pool size
> (replica size) is less than 2 or greater than 3, this rule would not be
> selected.
Thanks for the help. Reading your comments and re-reading the documentation is
helpful in understanding how the rule language works. I had a few
misconceptions.
Any thoughts as to what conditions would cause us to end up with more than the
specified number of replicas? Is it for recovery scenarios or like a safety
rail for flapping OSDs?
It would seem that the default min_size and max_size values (1 and 10) are
sufficient for this rule, just as you demonstrated in your rule.
rule host_rule {
ruleset 2
type replicated
min_size 1
max_size 10
step take default
step choose firstn 2 type host
step chooseleaf firstn 2 type osd
step emit
}
> My rule looks like this now:
> rule host_rule {
> ruleset 2
> type replicated
> min_size 2
> max_size 3
> step take default
> step choose firstn 2 type host
> step chooseleaf firstn 2 type osd
> step emit
> }
>
> And the cluster is reporting the pool as clean, finally. If I understand
> correctly, we will now potentially have as many as 4 replicas of an object in
> the pool, 2 on each host.
>
>
> You will only have 4 replicas if you set the size of your pool to 4,
> otherwise if it is the default, it will be three. The rule will support up to
> 4 replicas.
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com