> In both cases, you only get 2 replicas on the remaining 2 hosts.
OK, I was able to reproduce this with crushtool.
> The difference is if you have 4 hosts with 2 osds. In the choose case, you
> have
> some fraction of the data that chose the down host in the first step (most of
> the
> attempts, actually!) and then couldn't find a usable osd, leaving you with
> only 2
This is also reproducible.
> replicas. With chooseleaf that doesn't happen.
>
> The other difference is if you have one of the two OSDs on the host marked
> out.
> In the choose case, the remaining OSD will get allocated 2x the data; in the
> chooseleaf case, usage will remain proportional with the rest of the cluster
> and
> the data from the out OSD will be distributed across other OSDs (at least when
> there are > 3 hosts!).
I see, but data distribution seems not optimal in that case.
For example using this crush map:
# types
type 0 osd
type 1 host
type 2 rack
type 3 row
type 4 room
type 5 datacenter
type 6 root
# buckets
host prox-ceph-1 {
id -2 # do not change unnecessarily
# weight 7.260
alg straw
hash 0 # rjenkins1
item osd.0 weight 3.630
item osd.1 weight 3.630
}
host prox-ceph-2 {
id -3 # do not change unnecessarily
# weight 7.260
alg straw
hash 0 # rjenkins1
item osd.2 weight 3.630
item osd.3 weight 3.630
}
host prox-ceph-3 {
id -4 # do not change unnecessarily
# weight 3.630
alg straw
hash 0 # rjenkins1
item osd.4 weight 3.630
}
host prox-ceph-4 {
id -5 # do not change unnecessarily
# weight 3.630
alg straw
hash 0 # rjenkins1
item osd.5 weight 3.630
}
root default {
id -1 # do not change unnecessarily
# weight 21.780
alg straw
hash 0 # rjenkins1
item prox-ceph-1 weight 7.260 # 2 OSDs
item prox-ceph-2 weight 7.260 # 2 OSDs
item prox-ceph-3 weight 3.630 # 1 OSD
item prox-ceph-4 weight 3.630 # 1 OSD
}
# rules
rule data {
ruleset 0
type replicated
min_size 1
max_size 10
step take default
step chooseleaf firstn 0 type host
step emit
}
# end crush map
crushtool shows the following utilization:
# crushtool --test -i my.map --rule 0 --num-rep 3 --show-utilization
device 0: 423
device 1: 452
device 2: 429
device 3: 452
device 4: 661
device 5: 655
Any explanation for that? Maybe related to the small number of devices?
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com