After updating the CRUSH rule from
rule cephfs_ec {
id 1
type erasure
min_size 8
max_size 8
step set_chooseleaf_tries 5
step set_choose_tries 100
step take default
step choose indep 4 type host
step choose indep 2 type osd
step emit
}
to
rule cephfs_ec {
id 1
type erasure
min_size 8
max_size 12
#step set_chooseleaf_tries 6
step set_choose_tries 100
step take default
step choose indep 6 type host
step choose indep 2 type osd
step emit
}
upmap is not complaining anymore and is working with the six hosts.
Seems like CRUSH does not stop picking a host after the first four with the
first rule and is complaining when it gets the fifth host.
Is this a bug or intended behaviour?
Regards
Eric
On Tue, Sep 17, 2019 at 3:55 PM Eric Dold <[email protected]> wrote:
> With ceph 14.2.4 it's the same.
> The upmap balancer is not working.
>
> Any ideas?
>
> On Wed, Sep 11, 2019 at 11:32 AM Eric Dold <[email protected]> wrote:
>
>> Hello,
>>
>> I'm running ceph 14.2.3 on six hosts with each four osds. I did recently
>> upgrade this from four hosts.
>>
>> The cluster is running fine. But i get this in my logs:
>>
>> Sep 11 11:02:41 ceph1 ceph-mon[1333]: 2019-09-11 11:02:41.953
>> 7f26023a6700 -1 verify_upmap number of buckets 5 exceeds desired 4
>> Sep 11 11:02:41 ceph1 ceph-mon[1333]: 2019-09-11 11:02:41.953
>> 7f26023a6700 -1 verify_upmap number of buckets 5 exceeds desired 4
>> Sep 11 11:02:41 ceph1 ceph-mon[1333]: 2019-09-11 11:02:41.953
>> 7f26023a6700 -1 verify_upmap number of buckets 5 exceeds desired 4
>>
>> It looks like the balancer is not doing any work.
>>
>> Here are some infos about the cluster:
>>
>> ceph1 ~ # ceph osd crush rule ls
>> replicated_rule
>> cephfs_ec
>> ceph1 ~ # ceph osd crush rule dump replicated_rule
>> {
>> "rule_id": 0,
>> "rule_name": "replicated_rule",
>> "ruleset": 0,
>> "type": 1,
>> "min_size": 1,
>> "max_size": 10,
>> "steps": [
>> {
>> "op": "take",
>> "item": -1,
>> "item_name": "default"
>> },
>> {
>> "op": "chooseleaf_firstn",
>> "num": 0,
>> "type": "host"
>> },
>> {
>> "op": "emit"
>> }
>> ]
>> }
>>
>> ceph1 ~ # ceph osd crush rule dump cephfs_ec
>> {
>> "rule_id": 1,
>> "rule_name": "cephfs_ec",
>> "ruleset": 1,
>> "type": 3,
>> "min_size": 8,
>> "max_size": 8,
>> "steps": [
>> {
>> "op": "set_chooseleaf_tries",
>> "num": 5
>> },
>> {
>> "op": "set_choose_tries",
>> "num": 100
>> },
>> {
>> "op": "take",
>> "item": -1,
>> "item_name": "default"
>> },
>> {
>> "op": "choose_indep",
>> "num": 4,
>> "type": "host"
>> },
>> {
>> "op": "choose_indep",
>> "num": 2,
>> "type": "osd"
>> },
>> {
>> "op": "emit"
>> }
>> ]
>> }
>>
>> ceph1 ~ # ceph osd erasure-code-profile ls
>> default
>> isa_62
>> ceph1 ~ # ceph osd erasure-code-profile get default
>> k=2
>> m=1
>> plugin=jerasure
>> technique=reed_sol_van
>> ceph1 ~ # ceph osd erasure-code-profile get isa_62
>> crush-device-class=
>> crush-failure-domain=osd
>> crush-root=default
>> k=6
>> m=2
>> plugin=isa
>> technique=reed_sol_van
>>
>> The idea with four hosts was that the ec profile should take two osds on
>> each host for the eight buckets.
>> Now with six hosts i guess two hosts will have tow buckets on two osds
>> and four hosts will have each one bucket for a piece of data.
>>
>> Any idea how to resolve this?
>>
>> Regards
>> Eric
>>
>
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]