>
>>> This has left me with a single sad pg:
>>> [WRN] PG_AVAILABILITY: Reduced data availability: 1 pg inactive
>>> pg 1.0 is stuck inactive for 33m, current state unknown, last acting []
>>>
>> .mgr pool perhaps.
>
> I think so
>
>>> ceph osd tree shows that CRUSH picked up my racks OK, eg.
>>> -3 45.11993 rack B4
>>> -2 45.11993 host moss-be1001
>>> 1 hdd 3.75999 osd.1 up 1.00000 1.00000
>> Please send the entire first 10 lines or so of `ceph osd tree`
>
> root@moss-be1001:/# ceph osd tree
> ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
> -7 176.11194 rack F3
> -6 176.11194 host moss-be1003
> 2 hdd 7.33800 osd.2 up 1.00000 1.00000
> 3 hdd 7.33800 osd.3 up 1.00000 1.00000
> 6 hdd 7.33800 osd.6 up 1.00000 1.00000
> 9 hdd 7.33800 osd.9 up 1.00000 1.00000
> 12 hdd 7.33800 osd.12 up 1.00000 1.00000
> 13 hdd 7.33800 osd.13 up 1.00000 1.00000
> 16 hdd 7.33800 osd.16 up 1.00000 1.00000
> 19 hdd 7.33800 osd.19 up 1.00000 1.00000
Yep. Your racks and thus hosts and OSDs aren’t under the `default` or any
other root, so they won’t get picked by any CRUSH rule.
>
>>>
>>> I passed this config to bootstrap with --config:
>>>
>>> [global]
>>> osd_crush_chooseleaf_type = 3
>> Why did you set that? 3 is an unusual value. AIUI most of the time the
>> only reason to change this option is if one is setting up a single-node
>> sandbox - and perhaps localpools create a rule using it. I suspect this is
>> at least part of your problem.
>
> I wanted to have rack as failure domain rather than host i.e. to ensure that
> each replica goes in a different rack (academic at the moment as I have 3
> hosts, one in each rack, but for future expansion important).
You do that with the CRUSH rule, not with osd_crush_chooseleaf_type. Set that
back to the default value of `1`. This option is marked `dev` for a reason ;)
And the replication rule:
rule replicated_rule {
id 0
type replicated
step take default
step chooseleaf firstn 0 type rack ###### `rack` here is what
selects the failure domain.
step emit
}
> I could presumably fix this up by editing the crushmap (to put the racks into
> the default bucket)
That would probably help
`ceph osh crush move F3 root=default`
but I think you’d also need to revert `osd_crush_chooseleaf_type` too. Might
be better to wipe and redeploy so you know that down the road when you add /
replace hardware this behavior doesn’t resurface.
>
>>> Once the cluster was up I used an osd spec file that looked like:
>>> service_type: osd
>>> service_id: rrd_single_NVMe
>>> placement:
>>> label: "NVMe"
>>> spec:
>>> data_devices:
>>> rotational: 1
>>> db_devices:
>>> model: "NVMe"
>> Is it your intent to use spinners for payload data and SSD for metadata?
>
> Yes.
You might want to set `db_slots` accordingly, by default I think it’ll be 1:1
which probably isn’t what you intend.
>
> Regards,
>
> Matthew
> _______________________________________________
> ceph-users mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]