Dear All
We created an erasure coded pool with k=4 m=2 with failure-domain=host but have
only 6 osd nodes.
Is that correct that recovery will be forbidden by the crush rule if a node is
down?
After rebooting all nodes we noticed that the recovery was slow, maybe half an
hour, but all pools are currently empty (new install).
This is odd...
Can it be related to the k+m being equal to the number of nodes? (4+2=6)
step set_choose_tries 100 was already in the EC crush rule.
rule ewos1-prod_cinder_ec {
id 2
type erasure
min_size 3
max_size 6
step set_chooseleaf_tries 5
step set_choose_tries 100
step take default class nvme
step chooseleaf indep 0 type host
step emit
}
ceph osd erasure-code-profile set ec42 k=4 m=2 crush-root=default
crush-failure-domain=host crush-device-class=nvme
ceph osd pool create ewos1-prod_cinder_ec 256 256 erasure ec42
ceph version 12.2.10-543-gfc6f0c7299 (fc6f0c7299e3442e8a0ab83260849a6249ce7b5f)
luminous (stable)
cluster:
id: b5e30221-a214-353c-b66b-8c37b4349123
health: HEALTH_WARN
noout flag(s) set
Reduced data availability: 125 pgs inactive, 32 pgs peering
services:
mon: 3 daemons, quorum ewos1-osd1-prod,ewos1-osd3-prod,ewos1-osd5-prod
mgr: ewos1-osd5-prod(active), standbys: ewos1-osd3-prod, ewos1-osd1-prod
osd: 24 osds: 24 up, 24 in
flags noout
data:
pools: 4 pools, 1600 pgs
objects: 0 objects, 0B
usage: 24.3GiB used, 43.6TiB / 43.7TiB avail
pgs: 7.812% pgs not active
1475 active+clean
93 activating
32 peering
Which k&m values are preferred on 6 nodes?
BTW, we plan to use this EC pool as a second rbd pool in Openstack, with the
main first rbd pool being replicated size=3; it is nvme ssd only.
Thanks for your help!
Best Regards
Francois Scheurer
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
