Good morning folks,As a newbie to Ceph yesterday was the first time I've configured my CRUSH map, added a CRUSH rule and created my first pool using this rule.
Since then I get the status HEALTH_WARN with the following output:
~~~
$ sudo ceph status
cluster:
id: 47c108bd-db66-4197-96df-cadde9e9eb45
health: HEALTH_WARN
Degraded data redundancy: 128 pgs undersized
1 pools have pg_num > pgp_num
services:
mon: 3 daemons, quorum ccp-tcnm01,ccp-tcnm02,ccp-tcnm03
mgr: ccp-tcnm01(active), standbys: ccp-tcnm03, ccp-tcnm02
osd: 3 osds: 3 up, 3 in
data:
pools: 1 pools, 128 pgs
objects: 0 objects, 0 bytes
usage: 3088 MB used, 3068 GB / 3071 GB avail
pgs: 128 active+undersized
~~~
The pool was created running `sudo ceph osd pool create joergsfirstpool
128 replicated replicate_datacenter`.
I've figured out that I forgot to set the value for the key pgp_num accordingly. So I've done that by running `sudo ceph osd pool set joergsfirstpool pgp_num 128`. As you could see in the following output 15 PGs were remapped but 113 still remain in active+undersized.
~~~
$ sudo ceph status
cluster:
id: 47c108bd-db66-4197-96df-cadde9e9eb45
health: HEALTH_WARN
Degraded data redundancy: 113 pgs undersized
services:
mon: 3 daemons, quorum ccp-tcnm01,ccp-tcnm02,ccp-tcnm03
mgr: ccp-tcnm01(active), standbys: ccp-tcnm03, ccp-tcnm02
osd: 3 osds: 3 up, 3 in; 15 remapped pgs
data:
pools: 1 pools, 128 pgs
objects: 0 objects, 0 bytes
usage: 3089 MB used, 3068 GB / 3071 GB avail
pgs: 113 active+undersized
15 active+clean+remapped
~~~
My questions are:
1. What does active+undersized actually mean? I did not find anything
about it in the documentation on docs.ceph.com.
2. Why are only 15 PGs were getting remapped after I've corrected the mistake with the wrong pgp_num value?
3. What's wrong here and what do I have to do to get the cluster back to active+clean, again?
For further information you could find my current CRUSH map below:
# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable chooseleaf_vary_r 1
tunable chooseleaf_stable 1
tunable straw_calc_version 1
tunable allowed_bucket_algs 54
# devices
device 0 osd.0 class hdd
device 1 osd.1 class hdd
device 2 osd.2 class hdd
# types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5 pdu
type 6 pod
type 7 room
type 8 datacenter
type 9 region
type 10 root
# buckets
host ccp-tcnm01 {
id -5 # do not change unnecessarily
id -6 class hdd # do not change unnecessarily
# weight 1.000
alg straw2
hash 0 # rjenkins1
item osd.1 weight 1.000
}
host ccp-tcnm03 {
id -7 # do not change unnecessarily
id -8 class hdd # do not change unnecessarily
# weight 1.000
alg straw2
hash 0 # rjenkins1
item osd.2 weight 1.000
}
datacenter dc1 {
id -9 # do not change unnecessarily
id -12 class hdd # do not change unnecessarily
# weight 2.000
alg straw2
hash 0 # rjenkins1
item ccp-tcnm01 weight 1.000
item ccp-tcnm03 weight 1.000
}
host ccp-tcnm02 {
id -3 # do not change unnecessarily
id -4 class hdd # do not change unnecessarily
# weight 1.000
alg straw2
hash 0 # rjenkins1
item osd.0 weight 1.000
}
datacenter dc3 {
id -10 # do not change unnecessarily
id -11 class hdd # do not change unnecessarily
# weight 1.000
alg straw2
hash 0 # rjenkins1
item ccp-tcnm02 weight 1.000
}
root default {
id -1 # do not change unnecessarily
id -2 class hdd # do not change unnecessarily
# weight 3.000
alg straw2
hash 0 # rjenkins1
item dc1 weight 2.000
item dc3 weight 1.000
}
# rules
rule replicated_rule {
id 0
type replicated
min_size 1
max_size 10
step take default
step chooseleaf firstn 0 type host
step emit
}
rule replicate_datacenter {
id 1
type replicated
min_size 1
max_size 10
step take default
step chooseleaf firstn 0 type datacenter
step emit
}
# end crush map
Best regards,
Joerg
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
