hi
i have a problem with ceph 17.2.6 , cephfs with mds daemons but see an
unusual behavior.
create a data pool with default crush rule but data just store in 3
specific osd and other osd is clean
PG auto-scaling is also active but its size does not change when the pool
is biger
I did this manually but the problem was not solved and I got the error pg
are not balanced across osds
How do I solve this problem? Is this a bug? I did not have this problem in
previous versions
I solved this problem. There are several identical crash rules in the folder
step chooseleaf firstn 0 type host
I think this confuses the balancer and autoscale and output for ceph osd
pool autoscale-status is empty
after remove other crush rules autoscale runing
but
move data from osd full to clear osd is slow trying with Reducing the
weight of filled OSDs, I tried to prioritize the use of other OSDs ceph
osd reweight-by-utilization I hope this works
Is there a solution that makes the process of autoscaling and cleaning
placement groups faster?
-------------------------------
[root@opcsdfpsbpp0201 ~]# ceph osd crush rule dump
[
{
"rule_id": 0,
"rule_name": "replicated_rule",
"type": 1,
"steps": [
{
"op": "take",
"item": -1,
"item_name": "default"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
},
{
"rule_id": 1,
"rule_name": "r3-host",
"type": 1,
"steps": [
{
"op": "take",
"item": -2,
"item_name": "default~hdd"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
},
{
"rule_id": 2,
"rule_name": "r3",
"type": 1,
"steps": [
{
"op": "take",
"item": -2,
"item_name": "default~hdd"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
}
]
------------------------------------
# ceph osd status | grep back
23 opcsdfpsbpp0211 1900G 147G 0 0 0 0
backfillfull,exists,up
48 opcsdfpsbpp0201 1900G 147G 0 0 0 0
backfillfull,exists,up
61 opcsdfpsbpp0205 1900G 147G 0 0 0 0
backfillfull,exists,up
----------------------------------------------
Every 2.0s: ceph -s
opcsdfpsbpp0201: Sun Jun 18 11:44:29 2023
cluster:
id: 79a2627c-0821-11ee-a494-00505695c58c
health: HEALTH_WARN
3 backfillfull osd(s)
6 pool(s) backfillfull
services:
mon: 3 daemons, quorum opcsdfpsbpp0201,opcsdfpsbpp0205,opcsdfpsbpp0203
(age 6d)
mgr: opcsdfpsbpp0201.vttwxa(active, since 5d), standbys:
opcsdfpsbpp0205.tpodbs, opcsdfpsbpp0203.jwjkcl
mds: 1/1 daemons up, 2 standby
osd: 74 osds: 74 up (since 7d), 74 in (since 7d); 107 remapped pgs
data:
volumes: 1/1 healthy
pools: 6 pools, 359 pgs
objects: 599.64k objects, 2.2 TiB
usage: 8.1 TiB used, 140 TiB / 148 TiB avail
pgs: 923085/1798926 objects misplaced (51.313%)
252 active+clean
87 active+remapped+backfill_wait
20 active+remapped+backfilling
io:
client: 255 B/s rd, 0 op/s rd, 0 op/s wr
recovery: 33 MiB/s, 8 objects/s
progress:
Global Recovery Event (5h)
[===================.........] (remaining: 2h)
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]