Hi,
Since some weeks we started to us pg autoscale on our pools.
We run with version 16.2.7.
Maybe a coincidence, maybe not, from some weeks we started to experience mgr
progress module failures:
“””
[root@naret-monitor01 ~]# ceph -s
cluster:
id: 63334166-d991-11eb-99de-40a6b72108d0
health: HEALTH_ERR
Module 'progress' has failed:
('346ee7e0-35f0-4fdf-960e-a36e7e2441e4',)
1 pool(s) full services:
mon: 3 daemons, quorum naret-monitor01,naret-monitor02,naret-monitor03 (age
5d)
mgr: naret-monitor02.ciqvgv(active, since 6d), standbys:
naret-monitor03.escwyg, naret-monitor01.suwugf
mds: 1/1 daemons up, 2 standby
osd: 760 osds: 760 up (since 4d), 760 in (since 4d); 10 remapped pgs
rgw: 3 daemons active (3 hosts, 1 zones) data:
volumes: 1/1 healthy
pools: 32 pools, 6250 pgs
objects: 977.79M objects, 3.6 PiB
usage: 5.7 PiB used, 5.1 PiB / 11 PiB avail
pgs: 4602612/5990777501 objects misplaced (0.077%)
6214 active+clean
25 active+clean+scrubbing+deep
10 active+remapped+backfilling
1 active+clean+scrubbing io:
client: 243 MiB/s rd, 292 MiB/s wr, 1.68k op/s rd, 842 op/s wr
recovery: 430 MiB/s, 109 objects/s progress:
Global Recovery Event (14h)
[===========================.] (remaining: 70s)
“””
In the mgr logs I see:
“””
debug 2022-10-20T23:09:03.859+0000 7fba5f300700 0 [pg_autoscaler ERROR root]
pool 2 has overlapping roots: {-60, -1}
debug 2022-10-20T23:09:03.863+0000 7fba5f300700 0 [pg_autoscaler ERROR root]
pool 3 has overlapping roots: {-60, -1, -2}
debug 2022-10-20T23:09:03.866+0000 7fba5f300700 0 [pg_autoscaler ERROR root]
pool 5 has overlapping roots: {-60, -1, -2}
debug 2022-10-20T23:09:03.870+0000 7fba5f300700 0 [pg_autoscaler ERROR root]
pool 6 has overlapping roots: {-60, -1, -2}
debug 2022-10-20T23:09:03.873+0000 7fba5f300700 0 [pg_autoscaler ERROR root]
pool 10 has overlapping roots: {-105, -60,
-1, -2}
debug 2022-10-20T23:09:03.877+0000 7fba5f300700 0 [pg_autoscaler ERROR root]
pool 11 has overlapping roots: {-105, -60,
-1, -2}
debug 2022-10-20T23:09:03.880+0000 7fba5f300700 0 [pg_autoscaler ERROR root]
pool 12 has overlapping roots: {-105, -60,
-1, -2}
debug 2022-10-20T23:09:03.884+0000 7fba5f300700 0 [pg_autoscaler ERROR root]
pool 13 has overlapping roots: {-105, -60,
-1, -2}
debug 2022-10-20T23:09:03.887+0000 7fba5f300700 0 [pg_autoscaler ERROR root]
pool 14 has overlapping roots: {-105, -60,
-1, -2}
debug 2022-10-20T23:09:03.891+0000 7fba5f300700 0 [pg_autoscaler ERROR root]
pool 15 has overlapping roots: {-105, -60,
-1, -2}
debug 2022-10-20T23:09:03.894+0000 7fba5f300700 0 [pg_autoscaler ERROR root]
pool 26 has overlapping roots: {-105, -60,
-1, -2}
debug 2022-10-20T23:09:03.898+0000 7fba5f300700 0 [pg_autoscaler ERROR root]
pool 28 has overlapping roots: {-105, -60, -1, -2}
debug 2022-10-20T23:09:03.901+0000 7fba5f300700 0 [pg_autoscaler ERROR root]
pool 29 has overlapping roots: {-105, -60, -1, -2}
debug 2022-10-20T23:09:03.905+0000 7fba5f300700 0 [pg_autoscaler ERROR root]
pool 30 has overlapping roots: {-105, -60, -1, -2}
...
“””
Do you have any explanation/fix for this errors?
Regards,
Giuseppe
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]