[ceph-users] autocaling not work and active+remapped+backfilling

farhad kh Mon, 19 Jun 2023 00:29:04 -0700

 hi
i have a problem with ceph 17.2.6 , cephfs with mds daemons but see an
unusual behavior.
 create a data pool with default crush rule but data just store in 3
specific osd and other osd is clean
PG auto-scaling is also active but its size does not change when the pool
is biger
  I did this manually but the problem was not solved and I got the error pg
are not balanced across osds
How do I solve this problem? Is this a bug? I did not have this problem in
previous versions
I solved this problem. There are several identical crash rules in the folder


step chooseleaf firstn 0 type host

I think this confuses the balancer and autoscale and output for ceph osd
pool autoscale-status  is empty
after remove other crush rules autoscale runing
but
 move data from osd full to clear osd is slow trying  with  Reducing the
 weight of filled OSDs, I tried to prioritize the use of other OSDs  ceph
osd reweight-by-utilization I hope this works
Is there a solution that makes the process of autoscaling and cleaning
placement groups faster?

-------------------------------

[root@opcsdfpsbpp0201 ~]# ceph osd crush rule dump
[
    {
        "rule_id": 0,
        "rule_name": "replicated_rule",
        "type": 1,
        "steps": [
            {
                "op": "take",
                "item": -1,
                "item_name": "default"
            },
            {
                "op": "chooseleaf_firstn",
                "num": 0,
                "type": "host"
            },
            {
                "op": "emit"
            }
        ]
    },
    {
        "rule_id": 1,
        "rule_name": "r3-host",
        "type": 1,
        "steps": [
            {
                "op": "take",
                "item": -2,
                "item_name": "default~hdd"
            },
            {
                "op": "chooseleaf_firstn",
                "num": 0,
                "type": "host"
            },
            {
                "op": "emit"
            }
        ]
    },
    {
        "rule_id": 2,
        "rule_name": "r3",
        "type": 1,
        "steps": [
            {
                "op": "take",
                "item": -2,
                "item_name": "default~hdd"
            },
            {
                "op": "chooseleaf_firstn",
                "num": 0,
                "type": "host"
            },
            {
                "op": "emit"
            }
        ]
    }
]

------------------------------------

# ceph osd status | grep back
23  opcsdfpsbpp0211  1900G   147G      0        0       0        0
backfillfull,exists,up
48  opcsdfpsbpp0201  1900G   147G      0        0       0        0
backfillfull,exists,up
61  opcsdfpsbpp0205  1900G   147G      0        0       0        0
backfillfull,exists,up

----------------------------------------------

Every 2.0s: ceph -s

             opcsdfpsbpp0201: Sun Jun 18 11:44:29 2023

  cluster:
    id:     79a2627c-0821-11ee-a494-00505695c58c
    health: HEALTH_WARN
            3 backfillfull osd(s)
            6 pool(s) backfillfull

  services:
    mon: 3 daemons, quorum opcsdfpsbpp0201,opcsdfpsbpp0205,opcsdfpsbpp0203
(age 6d)
    mgr: opcsdfpsbpp0201.vttwxa(active, since 5d), standbys:
opcsdfpsbpp0205.tpodbs, opcsdfpsbpp0203.jwjkcl
    mds: 1/1 daemons up, 2 standby
    osd: 74 osds: 74 up (since 7d), 74 in (since 7d); 107 remapped pgs

  data:
    volumes: 1/1 healthy
    pools:   6 pools, 359 pgs
    objects: 599.64k objects, 2.2 TiB
    usage:   8.1 TiB used, 140 TiB / 148 TiB avail
    pgs:     923085/1798926 objects misplaced (51.313%)
             252 active+clean
             87  active+remapped+backfill_wait
             20  active+remapped+backfilling

  io:
    client:   255 B/s rd, 0 op/s rd, 0 op/s wr
    recovery: 33 MiB/s, 8 objects/s

  progress:
    Global Recovery Event (5h)
      [===================.........] (remaining: 2h)
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[ceph-users] autocaling not work and active+remapped+backfilling

Reply via email to