fs storage size?

listy via ceph-users Mon, 02 Mar 2026 00:16:48 -0800

-> $ ceph balancer status
{
    "active": true,
    "last_optimize_duration": "0:00:00.001667",
    "last_optimize_started": "Mon Mar  2 04:08:35 2026",
    "mode": "upmap",
    "no_optimization_needed": true,

"optimize_result": "Unable to find furtheroptimization, or pool(s) pg_num is decreasing, ordistribution is already perfect",

    "plans": []
}


-> $ ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME          STATUS  REWEIGHT  PRI-AFF
-1         1.07428  root default
-3         0.34180      host podster1

9 ssd 0.04880 osd.9 up 1.00000 1.00000

10    ssd  0.29300          osd.10         up   1.00000  1.00000
-7         0.39069      host podster2

0 ssd 0.09769 osd.0 up 1.00000 1.00000 4 ssd 0.29300 osd.4 up 1.00000 1.00000

-5         0.34180      host podster3

1 ssd 0.04880 osd.1 up 1.00000 1.00000 5 ssd 0.29300 osd.5 up 1.00000 1.00000


-> $ ceph osd dump | grep ratio
full_ratio 0.95
backfillfull_ratio 0.9
nearfull_ratio 0.85

I've changed upmap_max_deviation = 1
I also did:
-> $ ceph osd reweight osd.0 1.0
and ceased all clients' activity to FSes

I've extend disk-devices ceph uses twice, for 'health'complained about 'backfillull' like here, only _pg_ wasthere, is is now below:


-> $ ceph health detail

HEALTH_WARN 1 backfillfull osd(s); Low space hinderingbackfill (add storage if this doesn't resolve itself): 3 pgsbackfill_toofull; 5 pool(s) backfillfull

[WRN] OSD_BACKFILLFULL: 1 backfillfull osd(s)
    osd.0 is backfill full

[WRN] PG_BACKFILL_FULL: Low space hindering backfill (addstorage if this doesn't resolve itself): 3 pgs backfill_toofull pg 3.1b is active+remapped+backfill_toofull, acting[4,5,10] pg 3.21 is active+remapped+backfill_toofull, acting[4,5,10] pg 3.23 is active+remapped+backfill_toofull, acting[10,5,4]

[WRN] POOL_BACKFILLFULL: 5 pool(s) backfillfull
    pool '.mgr' is backfillfull
    pool 'cephfs.APKI.meta' is backfillfull
    pool 'cephfs.APKI.data' is backfillfull
    pool 'cephfs.MONERO.meta' is backfillfull
    pool 'cephfs.MONERO.data' is backfillfull

Each time I "extended" devices-disks, cluster went to useosd.0 and fill it up.

Yes everything is small in this cluster, it's a lab.

-> $ ceph -w
  cluster:
    id:     9f4f9dba-72c7-11f0-8052-525400519d29
    health: HEALTH_WARN
            1 backfillfull osd(s)

Low space hindering backfill (add storage ifthis doesn't resolve itself): 3 pgs backfill_toofull

            5 pool(s) backfillfull

  services:

mon: 3 daemons, quorum podster3,podster2,podster1 (age2d) [leader: podster3] mgr: podster1.qzojrl(active, since 2d), standbys:podster3.kyyolr

    mds: 2/2 daemons up, 2 standby

osd: 6 osds: 6 up (since 12h), 6 in (since 2d); 3remapped pgs


  data:
    volumes: 2/2 healthy
    pools:   5 pools, 289 pgs
    objects: 102.74k objects, 355 GiB
    usage:   1.1 TiB used, 523 GiB / 1.6 TiB avail
    pgs:     737/308229 objects misplaced (0.239%)
             286 active+clean
             3   active+remapped+backfill_toofull

2026-03-02T04:10:00.000125+0000 mon.podster3 [WRN] overallHEALTH_WARN 1 backfillfull osd(s); Low space hinderingbackfill (add storage if this doesn't resolve itself): 3 pgsbackfill_toofull; 5 pool(s) backfillfull2026-03-02T04:20:00.000091+0000 mon.podster3 [WRN] overallHEALTH_WARN 1 backfillfull osd(s); Low space hinderingbackfill (add storage if this doesn't resolve itself): 3 pgsbackfill_toofull; 5 pool(s) backfillfull2026-03-02T04:26:38.174003+0000 mon.podster3 [WRN] Healthcheck failed: Reduced data availability: 1 pg peering(PG_AVAILABILITY)2026-03-02T04:26:44.210790+0000 mon.podster3 [INF] Healthcheck cleared: PG_AVAILABILITY (was: Reduced dataavailability: 1 pg peering)2026-03-02T04:30:00.000140+0000 mon.podster3 [WRN] overallHEALTH_WARN 1 backfillfull osd(s); Low space hinderingbackfill (add storage if this doesn't resolve itself): 3 pgsbackfill_toofull; 5 pool(s) backfillfull2026-03-02T04:40:00.000118+0000 mon.podster3 [WRN] overallHEALTH_WARN 1 backfillfull osd(s); Low space hinderingbackfill (add storage if this doesn't resolve itself): 3 pgsbackfill_toofull; 5 pool(s) backfillfull2026-03-02T04:50:00.000114+0000 mon.podster3 [WRN] overallHEALTH_WARN 1 backfillfull osd(s); Low space hinderingbackfill (add storage if this doesn't resolve itself): 3 pgsbackfill_toofull; 5 pool(s) backfillfull

If the cluster is doing something good, something it shouldbe doing to heal, it's barely happening so slow it is -given no clients work now + as you said, storage capacitiesare minute in comparison to anything production.

7 hours later, still:
-> $ ceph osd df tree | egrep '(osd.0|ID)'

ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS TYPE NAME 0 ssd 0.09769 1.00000 150 GiB 140 GiB 88 GiB 1.0 MiB 2.0 GiB 10 GiB 93.29 1.37 72 up osd.0


Now 5 hours later - since i started this draft - I added:
-> $ ceph osd reweight-by-utilization
and before that cmd above I also noticed:
-> $ ceph config get mgr mgr/balancer/begin_weekday
0
-> $ ceph config get mgr mgr/balancer/end_weekday
0

which was done by 'deployment' process - cephadm bootstrap -and made me wonder:

does that mean that auto-rebalance runs only on Sunday?
I changed: end_weekday = 6

_reweight-by-utilization_ I notice, changed REWEIGHT forosd.0 and that did something, I think.So now _active+remapped+backfill_toofull_ are gone from'pcs' part of health report.

RAW USE & DATA are down, but stil:
-> $ ceph osd df tree | egrep '(osd.0|ID)'

ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS TYPE NAME 0 ssd 0.09769 0.90002 150 GiB 134 GiB 82 GiB 1.1 MiB 2.0 GiB 16 GiB 89.51 1.31 68 up osd.0and when compared to other host-ods which use "identical"disk-drives:

-> $ ceph osd df tree | egrep '(osd\.[0,1,9] |ID)'

ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS TYPE NAME 9 ssd 0.04880 1.00000 150 GiB 52 GiB 51 GiB 624 KiB 1.4 GiB 98 GiB 34.83 0.51 43 up osd.9 0 ssd 0.09769 0.90002 150 GiB 134 GiB 82 GiB 1.1 MiB 2.0 GiB 16 GiB 89.51 1.31 68 up osd.0 1 ssd 0.04880 1.00000 150 GiB 53 GiB 53 GiB 526 KiB 254 MiB 97 GiB 35.40 0.52 44 up osd.1

Perhaps cluster goes only as far as to satisfy_backfill_toofull_ be gone and then "gives up"?

The "other" disk-drives:
-> $ ceph osd df tree | egrep '(osd\.(5|4|10)\  |ID)'

ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS TYPE NAME10 ssd 0.29300 1.00000 400 GiB 307 GiB 305 GiB 2.0MiB 2.1 GiB 93 GiB 76.68 1.12 246 up osd.10 4 ssd 0.29300 1.00000 400 GiB 275 GiB 273 GiB 3.2 MiB 2.1 GiB 125 GiB 68.83 1.01 221 up osd.4 5 ssd 0.29300 1.00000 400 GiB 305 GiB 303 GiB 3.0 MiB 2.5 GiB 95 GiB 76.27 1.12 245 up osd.5

Seems that _host podster2_ balances its osds 4 & 0"differently" to what other two hosts do - if so then why?

_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[ceph-users] Re: resize/extend pool/fs storage size?

Reply via email to