We are using JJ’s ceph balancer to move PGs around.

There’s nothing visibly unusual about the balance of the osds relative to each 
other or the configured full ratio.

When MAX AVAIL hits 0%, we’ve iterated over each down-weighted osd, and bumped 
up the crush weight slightly until MAX AVAIL is calculated correctly again. 
Through this process we can isolate the problem to specific OSDs. When checking 
the OSD’s configuration and reported stats they look otherwise normal. Setting 
the problematic OSD to out will then resolve the issue.

I suppose we can switch back to using the inbuilt balancer and see if the 
problem occurs again.

From: Anthony D'Atri <anthony.da...@gmail.com>
Date: Tuesday, 19 August 2025 at 12:04 pm
To: Justin Mammarella <justin.mammare...@unimelb.edu.au>
Cc: Ceph Users <ceph-users@ceph.io>
Subject: Re: [ceph-users] [EXT] Re: MAX_AVAIL becomes 0 bytes when setting osd 
crush weight to low value.
External email: Please exercise caution

Got it.  If you were using, say, a rack failure domain and had weighted down 
one or more racks such that there were no longer at least 9 racks with normal 
weight that might have been a factor.

The "max avail" figures are calculated based on the configured full ratio, 
relative to the single most-full OSD.  Is your balancer on?  Do you have any 
legacy reweight values that are < 1.000 ?  What does `ceph osd df | tail` show 
for std deviation?  Does `ceph osd df` show a wide spread in fullness such that 
some outlier OSD might be perturbing your results?

> On Aug 18, 2025, at 6:46 PM, Justin Mammarella 
> <justin.mammare...@unimelb.edu.au> wrote:
>
> Our failure domain is host.
> We currently have 46 hosts, 6 of them have osds that are weighted down to 
> near 0.
>
> And a correction to my original email, we are using EC 6 + 3
>
>
> From: Anthony D'Atri <anthony.da...@gmail.com>
> Date: Tuesday, 19 August 2025 at 2:14 am
> To: Justin Mammarella <justin.mammare...@unimelb.edu.au>
> Cc: Ceph Users <ceph-users@ceph.io>
> Subject: [EXT] Re: [ceph-users] MAX_AVAIL becomes 0 bytes when setting osd 
> crush weight to low value.
> External email: Please exercise caution
>
> How many failure domains do you have? The downweighted hosts, are they spread 
> across failure domains?
>
>> On Aug 18, 2025, at 10:28 AM, Justin Mammarella 
>> <justin.mammare...@unimelb.edu.au> wrote:
>>
>> Hello,
>>
>> We’re seeing the MAX_AVAIL value in ceph df instantaneously drop to 0 Bytes 
>> / 100% full when specific osds have their crush weight
>> set to low values.
>>
>> The osds are otherwise healthy, and ceph osd df does not show their 
>> utilization to be above 70%.
>>
>> ceph version 19.2.2
>>
>> CLASS SIZE AVAIL USED RAW USED %RAW USED
>> mf1hdd 19 PiB 8.9 PiB 10 PiB 10 PiB 53.83
>>
>> to
>>
>> POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL
>> mf1fs_data 1 16384 6.8 PiB 2.51G 10 PiB 100.00 0 B
>>
>> We’re running a 9+3 EC pool.
>>
>> This cluster has 1139 osds / 46 host.
>>
>> We’re in the process of downsizing the cluster and draining nodes via crush 
>> reweight is part of our normal operations.
>>
>> It happened once a few weeks ago, and we isolated the issue to the weight on 
>> a single osd. Now it’s happening during rebalance on multiple osds, at some 
>> point the movement of PGs triggers an edge case causing the MAX AVAIL 
>> calculation to fail if the crush weight is too low.
>>
>> Example crush weights
>>
>>
>> 1418    mf1hdd      0.02000          osd.1418             up   1.00000  
>> 1.00000
>>
>> 1419    mf1hdd      0.02000          osd.1419             up         0  
>> 1.00000
>>
>> 2110    mf1hdd      0.02000          osd.2110             up   1.00000  
>> 1.00000
>>
>> 2111    mf1hdd      0.02000          osd.2111             up         0  
>> 1.00000
>>
>> 2112  nvmemeta      0.02000          osd.2112             up   1.00000  
>> 1.00000
>>
>>
>> Any ideas before I file a bug report?
>>
>> Thankyou
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to