[ceph-users] Re: Low space hindering backfill and 2 backfillfull osd(s)

Szabo, Istvan (Agoda) Fri, 14 Oct 2022 03:33:50 -0700

Thank you very much the detailed explanation. Will wait then, based on the 
speed 5 more hours, let's see


Istvan Szabo
Senior Infrastructure Engineer
---------------------------------------------------
Agoda Services Co., Ltd.
e: istvan.sz...@agoda.com
---------------------------------------------------

-----Original Message-----
From: Janne Johansson <icepic...@gmail.com>
Sent: Friday, October 14, 2022 5:26 PM
To: Szabo, Istvan (Agoda) <istvan.sz...@agoda.com>
Cc: Ceph Users <ceph-users@ceph.io>
Subject: Re: [ceph-users] Low space hindering backfill and 2 backfillfull osd(s)

Email received from the internet. If in doubt, don't click any link nor open 
any attachment !
________________________________

Den fre 14 okt. 2022 kl 12:10 skrev Szabo, Istvan (Agoda)
<istvan.sz...@agoda.com>:
> I've added 5 more nodes to my cluster and got this issue.
> HEALTH_WARN 2 backfillfull osd(s); 17 pool(s) backfillfull; Low space
> hindering backfill (add storage if this doesn't resolve itself): 4 pgs 
> backfill_toofull OSD_BACKFILLFULL 2 backfillfull osd(s)
>     osd.150 is backfill full
>     osd.178 is backfill full
>
> I read in the mail list that I might need to increase the pg on the some pool 
> to have smaller pgs.
> Also read I might need to reweigt the mentioned full osd with 1.2 until it's 
> ok, then set back.
> Which would be the best solution?


It is not unusual to see "backfill_toofull", especially if the reason for 
expanding was that space was getting tight.

When you add new drives, a lot of PGs need to move, not only from "old OSDs to 
new" but in all possible directions.
As an example, if you had 16 PGs and three hosts (A,B and C), the PGs would end 
up something like:

A 1,4,7,10,13,16
B 2,5,8,11,14
C 3,6,9,12,15
(5-6 PGs per host)

Then you add host D and E, now it should become something like:

A 1,6,11,16
B 2,7,12
C 3,8,13
D 4,9,14
E 5,10,15
(3-4 PGs per host)

>From here we can see that A will keep PG 1 and 16, B will keep PG 2, C keeps 
>PG 3, but more or less ALL the other PGs will be moving about.
D and E will of course get PGs because they are added, but A will send PG 7 to 
host B, B send PG 8 to host C and so on.

If A,B and C are almost full and you add new OSDs (D and E), the cluster will 
try to schedule *all* the moves.

Of course pgs 4,5,9,10,14 and 15 can just start copying at any time since D and 
E are empty when they arrive, but the cluster will also ask A to send PG 7 to 
B, and B will try to send PG 8 to C, and if PG 7 makes B go past backfill_full 
limit, or of PG 8 makes host C pass it, they will pause those moves with the 
state backfill_toofull and just have them being "misplaced"/"remapped".

In the meantime, the other moves are going to get handled, and sooner or later, 
the host B and C will have moved off so much data so that PG
7 and 8 can move to their correct places, but this might mean those will be 
among the last to move about.

The reality is not 100% as simple as this, the straw2 bucket placing algorithm 
tries to help prevent parts of this, and there might be cases where two of the 
old hosts would send PGs to each other, basically just swapping them around and 
the point that any PG is made up of ECk+m/#replica parts makes this explanation 
a bit too simple, but in broad terms, this is why you get "errors" when adding 
new empty drives and it is perfectly ok, and will fix itself as soon as the 
other moves have created space enough for the queued-toofull moves to be 
performed without driving an OSD over the limits.

--
May the most significant bit of your life be positive.

________________________________
This message is confidential and is for the sole use of the intended 
recipient(s). It may also be privileged or otherwise protected by copyright or 
other legal rules. If you have received it by mistake please let us know by 
reply email and delete it from your system. It is prohibited to copy this 
message or disclose its content to anyone. Any confidentiality or privilege is 
not waived or lost by any mistaken delivery or unauthorized disclosure of the 
message. All messages sent to and from Agoda may be monitored to ensure 
compliance with company policies, to protect the company's interests and to 
remove potential malware. Electronic messages may be intercepted, amended, lost 
or deleted, or contain viruses.
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Low space hindering backfill and 2 backfillfull osd(s)

Reply via email to