Hi Eugen,
I'm afraid the description of your observation breaks a bit with causality and
this might be the reason for the few replies. To produce a bit more structure
for when exactly what happened, let's look at what I did and didn't get:
Before adding the hosts you have situation
1)
default
DCA
host A1 ... AN
DCB
host B1 ... BM
Now you add K+L hosts, they go into the default root and we have situation
2)
default
host C1 ... CK, D1 ... DL
DCA
host A1 ... AN
DCB
host B1 ... BM
As a last step, you move the hosts to their final locations and we arrive at
situation
3)
default
DCA
host A1 ... AN, C1 ... CK
DCB
host B1 ... BM, D1 ... DL
Please correct if this is wrong. Assuming its correct, I conclude the following.
Now, from your description it is not clear to me on which of the transitions
1->2 or 2->3 you observe
- peering and/or
- unknown PGs.
We use a somewhat similar procedure except that we have a second root (separate
disjoint tree) for new hosts/OSDs. However, in terms of peering it is the same
and if everything is configured correctly I would expect this to happen (this
is what happens when we add OSDs/hosts):
transition 1->2: hosts get added: no peering, no remapped objects, nothing,
just new OSDs doing nothing
transition 2->3: hosts get moved: peering starts and remapped objects appear,
all PGs active+clean
Unknown PGs should not occur (maybe only temporarily when the primary changes
or the PG is slow to respond/report status??). The crush bug with too few
set_choose_tries is observed if one has *just enough hosts* for the EC profile
and should not be observed if all PGs are active+clean and one *adds hosts*.
Persistent unknown PGs can (to my understanding, does unknown mean "has no
primary"?) only occur if the number of PGs changes (autoscaler messing
around??) because all PGs were active+clean before. The crush bug leads to
incomplete PGs, so PGs can go incomplete but they should always have an acting
primary.
This is assuming no OSDs went down/out during the process.
Can you please check if my interpretation is correct and describe at which step
exactly things start diverging from my expectations.
Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
________________________________________
From: Eugen Block <[email protected]>
Sent: Thursday, May 23, 2024 12:05 PM
To: [email protected]
Subject: [ceph-users] Re: unknown PGs after adding hosts in different subtree
Hi again,
I'm still wondering if I misunderstand some of the ceph concepts.
Let's assume the choose_tries value is too low and ceph can't find
enough OSDs for the remapping. I would expect that there are some PG
chunks in remapping state or unknown or whatever, but why would it
affect the otherwise healthy cluster in such a way?
Even if ceph doesn't know where to put some of the chunks, I wouldn't
expect inactive PGs and have a service interruption.
What am I missing here?
Thanks,
Eugen
Zitat von Eugen Block <[email protected]>:
> Thanks, Konstantin.
> It's been a while since I was last bitten by the choose_tries being
> too low... Unfortunately, I won't be able to verify that... But I'll
> definitely keep that in mind, or least I'll try to. :-D
>
> Thanks!
>
> Zitat von Konstantin Shalygin <[email protected]>:
>
>> Hi Eugen
>>
>>> On 21 May 2024, at 15:26, Eugen Block <[email protected]> wrote:
>>>
>>> step set_choose_tries 100
>>
>> I think you should try to increase set_choose_tries to 200
>> Last year we had an Pacific EC 8+2 deployment of 10 racks. And even
>> with 50 hosts, the value of 100 not worked for us
>>
>>
>> k
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]