Re: [ceph-users] PG overdose protection causing PG unavailability

Oliver Freyermuth Thu, 22 Feb 2018 16:15:07 -0800

Am 23.02.2018 um 01:05 schrieb Gregory Farnum:
> 
> 
> On Wed, Feb 21, 2018 at 2:46 PM Oliver Freyermuth 
> <freyerm...@physik.uni-bonn.de <mailto:freyerm...@physik.uni-bonn.de>> wrote:
> 
>     Dear Cephalopodians,
> 
>     in a Luminous 12.2.3 cluster with a pool with:
>     - 192 Bluestore OSDs total
>     - 6 hosts (32 OSDs per host)
>     - 2048 total PGs
>     - EC profile k=4, m=2
>     - CRUSH failure domain = host
>     which results in 2048*6/192 = 64 PGs per OSD on average, I run into 
> issues with PG overdose protection.
> 
>     In case I reinstall one OSD host (zapping all disks), and recreate the 
> OSDs one by one with ceph-volume,
>     they will usually come back "slowly", i.e. one after the other.
> 
>     This means the first OSD will initially be assigned all 2048 PGs (to 
> fulfill the "failure domain host" requirement),
>     thus breaking through the default osd_max_pg_per_osd_hard_ratio of 2.
>     We also use mon_max_pg_per_osd default, i.e. 200.
> 
>     This appears to cause the previously active (but of course 
> undersized+degraded) PGs to enter an "activating+remapped" state,
>     and hence they become unavailable.
>     Thus, data availability is reduced. All this is caused by adding an OSD!
> 
>     Of course, as more and more OSDs are added until all 32 are back online, 
> this situation is relaxed.
>     Still, I observe that some PGs get stuck in this "activating" state, and 
> can't seem to figure out from logs or by dumping them
>     what's the actual reason. Waiting does not help, PGs stay "activating", 
> data stays inaccessible.
> 
> 
> Can you upload logs from each of the OSDs that are (and should be, but 
> aren't) involved with one of the PGs that happens to? (ceph-post-file) And 
> create a ticket about it?


I'll reproduce in the weekend and then capture the logs, at least I did not see 
anything in there, but I also am not yet too much used to reading them. 

What I can already confirm for sure is that after I set:
osd_max_pg_per_osd_hard_ratio = 32
in ceph.conf (global) and deploy new OSD hosts with that, the problem has fully 
vanished. I have already tested this with two machines. 

Cheers,
Oliver

> 
> Once you have a good map, all the PGs should definitely activate themselves.
> -Greg
> 
> 
>     Waiting a bit and manually restarting the ceph-OSD-services on the 
> reinstalled host seems to bring them back.
>     Also, adjusting osd_max_pg_per_osd_hard_ratio to something large (e.g. 
> 10) appears to prevent the issue.
> 
>     So my best guess is that this is related to PG overdose protection.
>     Any ideas on how to best overcome this / similar observations?
> 
>     It would be nice to be able to reinstall an OSD host without temporarily 
> making data unavailable,
>     right now the only thing which comes to my mind is to effectively disable 
> PG overdose protection.
> 
>     Cheers,
>             Oliver
> 
>     _______________________________________________
>     ceph-users mailing list
>     ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

smime.p7s
Description: S/MIME Cryptographic Signature

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] PG overdose protection causing PG unavailability

Reply via email to