On Wed, Feb 21, 2018 at 2:46 PM Oliver Freyermuth < [email protected]> wrote:
> Dear Cephalopodians, > > in a Luminous 12.2.3 cluster with a pool with: > - 192 Bluestore OSDs total > - 6 hosts (32 OSDs per host) > - 2048 total PGs > - EC profile k=4, m=2 > - CRUSH failure domain = host > which results in 2048*6/192 = 64 PGs per OSD on average, I run into issues > with PG overdose protection. > > In case I reinstall one OSD host (zapping all disks), and recreate the > OSDs one by one with ceph-volume, > they will usually come back "slowly", i.e. one after the other. > > This means the first OSD will initially be assigned all 2048 PGs (to > fulfill the "failure domain host" requirement), > thus breaking through the default osd_max_pg_per_osd_hard_ratio of 2. > We also use mon_max_pg_per_osd default, i.e. 200. > > This appears to cause the previously active (but of course > undersized+degraded) PGs to enter an "activating+remapped" state, > and hence they become unavailable. > Thus, data availability is reduced. All this is caused by adding an OSD! > > Of course, as more and more OSDs are added until all 32 are back online, > this situation is relaxed. > Still, I observe that some PGs get stuck in this "activating" state, and > can't seem to figure out from logs or by dumping them > what's the actual reason. Waiting does not help, PGs stay "activating", > data stays inaccessible. > Can you upload logs from each of the OSDs that are (and should be, but aren't) involved with one of the PGs that happens to? (ceph-post-file) And create a ticket about it? Once you have a good map, all the PGs should definitely activate themselves. -Greg > Waiting a bit and manually restarting the ceph-OSD-services on the > reinstalled host seems to bring them back. > Also, adjusting osd_max_pg_per_osd_hard_ratio to something large (e.g. 10) > appears to prevent the issue. > > So my best guess is that this is related to PG overdose protection. > Any ideas on how to best overcome this / similar observations? > > It would be nice to be able to reinstall an OSD host without temporarily > making data unavailable, > right now the only thing which comes to my mind is to effectively disable > PG overdose protection. > > Cheers, > Oliver > > _______________________________________________ > ceph-users mailing list > [email protected] > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
