Figure out which OSDs are too full: ceph osd df tree
Then you can either reduce their weight: ceph osd reweight <id> 0.9 Or increase the threshhold after which an OSD is considered too full for backfills. How this is configured depends on the version, i think in your version it is still ceph pg set_backfillfull_ratio 0.XX It's probably currently configured to 0.85 Paul 2018-06-12 11:06 GMT+02:00 Steininger, Herbert < herbert_steinin...@psych.mpg.de>: > Hi Guys, > > i've inherited a CephFS-Cluster, I'm fairly new to CephFS. > The Cluster was down and I managed somehow to bring it up again. > But now there are some Problems that I can't fix that easily. > This is what 'ceph -s' is giving me as Info: > [root@pcl241 ceph]# ceph -s > cluster cde1487e-f930-417a-9403-28e9ebf406b8 > health HEALTH_WARN > 2 pgs backfill_toofull > 1 pgs degraded > 1 pgs stuck degraded > 2 pgs stuck unclean > 1 pgs stuck undersized > 1 pgs undersized > recovery 260/29731463 objects degraded (0.001%) > recovery 798/29731463 objects misplaced (0.003%) > 2 near full osd(s) > crush map has legacy tunables (require bobtail, min is firefly) > crush map has straw_calc_version=0 > monmap e8: 3 mons at {cephcontrol=172.22.12.241: > 6789/0,slurmbackup=172.22.20.4:6789/0,slurmmaster=172.22.20.3:6789/0} > election epoch 48, quorum 0,1,2 cephcontrol,slurmmaster, > slurmbackup > fsmap e2288: 1/1/1 up {0=pcl241=up:active} > osdmap e10865: 6 osds: 6 up, 6 in; 2 remapped pgs > flags nearfull > pgmap v14103169: 320 pgs, 3 pools, 30899 GB data, 9678 kobjects > 92929 GB used, 22139 GB / 112 TB avail > 260/29731463 objects degraded (0.001%) > 798/29731463 objects misplaced (0.003%) > 316 active+clean > 2 active+clean+scrubbing+deep > 1 active+undersized+degraded+remapped+backfill_toofull > 1 active+remapped+backfill_toofull > [root@pcl241 ceph]# > > > [root@pcl241 ceph]# ceph osd tree > ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY > -1 109.13992 root default > -2 0 host A1214-2950-01 > -3 0 host A1214-2950-02 > -4 0 host A1214-2950-04 > -5 0 host A1214-2950-05 > -6 0 host A1214-2950-03 > -7 18.18999 host cuda002 > 1 18.18999 osd.1 up 1.00000 1.00000 > -8 18.18999 host cuda001 > 0 18.18999 osd.0 up 1.00000 1.00000 > -9 18.18999 host cuda005 > 3 18.18999 osd.3 up 1.00000 1.00000 > -10 18.18999 host cuda003 > 4 18.18999 osd.4 up 1.00000 1.00000 > -11 18.18999 host cuda004 > 2 18.18999 osd.2 up 1.00000 1.00000 > -12 18.18999 host A1214-2950-06 > 5 18.18999 osd.5 up 1.00000 1.00000 > -13 0 host pe9 > > > > > Could someone please put me in the right Direction about what to do to fix > the Problems? > It seems that two OSD are full, but how can I solve that, if I don't have > additionally hardware available? > Also it seems that the Cluster has different ceph-versions running (Hammer > and Jewel), how to solve that? > Ceph-(mds/-mon/-osd) is running on Scientific Linux. > If more Info is needed, just let me know. > > Thanks in Advance, > Steininger Herbert > > --- > Herbert Steininger > Leiter EDV > Administrator > Max-Planck-Institut für Psychiatrie - EDV > Kraepelinstr. 2-10 > 80804 München > Tel +49 (0)89 / 30622-368 > Mail herbert_steinin...@psych.mpg.de > Web http://www.psych.mpg.de > > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com