PS: Cluster currently is size 2, I used PGCalc on Ceph website which, by default, will place 200 PGs on each OSD. I read about the protection in the docs and later noticed that I better had only placed 100 PGs.
2018-05-17 13:35 GMT+02:00 Kevin Olbrich <k...@sv01.de>: > Hi! > > Thanks for your quick reply. > Before I read your mail, i applied the following conf to my OSDs: > ceph tell 'osd.*' injectargs '--osd_max_pg_per_osd_hard_ratio 32' > > Status is now: > data: > pools: 2 pools, 1536 pgs > objects: 639k objects, 2554 GB > usage: 5211 GB used, 11295 GB / 16506 GB avail > pgs: 7.943% pgs not active > 5567/1309948 objects degraded (0.425%) > 252327/1309948 objects misplaced (19.262%) > 1030 active+clean > 351 active+remapped+backfill_wait > 107 activating+remapped > 33 active+remapped+backfilling > 15 activating+undersized+degraded+remapped > > A little bit better but still some non-active PGs. > I will investigate your other hints! > > Thanks > Kevin > > 2018-05-17 13:30 GMT+02:00 Burkhard Linke <Burkhard.Linke@computational. > bio.uni-giessen.de>: > >> Hi, >> >> >> >> On 05/17/2018 01:09 PM, Kevin Olbrich wrote: >> >>> Hi! >>> >>> Today I added some new OSDs (nearly doubled) to my luminous cluster. >>> I then changed pg(p)_num from 256 to 1024 for that pool because it was >>> complaining about to few PGs. (I noticed that should better have been >>> small >>> changes). >>> >>> This is the current status: >>> >>> health: HEALTH_ERR >>> 336568/1307562 objects misplaced (25.740%) >>> Reduced data availability: 128 pgs inactive, 3 pgs peering, >>> 1 >>> pg stale >>> Degraded data redundancy: 6985/1307562 objects degraded >>> (0.534%), 19 pgs degraded, 19 pgs undersized >>> 107 slow requests are blocked > 32 sec >>> 218 stuck requests are blocked > 4096 sec >>> >>> data: >>> pools: 2 pools, 1536 pgs >>> objects: 638k objects, 2549 GB >>> usage: 5210 GB used, 11295 GB / 16506 GB avail >>> pgs: 0.195% pgs unknown >>> 8.138% pgs not active >>> 6985/1307562 objects degraded (0.534%) >>> 336568/1307562 objects misplaced (25.740%) >>> 855 active+clean >>> 517 active+remapped+backfill_wait >>> 107 activating+remapped >>> 31 active+remapped+backfilling >>> 15 activating+undersized+degraded+remapped >>> 4 active+undersized+degraded+remapped+backfilling >>> 3 unknown >>> 3 peering >>> 1 stale+active+clean >>> >> >> You need to resolve the unknown/peering/activating pgs first. You have >> 1536 PGs, assuming replication size 3 this make 4608 PG copies. Given 25 >> OSDs and the heterogenous host sizes, I assume that some OSDs hold more >> than 200 PGs. There's a threshold for the number of PGs; reaching this >> threshold keeps the OSDs from accepting new PGs. >> >> Try to increase the threshold (mon_max_pg_per_osd / >> max_pg_per_osd_hard_ratio / osd_max_pg_per_osd_hard_ratio, not sure about >> the exact one, consult the documentation) to allow more PGs on the OSDs. If >> this is the cause of the problem, the peering and activating states should >> be resolved within a short time. >> >> You can also check the number of PGs per OSD with 'ceph osd df'; the last >> column is the current number of PGs. >> >> >>> >>> OSD tree: >>> >>> ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF >>> -1 16.12177 root default >>> -16 16.12177 datacenter dc01 >>> -19 16.12177 pod dc01-agg01 >>> -10 8.98700 rack dc01-rack02 >>> -4 4.03899 host node1001 >>> 0 hdd 0.90999 osd.0 up 1.00000 1.00000 >>> 1 hdd 0.90999 osd.1 up 1.00000 1.00000 >>> 5 hdd 0.90999 osd.5 up 1.00000 1.00000 >>> 2 ssd 0.43700 osd.2 up 1.00000 1.00000 >>> 3 ssd 0.43700 osd.3 up 1.00000 1.00000 >>> 4 ssd 0.43700 osd.4 up 1.00000 1.00000 >>> -7 4.94899 host node1002 >>> 9 hdd 0.90999 osd.9 up 1.00000 1.00000 >>> 10 hdd 0.90999 osd.10 up 1.00000 1.00000 >>> 11 hdd 0.90999 osd.11 up 1.00000 1.00000 >>> 12 hdd 0.90999 osd.12 up 1.00000 1.00000 >>> 6 ssd 0.43700 osd.6 up 1.00000 1.00000 >>> 7 ssd 0.43700 osd.7 up 1.00000 1.00000 >>> 8 ssd 0.43700 osd.8 up 1.00000 1.00000 >>> -11 7.13477 rack dc01-rack03 >>> -22 5.38678 host node1003 >>> 17 hdd 0.90970 osd.17 up 1.00000 1.00000 >>> 18 hdd 0.90970 osd.18 up 1.00000 1.00000 >>> 24 hdd 0.90970 osd.24 up 1.00000 1.00000 >>> 26 hdd 0.90970 osd.26 up 1.00000 1.00000 >>> 13 ssd 0.43700 osd.13 up 1.00000 1.00000 >>> 14 ssd 0.43700 osd.14 up 1.00000 1.00000 >>> 15 ssd 0.43700 osd.15 up 1.00000 1.00000 >>> 16 ssd 0.43700 osd.16 up 1.00000 1.00000 >>> -25 1.74799 host node1004 >>> 19 ssd 0.43700 osd.19 up 1.00000 1.00000 >>> 20 ssd 0.43700 osd.20 up 1.00000 1.00000 >>> 21 ssd 0.43700 osd.21 up 1.00000 1.00000 >>> 22 ssd 0.43700 osd.22 up 1.00000 1.00000 >>> >>> >>> Crush rule is set to chooseleaf rack and (temporary!) to size 2. >>> Why are PGs stuck in peering and activating? >>> "ceph df" shows that only 1,5TB are used on the pool, residing on the >>> hdd's >>> - which would perfectly fit the crush rule....(?) >>> >> >> Size 2 within the crush rule or size 2 for the two pools? >> >> Regards, >> Burkhard >> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com