ceph version 12.2.8-pve1 on proxmox ceph osd df tree @ https://pastebin.com/e68fJ5fM
I added `debug mgr = 4/5` to [global] section in ceph.conf on the active mgr. And restarted mgr service. Is this correct ? I noticed some config settings in the mgr logs.. Changed config to use "mgr/balancer/max_misplaced": "1", "mgr/balancer/mode": "upmap", "mgr/balancer/upmap_max_deviation": "0.0001", "mgr/balancer/upmap_max_iterations": "1000" After this i get a larger plan. I tried running the upmap commands manually, but for some reason no data is getting moved ... all pgs are active+clean and few scrubbing. Maybe it wont run until scrub is over ? I pasted a snippet of the mgr logs i found interesting... Given that the new plan is good now the problem seems to be upmap directive is being ignored ? On Wed, Feb 6, 2019 at 2:15 AM Konstantin Shalygin <[email protected]> wrote: > I previously enabled upmap and used automatic balancing with "ceph balancer > on". I got very good results and OSD's ended up with perfectly distributed > pg's. > > Now after adding several new OSD's, auto balancing does not seem to be > working anymore. OSD's have 30-50% usage where previously all had almost > the same %. > > I turned off auto balancer and tried manually running a plan > > # ceph balancer reset > # ceph balancer optimize myplan > # ceph balancer show myplan > ceph osd pg-upmap-items 41.1 106 125 95 121 84 34 36 99 72 126 > ceph osd pg-upmap-items 41.5 12 121 65 3 122 52 5 126 > ceph osd pg-upmap-items 41.b 117 99 65 125 > ceph osd pg-upmap-items 41.c 49 121 81 131 > ceph osd pg-upmap-items 41.e 61 82 73 52 122 46 84 118 > ceph osd pg-upmap-items 41.f 71 127 15 121 56 82 > ceph osd pg-upmap-items 41.12 81 92 > ceph osd pg-upmap-items 41.17 35 127 71 44 > ceph osd pg-upmap-items 41.19 81 131 21 119 18 52 > ceph osd pg-upmap-items 41.25 18 52 37 125 40 3 41 34 71 127 4 128 > > > After running this plan there's no difference and still huge inbalance on > the OSD's. Creating a new plan give the same plan again. > > # ceph balancer eval > current cluster score 0.015162 (lower is better) > > Balancer eval shows quite low number, so it seems to think the pg > distribution is already optimized ? > > Since i'm not getting this working again. I looked into the offline > optimization at http://docs.ceph.com/docs/mimic/rados/operations/upmap/ > > I have 2 pools. > Replicated pool using 3 OSD's with "10k" device class. > And remaining OSD's have "hdd" device class. > > The resulting out.txt creates a much larger plan, but would map alot of > PG's to the "10k" OSD's (where they should not be). And i can't seem to > find any way to exclude these 3 OSD's. > > Any ideas how to proceed ? > > Please, paste (on pastebin) your `ceph osd df tree`. What is your ceph > version? > > Also you can enable balancer debug messages if set `debug mgr = 4/5` in > your ceph.conf > > > > k >
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
