Re: [ceph-users] ceph df space usage confusion - balancing needed?
Am 27.10.18 um 04:12 schrieb Linh Vu: > Should be fine as long as your "mgr/balancer/max_misplaced" is reasonable. I > find the default value of 0.05 decent enough, although from experience that > seems like 0.05% rather than 5% as suggested here: > http://docs.ceph.com/docs/luminous/mgr/balancer/ Ok! I did actually choose 0.01. Interestingly, during the initial large rebalancing, it went up to > 2 % of misplaced objects (in small steps) until I decided to stop the balancer for a day to give the cluster enough time to adapt. > You can also choose to turn it on only during certain hours when the cluster > might be less busy. The config-keys are there somewhere (there's a post by > Dan van der Ster on the ML about them) but they don't actually work in 12.2.8 > at least, when I tried them. I suggest just use cron to turn the balancer on > and off. I found that mail in the archives. Indeed, that seems helpful. I'll start with permanently leaving the balancer on for now and observe if it has any impact. Since we rarely change the cluster's layout, it should effectively just sit there silently most of the time. Thanks! Oliver > > -- > *From:* Oliver Freyermuth > *Sent:* Friday, 26 October 2018 9:32:14 PM > *To:* Linh Vu; Janne Johansson > *Cc:* ceph-users@lists.ceph.com; Peter Wienemann > *Subject:* Re: [ceph-users] ceph df space usage confusion - balancing needed? > > Dear Cephalopodians, > > thanks for all your feedback! > > I finally "pushed the button" and let upmap run for ~36 hours. > Previously, we had ~63 % usage of our CephFS with only 50 % raw usage, now, > we see only 53.77 % usage. > > That's as close as I expect things to ever become, and we gained about 70 TiB > of free storage by that, which is almost one file server. > So the outcome is really close to perfection :-). > > I'm leaving the balancer active now in upmap mode. Any bad experiences with > leaving it active "forever"? > > Cheers and many thanks again, > Oliver > > Am 23.10.18 um 01:14 schrieb Linh Vu: >> Upmap is awesome. I ran it on our new cluster before we started ingesting >> data, so that the PG count is balanced on all OSDs. After ingesting about >> 315TB, it's still beautifully balanced. Note: we have a few nodes with 8TB >> OSDs, and the rest on 10TBs. >> >> >> # ceph osd df plain >> ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS >> 0 mf1hdd 7.27739 1.0 7.28TiB 2.06TiB 5.21TiB 28.34 1.01 144 >> 1 mf1hdd 7.27739 1.0 7.28TiB 2.07TiB 5.21TiB 28.38 1.02 144 >> 2 mf1hdd 7.27739 1.0 7.28TiB 2.03TiB 5.24TiB 27.96 1.00 142 >> 3 mf1hdd 7.27739 1.0 7.28TiB 2.06TiB 5.21TiB 28.37 1.02 144 >> 4 mf1hdd 7.27739 1.0 7.28TiB 2.03TiB 5.24TiB 27.96 1.00 142 >> 5 mf1hdd 7.27739 1.0 7.28TiB 2.02TiB 5.26TiB 27.73 0.99 141 >> 6 mf1hdd 7.27739 1.0 7.28TiB 2.03TiB 5.24TiB 27.94 1.00 142 >> 7 mf1hdd 7.27739 1.0 7.28TiB 2.06TiB 5.21TiB 28.35 1.02 144 >> 8 mf1hdd 7.27739 1.0 7.28TiB 2.02TiB 5.26TiB 27.76 0.99 141 >> 9 mf1hdd 7.27739 1.0 7.28TiB 2.04TiB 5.24TiB 27.97 1.00 142 >> 10 mf1hdd 7.27739 1.0 7.28TiB 2.06TiB 5.21TiB 28.35 1.02 144 >> 11 mf1hdd 7.27739 1.0 7.28TiB 2.04TiB 5.24TiB 27.99 1.00 142 >> 12 mf1hdd 7.27739 1.0 7.28TiB 2.02TiB 5.26TiB 27.75 0.99 141 >> 13 mf1hdd 7.27739 1.0 7.28TiB 2.03TiB 5.24TiB 27.96 1.00 142 >> 14 mf1hdd 7.27739 1.0 7.28TiB 2.02TiB 5.26TiB 27.78 0.99 141 >> 15 mf1hdd 7.27739 1.0 7.28TiB 2.07TiB 5.21TiB 28.38 1.02 144 >> 224 nvmemeta 0.02179 1.0 22.3GiB 1.52GiB 20.8GiB 6.82 0.24 185 >> 225 nvmemeta 0.02179 1.0 22.4GiB 1.49GiB 20
Re: [ceph-users] ceph df space usage confusion - balancing needed?
7.99 1.00 174 > 137 mf1hdd 8.91019 1.0 8.91TiB 2.48TiB 6.43TiB 27.82 1.00 173 > 138 mf1hdd 8.91019 1.0 8.91TiB 2.48TiB 6.43TiB 27.81 1.00 173 > 139 mf1hdd 8.91019 1.0 8.91TiB 2.48TiB 6.43TiB 27.84 1.00 173 > 140 mf1hdd 8.91019 1.0 8.91TiB 2.48TiB 6.43TiB 27.81 1.00 173 > 141 mf1hdd 8.91019 1.0 8.91TiB 2.48TiB 6.43TiB 27.82 1.00 173 > 142 mf1hdd 8.91019 1.0 8.91TiB 2.50TiB 6.41TiB 28.00 1.00 174 > 143 mf1hdd 8.91019 1.0 8.91TiB 2.48TiB 6.43TiB 27.82 1.00 173 > 240 nvmemeta 0.02179 1.0 22.3GiB 1.61GiB 20.7GiB 7.22 0.26 184 > 241 nvmemeta 0.02179 1.0 22.4GiB 1.43GiB 20.9GiB 6.41 0.23 182 > TOTAL 1.85PiB 528TiB 1.33PiB 27.93 > MIN/MAX VAR: 0.23/1.02 STDDEV: 7.10 > > ---------------------- > *From:* ceph-users on behalf of Oliver > Freyermuth > *Sent:* Sunday, 21 October 2018 6:57:49 AM > *To:* Janne Johansson > *Cc:* ceph-users@lists.ceph.com; Peter Wienemann > *Subject:* Re: [ceph-users] ceph df space usage confusion - balancing needed? > > Ok, I'll try out the balancer end of the upcoming week then (after we've > fixed a HW-issue with one of our mons > and the cooling system). > > Until then, any further advice and whether upmap is recommended over > crush-compat (all clients are Luminous) are welcome ;-). > > Cheers, > Oliver > > Am 20.10.18 um 21:26 schrieb Janne Johansson: >> Ok, can't say "why" then, I'd reweigh them somewhat to even it out, >> 1.22 -vs- 0.74 in variance is a lot, so either a balancer plugin for >> the MGRs, a script or just a few manual tweaks might be in order. >> >> Den lör 20 okt. 2018 kl 21:02 skrev Oliver Freyermuth >> : >>> >>> All OSDs are of the very same size. One OSD host has slightly more disks >>> (33 instead of 31), though. >>> So also that that can't explain the hefty difference. >>> >>> I attach the output of "ceph osd tree" and "ceph osd df". >>> >>> The crush rule for the ceph_data pool is: >>> rule cephfs_data { >>> id 2 >>> type erasure >>> min_size 3 >>> max_size 6 >>> step set_chooseleaf_tries 5 >>> step set_choose_tries 100 >>> step take default class hdd >>> step chooseleaf indep 0 type host >>> step emit >>> } >>> So that only considers the hdd device class. EC is done with k=4 m=2. >>> >>> So I don't see any imbalance on the hardware level, but only a somewhat >>> uneven distribution of PGs. >>> Am I missing something, or is this really just a case for the ceph balancer >>> plugin? >>> I'm just a bit astonished this effect is so huge. >>> Maybe our 4096 PGs for the ceph_data pool are not enough to get an even >>> distribution without balancing? >>> But it yields about 100 PGs per OSD, as you can see... >>> >>> -- >>> # ceph osd tree >>> ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF >>> -1 826.26428 root default >>> -3 0.43700 host mon001 >>> 0 ssd 0.21799 osd.0 up 1.0 1.0 >>> 1 ssd 0.21799 osd.1 up 1.0 1.0 >>> -5 0.43700 host mon002 >>> 2 ssd 0.21799 osd.2 up 1.0 1.0 >>> 3 ssd 0.21799 osd.3 up 1.0 1.0 >>> -31 1.81898 host mon003 >>> 230 ssd 0.90999 osd.230 up 1.0 1.0 >>> 231 ssd 0.90999 osd.231 up 1.0 1.0 >>> -10 116.64600 host osd001 >>> 4 hdd 3.64499 osd.4 up 1.0 1.0 >>> 5 hdd 3.64499 osd.5 up 1.0 1.0 >>> 6 hdd 3.64499 osd.6 up 1.0 1.0 >>> 7 hdd 3.64499 osd.7 up 1.0 1.0 >>> 8 hdd 3.64499 osd.8 up 1.0 1.0 >>> 9 hdd 3.64499 osd.9 up 1.0 1.0 >>> 10 hdd 3.64499 osd.10 up 1.0 1.0 >>> 11 hdd 3.64499 osd.11 up 1.0 1.0 >>> 12 hdd 3.64499 osd.12 up 1.0 1.0 >>> 13 hdd 3.64499 osd.13 up 1.0 1.000
Re: [ceph-users] ceph df space usage confusion - balancing needed?
upmap has been amazing and balanced my clusters far better than anything else I've ever seen. I would go so far as to say that upmap can achieve a perfect balance. Upmap is awesome. I ran it on our new cluster before we started ingesting data, so that the PG count is balanced on all OSDs. Guys, do you remember, mainline or el kernel (krbd) is already supported for upmap? k ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph df space usage confusion - balancing needed?
I haven't had crush-compat do anything helpful for balancing my clusters. upmap has been amazing and balanced my clusters far better than anything else I've ever seen. I would go so far as to say that upmap can achieve a perfect balance. It seems to evenly distribute the PGs for each pool onto all OSDs that pool is on. It does that with a maximum difference of 1PG based on how divisible the number of PGs are with the number of OSDs you have. As a side note, your OSD CRUSH weights should be the default weights for their size for upmap to be as effective as it can be. On Sat, Oct 20, 2018 at 3:58 PM Oliver Freyermuth < freyerm...@physik.uni-bonn.de> wrote: > Ok, I'll try out the balancer end of the upcoming week then (after we've > fixed a HW-issue with one of our mons > and the cooling system). > > Until then, any further advice and whether upmap is recommended over > crush-compat (all clients are Luminous) are welcome ;-). > > Cheers, > Oliver > > Am 20.10.18 um 21:26 schrieb Janne Johansson: > > Ok, can't say "why" then, I'd reweigh them somewhat to even it out, > > 1.22 -vs- 0.74 in variance is a lot, so either a balancer plugin for > > the MGRs, a script or just a few manual tweaks might be in order. > > > > Den lör 20 okt. 2018 kl 21:02 skrev Oliver Freyermuth > > : > >> > >> All OSDs are of the very same size. One OSD host has slightly more > disks (33 instead of 31), though. > >> So also that that can't explain the hefty difference. > >> > >> I attach the output of "ceph osd tree" and "ceph osd df". > >> > >> The crush rule for the ceph_data pool is: > >> rule cephfs_data { > >> id 2 > >> type erasure > >> min_size 3 > >> max_size 6 > >> step set_chooseleaf_tries 5 > >> step set_choose_tries 100 > >> step take default class hdd > >> step chooseleaf indep 0 type host > >> step emit > >> } > >> So that only considers the hdd device class. EC is done with k=4 m=2. > >> > >> So I don't see any imbalance on the hardware level, but only a somewhat > uneven distribution of PGs. > >> Am I missing something, or is this really just a case for the ceph > balancer plugin? > >> I'm just a bit astonished this effect is so huge. > >> Maybe our 4096 PGs for the ceph_data pool are not enough to get an even > distribution without balancing? > >> But it yields about 100 PGs per OSD, as you can see... > >> > >> -- > >> # ceph osd tree > >> ID CLASS WEIGHTTYPE NAME STATUS REWEIGHT PRI-AFF > >> -1 826.26428 root default > >> -3 0.43700 host mon001 > >> 0 ssd 0.21799 osd.0 up 1.0 1.0 > >> 1 ssd 0.21799 osd.1 up 1.0 1.0 > >> -5 0.43700 host mon002 > >> 2 ssd 0.21799 osd.2 up 1.0 1.0 > >> 3 ssd 0.21799 osd.3 up 1.0 1.0 > >> -31 1.81898 host mon003 > >> 230 ssd 0.90999 osd.230 up 1.0 1.0 > >> 231 ssd 0.90999 osd.231 up 1.0 1.0 > >> -10 116.64600 host osd001 > >> 4 hdd 3.64499 osd.4 up 1.0 1.0 > >> 5 hdd 3.64499 osd.5 up 1.0 1.0 > >> 6 hdd 3.64499 osd.6 up 1.0 1.0 > >> 7 hdd 3.64499 osd.7 up 1.0 1.0 > >> 8 hdd 3.64499 osd.8 up 1.0 1.0 > >> 9 hdd 3.64499 osd.9 up 1.0 1.0 > >> 10 hdd 3.64499 osd.10 up 1.0 1.0 > >> 11 hdd 3.64499 osd.11 up 1.0 1.0 > >> 12 hdd 3.64499 osd.12 up 1.0 1.0 > >> 13 hdd 3.64499 osd.13 up 1.0 1.0 > >> 14 hdd 3.64499 osd.14 up 1.0 1.0 > >> 15 hdd 3.64499 osd.15 up 1.0 1.0 > >> 16 hdd 3.64499 osd.16 up 1.0 1.0 > >> 17 hdd 3.64499 osd.17 up 1.0 1.0 > >> 18 hdd 3.64499 osd.18 up 1.0 1.0 > >> 19 hdd 3.64499 osd.19 up 1.0 1.0 > >> 20 hdd 3.64499 osd.20 up 1.0 1.0 > >> 21 hdd 3.64499 osd.21 up 1.0 1.0 > >> 22 hdd 3.64499 osd.22 up 1.0 1.0 > >> 23 hdd 3.64499 osd.23 up 1.0 1.0 > >> 24 hdd 3.64499 osd.24 up 1.0 1.0 > >> 25 hdd 3.64499 osd.25 up 1.0 1.0 > >> 26 hdd 3.64499 osd.26 up 1.0 1.0 > >> 27 hdd 3.64499 osd.27 up 1.0 1.0 > >> 28 hdd 3.64499 osd.28 up 1.0 1.0 > >> 29 hdd 3.64499 osd.29 up 1.0 1.0 > >> 30 hdd 3.64499 osd.30 up 1.0 1.0 > >> 31 hdd 3.64499 osd.31 up 1.0 1.0 > >> 32 hdd
Re: [ceph-users] ceph df space usage confusion - balancing needed?
Ok, I'll try out the balancer end of the upcoming week then (after we've fixed a HW-issue with one of our mons and the cooling system). Until then, any further advice and whether upmap is recommended over crush-compat (all clients are Luminous) are welcome ;-). Cheers, Oliver Am 20.10.18 um 21:26 schrieb Janne Johansson: > Ok, can't say "why" then, I'd reweigh them somewhat to even it out, > 1.22 -vs- 0.74 in variance is a lot, so either a balancer plugin for > the MGRs, a script or just a few manual tweaks might be in order. > > Den lör 20 okt. 2018 kl 21:02 skrev Oliver Freyermuth > : >> >> All OSDs are of the very same size. One OSD host has slightly more disks (33 >> instead of 31), though. >> So also that that can't explain the hefty difference. >> >> I attach the output of "ceph osd tree" and "ceph osd df". >> >> The crush rule for the ceph_data pool is: >> rule cephfs_data { >> id 2 >> type erasure >> min_size 3 >> max_size 6 >> step set_chooseleaf_tries 5 >> step set_choose_tries 100 >> step take default class hdd >> step chooseleaf indep 0 type host >> step emit >> } >> So that only considers the hdd device class. EC is done with k=4 m=2. >> >> So I don't see any imbalance on the hardware level, but only a somewhat >> uneven distribution of PGs. >> Am I missing something, or is this really just a case for the ceph balancer >> plugin? >> I'm just a bit astonished this effect is so huge. >> Maybe our 4096 PGs for the ceph_data pool are not enough to get an even >> distribution without balancing? >> But it yields about 100 PGs per OSD, as you can see... >> >> -- >> # ceph osd tree >> ID CLASS WEIGHTTYPE NAME STATUS REWEIGHT PRI-AFF >> -1 826.26428 root default >> -3 0.43700 host mon001 >> 0 ssd 0.21799 osd.0 up 1.0 1.0 >> 1 ssd 0.21799 osd.1 up 1.0 1.0 >> -5 0.43700 host mon002 >> 2 ssd 0.21799 osd.2 up 1.0 1.0 >> 3 ssd 0.21799 osd.3 up 1.0 1.0 >> -31 1.81898 host mon003 >> 230 ssd 0.90999 osd.230 up 1.0 1.0 >> 231 ssd 0.90999 osd.231 up 1.0 1.0 >> -10 116.64600 host osd001 >> 4 hdd 3.64499 osd.4 up 1.0 1.0 >> 5 hdd 3.64499 osd.5 up 1.0 1.0 >> 6 hdd 3.64499 osd.6 up 1.0 1.0 >> 7 hdd 3.64499 osd.7 up 1.0 1.0 >> 8 hdd 3.64499 osd.8 up 1.0 1.0 >> 9 hdd 3.64499 osd.9 up 1.0 1.0 >> 10 hdd 3.64499 osd.10 up 1.0 1.0 >> 11 hdd 3.64499 osd.11 up 1.0 1.0 >> 12 hdd 3.64499 osd.12 up 1.0 1.0 >> 13 hdd 3.64499 osd.13 up 1.0 1.0 >> 14 hdd 3.64499 osd.14 up 1.0 1.0 >> 15 hdd 3.64499 osd.15 up 1.0 1.0 >> 16 hdd 3.64499 osd.16 up 1.0 1.0 >> 17 hdd 3.64499 osd.17 up 1.0 1.0 >> 18 hdd 3.64499 osd.18 up 1.0 1.0 >> 19 hdd 3.64499 osd.19 up 1.0 1.0 >> 20 hdd 3.64499 osd.20 up 1.0 1.0 >> 21 hdd 3.64499 osd.21 up 1.0 1.0 >> 22 hdd 3.64499 osd.22 up 1.0 1.0 >> 23 hdd 3.64499 osd.23 up 1.0 1.0 >> 24 hdd 3.64499 osd.24 up 1.0 1.0 >> 25 hdd 3.64499 osd.25 up 1.0 1.0 >> 26 hdd 3.64499 osd.26 up 1.0 1.0 >> 27 hdd 3.64499 osd.27 up 1.0 1.0 >> 28 hdd 3.64499 osd.28 up 1.0 1.0 >> 29 hdd 3.64499 osd.29 up 1.0 1.0 >> 30 hdd 3.64499 osd.30 up 1.0 1.0 >> 31 hdd 3.64499 osd.31 up 1.0 1.0 >> 32 hdd 3.64499 osd.32 up 1.0 1.0 >> 33 hdd 3.64499 osd.33 up 1.0 1.0 >> 34 hdd 3.64499 osd.34 up 1.0 1.0 >> 35 hdd 3.64499 osd.35 up 1.0 1.0 >> -13 116.64600 host osd002 >> 36 hdd 3.64499 osd.36 up 1.0 1.0 >> 37 hdd 3.64499 osd.37 up 1.0 1.0 >> 38 hdd 3.64499 osd.38 up 1.0 1.0 >> 39 hdd 3.64499 osd.39 up 1.0 1.0 >> 40 hdd 3.64499 osd.40 up 1.0 1.0 >> 41 hdd 3.64499 osd.41 up 1.0 1.0 >> 42 hdd 3.64499 osd.42 up 1.0 1.0 >> 43 hdd 3.64499 osd.43 up 1.0 1.0 >> 44 hdd 3.64499 osd.44 up
Re: [ceph-users] ceph df space usage confusion - balancing needed?
All OSDs are of the very same size. One OSD host has slightly more disks (33 instead of 31), though. So also that that can't explain the hefty difference. I attach the output of "ceph osd tree" and "ceph osd df". The crush rule for the ceph_data pool is: rule cephfs_data { id 2 type erasure min_size 3 max_size 6 step set_chooseleaf_tries 5 step set_choose_tries 100 step take default class hdd step chooseleaf indep 0 type host step emit } So that only considers the hdd device class. EC is done with k=4 m=2. So I don't see any imbalance on the hardware level, but only a somewhat uneven distribution of PGs. Am I missing something, or is this really just a case for the ceph balancer plugin? I'm just a bit astonished this effect is so huge. Maybe our 4096 PGs for the ceph_data pool are not enough to get an even distribution without balancing? But it yields about 100 PGs per OSD, as you can see... -- # ceph osd tree ID CLASS WEIGHTTYPE NAME STATUS REWEIGHT PRI-AFF -1 826.26428 root default -3 0.43700 host mon001 0 ssd 0.21799 osd.0 up 1.0 1.0 1 ssd 0.21799 osd.1 up 1.0 1.0 -5 0.43700 host mon002 2 ssd 0.21799 osd.2 up 1.0 1.0 3 ssd 0.21799 osd.3 up 1.0 1.0 -31 1.81898 host mon003 230 ssd 0.90999 osd.230 up 1.0 1.0 231 ssd 0.90999 osd.231 up 1.0 1.0 -10 116.64600 host osd001 4 hdd 3.64499 osd.4 up 1.0 1.0 5 hdd 3.64499 osd.5 up 1.0 1.0 6 hdd 3.64499 osd.6 up 1.0 1.0 7 hdd 3.64499 osd.7 up 1.0 1.0 8 hdd 3.64499 osd.8 up 1.0 1.0 9 hdd 3.64499 osd.9 up 1.0 1.0 10 hdd 3.64499 osd.10 up 1.0 1.0 11 hdd 3.64499 osd.11 up 1.0 1.0 12 hdd 3.64499 osd.12 up 1.0 1.0 13 hdd 3.64499 osd.13 up 1.0 1.0 14 hdd 3.64499 osd.14 up 1.0 1.0 15 hdd 3.64499 osd.15 up 1.0 1.0 16 hdd 3.64499 osd.16 up 1.0 1.0 17 hdd 3.64499 osd.17 up 1.0 1.0 18 hdd 3.64499 osd.18 up 1.0 1.0 19 hdd 3.64499 osd.19 up 1.0 1.0 20 hdd 3.64499 osd.20 up 1.0 1.0 21 hdd 3.64499 osd.21 up 1.0 1.0 22 hdd 3.64499 osd.22 up 1.0 1.0 23 hdd 3.64499 osd.23 up 1.0 1.0 24 hdd 3.64499 osd.24 up 1.0 1.0 25 hdd 3.64499 osd.25 up 1.0 1.0 26 hdd 3.64499 osd.26 up 1.0 1.0 27 hdd 3.64499 osd.27 up 1.0 1.0 28 hdd 3.64499 osd.28 up 1.0 1.0 29 hdd 3.64499 osd.29 up 1.0 1.0 30 hdd 3.64499 osd.30 up 1.0 1.0 31 hdd 3.64499 osd.31 up 1.0 1.0 32 hdd 3.64499 osd.32 up 1.0 1.0 33 hdd 3.64499 osd.33 up 1.0 1.0 34 hdd 3.64499 osd.34 up 1.0 1.0 35 hdd 3.64499 osd.35 up 1.0 1.0 -13 116.64600 host osd002 36 hdd 3.64499 osd.36 up 1.0 1.0 37 hdd 3.64499 osd.37 up 1.0 1.0 38 hdd 3.64499 osd.38 up 1.0 1.0 39 hdd 3.64499 osd.39 up 1.0 1.0 40 hdd 3.64499 osd.40 up 1.0 1.0 41 hdd 3.64499 osd.41 up 1.0 1.0 42 hdd 3.64499 osd.42 up 1.0 1.0 43 hdd 3.64499 osd.43 up 1.0 1.0 44 hdd 3.64499 osd.44 up 1.0 1.0 45 hdd 3.64499 osd.45 up 1.0 1.0 46 hdd 3.64499 osd.46 up 1.0 1.0 47 hdd 3.64499 osd.47 up 1.0 1.0 48 hdd 3.64499 osd.48 up 1.0 1.0 49 hdd 3.64499 osd.49 up 1.0 1.0 50 hdd 3.64499 osd.50 up 1.0 1.0 51 hdd 3.64499 osd.51 up 1.0 1.0 52 hdd 3.64499 osd.52 up 1.0 1.0 53 hdd 3.64499 osd.53 up 1.0 1.0 54 hdd 3.64499 osd.54 up 1.0 1.0 55 hdd 3.64499
Re: [ceph-users] ceph df space usage confusion - balancing needed?
Yes, if you have uneven sizes I guess you could end up in a situation where you have lots of 1TB OSDs and a number of 2TB OSD but pool replication forces the pool to have one PG replica on the 1TB OSD, then it would be possible to state "this pool cant write more than X G" but when it is full, there would be free space left on some of the 2TB-OSDs, but which the pool cant utilize. Probably same for uneven OSD hosts if you have those. Den lör 20 okt. 2018 kl 20:28 skrev Oliver Freyermuth : > > Dear Janne, > > yes, of course. But since we only have two pools here, this can not explain > the difference. > The metadata is replicated (3 copies) across ssd drives, and we have < 3 TB > of total raw storage for that. > So looking at the raw space usage, we can ignore that. > > All the rest is used for the ceph_data pool. So the ceph_data pool, in terms > of raw storage, is about 50 % used. > > But in terms of storage shown for that pool, it's almost 63 % %USED. > So I guess this can purely be from bad balancing, correct? > > Cheers, > Oliver > > Am 20.10.18 um 19:49 schrieb Janne Johansson: > > Do mind that drives may have more than one pool on them, so RAW space > > is what it says, how much free space there is. Then the avail and > > %USED on per-pool stats will take replication into account, it can > > tell how much data you may write into that particular pool, given that > > pools replication or EC settings. > > > > Den lör 20 okt. 2018 kl 19:09 skrev Oliver Freyermuth > > : > >> > >> Dear Cephalopodians, > >> > >> as many others, I'm also a bit confused by "ceph df" output > >> in a pretty straightforward configuration. > >> > >> We have a CephFS (12.2.7) running, with 4+2 EC profile. > >> > >> I get: > >> > >> # ceph df > >> GLOBAL: > >> SIZE AVAIL RAW USED %RAW USED > >> 824T 410T 414T 50.26 > >> POOLS: > >> NAMEID USED %USED MAX AVAIL OBJECTS > >> cephfs_metadata 1 452M 0.05 860G 365774 > >> cephfs_data 2 275T 62.68 164T 75056403 > >> > >> > >> So about 50 % of raw space are used, but already ~63 % of filesystem space > >> are used. > >> Is this purely from imperfect balancing? > >> In "ceph osd df", I do indeed see OSD usages spreading from 65.02 % usage > >> down to 37.12 %. > >> > >> We did not yet use the balancer plugin. > >> We don't have any pre-luminous clients. > >> In that setup, I take it that "upmap" mode would be recommended - correct? > >> Any "gotchas" using that on luminous? > >> > >> Cheers, > >> Oliver > >> > >> ___ > >> ceph-users mailing list > >> ceph-users@lists.ceph.com > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > > > -- May the most significant bit of your life be positive. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph df space usage confusion - balancing needed?
Dear Janne, yes, of course. But since we only have two pools here, this can not explain the difference. The metadata is replicated (3 copies) across ssd drives, and we have < 3 TB of total raw storage for that. So looking at the raw space usage, we can ignore that. All the rest is used for the ceph_data pool. So the ceph_data pool, in terms of raw storage, is about 50 % used. But in terms of storage shown for that pool, it's almost 63 % %USED. So I guess this can purely be from bad balancing, correct? Cheers, Oliver Am 20.10.18 um 19:49 schrieb Janne Johansson: > Do mind that drives may have more than one pool on them, so RAW space > is what it says, how much free space there is. Then the avail and > %USED on per-pool stats will take replication into account, it can > tell how much data you may write into that particular pool, given that > pools replication or EC settings. > > Den lör 20 okt. 2018 kl 19:09 skrev Oliver Freyermuth > : >> >> Dear Cephalopodians, >> >> as many others, I'm also a bit confused by "ceph df" output >> in a pretty straightforward configuration. >> >> We have a CephFS (12.2.7) running, with 4+2 EC profile. >> >> I get: >> >> # ceph df >> GLOBAL: >> SIZE AVAIL RAW USED %RAW USED >> 824T 410T 414T 50.26 >> POOLS: >> NAMEID USED %USED MAX AVAIL OBJECTS >> cephfs_metadata 1 452M 0.05 860G 365774 >> cephfs_data 2 275T 62.68 164T 75056403 >> >> >> So about 50 % of raw space are used, but already ~63 % of filesystem space >> are used. >> Is this purely from imperfect balancing? >> In "ceph osd df", I do indeed see OSD usages spreading from 65.02 % usage >> down to 37.12 %. >> >> We did not yet use the balancer plugin. >> We don't have any pre-luminous clients. >> In that setup, I take it that "upmap" mode would be recommended - correct? >> Any "gotchas" using that on luminous? >> >> Cheers, >> Oliver >> >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > smime.p7s Description: S/MIME Cryptographic Signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph df space usage confusion - balancing needed?
Do mind that drives may have more than one pool on them, so RAW space is what it says, how much free space there is. Then the avail and %USED on per-pool stats will take replication into account, it can tell how much data you may write into that particular pool, given that pools replication or EC settings. Den lör 20 okt. 2018 kl 19:09 skrev Oliver Freyermuth : > > Dear Cephalopodians, > > as many others, I'm also a bit confused by "ceph df" output > in a pretty straightforward configuration. > > We have a CephFS (12.2.7) running, with 4+2 EC profile. > > I get: > > # ceph df > GLOBAL: > SIZE AVAIL RAW USED %RAW USED > 824T 410T 414T 50.26 > POOLS: > NAMEID USED %USED MAX AVAIL OBJECTS > cephfs_metadata 1 452M 0.05 860G 365774 > cephfs_data 2 275T 62.68 164T 75056403 > > > So about 50 % of raw space are used, but already ~63 % of filesystem space > are used. > Is this purely from imperfect balancing? > In "ceph osd df", I do indeed see OSD usages spreading from 65.02 % usage > down to 37.12 %. > > We did not yet use the balancer plugin. > We don't have any pre-luminous clients. > In that setup, I take it that "upmap" mode would be recommended - correct? > Any "gotchas" using that on luminous? > > Cheers, > Oliver > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- May the most significant bit of your life be positive. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] ceph df space usage confusion - balancing needed?
Dear Cephalopodians, as many others, I'm also a bit confused by "ceph df" output in a pretty straightforward configuration. We have a CephFS (12.2.7) running, with 4+2 EC profile. I get: # ceph df GLOBAL: SIZE AVAIL RAW USED %RAW USED 824T 410T 414T 50.26 POOLS: NAMEID USED %USED MAX AVAIL OBJECTS cephfs_metadata 1 452M 0.05 860G 365774 cephfs_data 2 275T 62.68 164T 75056403 So about 50 % of raw space are used, but already ~63 % of filesystem space are used. Is this purely from imperfect balancing? In "ceph osd df", I do indeed see OSD usages spreading from 65.02 % usage down to 37.12 %. We did not yet use the balancer plugin. We don't have any pre-luminous clients. In that setup, I take it that "upmap" mode would be recommended - correct? Any "gotchas" using that on luminous? Cheers, Oliver smime.p7s Description: S/MIME Cryptographic Signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com