Re: [ceph-users] Balancing cluster with large disks - 10TB HHD
Assuming I understand it correctly: "pg_upmap_items 6.0 [40,20]" refers to replacing (upmapping?) osd.40 with osd.20 in the acting set of the placement group '6.0'. Assuming it's a 3 replica PG, the other two OSDs in the set remain unchanged from the CRUSH calculation. "pg_upmap_items 6.6 [45,46,59,56]" describes two upmap replacements for the PG 6.6, replacing 45 with 46, and 59 with 56. Hope that helps. Cheers, Tom > -Original Message- > From: ceph-users On Behalf Of > jes...@krogh.cc > Sent: 30 December 2018 22:04 > To: Konstantin Shalygin > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] Balancing cluster with large disks - 10TB HHD > > >> I would still like to have a log somewhere to grep and inspect what > >> balancer/upmap actually does - when in my cluster. Or some ceph > >> commands that deliveres some monitoring capabilityes .. any > >> suggestions? > > Yes, on ceph-mgr log, when log level is DEBUG. > > Tried the docs .. something like: > > ceph tell mds ... does not seem to work. > http://docs.ceph.com/docs/master/rados/troubleshooting/log-and-debug/ > > > You can get your cluster upmap's in via `ceph osd dump | grep upmap`. > > Got it -- but I really need the README .. it shows the map .. > ... > pg_upmap_items 6.0 [40,20] > pg_upmap_items 6.1 [59,57,47,48] > pg_upmap_items 6.2 [59,55,75,9] > pg_upmap_items 6.3 [22,13,40,39] > pg_upmap_items 6.4 [23,9] > pg_upmap_items 6.5 [25,17] > pg_upmap_items 6.6 [45,46,59,56] > pg_upmap_items 6.8 [60,54,16,68] > pg_upmap_items 6.9 [61,69] > pg_upmap_items 6.a [51,48] > pg_upmap_items 6.b [43,71,41,29] > pg_upmap_items 6.c [22,13] > > .. > > But .. I dont have any pg's that should only have 2 replicas.. neither any > with 4 > .. how should this be interpreted? > > Thanks. > > -- > Jesper > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Balancing cluster with large disks - 10TB HHD
>> I would still like to have a log somewhere to grep and inspect what >> balancer/upmap >> actually does - when in my cluster. Or some ceph commands that deliveres >> some monitoring capabilityes .. any suggestions? > Yes, on ceph-mgr log, when log level is DEBUG. Tried the docs .. something like: ceph tell mds ... does not seem to work. http://docs.ceph.com/docs/master/rados/troubleshooting/log-and-debug/ > You can get your cluster upmap's in via `ceph osd dump | grep upmap`. Got it -- but I really need the README .. it shows the map .. ... pg_upmap_items 6.0 [40,20] pg_upmap_items 6.1 [59,57,47,48] pg_upmap_items 6.2 [59,55,75,9] pg_upmap_items 6.3 [22,13,40,39] pg_upmap_items 6.4 [23,9] pg_upmap_items 6.5 [25,17] pg_upmap_items 6.6 [45,46,59,56] pg_upmap_items 6.8 [60,54,16,68] pg_upmap_items 6.9 [61,69] pg_upmap_items 6.a [51,48] pg_upmap_items 6.b [43,71,41,29] pg_upmap_items 6.c [22,13] .. But .. I dont have any pg's that should only have 2 replicas.. neither any with 4 .. how should this be interpreted? Thanks. -- Jesper ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Balancing cluster with large disks - 10TB HHD
On 12/30/18 6:48 PM, Marc Roos wrote: You mean the values in the reweight column or the weight column? Because from the commands in this thread I am assuming the weight column. Does this mean that the upmap is handling disk sizes automatically? Reweight, not weight. Weight is a weight of bucket. Reweight is "I have unbalanced buckets, so I need local adjust". Upmap is not about disk size, please consult with this PDF from Dan [1] Currently I am using the balancer (turned off) in crush-compat mode and have a few 8TB disks mixed with 4TB disks. upmap balancing mode doesn't balances by size, is balanced by PG. k [1] https://www.slideshare.net/Inktank_Ceph/ceph-day-berlin-mastering-ceph-operations-upmap-and-the-mgr-balancer ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Balancing cluster with large disks - 10TB HHD
>> 4. Revert all your reweights. > >Done You mean the values in the reweight column or the weight column? Because from the commands in this thread I am assuming the weight column. Does this mean that the upmap is handling disk sizes automatically? Currently I am using the balancer (turned off) in crush-compat mode and have a few 8TB disks mixed with 4TB disks. ID CLASS WEIGHTTYPE NAME STATUS REWEIGHT PRI-AFF -1 120.64897 root default -230.48000 host c01 0 hdd 8.0 osd.0 up 1.0 1.0 3 hdd 3.0 osd.3 up 0.86809 1.0 6 hdd 4.0 osd.6 up 1.0 1.0 7 hdd 3.0 osd.7 up 0.86809 1.0 8 hdd 4.0 osd.8 up 1.0 1.0 #sets the weight (mostly disk size, first column): ceph osd crush reweight osd.6 4 #set the reweight (2nd column) ceph osd reweight osd.6 1 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Balancing cluster with large disks - 10TB HHD
I would still like to have a log somewhere to grep and inspect what balancer/upmap actually does - when in my cluster. Or some ceph commands that deliveres some monitoring capabilityes .. any suggestions? Yes, on ceph-mgr log, when log level is DEBUG. You can get your cluster upmap's in via `ceph osd dump | grep upmap`. k ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Balancing cluster with large disks - 10TB HHD
Hi. .. Just an update - This looks awesome.. and in a 8x5 company - christmas is a good period to rebalance a cluster :-) >> I'll try it out again - last I tried it complanied about older clients - >> it should be better now. > upmap is supported since kernel 4.13. > >> Second - should the reweights be set back to 1 then? > Yes, also: > > 1. `ceph osd crush tunables optimal` Done > 2. All your buckets should be straw2, but in case `ceph osd crush > set-all-straw-buckets-to-straw2` Done > 3. Your hosts imbalanced: elefant & capone have only eight 10TB's, > another hosts - 12. So I recommend replace 8TB's spinners to 10TB or > just shuffle it between hosts, like 2x8TB+10x10Tb. Yes, we initially thought we could go with 3 osd-hosts .. but then found out that EC-pools required more -- and then added. > 4. Revert all your reweights. Done > 5. Balancer do his work: `ceph balancer mode upmap`, `ceph balancer on`. So far - works awesome -- sudo qms/server_documentation/ceph/ceph-osd-data-distribution hdd hdd x N Min MaxMedian AvgStddev x 72 50.82 55.65 52.88 52.916944 1.0002586 As compared to the best I got with reweighting: $ sudo qms/server_documentation/ceph/ceph-osd-data-distribution hdd hdd x N Min MaxMedian AvgStddev x 72 45.36 54.98 52.63 52.131944 2.0746672 It took about 24 hours to rebalance -- and move quite some TB's around. I would still like to have a log somewhere to grep and inspect what balancer/upmap actually does - when in my cluster. Or some ceph commands that deliveres some monitoring capabilityes .. any suggestions? -- Jesper ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Balancing cluster with large disks - 10TB HHD
I'll try it out again - last I tried it complanied about older clients - it should be better now. upmap is supported since kernel 4.13. Second - should the reweights be set back to 1 then? Yes, also: 1. `ceph osd crush tunables optimal` 2. All your buckets should be straw2, but in case `ceph osd crush set-all-straw-buckets-to-straw2` 3. Your hosts imbalanced: elefant & capone have only eight 10TB's, another hosts - 12. So I recommend replace 8TB's spinners to 10TB or just shuffle it between hosts, like 2x8TB+10x10Tb. 4. Revert all your reweights. 5. Balancer do his work: `ceph balancer mode upmap`, `ceph balancer on`. k ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Balancing cluster with large disks - 10TB HHD
> Have a look at this thread on the mailing list: > https://www.mail-archive.com/ceph-users@lists.ceph.com/msg46506.html Ok, done.. how do I see that it actually work? Second - should the reweights be set back to 1 then? Jesper ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Balancing cluster with large disks - 10TB HHD
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On mik, 2018-12-26 at 16:30 +0100, jes...@krogh.cc wrote: > > -BEGIN PGP SIGNED MESSAGE- > > Hash: SHA256 > > > > On mik, 2018-12-26 at 13:14 +0100, jes...@krogh.cc wrote: > > > Thanks for the insight and links. > > > > > > > As I can see you are on Luminous. Since Luminous Balancer plugin is > > > > available [1], you should use it instead reweight's in place, > > > > > > especially > > > > in upmap mode [2] > > > > > > I'll try it out again - last I tried it complanied about older clients - > > > it should be better now. > > > > > > > require_min_compat_client luminous is required, for you to take advantage > > of > > upmap. > > $ sudo ceph osd set-require-min-compat-client luminous > Error EPERM: cannot set require_min_compat_client to luminous: 54 > connected client(s) look like jewel (missing 0x800); add > --yes-i-really-mean-it to do it anyway > > We've standardize on the 4.15 kernel client on all CephFS clients, those > are the 54 - would it be safe to ignore above warning ? Otherwise - which > kernel do I need to go to ? > > Have a look at this thread on the mailing list: https://www.mail-archive.com/ceph-users@lists.ceph.com/msg46506.html -BEGIN PGP SIGNATURE- iQIzBAEBCAAdFiEElZWfRQVsNukQFi9Ko80MCbT/An0FAlwjwgIACgkQo80MCbT/ An2sIQ/9GvHML60e+kA6zA+wdI0wxRt+EyUCKGbnAiILtgTHfkd/MWF4MOLuiNUw QGLMjh+vXdBf8wS+tFgl6crwXkMrQUleN3caDB0unC0R/xXUEJcP8cjBcBiG96cp lhAiTEDPWbWflupy/AhaFrVDWZWiIL9KEmv0KjETke08ddnFfZRPudrO31mSYR/k xCTP9ui30aLkHpZe8KwP2QwbJJc1C/ZqsrNwSDFmhYO2x6xIaqDLM7TzYS5sDPQj eAe05Oes5OttlijigGZueKvvA+gnMuuGxOTLpazhh1Zo/Vpx48IFlQarYTfsZYZw MUzQESlYxT7zuCk1ikXHXJ5JbLHr2Ar/uA/G2Q6Uk1RPsgyfAM5Yt/5oCxAPeQP2 m5AVS82evP00897fDy2uM+/h0d4tyOJ73iSCUoxqGTP8O0QxArkpCJCS4xrWEfCP 7OBfdQQ6jynvEx3n2j1PsOZTsumIBv9t17mKWXcX9X+iZuD1zq58qmg3wvD9fZbf JaySSWcGR1yPmD0A3CKkn7YTnxdwGu34kWPnAS3XTinepFXefUrKHpOFsmCteGaD YSd2Sx9fqLXdsIyFHNEjitt0V+dO8HMSJ+xlrbHd2ZzbVqbeg4s7lRVeNBn5URbx pmU9Czh/f1Sbtn+B/B0d8rushENQNMeOmFUwgGlF0lF7uAfwQD8= =HTc4 -END PGP SIGNATURE- ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Balancing cluster with large disks - 10TB HHD
> -BEGIN PGP SIGNED MESSAGE- > Hash: SHA256 > > On mik, 2018-12-26 at 13:14 +0100, jes...@krogh.cc wrote: >> Thanks for the insight and links. >> >> > As I can see you are on Luminous. Since Luminous Balancer plugin is >> > available [1], you should use it instead reweight's in place, >> especially >> > in upmap mode [2] >> >> I'll try it out again - last I tried it complanied about older clients - >> it should be better now. >> > require_min_compat_client luminous is required, for you to take advantage > of > upmap. $ sudo ceph osd set-require-min-compat-client luminous Error EPERM: cannot set require_min_compat_client to luminous: 54 connected client(s) look like jewel (missing 0x800); add --yes-i-really-mean-it to do it anyway We've standardize on the 4.15 kernel client on all CephFS clients, those are the 54 - would it be safe to ignore above warning ? Otherwise - which kernel do I need to go to ? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Balancing cluster with large disks - 10TB HHD
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On mik, 2018-12-26 at 13:14 +0100, jes...@krogh.cc wrote: > Thanks for the insight and links. > > > As I can see you are on Luminous. Since Luminous Balancer plugin is > > available [1], you should use it instead reweight's in place, especially > > in upmap mode [2] > > I'll try it out again - last I tried it complanied about older clients - > it should be better now. > require_min_compat_client luminous is required, for you to take advantage of upmap. > > Also, may be I can catch another crush mistakes, can I see `ceph osd > > crush show-tunables, `ceph osd crush rule dump`, `ceph osd pool ls > > detail`? > > Here: > $ sudo ceph osd crush show-tunables > { > "choose_local_tries": 0, > "choose_local_fallback_tries": 0, > "choose_total_tries": 50, > "chooseleaf_descend_once": 1, > "chooseleaf_vary_r": 1, > "chooseleaf_stable": 0, > "straw_calc_version": 1, > "allowed_bucket_algs": 54, > "profile": "hammer", > "optimal_tunables": 0, > "legacy_tunables": 0, > "minimum_required_version": "hammer", > "require_feature_tunables": 1, > "require_feature_tunables2": 1, > "has_v2_rules": 1, > "require_feature_tunables3": 1, > "has_v3_rules": 0, > "has_v4_buckets": 1, > "require_feature_tunables5": 0, > "has_v5_rules": 0 > } > > $ sudo ceph osd crush rule dump > [ > { > "rule_id": 0, > "rule_name": "replicated_ruleset_hdd", > "ruleset": 0, > "type": 1, > "min_size": 1, > "max_size": 10, > "steps": [ > { > "op": "take", > "item": -1, > "item_name": "default~hdd" > }, > { > "op": "chooseleaf_firstn", > "num": 0, > "type": "host" > }, > { > "op": "emit" > } > ] > }, > { > "rule_id": 1, > "rule_name": "replicated_ruleset_hdd_fast", > "ruleset": 1, > "type": 1, > "min_size": 1, > "max_size": 10, > "steps": [ > { > "op": "take", > "item": -28, > "item_name": "default~hdd_fast" > }, > { > "op": "chooseleaf_firstn", > "num": 0, > "type": "host" > }, > { > "op": "emit" > } > ] > }, > { > "rule_id": 2, > "rule_name": "replicated_ruleset_ssd", > "ruleset": 2, > "type": 1, > "min_size": 1, > "max_size": 10, > "steps": [ > { > "op": "take", > "item": -21, > "item_name": "default~ssd" > }, > { > "op": "chooseleaf_firstn", > "num": 0, > "type": "host" > }, > { > "op": "emit" > } > ] > }, > { > "rule_id": 3, > "rule_name": "cephfs_data_ec42", > "ruleset": 3, > "type": 3, > "min_size": 3, > "max_size": 6, > "steps": [ > { > "op": "set_chooseleaf_tries", > "num": 5 > }, > { > "op": "set_choose_tries", > "num": 100 > }, > { > "op": "take", > "item": -1, > "item_name": "default~hdd" > }, > { > "op": "chooseleaf_indep", > "num": 0, > "type": "host" > }, > { > "op": "emit" > } > ] > } > ] > > $ sudo ceph osd pool ls detail > pool 6 'kube' replicated size 3 min_size 2 crush_rule 0 object_hash > rjenkins pg_num 128 pgp_num 128 last_change 41045 flags hashpspool > stripe_width 0 application rbd > removed_snaps [1~3] > pool 15 'default.rgw.buckets.data' replicated size 3 min_size 2 crush_rule > 0 object_hash rjenkins pg_num 256 pgp_num 256 last_change 41045 flags > hashpspool stripe_width 0 application rgw > pool 17 'default.rgw.users.keys' replicated size 3 min_size 2 crush_rule 0 > object_hash rjenkins pg_num 16 pgp_num 16 last_change 41045 lfor 0/36590 > flags hashpspool stripe_width 0 application rgw > pool 18 'default.rgw.buckets.non-ec' replicated size 3 min_size 2 > crush_rule 0 object_hash rjenkins pg_num 16 pgp_num 16 last_change 41045 > lfor 0/36595 flags hashpspool stripe_width 0 application rgw > pool 19 'default.rgw.users.uid' replicated size 3 min_size 2 crush_rule 0 > object_hash rjenkins pg_num 16 pgp_num 16 last_change 41045 lfor 0/36608 > flags hashpspool stripe_width 0 application rgw > pool 20 'rbd' replicated size 3 min_size 2 crush_rule 0 object_hash > rjenkins
Re: [ceph-users] Balancing cluster with large disks - 10TB HHD
Thanks for the insight and links. > As I can see you are on Luminous. Since Luminous Balancer plugin is > available [1], you should use it instead reweight's in place, especially > in upmap mode [2] I'll try it out again - last I tried it complanied about older clients - it should be better now. > Also, may be I can catch another crush mistakes, can I see `ceph osd > crush show-tunables, `ceph osd crush rule dump`, `ceph osd pool ls > detail`? Here: $ sudo ceph osd crush show-tunables { "choose_local_tries": 0, "choose_local_fallback_tries": 0, "choose_total_tries": 50, "chooseleaf_descend_once": 1, "chooseleaf_vary_r": 1, "chooseleaf_stable": 0, "straw_calc_version": 1, "allowed_bucket_algs": 54, "profile": "hammer", "optimal_tunables": 0, "legacy_tunables": 0, "minimum_required_version": "hammer", "require_feature_tunables": 1, "require_feature_tunables2": 1, "has_v2_rules": 1, "require_feature_tunables3": 1, "has_v3_rules": 0, "has_v4_buckets": 1, "require_feature_tunables5": 0, "has_v5_rules": 0 } $ sudo ceph osd crush rule dump [ { "rule_id": 0, "rule_name": "replicated_ruleset_hdd", "ruleset": 0, "type": 1, "min_size": 1, "max_size": 10, "steps": [ { "op": "take", "item": -1, "item_name": "default~hdd" }, { "op": "chooseleaf_firstn", "num": 0, "type": "host" }, { "op": "emit" } ] }, { "rule_id": 1, "rule_name": "replicated_ruleset_hdd_fast", "ruleset": 1, "type": 1, "min_size": 1, "max_size": 10, "steps": [ { "op": "take", "item": -28, "item_name": "default~hdd_fast" }, { "op": "chooseleaf_firstn", "num": 0, "type": "host" }, { "op": "emit" } ] }, { "rule_id": 2, "rule_name": "replicated_ruleset_ssd", "ruleset": 2, "type": 1, "min_size": 1, "max_size": 10, "steps": [ { "op": "take", "item": -21, "item_name": "default~ssd" }, { "op": "chooseleaf_firstn", "num": 0, "type": "host" }, { "op": "emit" } ] }, { "rule_id": 3, "rule_name": "cephfs_data_ec42", "ruleset": 3, "type": 3, "min_size": 3, "max_size": 6, "steps": [ { "op": "set_chooseleaf_tries", "num": 5 }, { "op": "set_choose_tries", "num": 100 }, { "op": "take", "item": -1, "item_name": "default~hdd" }, { "op": "chooseleaf_indep", "num": 0, "type": "host" }, { "op": "emit" } ] } ] $ sudo ceph osd pool ls detail pool 6 'kube' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 41045 flags hashpspool stripe_width 0 application rbd removed_snaps [1~3] pool 15 'default.rgw.buckets.data' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 256 pgp_num 256 last_change 41045 flags hashpspool stripe_width 0 application rgw pool 17 'default.rgw.users.keys' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 16 pgp_num 16 last_change 41045 lfor 0/36590 flags hashpspool stripe_width 0 application rgw pool 18 'default.rgw.buckets.non-ec' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 16 pgp_num 16 last_change 41045 lfor 0/36595 flags hashpspool stripe_width 0 application rgw pool 19 'default.rgw.users.uid' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 16 pgp_num 16 last_change 41045 lfor 0/36608 flags hashpspool stripe_width 0 application rgw pool 20 'rbd' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 41045 flags hashpspool stripe_width 0 application rbd pool 26 'default.rgw.data.root' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 41045 flags hashpspool stripe_width 0 application rgw pool 27 'default.rgw.log' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 41045 flags hashpspool stripe_width 0 application rgw pool 28 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0
Re: [ceph-users] Balancing cluster with large disks - 10TB HHD
$ sudo ceph osd df tree ID CLASSWEIGHTREWEIGHT SIZE USEAVAIL %USE VAR PGS TYPE NAME -8 639.98883- 639T 327T 312T 51.24 1.00 - root default -10 111.73999- 111T 58509G 55915G 51.13 1.00 - host bison 78 hdd_fast 0.90900 1.0 930G 1123M 929G 0.12 0.00 0 osd.78 79 hdd_fast 0.81799 1.0 837G 1123M 836G 0.13 0.00 0 osd.79 20 hdd 9.09499 0.95000 9313G 4980G 4333G 53.47 1.04 204 osd.20 28 hdd 9.09499 1.0 9313G 4612G 4700G 49.53 0.97 200 osd.28 29 hdd 9.09499 1.0 9313G 4848G 4465G 52.05 1.02 211 osd.29 33 hdd 9.09499 1.0 9313G 4759G 4553G 51.10 1.00 207 osd.33 34 hdd 9.09499 1.0 9313G 4613G 4699G 49.54 0.97 195 osd.34 35 hdd 9.09499 0.89250 9313G 4954G 4359G 53.19 1.04 206 osd.35 36 hdd 9.09499 1.0 9313G 4724G 4588G 50.73 0.99 200 osd.36 37 hdd 9.09499 1.0 9313G 5013G 4300G 53.83 1.05 214 osd.37 38 hdd 9.09499 0.92110 9313G 4962G 4350G 53.28 1.04 206 osd.38 39 hdd 9.09499 1.0 9313G 4960G 4353G 53.26 1.04 214 osd.39 40 hdd 9.09499 1.0 9313G 5022G 4291G 53.92 1.05 216 osd.40 41 hdd 9.09499 0.88235 9313G 5037G 4276G 54.09 1.06 203 osd.41 7 ssd 0.87299 1.0 893G 18906M 875G 2.07 0.04 124 osd.7 -7 102.74084- 102T 54402G 50805G 51.71 1.01 - host bonnie 0 hdd 7.27699 0.87642 7451G 4191G 3259G 56.25 1.10 175 osd.0 1 hdd 7.27699 0.86200 7451G 3837G 3614G 51.49 1.01 163 osd.1 2 hdd 7.27699 0.74664 7451G 3920G 3531G 52.61 1.03 169 osd.2 11 hdd 7.27699 0.77840 7451G 3983G 3467G 53.46 1.04 169 osd.11 13 hdd 9.09499 0.76595 9313G 4894G 4419G 52.55 1.03 201 osd.13 14 hdd 9.09499 1.0 9313G 4350G 4963G 46.71 0.91 189 osd.14 16 hdd 9.09499 0.92635 9313G 4879G 4434G 52.39 1.02 204 osd.16 18 hdd 9.09499 0.67932 9313G 4634G 4678G 49.76 0.97 190 osd.18 22 hdd 9.09499 0.93053 9313G 5085G 4228G 54.60 1.07 218 osd.22 31 hdd 9.09499 0.88536 9313G 5152G 4160G 55.33 1.08 221 osd.31 42 hdd 9.09499 0.84232 9313G 4796G 4516G 51.51 1.01 199 osd.42 43 hdd 9.09499 0.87662 9313G 4656G 4657G 50.00 0.98 191 osd.43 6 ssd 0.87299 1.0 894G 20643M 874G 2.25 0.04 134 osd.6 -6 102.74100- 102T 53627G 51580G 50.97 0.99 - host capone 3 hdd 7.27699 0.84938 7451G 4028G 3422G 54.07 1.06 171 osd.3 4 hdd 7.27699 0.83890 7451G 3909G 3542G 52.46 1.02 167 osd.4 5 hdd 7.27699 1.0 7451G 3389G 4061G 45.49 0.89 151 osd.5 9 hdd 7.27699 1.0 7451G 3710G 3740G 49.80 0.97 161 osd.9 15 hdd 9.09499 1.0 9313G 4952G 4360G 53.18 1.04 206 osd.15 17 hdd 9.09499 0.95000 9313G 4865G 4448G 52.24 1.02 202 osd.17 23 hdd 9.09499 1.0 9313G 4984G 4329G 53.52 1.04 223 osd.23 24 hdd 9.09499 1.0 9313G 4847G 4466G 52.05 1.02 202 osd.24 25 hdd 9.09499 0.89929 9313G 4909G 4404G 52.71 1.03 205 osd.25 30 hdd 9.09499 0.92787 9313G 4740G 4573G 50.90 0.99 202 osd.30 74 hdd 9.09499 0.93146 9313G 4709G 4603G 50.57 0.99 199 osd.74 75 hdd 9.09499 1.0 9313G 4559G 4753G 48.96 0.96 194 osd.75 8 ssd 0.87299 1.0 893G 19593M 874G 2.14 0.04 129 osd.8 -16 102.74100- 102T 53985G 51222G 51.31 1.00 - host elefant 19 hdd 7.27699 1.0 7451G 3665G 3786G 49.19 0.96 152 osd.19 21 hdd 7.27699 0.89539 7451G 4102G 3349G 55.05 1.07 169 osd.21 64 hdd 7.27699 0.89275 7451G 3956G 3494G 53.10 1.04 171 osd.64 65 hdd 7.27699 0.92513 7451G 3976G 3475G 53.36 1.04 171 osd.65 66 hdd 9.09499 1.0 9313G 4674G 4638G 50.20 0.98 199 osd.66 67 hdd 9.09499 1.0 9313G 4737G 4575G 50.87 0.99 201 osd.67 68 hdd 9.09499 0.89973 9313G 4946G 4366G 53.11 1.04 211 osd.68 69 hdd 9.09499 1.0 9313G 4648G 4665G 49.91 0.97 204 osd.69 70 hdd 9.09499 0.89526 9313G 4907G 4405G 52.69 1.03 209 osd.70 71 hdd 9.09499 0.84923 9313G 4690G 4622G 50.37 0.98 198 osd.71 72 hdd 9.09499 0.87547 9313G 4976G 4336G 53.43 1.04 211 osd.72 73 hdd 9.09499 1.0 9313G 4683G 4630G 50.29 0.98 200 osd.73 10 ssd 0.87299 1.0 893G 19158M 875G 2.09 0.04 126 osd.10 -14 110.01300- 110T 58498G 54157G 51.93 1.01 - host flodhest 27 hdd 9.09499 1.0 9313G 4602G 4710G 49.42 0.96 199 osd.27 32 hdd 9.09499 0.92557 9313G 5028G 4285G 53.99 1.05 215 osd.32 54 hdd 9.09499 0.90724 9313G 4897G 4415G 52.59 1.03 203 osd.54 55 hdd 9.09499 1.0 9313G 4867G 4446G 52.26 1.02 198 osd.55 56 hdd 9.09499 1.0 9313G 4827G 4485G 51.84 1.01 202 osd.56
Re: [ceph-users] Balancing cluster with large disks - 10TB HHD
> Please, paste your `ceph osd df tree` and `ceph osd dump | head -n 12`. $ sudo ceph osd df tree ID CLASSWEIGHTREWEIGHT SIZE USEAVAIL %USE VAR PGS TYPE NAME -8 639.98883- 639T 327T 312T 51.24 1.00 - root default -10 111.73999- 111T 58509G 55915G 51.13 1.00 - host bison 78 hdd_fast 0.90900 1.0 930G 1123M 929G 0.12 0.00 0 osd.78 79 hdd_fast 0.81799 1.0 837G 1123M 836G 0.13 0.00 0 osd.79 20 hdd 9.09499 0.95000 9313G 4980G 4333G 53.47 1.04 204 osd.20 28 hdd 9.09499 1.0 9313G 4612G 4700G 49.53 0.97 200 osd.28 29 hdd 9.09499 1.0 9313G 4848G 4465G 52.05 1.02 211 osd.29 33 hdd 9.09499 1.0 9313G 4759G 4553G 51.10 1.00 207 osd.33 34 hdd 9.09499 1.0 9313G 4613G 4699G 49.54 0.97 195 osd.34 35 hdd 9.09499 0.89250 9313G 4954G 4359G 53.19 1.04 206 osd.35 36 hdd 9.09499 1.0 9313G 4724G 4588G 50.73 0.99 200 osd.36 37 hdd 9.09499 1.0 9313G 5013G 4300G 53.83 1.05 214 osd.37 38 hdd 9.09499 0.92110 9313G 4962G 4350G 53.28 1.04 206 osd.38 39 hdd 9.09499 1.0 9313G 4960G 4353G 53.26 1.04 214 osd.39 40 hdd 9.09499 1.0 9313G 5022G 4291G 53.92 1.05 216 osd.40 41 hdd 9.09499 0.88235 9313G 5037G 4276G 54.09 1.06 203 osd.41 7 ssd 0.87299 1.0 893G 18906M 875G 2.07 0.04 124 osd.7 -7 102.74084- 102T 54402G 50805G 51.71 1.01 - host bonnie 0 hdd 7.27699 0.87642 7451G 4191G 3259G 56.25 1.10 175 osd.0 1 hdd 7.27699 0.86200 7451G 3837G 3614G 51.49 1.01 163 osd.1 2 hdd 7.27699 0.74664 7451G 3920G 3531G 52.61 1.03 169 osd.2 11 hdd 7.27699 0.77840 7451G 3983G 3467G 53.46 1.04 169 osd.11 13 hdd 9.09499 0.76595 9313G 4894G 4419G 52.55 1.03 201 osd.13 14 hdd 9.09499 1.0 9313G 4350G 4963G 46.71 0.91 189 osd.14 16 hdd 9.09499 0.92635 9313G 4879G 4434G 52.39 1.02 204 osd.16 18 hdd 9.09499 0.67932 9313G 4634G 4678G 49.76 0.97 190 osd.18 22 hdd 9.09499 0.93053 9313G 5085G 4228G 54.60 1.07 218 osd.22 31 hdd 9.09499 0.88536 9313G 5152G 4160G 55.33 1.08 221 osd.31 42 hdd 9.09499 0.84232 9313G 4796G 4516G 51.51 1.01 199 osd.42 43 hdd 9.09499 0.87662 9313G 4656G 4657G 50.00 0.98 191 osd.43 6 ssd 0.87299 1.0 894G 20643M 874G 2.25 0.04 134 osd.6 -6 102.74100- 102T 53627G 51580G 50.97 0.99 - host capone 3 hdd 7.27699 0.84938 7451G 4028G 3422G 54.07 1.06 171 osd.3 4 hdd 7.27699 0.83890 7451G 3909G 3542G 52.46 1.02 167 osd.4 5 hdd 7.27699 1.0 7451G 3389G 4061G 45.49 0.89 151 osd.5 9 hdd 7.27699 1.0 7451G 3710G 3740G 49.80 0.97 161 osd.9 15 hdd 9.09499 1.0 9313G 4952G 4360G 53.18 1.04 206 osd.15 17 hdd 9.09499 0.95000 9313G 4865G 4448G 52.24 1.02 202 osd.17 23 hdd 9.09499 1.0 9313G 4984G 4329G 53.52 1.04 223 osd.23 24 hdd 9.09499 1.0 9313G 4847G 4466G 52.05 1.02 202 osd.24 25 hdd 9.09499 0.89929 9313G 4909G 4404G 52.71 1.03 205 osd.25 30 hdd 9.09499 0.92787 9313G 4740G 4573G 50.90 0.99 202 osd.30 74 hdd 9.09499 0.93146 9313G 4709G 4603G 50.57 0.99 199 osd.74 75 hdd 9.09499 1.0 9313G 4559G 4753G 48.96 0.96 194 osd.75 8 ssd 0.87299 1.0 893G 19593M 874G 2.14 0.04 129 osd.8 -16 102.74100- 102T 53985G 51222G 51.31 1.00 - host elefant 19 hdd 7.27699 1.0 7451G 3665G 3786G 49.19 0.96 152 osd.19 21 hdd 7.27699 0.89539 7451G 4102G 3349G 55.05 1.07 169 osd.21 64 hdd 7.27699 0.89275 7451G 3956G 3494G 53.10 1.04 171 osd.64 65 hdd 7.27699 0.92513 7451G 3976G 3475G 53.36 1.04 171 osd.65 66 hdd 9.09499 1.0 9313G 4674G 4638G 50.20 0.98 199 osd.66 67 hdd 9.09499 1.0 9313G 4737G 4575G 50.87 0.99 201 osd.67 68 hdd 9.09499 0.89973 9313G 4946G 4366G 53.11 1.04 211 osd.68 69 hdd 9.09499 1.0 9313G 4648G 4665G 49.91 0.97 204 osd.69 70 hdd 9.09499 0.89526 9313G 4907G 4405G 52.69 1.03 209 osd.70 71 hdd 9.09499 0.84923 9313G 4690G 4622G 50.37 0.98 198 osd.71 72 hdd 9.09499 0.87547 9313G 4976G 4336G 53.43 1.04 211 osd.72 73 hdd 9.09499 1.0 9313G 4683G 4630G 50.29 0.98 200 osd.73 10 ssd 0.87299 1.0 893G 19158M 875G 2.09 0.04 126
Re: [ceph-users] Balancing cluster with large disks - 10TB HHD
We hit an OSD_FULL last week on our cluster - with an average utillzation of less than 50% .. thus hugely imbalanced. This has driven us to go for adjusting pg's upwards and reweighting the osd's more agressively. Question: What do people see as an "acceptable" variance across OSD's? x N Min MaxMedian AvgStddev x 72 45.49 56.25 52.35 51.878889 2.1764343 72 x 10TB drives. It seems hard to get further down -- thus churn will most likely make it hard for us to stay at this level. Currently we have ~158 PGs / OSD .. which by my math gives 63GB/pg if they were fully utillzing the disk - which leads me to think that somewhat smaller pg's would give the balancing an easier job. Would to be ok to go to closer to 300 PGs/OSD ? - would it be sane? I can see that the default max is 300, but I have hard time finding out if this is "recommendable" or just a "tunable". * We've now seen OSD_FULL trigger irrecoverable kernel bugs on the CephFS kernel client on our 4.15 kernels - multiple times - forced reboot is the only way out. We're on the Ubuntu kernels .. I havent done the diff to upstream (yet) and I dont intent to run our production cluster disk-full anyware in the near future to test it out. Please, paste your `ceph osd df tree` and `ceph osd dump | head -n 12`. k ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com