Re: [ceph-users] Ceph stuck creating pool
Hi, Yes, I can confirm it has a networking problem not a Ceph problem. After change the network on Cluster network virtual nics, everything start working ok. Thanks very much for the help. Guilherme *From:* Guilherme Lima [mailto:guilherme.l...@farfetch.com] *Sent:* Tuesday, October 3, 2017 18:25 *To:* 'David Turner' <drakonst...@gmail.com>; 'Webert de Souza Lima' < webert.b...@gmail.com> *Cc:* 'ceph-users' <ceph-users@lists.ceph.com> *Subject:* RE: [ceph-users] Ceph stuck creating pool Hi David; Yes I can ping the host from the cluster network. This is a test lab build in Hyper-V. I think you are right, probably there is a problem with the cluster network. I will check and let you know the results. Thanks very much Guilherme Lima Systems Administrator Main: +351 220 430 530 Fax: +351 253 424 739 Skype: guilherme.lima.farfetch.com Farfetch Rua da Lionesa, nr. 446 Edificio G12 4465-671 Leça do Balio Porto – Portugal [image: http://cdn-static.farfetch.com/Content/UP/email_signature/fflogox.jpg] 400 Boutiques. 1 Address http://farfetch.com Twitter: https://twitter.com/farfetch Facebook: https://www.facebook.com/Farfetch Instagram: https://instagram.com/farfetch This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you are not the named addressee/intended recipient then please delete it and notify the sender immediately. *From:* David Turner [mailto:drakonst...@gmail.com <drakonst...@gmail.com>] *Sent:* Tuesday, October 3, 2017 17:53 *To:* Guilherme Lima <guilherme.l...@farfetch.com>; Webert de Souza Lima < webert.b...@gmail.com> *Cc:* ceph-users <ceph-users@lists.ceph.com> *Subject:* Re: [ceph-users] Ceph stuck creating pool My guess is a networking problem. Do you have vlans, cluster network vs public network in the ceph.conf, etc configured? Can you ping between all of your storage nodes on all of their IPs? All of your OSDs communicate with the mons on the public network, but they communicate with each other for peering on the cluster network. My guess is that your public network is working fine, but that your cluster network might be having an issue causing the new PGs to never be able to peer. On Tue, Oct 3, 2017 at 11:12 AM Guilherme Lima <guilherme.l...@farfetch.com> wrote: Here it is, size: 3 min_size: 2 crush_rule: replicated_rule [ { "rule_id": 0, "rule_name": "replicated_rule", "ruleset": 0, "type": 1, "min_size": 1, "max_size": 10, "steps": [ { "op": "take", "item": -1, "item_name": "default" }, { "op": "chooseleaf_firstn", "num": 0, "type": "host" }, { "op": "emit" } ] } ] Thanks Guilherme *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf Of *Webert de Souza Lima *Sent:* Tuesday, October 3, 2017 15:47 *To:* ceph-users <ceph-users@lists.ceph.com> *Subject:* Re: [ceph-users] Ceph stuck creating pool This looks like something wrong with the crush rule. What's the size, min_size and crush_rule of this pool? ceph osd pool get POOLNAME size ceph osd pool get POOLNAME min_size ceph osd pool get POOLNAME crush_ruleset How is the crush rule? ceph osd crush rule dump Regards, Webert Lima DevOps Engineer at MAV Tecnologia *Belo Horizonte - Brasil* On Tue, Oct 3, 2017 at 11:22 AM, Guilherme Lima <guilherme.l...@farfetch.com> wrote: Hi, I have installed a virtual Ceph Cluster lab. I using Ceph Luminous v12.2.1 It consist in 3 mon + 3 osd nodes. Each node have 3 x 250GB OSD. My osd tree: ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 2.19589 root default -3 0.73196 host osd1 0 hdd 0.24399 osd.0 up 1.0 1.0 6 hdd 0.24399 osd.6 up 1.0 1.0 9 hdd 0.24399 osd.9 up 1.0 1.0 -5 0.73196 host osd2 1 hdd 0.24399 osd.1 up 1.0 1.0 7 hdd 0.24399 osd.7 up 1.0 1.0 10 hdd 0.24399 osd.10 up 1.0 1.0 -7 0.73196 host osd3 2 hdd 0.24399 osd.2 up 1.0 1.0 8 hdd 0.24399 osd.8 up 1.0 1.0 11 hdd 0.24399 osd.11 up 1.0 1.0 After create a new pool it is stuck on creating+peering and creating+activating. cluster: id: d20fdc12-f8bf-45c1-a276-c36dfcc788bc health: HEALTH_WA
Re: [ceph-users] Ceph stuck creating pool
Hi David; Yes I can ping the host from the cluster network. This is a test lab build in Hyper-V. I think you are right, probably there is a problem with the cluster network. I will check and let you know the results. Thanks very much Guilherme Lima Systems Administrator Main: +351 220 430 530 Fax: +351 253 424 739 Skype: guilherme.lima.farfetch.com Farfetch Rua da Lionesa, nr. 446 Edificio G12 4465-671 Leça do Balio Porto – Portugal [image: http://cdn-static.farfetch.com/Content/UP/email_signature/fflogox.jpg] 400 Boutiques. 1 Address http://farfetch.com Twitter: https://twitter.com/farfetch Facebook: https://www.facebook.com/Farfetch Instagram: https://instagram.com/farfetch This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you are not the named addressee/intended recipient then please delete it and notify the sender immediately. *From:* David Turner [mailto:drakonst...@gmail.com] *Sent:* Tuesday, October 3, 2017 17:53 *To:* Guilherme Lima <guilherme.l...@farfetch.com>; Webert de Souza Lima < webert.b...@gmail.com> *Cc:* ceph-users <ceph-users@lists.ceph.com> *Subject:* Re: [ceph-users] Ceph stuck creating pool My guess is a networking problem. Do you have vlans, cluster network vs public network in the ceph.conf, etc configured? Can you ping between all of your storage nodes on all of their IPs? All of your OSDs communicate with the mons on the public network, but they communicate with each other for peering on the cluster network. My guess is that your public network is working fine, but that your cluster network might be having an issue causing the new PGs to never be able to peer. On Tue, Oct 3, 2017 at 11:12 AM Guilherme Lima <guilherme.l...@farfetch.com> wrote: Here it is, size: 3 min_size: 2 crush_rule: replicated_rule [ { "rule_id": 0, "rule_name": "replicated_rule", "ruleset": 0, "type": 1, "min_size": 1, "max_size": 10, "steps": [ { "op": "take", "item": -1, "item_name": "default" }, { "op": "chooseleaf_firstn", "num": 0, "type": "host" }, { "op": "emit" } ] } ] Thanks Guilherme *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf Of *Webert de Souza Lima *Sent:* Tuesday, October 3, 2017 15:47 *To:* ceph-users <ceph-users@lists.ceph.com> *Subject:* Re: [ceph-users] Ceph stuck creating pool This looks like something wrong with the crush rule. What's the size, min_size and crush_rule of this pool? ceph osd pool get POOLNAME size ceph osd pool get POOLNAME min_size ceph osd pool get POOLNAME crush_ruleset How is the crush rule? ceph osd crush rule dump Regards, Webert Lima DevOps Engineer at MAV Tecnologia *Belo Horizonte - Brasil* On Tue, Oct 3, 2017 at 11:22 AM, Guilherme Lima <guilherme.l...@farfetch.com> wrote: Hi, I have installed a virtual Ceph Cluster lab. I using Ceph Luminous v12.2.1 It consist in 3 mon + 3 osd nodes. Each node have 3 x 250GB OSD. My osd tree: ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 2.19589 root default -3 0.73196 host osd1 0 hdd 0.24399 osd.0 up 1.0 1.0 6 hdd 0.24399 osd.6 up 1.0 1.0 9 hdd 0.24399 osd.9 up 1.0 1.0 -5 0.73196 host osd2 1 hdd 0.24399 osd.1 up 1.0 1.0 7 hdd 0.24399 osd.7 up 1.0 1.0 10 hdd 0.24399 osd.10 up 1.0 1.0 -7 0.73196 host osd3 2 hdd 0.24399 osd.2 up 1.0 1.0 8 hdd 0.24399 osd.8 up 1.0 1.0 11 hdd 0.24399 osd.11 up 1.0 1.0 After create a new pool it is stuck on creating+peering and creating+activating. cluster: id: d20fdc12-f8bf-45c1-a276-c36dfcc788bc health: HEALTH_WARN Reduced data availability: 256 pgs inactive, 143 pgs peering Degraded data redundancy: 256 pgs unclean services: mon: 3 daemons, quorum mon2,mon3,mon1 mgr: mon1(active), standbys: mon2, mon3 osd: 9 osds: 9 up, 9 in data: pools: 1 pools, 256 pgs objects: 0 objects, 0 bytes usage: 10202 MB used, 2239 GB / 2249 GB avail pgs: 100.000% pgs not active 143 creating+peering 113 creating+activating Can anyone help to find the issue? Thanks
Re: [ceph-users] Ceph stuck creating pool
My guess is a networking problem. Do you have vlans, cluster network vs public network in the ceph.conf, etc configured? Can you ping between all of your storage nodes on all of their IPs? All of your OSDs communicate with the mons on the public network, but they communicate with each other for peering on the cluster network. My guess is that your public network is working fine, but that your cluster network might be having an issue causing the new PGs to never be able to peer. On Tue, Oct 3, 2017 at 11:12 AM Guilherme Lima <guilherme.l...@farfetch.com> wrote: > Here it is, > > > > size: 3 > > min_size: 2 > > crush_rule: replicated_rule > > > > [ > > { > > "rule_id": 0, > > "rule_name": "replicated_rule", > > "ruleset": 0, > > "type": 1, > > "min_size": 1, > > "max_size": 10, > > "steps": [ > > { > > "op": "take", > > "item": -1, > > "item_name": "default" > > }, > > { > > "op": "chooseleaf_firstn", > > "num": 0, > > "type": "host" > > }, > > { > > "op": "emit" > > } > > ] > > } > > ] > > > > > > Thanks > > Guilherme > > > > > > *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf > Of *Webert de Souza Lima > *Sent:* Tuesday, October 3, 2017 15:47 > *To:* ceph-users <ceph-users@lists.ceph.com> > *Subject:* Re: [ceph-users] Ceph stuck creating pool > > > > This looks like something wrong with the crush rule. > > > > What's the size, min_size and crush_rule of this pool? > > ceph osd pool get POOLNAME size > > ceph osd pool get POOLNAME min_size > > ceph osd pool get POOLNAME crush_ruleset > > > > How is the crush rule? > > ceph osd crush rule dump > > > > > Regards, > > > > Webert Lima > > DevOps Engineer at MAV Tecnologia > > *Belo Horizonte - Brasil* > > > > On Tue, Oct 3, 2017 at 11:22 AM, Guilherme Lima < > guilherme.l...@farfetch.com> wrote: > > Hi, > > > > I have installed a virtual Ceph Cluster lab. I using Ceph Luminous v12.2.1 > > It consist in 3 mon + 3 osd nodes. > > Each node have 3 x 250GB OSD. > > > > My osd tree: > > > > ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF > > -1 2.19589 root default > > -3 0.73196 host osd1 > > 0 hdd 0.24399 osd.0 up 1.0 1.0 > > 6 hdd 0.24399 osd.6 up 1.0 1.0 > > 9 hdd 0.24399 osd.9 up 1.0 1.0 > > -5 0.73196 host osd2 > > 1 hdd 0.24399 osd.1 up 1.0 1.0 > > 7 hdd 0.24399 osd.7 up 1.0 1.0 > > 10 hdd 0.24399 osd.10 up 1.0 1.0 > > -7 0.73196 host osd3 > > 2 hdd 0.24399 osd.2 up 1.0 1.0 > > 8 hdd 0.24399 osd.8 up 1.0 1.0 > > 11 hdd 0.24399 osd.11 up 1.0 1.0 > > > > After create a new pool it is stuck on creating+peering and > creating+activating. > > > > cluster: > > id: d20fdc12-f8bf-45c1-a276-c36dfcc788bc > > health: HEALTH_WARN > > Reduced data availability: 256 pgs inactive, 143 pgs peering > > Degraded data redundancy: 256 pgs unclean > > > > services: > > mon: 3 daemons, quorum mon2,mon3,mon1 > > mgr: mon1(active), standbys: mon2, mon3 > > osd: 9 osds: 9 up, 9 in > > > > data: > > pools: 1 pools, 256 pgs > > objects: 0 objects, 0 bytes > > usage: 10202 MB used, 2239 GB / 2249 GB avail > > pgs: 100.000% pgs not active > > 143 creating+peering > > 113 creating+activating > > > > Can anyone help to find the issue? > > > > Thanks > > Guilherme > > > > > > > > > > > > This email and any files transmitted with it are confidential and intended > solely for the use of the individual or entity to whom they are addressed. > If you have received this email in error please notify the system manager. > This message contains confidential informa
Re: [ceph-users] Ceph stuck creating pool
Here it is, size: 3 min_size: 2 crush_rule: replicated_rule [ { "rule_id": 0, "rule_name": "replicated_rule", "ruleset": 0, "type": 1, "min_size": 1, "max_size": 10, "steps": [ { "op": "take", "item": -1, "item_name": "default" }, { "op": "chooseleaf_firstn", "num": 0, "type": "host" }, { "op": "emit" } ] } ] Thanks Guilherme *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf Of *Webert de Souza Lima *Sent:* Tuesday, October 3, 2017 15:47 *To:* ceph-users <ceph-users@lists.ceph.com> *Subject:* Re: [ceph-users] Ceph stuck creating pool This looks like something wrong with the crush rule. What's the size, min_size and crush_rule of this pool? ceph osd pool get POOLNAME size ceph osd pool get POOLNAME min_size ceph osd pool get POOLNAME crush_ruleset How is the crush rule? ceph osd crush rule dump Regards, Webert Lima DevOps Engineer at MAV Tecnologia *Belo Horizonte - Brasil* On Tue, Oct 3, 2017 at 11:22 AM, Guilherme Lima <guilherme.l...@farfetch.com> wrote: Hi, I have installed a virtual Ceph Cluster lab. I using Ceph Luminous v12.2.1 It consist in 3 mon + 3 osd nodes. Each node have 3 x 250GB OSD. My osd tree: ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 2.19589 root default -3 0.73196 host osd1 0 hdd 0.24399 osd.0 up 1.0 1.0 6 hdd 0.24399 osd.6 up 1.0 1.0 9 hdd 0.24399 osd.9 up 1.0 1.0 -5 0.73196 host osd2 1 hdd 0.24399 osd.1 up 1.0 1.0 7 hdd 0.24399 osd.7 up 1.0 1.0 10 hdd 0.24399 osd.10 up 1.0 1.0 -7 0.73196 host osd3 2 hdd 0.24399 osd.2 up 1.0 1.0 8 hdd 0.24399 osd.8 up 1.0 1.0 11 hdd 0.24399 osd.11 up 1.0 1.0 After create a new pool it is stuck on creating+peering and creating+activating. cluster: id: d20fdc12-f8bf-45c1-a276-c36dfcc788bc health: HEALTH_WARN Reduced data availability: 256 pgs inactive, 143 pgs peering Degraded data redundancy: 256 pgs unclean services: mon: 3 daemons, quorum mon2,mon3,mon1 mgr: mon1(active), standbys: mon2, mon3 osd: 9 osds: 9 up, 9 in data: pools: 1 pools, 256 pgs objects: 0 objects, 0 bytes usage: 10202 MB used, 2239 GB / 2249 GB avail pgs: 100.000% pgs not active 143 creating+peering 113 creating+activating Can anyone help to find the issue? Thanks Guilherme This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph stuck creating pool
This looks like something wrong with the crush rule. What's the size, min_size and crush_rule of this pool? ceph osd pool get POOLNAME size ceph osd pool get POOLNAME min_size ceph osd pool get POOLNAME crush_ruleset How is the crush rule? ceph osd crush rule dump Regards, Webert Lima DevOps Engineer at MAV Tecnologia *Belo Horizonte - Brasil* On Tue, Oct 3, 2017 at 11:22 AM, Guilherme Limawrote: > Hi, > > > > I have installed a virtual Ceph Cluster lab. I using Ceph Luminous v12.2.1 > > It consist in 3 mon + 3 osd nodes. > > Each node have 3 x 250GB OSD. > > > > My osd tree: > > > > ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF > > -1 2.19589 root default > > -3 0.73196 host osd1 > > 0 hdd 0.24399 osd.0 up 1.0 1.0 > > 6 hdd 0.24399 osd.6 up 1.0 1.0 > > 9 hdd 0.24399 osd.9 up 1.0 1.0 > > -5 0.73196 host osd2 > > 1 hdd 0.24399 osd.1 up 1.0 1.0 > > 7 hdd 0.24399 osd.7 up 1.0 1.0 > > 10 hdd 0.24399 osd.10 up 1.0 1.0 > > -7 0.73196 host osd3 > > 2 hdd 0.24399 osd.2 up 1.0 1.0 > > 8 hdd 0.24399 osd.8 up 1.0 1.0 > > 11 hdd 0.24399 osd.11 up 1.0 1.0 > > > > After create a new pool it is stuck on creating+peering and > creating+activating. > > > > cluster: > > id: d20fdc12-f8bf-45c1-a276-c36dfcc788bc > > health: HEALTH_WARN > > Reduced data availability: 256 pgs inactive, 143 pgs peering > > Degraded data redundancy: 256 pgs unclean > > > > services: > > mon: 3 daemons, quorum mon2,mon3,mon1 > > mgr: mon1(active), standbys: mon2, mon3 > > osd: 9 osds: 9 up, 9 in > > > > data: > > pools: 1 pools, 256 pgs > > objects: 0 objects, 0 bytes > > usage: 10202 MB used, 2239 GB / 2249 GB avail > > pgs: 100.000% pgs not active > > 143 creating+peering > > 113 creating+activating > > > > Can anyone help to find the issue? > > > > Thanks > > Guilherme > > > > > > > > > > This email and any files transmitted with it are confidential and intended > solely for the use of the individual or entity to whom they are addressed. > If you have received this email in error please notify the system manager. > This message contains confidential information and is intended only for the > individual named. If you are not the named addressee you should not > disseminate, distribute or copy this e-mail. Please notify the sender > immediately by e-mail if you have received this e-mail by mistake and > delete this e-mail from your system. If you are not the intended recipient > you are notified that disclosing, copying, distributing or taking any > action in reliance on the contents of this information is strictly > prohibited. > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Ceph stuck creating pool
Hi, I have installed a virtual Ceph Cluster lab. I using Ceph Luminous v12.2.1 It consist in 3 mon + 3 osd nodes. Each node have 3 x 250GB OSD. My osd tree: ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 2.19589 root default -3 0.73196 host osd1 0 hdd 0.24399 osd.0 up 1.0 1.0 6 hdd 0.24399 osd.6 up 1.0 1.0 9 hdd 0.24399 osd.9 up 1.0 1.0 -5 0.73196 host osd2 1 hdd 0.24399 osd.1 up 1.0 1.0 7 hdd 0.24399 osd.7 up 1.0 1.0 10 hdd 0.24399 osd.10 up 1.0 1.0 -7 0.73196 host osd3 2 hdd 0.24399 osd.2 up 1.0 1.0 8 hdd 0.24399 osd.8 up 1.0 1.0 11 hdd 0.24399 osd.11 up 1.0 1.0 After create a new pool it is stuck on creating+peering and creating+activating. cluster: id: d20fdc12-f8bf-45c1-a276-c36dfcc788bc health: HEALTH_WARN Reduced data availability: 256 pgs inactive, 143 pgs peering Degraded data redundancy: 256 pgs unclean services: mon: 3 daemons, quorum mon2,mon3,mon1 mgr: mon1(active), standbys: mon2, mon3 osd: 9 osds: 9 up, 9 in data: pools: 1 pools, 256 pgs objects: 0 objects, 0 bytes usage: 10202 MB used, 2239 GB / 2249 GB avail pgs: 100.000% pgs not active 143 creating+peering 113 creating+activating Can anyone help to find the issue? Thanks Guilherme -- This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com